• ×
    Information
    Need Windows 11 help?
    Check documents on compatibility, FAQs, upgrade information and available fixes.
    Windows 11 Support Center.
  • post a message
  • ×
    Information
    Need Windows 11 help?
    Check documents on compatibility, FAQs, upgrade information and available fixes.
    Windows 11 Support Center.
  • post a message
Guidelines
Are you having HotKey issues? Click here for tips and tricks.
HP Recommended

I recently got a refurb HP Z8-G4 with 2x Xeon Platinum 8273CL (28 cores each, 2.2GHz base clock) and 384GB of RAM (every slot w/ 16GB 2933Y), but my intense processing (PixInsight or Excel) does not seem to be fully utilizing both CPUs.  HP Performance Advisor and Windows Task Manager sees both CPUs and all the RAM.

 

Upon PixInsight (specialized AstroPhotography processing s/w)  loading, PI is seeing all 112 threads, but when running WBPP (calibrating, aligning, stacking many photos into single combined image)  on 253x QHY268C subs, PI seems to be using only 1 CPU at 100% with the other CPU at <5%.  By comparison, the prior Z840 with 2x Xeon E5-2680v3 (12 cores each, 2.5GHz base clock) and 256GB of RAM would use both cores with ~equivalent loads, peaking at 100% for long stretches.

 

I have noticed similar behavior when Excel is doing an intensive process, with CPU0 at 100% and CPU1 just above an idle.

Here is what the Z8-G4 CPU use look like during an intense part of WBPP (debayering). Notice that CPU0 is pegged at 100%, with CPU1 is lightly exercised. Also, at this intense point CPU clock is only 2.6GHz (though later I saw 3.10GHz), far slower than its 3.7GHz turbo max. I am running several other things as well: many Chrome tabs, several Excel sessions, and JRiver playing music.

 

Task Mgr Z8 WBPP.png

Here is the Z8 block diagram:

 

Z8 Block Diagram.png

 

Here is configuration details:

 

Z8 Configuration.png

When I am not doing anything intensive, e.g. no Excel calc or running PI processes, but many things open with only internet and music playing, both CPUs are at ~1%-3% with very similar variation.

 

How can I get the Z8 to use both CPUs ~ evenly during intensive processes?

5 REPLIES 5
HP Recommended

My Z840 never had this problem, as it would use both CPUs up to 100% and use much more RAM (my Z8 has 384GB vs. 256GB in my Z840).  PixInsight's developer, Juan, says it is a Windows core allocation problem.

 

Even w/ 1 hand tied behind its back, the Z8 is ~12% faster than the Z840 (on the exact same data set), but I was expecting more than 2x improvement.  To do the complete WBPP on 253x OSC QHY268C subs, the Z8 took 2hr04min to output separate R,G,B stacks.  I see the Z8 was not using nearly as much RAM as my Z840 would have, but that makes some sense as it was not using all the cores.

 

I saw an old PI thread, in which Juan mentioned some of this had to do with Win10 (though Win10P can use up to 128 cores in 2 socket install), and Win11 should be better.  But I have not had this problem in the past 4+ years w/ my Z840 running Win10.  There must be some custom Windows setting difference between the two.   When I first got the Z8, Windows Update wanted to migrate to Win11P, and I did.  However, PI's Benchmark was much worse on Win11 (~60%), so I reverted.  But I find PI's Benchmark to be almost worthless for the cases that I concerned with (the big CPU and RAM intensive processes), as PI's Benchmark hardly uses any RAM. 

HP Recommended

First of all trying to compare your z840 with the newer "Z" workstation  is pointless as

 

1. the OS"s are different,  and the win 11 task scheduler works different than the win 10 one

2 the cpu's between the two systems are vastly different in their internal layout/design

3 the motherboard chipset/bios are not even close to each other

 

due to the major changes between the above systems, comparing cpu usage between them in your case is not a valid way to compare your cpu processor usage percentage between cores

 

the win 11 cpu task scheduler is much improved over the win 10 one so i doubt you actually have a problem

in this case i would ask the program makers support if they  recommend using a script to set program or program thread affinity to specific cores

HP Recommended

I have Win10P on both machines.  I originally tried Win11P on the Z8, but its PI Benchmark was significantly worse than it was with Win10P on the Z8.  However, PI Benchmark is a very limited exercise, and uses very little RAM.  After this I reverted back to Win10P.

 

I have not run this bigger process with Win11P.   I could go back to Win11P and try it on a large process (which is why I need the advantage of the dual CPUs and large RAM).

 

PixInsight does have an Affinity option setting in its Global Preferences.  I have it unchecked.

 

I need PI to be able to use BOTH CPUs and ~ all the available RAM and Cores.

 

There is a mismatch in the RAM, but it is consistent by CPU, and there are no post warnings or warnings in HP Performance Advisor. 

CPU0: Hynix        16x16GB 2Rx8 PC4-2933Y-RE2-12

CPU1: Samsung 16x16GB 1Rx4 PC4-2933Y-RC2-12

Does this RAM mismatch keep the applications from crossing over to using the other CPU?

HP Recommended

the x820/ z840 and your "Z" workstation  have a bios setting that allows all ram from both cpu's to be accessed by either cpu, or split the ram so each cpu only uses the ram local to each cpu.

 

using this setting splits the 256GB into two 128 GB banks one per cpu and this configuration can be faster as each cpu no longer has to wait to access ram from the other cpu (more wait states)

 

For predictable latencies, try disabling NUMA (Non-Uniform Memory Access) mode (BIOS setup menu -> Advanced ->. Performance Options -> Non-Uniform Memory Access

 

ideally, all ram banks should have the same capacity/speed dimms installed

 

please read the entire document linked below

https://h20195.www2.hp.com/v2/GetDocument.aspx?docname=4AA7-1334ENW

 

your listed ram differs in "Rank" what this means is how the memory chips on the dimm is organized and in general a dual or quad rank dimm is more common for higher capacity modules and can be slightly faster as the computer's memory controller will see a dual rank dimm as the equivalent of 2 dimms on on a single memory module

 

you cannot mix ranks within a memory bank, this means each memory bank on your system is comprised of two dimm slots (2x12 = 24) so each 2 dimm slots/bank must be comprised of 1x  2x or 4x modules installing a 1x and 2x in the same bank will not work and give a memory error on startup

HP Recommended

It looks like the problem is solved by Win11P.  

With Win11 both CPUs are used ~ the same and process time is ~80% that of Win10 on real-world case.  As suspected PI's Benchmark is not representative of heavier real-world loads.
 
I did NOT try the BIOS change.
 
Task Mgr Z8 WBPP Win11P.pngExecution Z8 384GB WBPP 227 Mono Win11P.png
my testing
† The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the <a href="https://www8.hp.com/us/en/terms-of-use.html" class="udrlinesmall">Terms of Use</a> and <a href="/t5/custom/page/page-id/hp.rulespage" class="udrlinesmall"> Rules of Participation</a>.