03-26-2020 10:19 PM - edited 03-29-2020 02:03 PM
CPUs in our HP Z640 Workstation are overheating. Automatic fan speed control does not work (based on CPU load / temperature). CPU0 and CPU1 fans has almost the same rpm even the load is almost 0% and temperature 35°C or 100% load and 89 °C (192 °F) CPU temperature. Edit: in next post event 93°C (200 °F).
HP support does not know how to solve the problem at all. During 2 months warranty claim they already changed motherboard, new fans on rear case wall, new USB module on the front case wall. It did not solve the problem. Updated BIOS. Reinstalled OS Win 10 Pro + all drivers. Nothing helped after 3 HP technicians visits onsite.
I am very disappointed with HP support. I will never ever buy any HP product anymore.
Configuration: 2x CPU Intel Xeon E5-2690 v4 (14x core @ 2.6 GHz), 128 GB ECC RAM, HP NVIDIA Quadro M4000, 8GB GDDR5
Motherboard: HP model 212A, version 1.01
Chipset: Intel C612 (Wellsburg-G) https://ark.intel.com/content/www/us/en/ark/products/81759/intel-c612-chipset.html
CPU Intel Xeon E5-2690 v4 https://ark.intel.com/content/www/us/en/ark/products/91770/intel-xeon-processor-e5-2690-v4-35m-cache...
Current BIOS: M60 02.50 (newest available). I already tried to downgrade to M60 02.48 if it will help as I expected there is some software bug in the BIOS. It did not help. Finally I tried downgrade to M60 02.31 from 2017 year when this PC was procuded. Did not help, too.
When I set "Idle fan speed" option in BIOS to maximum, then CPU fan speed was around 5000 rpm. Nice! But also both rear case wall and front case wall fans rotate at thousands rpm. With this cooling power CPUs even under heavy load 100% don't exceed 50°C. Nice. It this is not workstation but server placed somewhere in data centre, I will keep it as it is. But it is on the desk in our officee. And with such Idel fans rpm it's very very noisy.
This super power and super expensive workstation is useless when you cannot use the power because you don't want to destroy CPUs. Instead of rendering in 3DS Max it is possible to use it by secretary for replying emails and writing in MS Word 😞
CPUs temperatures and fans rpm I watch in HP Performance Advisor SW.
I asked Intel support to give me their opinion. They stated that at 89°C CPU makes throttling to do not exceed TCASE temperature. Case Temperature is the maximum temperature allowed at the processor Integrated Heat Spreader (IHS). So the CPU is shutting down core by core and reducing frequency to avoid this temperature to be exceeded. It's clear that running CPUs at these max temperatures will sooner or later kill CPUs.
Exactly same problem many people already described in other topics of this forum. Ie. (it's totally stupid on this forum that I must create a new topic, cannot post just reply to some existing):
03-28-2020 06:19 PM - edited 03-28-2020 06:35 PM
CPU Intel Xeon E5-2690 v4 data sheet: https://ark.intel.com/content/www/us/en/ark/products/91770/intel-xeon-processor-e5-2690-v4-35m-cache...
TCASE: 89°C ... Case Temperature is the maximum temperature allowed at the processor Integrated Heat Spreader (IHS).
Today I measured even higher temperatures at processors: CPU0 at 93°C and CPU1 at 90°C.
HP support answer: that's ok.
But almost every opinion I read or heard is in this way: High temperature is an enemy of electronic component, the expected life of an electronic component has a direct relation with the operating temperature.
HP stupid BIOS allows only to specify minimum fans rpm (Idle fans speed). It has to have option called "Thermal management" with options like:
- Silently die while overheating ... this is current and only one HP thermal management available, unfortunately
- Balanced noise vs. cooling
- Cooling is priority
- Maximum cooling
I studied Information Systems / Electronic Computer Systems at Electrical Engineering school. I remember from there that semiconductor parts can be destroyed running long time at high temperatures. And it decrease their lifespan.
I will never buy HP products again. And generally not any product with such poor / terrible temperarture management.
I run websites with around 1 million visitors (by Google Analytics) for 20 years already. Currently I have 2 servers, the first one is 2-CPU system, the second one is 4-CPU system. They are placed in datacenter (serverhousing). Both servers are SuperMicoro brand https://www.supermicro.com/en/ I trust they are the best available on the market. They have excellent cooling. In the datacentre there are hundreds servers in every server room. I did not see much HP, but lots of SuperMicro servers. Guess why 🙂
As CPU fans are connected to motherboard, you have as a HP customer no choice how to solve overheating. Changing motherboard means change the whole PC / workstation. When I will do it, I will buy components and build my own as it will be much much better than some HP. Lots of motherboards on the market has great configuration options including thermal management.
I can accept higher level of noise, but I cannot never accept quietly dying processor due to overheating.
03-29-2020 08:29 AM
Thank you for the PM.
I am no expert but I will reply to your problem on your thread so that others can comment on it as well.
From my understanding, It seems like your fans are working fine and it does cool the CPU with fans at full speed but the motherboard does not seem to automatically apply fan speed control based on CPU temperature or load. You mentioned that fan noise is better than a crippled or dead CPU. I am also assuming that you do not have a service contract with HP.
My daily workstations are a single CPU Z620 and a couple of dual CPU Z820 (one of which are liquid cooled). I do not own a Z640.
HP Z Workstations are designed so that it removes the guesswork from the busy professional to allow quick turnaround hardware fixes. They have not designed the systems with high configurability like software fan control tuning. With that in mind, please allow me to make the following suggestions for your consideration.
1. Fix a suitable fan speed in BIOS and you may have to live with a noisier system.
2. Refresh the BIOS from the original to the latest version in proper sequence without skipping versions. Remember to clear CMOS after every upgrade. Proceed to further upgrade the BIOS only if fan rpm changes to CPU load as intended. I managed to get my Z820 with the liquid cooler to adopt a proper fan control strategy instead of using a very high rpm all the time. I suspect that the fan control settings in BIOS gets corrupted somehow when the BIOS is upgraded and following this procedure keeps the workstation working and myself happy and sane.
3. Jerry rig a Noctua fan controller to both or each CPU for manual fan control. Careful of the fan pinouts.
4. Repair the motherboard if you have access to board level schematics.
5. Purchase a replacement motherboard.
Hope you can get it up and running again.
03-29-2020 09:48 AM
thank you very much for your reply.
1. Yes, but this does not solve the problem. And I have one note - if the customer is deaf-mute, it's a great that he/she can set Idle fand speed to maximum as he/she will not hear the noise and therefore does not need any thermal control. Maybe HP have to sell HP Z640 as a workstation for deaf-mute people. But they must live alone and they have to own office at work, otherwise all other will suffer from unnecessary noise.
2. Already tried, did not help.
3. Could you send me link to the product website you mentioned?
4. I think this option is completely impossible.
5. Motherboard was replaced for a new one during the warranty claim. Did not help.
I always want to keep the original design of the product. For example to implement a water cooling is quite a big intervention to the original cooling solution. And it's not only about those 2 CPU fans, but there are 2 rear case fans, too. Which also suffer from the problem of not-working thermal management. And there will be problem with the space inside - case is not designed for water cooling.
What I will do now is to push HP as hard as possible to find out solution. I will give this problem so much publicity till they will start to search for a solution. Otherwise this will be a big shame for HP company and will have an impact to their business.
03-29-2020 10:11 AM
You can try using Noctua NA-FC1 to control the fans. Hope you are able to find a way to connect it to each of the CPU fans on the fan shroud. The motherboard requires a fan speed feedback to keep it from throwing a fault at post. Please make sure you get the pinouts correct.
There are factory options for water-cooling for the CPU from HP but they are for high TDP CPUs like the E5-2687w in mine. You may not need it for yours once you get the fan control sorted out.
Have you tried booting your Z640 after a new BIOS flash and CMOS reset with a fresh install of Windows 10? I read my old post and it seems to work for me to get the motherboard fan control to behave normally. I would never have imagined that a fresh copy of Windows 10 could affect how the motherboard behaves. Maybe this can help you too?
Good luck in your endeavours and I sincerely hope you can get it working again. Do share your solution if you have found it please.
03-29-2020 03:20 PM - edited 03-29-2020 03:47 PM
thank you for the link. I know about this product, few days ago I already looked at it in eshop. But it is manual adjusting of fan speed. It does exactly the same thing like the option "Idle fan speed" in BIOS setup. I need to implement anyhow automatic fan speed control based on the CPU load / temperature. This will not help, unfortunately. But if exists something like this which will have thermistor and adjust fan speed based on the temperature, it will be great. Just connect it between fan and motherboard connector. Then it is possible to set "Idle fan speed" to maximum and let this device adjust fans rpm automatically. But I spent hours searching on eshops, but did not find anything like this.
Btw. "Idle fan speed" settings could be useful for servers. You will set maximum fans rpm and place it to the datacentre. But definitely not for workstations.
I know PWM (Pulse-width modulation) CPU fan connectors have 3 or 4 pins. In case of 4-pin its Ground, +12 V, sense and control https://en.wikipedia.org/wiki/Computer_fan_control#Fan_connectors Connector of Z640 has 6-pins. But three pins are connected together, so its basically 4-pin connection. See attached photo.
CPU fan dimensions are 92 x 92 x 25 mm. Below is attached photo of my Z640 CPU fan.
In this video https://www.youtube.com/watch?v=AVovy7UWDQ0 I found that exists NMB 4W022 fan with thermistor. Sounds great, like it will adjust rpm by temperature automaticallly. Maybe? It has 3-pin wire connector. It is 12V, 0.68A fan. Dimensions 92 x 92 x 32 mm (7 mm thicker). Part Number: 3612KL-04W-B66 https://www.electronicsdatasheets.com/manufacturers/nmb-technologies/parts/3612kl04wb66 I am curious about the pintout - I hope Ground, +12 V and sense. It seems this fan was mounted in DELL servers (some people mentioned on eBay). Here it's offered for $15.00 USD https://store.cwc-group.com/3612kl04wb66.html?viewfullsite=1 There is written it has Input Power: 1.25W, Speed: 1400~3800 RPM Thermal Control with Heat Sensor and Air Flow: 25~69.91 CFM. They offer tens of 92mm fans https://store.cwc-group.com/90mm1.html
I will share any information I find. Thanks again for your help.
ps. other customers with workstations overheating problems:
03-29-2020 06:34 PM
I found Datech DS9225-12HBTA-B 3-Pin 92MM Fan https://store.cwc-group.com/ds9525.html fan which has the exact dimensions 92 x 92 x 25 mm as HP Z640 CPU fans. Costs $15 USD.
- JMC Datech DS9225-12HBTA-B
- Dimensions: 92x92x25 MM
- Voltage: DC 12V
- Current: 0.75A
- Speed: 800 - 4500 RPM
- Feature: Temperature Sensor
- Termination: 3-Pin / 3-Wire / 20-Inch
But some other eshops / websites presents it as 90x90x25mm.
03-29-2020 08:35 PM - edited 03-29-2020 08:55 PM
Sumamry of CPU0, CPU1 and both rear case fans specifications for HP Z640.
- Placed at main motherboard.
- Dimensions: 92 x 92 x 25 mm
- Connector: 6-pin
- Supply: DC 12V, 0.40A
- Model: Foxconn PVA092G12S
- Placed at secondary (smaller) motherboard.
- Dimensions: 80 x 80 x 15 mm
- Connector: 4-pin
- Supply: DC 12V, 0.50A
- Model: Foxconn PVA080E12R
Rear case fan
- Placed at rear case wall, 2 pieces, connected by wires together
- Dimensions: 92 x 92 x 25 mm
- Connector: 6-pin
- Supply: DC 12V, 0.35A
- Model: Nidec FAN B T92T12MS3A7-57A03, HP ASSY PN 644315-001 REVB, 8802H S5 E.P.
03-30-2020 07:30 PM - edited 03-30-2020 07:51 PM
I note your CPU0 fan speed is low compared to CPU1. The fan control via the motherboard's firmware is complex and tuned by HP engineers, and I'll assume they got it right. For you, however, something is wrong. Perhaps your CPU0 fan has gone bad in terms of its PWM control circuit. Could that corrupt the whole fan control paradigm? The electronics are interrelated.
You are right that HP uses the standard PWM wiring order..... negative, 12VDC, RPM sense, PWM control for pins 1,2,3,4. In the Z600/Z620 era the processor fan plugs were 5-holes. The first 4 were used for a Mainstream (lower TDP) processor and the fifth hole was empty. In a Performance fan that plug had still only 5 holes but the fifth was no longer empty. It was populated with a ground jumper wire from hole 1 to hole 5. That is how you could spoof the motherboard into thinking it had a Performance heatsink/fan which is demanded by the motherboard to run a faster high-TDP processor (or two). This would be fine if you were only doing email and Word work.... I reported this years ago on this forum, and the trick works great as long as you know what you are doing and can check your temps as needed, and don't let your spouse use it for plasma holographic modeling.
The Z640 comes along... very much like the Z440. Both of these workstations now have 6 holes on the white processor fan plug. However it is almost the same as the 5-hole fan plug for the Z420/single processor Z620.... You still have the ground jumper wire from hole 1 to hole 5, plus you now have a second ground jumper from hole 5 to hole 6. I have no idea why HP did that. However, that allows me to take a stock single-processor build heatsink/fan from a Z640 (or any one from a Z440) and transplant it over to a Z620 single-processor build (or any Z420), and I just hang that 6th hole out in space beyond the 5-pin motherboard's processor fan header. Why do that? Those next generation ZX40 heatsinks have almost exactly twice the aluminum fin surface area as the stock Z620/Z420 ones do, plus one more heat tube (4 instead of 3).
How does that all relate to you? This opens up a cheap source for a HP sourced perfect replacement fan for you. Just buy a Z440/single-processor build Z640 heatsink/fan and harvest the HP fan off that to put on your current CPU0 heatsink. You don't even need to remover the heatsink from the socket to do that if you are careful (remover from power, of course). Go to eBay and look up HP part number 749554-001. The best price currently is $15.00 USD, but I got one a few days back for $11.00 USD, including shipping costs.
Regarding Noctua PWM fans.... love them but not for this project. HP buys fast PWM fans made exactly to their specificatins, and then applies a lot of PWM brakeing for normal fan speed settings. As the motherboard needs more cooling it uses less PWM braking. Same idea for the BIOS control of fan speed. So, the Noctua fans already are made to run slow and quiet and if you add on the standard HP PWM braking those go too slow. Out of balance. Bad idea......
03-31-2020 06:05 AM - edited 03-31-2020 06:35 AM
I wouldn't go with an untested 3rd party fan that wasn't designed for the Z640. This is a software fix that we need to sort out. Also know the Z640 is great but it isn't really designed as a server and probably shouldn't be hosting production 😕 .
I've been happily using my Z640 as a Linux workstation with 2x E5-2620v3 (2x6 cores) but I recently upgraded them to 2x E5-2660v3. With 2x10 cores I'd like to make sure thermal controls are better than with my old 2x6 cores that I never had a problem with. I'd always just let them throttle down under load to prevent overheating (less noise). Now I'd like to do it correctly. Yes the BIOS only has settings for "Idle fan speed" and if your OS doesn't take control of the fan devices it won't ever adjust speed. Linux is happy to detect the PWM controllers on my GPU devices but there is no driver that recognizes the HP system fans, which you can see listed as type 8 (ports) in dmidecode.
# dmidecode 3.2 Getting SMBIOS data from sysfs. SMBIOS 2.8 present. 77 structures occupying 3313 bytes. Table at 0xAB696000. Handle 0x0033, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: P8 CPU0 FAN Internal Connector Type: Other External Reference Designator: Not Specified External Connector Type: None Port Type: Other Handle 0x0034, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: P91 MEM FAN Internal Connector Type: Other External Reference Designator: Not Specified External Connector Type: None Port Type: Other Handle 0x0035, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: P9 FRNT FAN Internal Connector Type: Other External Reference Designator: Not Specified External Connector Type: None Port Type: Other Handle 0x0036, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: P95 REAR FANS Internal Connector Type: Other External Reference Designator: Not Specified External Connector Type: None Port Type: Other
If you dump the raw device info with dmidecode you can see some attributes. I'm trying to figure out if these are just standard PWM controllers or there is something special. If you're running Windows I'm not sure if you need to install their HP Coolsense (usually for laptops) to take control of the fans. That may be a special case where HP likes to license fan cooling software ($$). Maybe someone can clarify?
$ sudo dmidecode -u -H 0x0033 # dmidecode 3.2 Getting SMBIOS data from sysfs. SMBIOS 2.8 present. 77 structures occupying 3313 bytes. Table at 0xAB696000. Handle 0x0033, DMI type 8, 9 bytes Header and Data: 08 09 33 00 01 FF 00 00 FF Strings: 50 38 20 43 50 55 30 20 46 41 4E 00 "P8 CPU0 FAN"
I will see if I can find a udev rule to have the kernel classify these as fan devices so standard thermal control can take them over. Otherwise it may be HP proprietary. Anybody at HP able to answer that one? My Z600s work just fine detecting and controlling the system fans.