cancel
Showing results for 
Search instead for 
Did you mean: 
Damo123
Level 1
10 6 0 2
Message 1 of 22
4,104
Flag Post

Since upgrading Z620 from dual E5-2630v2 to dual E5-2660v2 errors on boot

HP Recommended
Z620

I have a Z620 and when I bought this it had a dual CPU E5-2630v2. I have upgraded these to dual E5-2660v2 but on doing so keep getting the following messages on boot:

 

932 - Warning one of the QPI links is not operating

 

In addition to this message I also occasionally get the following message:

 

929 - Fatal MCA error. QPI0 error detected CPU0

 

When I complete the boot the Workstation seems to be working fine, even though the performance increase isnt as much as I had hoped for from the upgrade (only really evident by stress testing but wondering if its something to do with these errors). However it is a real annoyance. I have had a look around and as far as I can tell it is something to do with the number of QPI links but as far as I can see the number on a E5-2630v2 is the same as on a E5-2630v2 so cant understand why it is happening. I have upgraded to the latest BIOS but this doesnt fix the issue. This therefore leads to the following questions:

 

Are there any fixes for this?

 

If there arent what v2 CPU's can I upgrade to so that I wont see this error?

Tags (1)
21 REPLIES 21
BambiBoomZ
Level 7
715 687 45 192
Message 2 of 22
Flag Post
HP Recommended

Damo123,

 

I'm a bit confused as the post mentions: ",..it had a dual CPU E5-2630 v2. I have upgraded these to dual E5-2630 v2."

 

I don't know the exact way QPI links work, but they are the links between multiple processors.  Errors in QPI links are related to:  the socket, motherboard, or the CPU. 

 

If the system was not showing the error with the previous CPU's, then the most likely cause is a fault in the new CPU0. 

 

If you are inclined, conisider installing the free trial of Passmark Performance Test and running the CPU test. According to Passmark, the average CPU Mark for a single E5-2630 v2 is 10453 and for dual processors: 16267

 

You might try swapping the two CPU's and if the error message is listed as occurring on CPU1, then the processor needs to be replaced.  But, Xeon E5's have extremly good reliability (170,000 hours MTBF I think) and I'd be inclined to look for incidental problems. While making the swap, inspect the socket pins carefully for: bent pins, old thermal paste debris or dust in the pins.  If the error message still lists CPU0 after the swap, the fault is the socket or motherboard.

 

BambiBoom_Z

 

 
HP z620_2 (2017) > Xeon E5-1680 v2 (8-core@ 4.1GHz)  / 64GB DDR3-1866 ECC Reg / Quadro P2000 5GB / HP Z Turbo Drive M.2 256GB + Intel 730 480GB + Seagate Constellation ES.3 1TB / ASUS Essence STX PCIe sound card / 825W PSU / Windows 7 Prof.’l 64-bit  > 2X Dell Ultrasharp U2715H  (2560 X 1440) / Logitech z2300 2.1 Sound

 

[Passmark Rating = 6166 / CPU rating = 16934 / 2D = 820 / 3D= 8849 / Mem = 2991 / Disk = 13794] 4.24.17   Single Thread Mark = 2252'

 

0 Kudos
Damo123
Author
Level 1
10 6 0 2
Message 3 of 22
Flag Post
HP Recommended
Hi BambiBoom_Z

Thanks for the reply. I just realised I had made a typo. It should read that I upgraded from the dual E5-2630 v2 to E5-2660 v2. Sorry about that.

I have also tried swapping the CPU's from one slot to the other and the error remains with the same CPU socket. I also tried another E5-2660 v2 CPU and the same error occurred.
BambiBoomZ
Level 7
715 687 45 192
Message 4 of 22
Flag Post
HP Recommended

Damo123,

 

Oh. no problem- I'd assumed you'd changed to another model processor.

 

Just to verify: Did the system perform well and without error messages with both E5-2630 v2's?

 

Also, you might  consider going to BIOS setup and resetting to factory defaults,  plus checking that all processors and all cores are enabled.

 

In Control Panel > System and Device Manager, how many processors / cores are listed? Is the total amount of RAM shown correctly?  I had the thought that the error may have an oblique connection to a memory problem and you might like to try swapping  and reseating the RAM associated with CPU0.

 

As the problem appears to be isolated to the CPU0  socket /motherboard and not the processors, only to eliminate the possibility, consider removing CPU0 and making a close inspection of the socket for bent pins, debris. dust, & etc. plus have a look at the motherboard surrounding the sockets for anything discolored or burnt.

 

If would be interesting also to try the Passmark test with both CPU's running and then with the 2nd CPU riser removed. If the CPU score is about the same, that would verify the  QPI link problem. Especallly if the system runs on CPU0  only without the error message, then my guess (I hope others will comment!) is that the motherboard is faulty.

 

 

BambiBoom_Z

0 Kudos
Damo123
Author
Level 1
10 6 0 2
Message 5 of 22
Flag Post
HP Recommended

Yes the system worked perfectly fine with the old E5-2630 v2 CPU's and had no errors.

 

All cores and threads are enabled and showing and RAM is displaying fine. The system works as it should once it boots up past the error message.

 

Also if I remove the 2nd CPU board and boot with just CPU0 then it boots up fine with no error message. I have also tested this using some CPU intensive mining software and performace is about half with just 1 CPU (as you would expect).

 

Tomorrow I will put back in the old E5-2630 v2 CPU's to test if the error comes up.

 

0 Kudos
BambiBoomZ
Level 7
715 687 45 192
Message 6 of 22
Flag Post
HP Recommended

Damo123,

 

As I re-read the thread this morning, the fact that the system worked properly without the 2nd CPU riser may suggest a memory problem.  You might consider trying using both E5-2660 v2's but with only one RAM module on the main board and one on the CPU riser board.  The modules should be placed in the first positions on both boards according to the order of placement in the manual. However, if you have already changed back to the E5-2630 v2's, those results will be revealing also.

 

BambiBoom_Z 

0 Kudos
Damo123
Author
Level 1
10 6 0 2
Message 7 of 22
Flag Post
HP Recommended

I've changed back to the old CPU's and am now still getting the same error:

 

932 - Warning one of the QPI links is not operating

0 Kudos
SDH
Level 10
2,274 2,228 211 657
Message 8 of 22
Flag Post
HP Recommended

Possible bent pin in one of the sockets?

0 Kudos
Javato
Level 2
17 15 0 10
Message 9 of 22
Flag Post
HP Recommended

Hi there!

 

  I happen to suffer the same situation. Just bought a refurbished z620 and riser card that came with 2x E5 2609. It seemed to work fine at first, but now it has begun to report at boot:

 

  929-Fatal MCA error

      QPI0 error detected CPU 1

      Generic interconnect level

       II other transaction Generic PP - bus interconnect error - phy control error.

 

  932-Warning one of the QPI links is not operating

 

  Tried to swap memory dimms and take even slower dimms so I would say is not a memory issue. Took the CPU out and applied some compressed air, but result is the same.

 

   The 929 does not show every time, the 932 I would say "always".

 

   For the rest I cannot confirm as Damo123 said that system works as before given I had no system installed. Booted a linux live and got some recoverable kernel errors.

 

   Bios is updated to the latest version and system does not report any error if the CPU riser card is out. Starting to suspect a riser card problem 😞

 

   Damo123, did you find a solution to it?

 

 

Thanks in advance,

Javier.

   

0 Kudos
BambiBoomZ
Level 7
715 687 45 192
Message 10 of 22
Flag Post
HP Recommended

 Javier,

 

Are you using ECC registered RAM? What is the amount on the mainboard, amount on the riser board, and module size? I have read that a symmetrical amount on the mainboard and riser plus the proper module installation sequence is important. There is a diagram on the inside of the access door for the order of placing the RAM.  For example, in my dual CPU z620, I have 4X 8GB mainboard + 4X 8GB riser of HP DDR3-1600 ECC registered.

 

The conventional assessment for this situation is that the 2nd CPU riser has a defect, or the riser connector socket on the mainboard has a fault- bent pins or debris in pins. Another possibility is bent pins or debris in the riser CPU socket. The last consideration is the possible impending failure of the riser E5-2609.  Can you try the riser E5-2609 on the mainboard to test it?

 

BambiBoomZ

 

HP z620_1 (2012) (Rev 3) 2X Xeon E5-2690 (8-core @ 2.9 / 3.8GHz) / 64GB DDR3-1600 ECC reg) / Quadro K2200 (4GB) + Tesla M2090 (6GB) / HP Z Turbo Drive (256GB) + Samsung 850 Evo 250GB + Seagate Constellation ES.3 (1TB) / 800W / Windows 7 Professional 64-bit > > HP 2711x  (27"  1980 X 1080)
[ Passmark System Rating= 5675 / CPU= 22625 / 2D= 815 / 3D = 3580 / Mem = 2522 / Disk = 12640 ] 9.25.16   Single Thread Mark = 1903
[ Cinebench R15: CPU = 2209 cb / Single core 130 cb / OpenGL= 119.23 fps / MP Ratio 16.84x] 10.31.16

 

 

 

0 Kudos
Warning Be alert for scammers posting fake support phone numbers and/or email addresses on the community. If you think you have received a fake HP Support message, please report it to us by clicking on "Flag Post".
† The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation