-
×InformationNeed Windows 11 help?Check documents on compatibility, FAQs, upgrade information and available fixes.
Windows 11 Support Center.
-
×InformationNeed Windows 11 help?Check documents on compatibility, FAQs, upgrade information and available fixes.
Windows 11 Support Center.
- HP Community
- Desktops
- Business PCs, Workstations and Point of Sale Systems
- Re: Tesla K20m on Z800 with Nvidia K5000
Create an account on the HP Community to personalize your profile and ask a question
01-21-2023 09:10 AM - edited 01-21-2023 09:11 AM
Anyone with experience with this configuration ?
The Tesla card report insufficent system resource to run correctly.
My hardware is HP z800 with 1100W Power Supply
2x Xeon X5680
95GB RAM
Quadro K5000
running Windows 10
01-21-2023 11:12 AM
hi
not the k20, but i think may help
https://forums.developer.nvidia.com/t/insufficient-system-resources-after-installing-k40c/166587/3
was this reply helpful , or just say thank you ? Click on the yes button
Please remember to mark the answers this can help other users
Desktop-Knowledge-Base
Windows 11 22h2 inside , user
------------------------------------------------------------------------------------------------------------
01-21-2023 10:26 PM - edited 01-21-2023 10:33 PM
the z800 with the 1100 watt supply will have 3 six pin gpu connectors with one cable having two of them and the other cable having only one for a total of two gpu power rails
the k20m draws 225 watts
the k5000 video card draws 122 watts
I'm assuming you have the actively cooled k20m that has the blower fan.
The passively cooled one will not work as it requires a server-level airflow to cool it.
(you could try using the cooling shroud from a gtx 640 and see if that fits)
if power is the problem try a k2000 card which does not use any aux gpu power connectors
When using a K20m which is a Kepler based gpu, the quadro card needs to be a Kepler based card also (which the k500 is)
the k20m needs to use BOTH POWER RAILS, so connect one six pin from the cable with the two connectors to the k20
and use the cable with only one six pin connector this splits the k20's draw across both gpu rails if your k20m has a 8 pin connector use a adapter
use the remaining six pin connector on the cable with the two gpu connectors to connect the k5000 card
finally in the z800 bios disable the network boot roms and disable the LSI sas boot rom (do not disable the LSI sas device)
The K20X should be in the primary GPU slot- and set as the primary in BIOS
since the k20 with a k5000 was never a official HP supported configuration you are on you own in regards to support
01-22-2023 01:24 AM
Solution
This is a design limitation of the NVIDIA Grid Kx, Tesla Kxx/Mxx, and Quadro Kxxxx/Mxxxx cards. This is thus a permanent restriction associated with these NVIDIA cards.
Workaround
Option 1:
Remove system DIMMs from the system to reduce the total amount of installed system memory to less than 1 TB.
Option 2:
If users have exactly 1 TB of physical memory installed, users can use the boot options described below to restrict the memory address range of the system:
On Linux Operating Systems, the NVIDIA Linux driver attempts to identify the scenario where the host system has more memory than a given GPU can address (which is 1 TB on current generation GPUs). If this scenario is detected, the NVIDIA driver will drop back to allocations from the 4 GB Direct Memory Access (DMA) zone to avoid address truncation. This means that the driver will use the __GFP_DMA32 flag and limit itself to memory addresses below the 4 GB boundary. This is done on a per-GPU basis, so limiting one GPU will not limit other GPUs in the system. For example, if an NVIDIA� Quadro� K6000 (which can address 1 TB) and a Quadro 5000 (which can address 512 GB) are installed in a system with 1 TB of memory, then the Quadro K6000 operates normally with no limitations, while the Quadro 5000 falls back to the 4 GB limit. This is the behavior on R331 drivers starting with 331.93 and R340 drivers starting with 340.28. It is also possible to use the mem=1024G or max_addr=1024G kernel parameters to limit the amount of system memory that the operating system can access. This restriction is for the case where the system has remapped any physical memory based on the usage of lower memory addresses assigned to MMIO.
On VMware ESXi 5.1, ESXi 5.5, and ESXi 6.0, the NVIDIA VMware VIB drivers mimic the behavior of the NVIDIA Linux drivers, for example, they will fall back to the 4 GB limit when there is more than 1 TB of system memory.
On Windows Operating Systems (OSes) (Server 2008 R2, Server 2012, and Server 2012 R2) with an NVIDIA� Tesla� card in the default Tesla Compute Cluster (TCC) mode on systems with 1 TB or more system memory, after the NVIDIA Windows driver installer is run and completed, the Microsoft Device Manager will show a yellow bang for the GPU device with a 'This device cannot start. (Code 10)' error and hence NVIDIA Tesla cards in TCC mode will not operate on Windows systems with 1 TB or more of system memory.
On Windows OSes (Server 2008 R2, Server 2012, and Server 2012 R2) with an NVIDIA Tesla, Quadro, or an NVIDIA GRID� card in WDDM (Windows Display Driver Model) mode on systems with 1 TB or more system memory, after the NVIDIA Windows driver installer is run and completed, the Microsoft Device Manager will indicate 'This device is working properly' even though the system will not be functional.
However, after the system is restarted, the Microsoft Device Manager will indicate 'Windows has stopped this device because it has reported issues (Code 43)'. On systems with multiple NVIDIA cards, multiple restarts may be required before the Microsoft Device Manager will indicate 'Windows has stopped this device because it has reported problems (Code 43).' This is the behavior on Windows R331 drivers starting with 333.44 and R340 drivers starting with 340.66.
Alternatively for Microsoft Windows 2003 or prior releases, it is possible to use the /maxmem=1048576 boot option part number in the boot.ini file. For Microsoft Windows 2008 and later releases, it is possible to use the bcdedit /set {current} truncatememory 0xFFFFFFFFFF boot option part number.
was this reply helpful , or just say thank you ? Click on the yes button
Please remember to mark the answers this can help other users
Desktop-Knowledge-Base
Windows 11 22h2 inside , user
------------------------------------------------------------------------------------------------------------
01-23-2023 10:17 AM - edited 01-23-2023 10:18 AM
Sorry, for my part, I found only a few links, and it mixes technique and a language, which is not mine, so I cannot understand everything that is said..
sometimes it seems to be solved simply, but sometimes it seems that no solution is possible!
if installed alone, it will not work either?
was this reply helpful , or just say thank you ? Click on the yes button
Please remember to mark the answers this can help other users
Desktop-Knowledge-Base
Windows 11 22h2 inside , user
------------------------------------------------------------------------------------------------------------
01-23-2023 11:04 PM
as i said previously, disabling the z800's onboard network boot roms and the LSI boot rom will free up systems resources i don't know if this will be sufficient to let the K20m work you may also have to use a low wattage kepler based video card
in regards to the last post on memory, disregard as that only applies to a Lenovo server that can use more than 1TB ram
The HP Z800 Workstation maxes out at 192GB of ECC Registered memory (12 x 32GB).
and unbuffered ram tops out at 96GB (12 x 8GB). both are a far cry from 1TB