Hi
We are running a C++ program on Ubuntu 18.04 LTS.
The app runs fine on a Zotac 1060 however when we run it on Zotac 1060K it appears to freeze the display. We can not switch sessions (eg. Ctrl ALt F1 does not work etc).
The GPU "falls off the bus" causing the freeze usually after a couple hours, but sometimes upto 16-20 hours.
Our application is 32-bit application and CPU intensive, used about 1/4 of the 4 cores (100% out of 400%).
The OS is 64-bit but we built a 32-bit OS and this causes the same issue.
The error in syslog says "GPU fallen off the bus".
We have used the same image on both boxes however the 1060K is the one that freezes.
This happens on multiple 1060K's but not on 1060s. It is not restricted to one box.
Amongst the many tests we have tried to determine the root cause are:
* updated Nvidia drivers
* updated motherboard BIOS
* disabled turbo mode (via bios and via linux)
* downgraded the video card bios on the 1060K to match vbios on the 1060
* upgraded the linux kernel (from 4.15 to 5.1)
* used a high-spec power supply
* disabled the internal ethernet port and used a usb to ethernet adapter
* set the internal fans to 100% to reduce any overheating (logging temp in syslog and it's not going over 70%)
* swapped the memory from a 1060 to 1060K
* set the system to use 1 core in bios (rather than all 4)
* disabled usb ports
* increased the number of linux open files (though our reporting indicates this is not an issue)
* disabled power management (pcie_aspm=off)
* run the application with a lower priority via nice
We can no longer source any Zotac 1060's and need to use Zotac 1060K's.
Any assistance provided would be greatly appreciated
Thanks in advance
Seek2019
We are running a C++ program on Ubuntu 18.04 LTS.
The app runs fine on a Zotac 1060 however when we run it on Zotac 1060K it appears to freeze the display. We can not switch sessions (eg. Ctrl ALt F1 does not work etc).
The GPU "falls off the bus" causing the freeze usually after a couple hours, but sometimes upto 16-20 hours.
Our application is 32-bit application and CPU intensive, used about 1/4 of the 4 cores (100% out of 400%).
The OS is 64-bit but we built a 32-bit OS and this causes the same issue.
The error in syslog says "GPU fallen off the bus".
We have used the same image on both boxes however the 1060K is the one that freezes.
This happens on multiple 1060K's but not on 1060s. It is not restricted to one box.
Amongst the many tests we have tried to determine the root cause are:
* updated Nvidia drivers
* updated motherboard BIOS
* disabled turbo mode (via bios and via linux)
* downgraded the video card bios on the 1060K to match vbios on the 1060
* upgraded the linux kernel (from 4.15 to 5.1)
* used a high-spec power supply
* disabled the internal ethernet port and used a usb to ethernet adapter
* set the internal fans to 100% to reduce any overheating (logging temp in syslog and it's not going over 70%)
* swapped the memory from a 1060 to 1060K
* set the system to use 1 core in bios (rather than all 4)
* disabled usb ports
* increased the number of linux open files (though our reporting indicates this is not an issue)
* disabled power management (pcie_aspm=off)
* run the application with a lower priority via nice
We can no longer source any Zotac 1060's and need to use Zotac 1060K's.
Any assistance provided would be greatly appreciated
Thanks in advance
Seek2019