NVIDIA driver working on CentOS 7.2, but not for all users
Good day LQ!!
I have a confusing issue that I've wasted most of a day trying to debug. I have a server (really, a cluster of severs, but I do not think that is relevant to the question) with a NVIDIA P100 installed in it. We have groups of researchers who run GPU enabled codes on these servers. There's one group running NAMD on these successfully, however, recently, they've added a few users and some of them seem to be unable to successfully run the code with errors like: ... CUDA driver version is insufficient for CUDA runtime version However, other users are able to run the code (using the GPU) without any trouble. Now, I've looked at length at pretty much anything that I can think of that might be different about these users:
I've simplified about everything I can think about their use case; they normally try to access these servers using a job scheduler (sge, which can be complicated and confusing), but I've logged into one of the target servers as several of the users and can verify that some of them can run the code directly on the server and others cannot. I'm at a loss and have run out of ideas; I would very much appreciate any help you may be able to give me the might point me to the reason why certain users cannot use the GPUs while others can. Please give me ideas! Thank you all very much for your support and time! |
ping
All, I'm not trying to be a pain; but I really need help with this. Does anybody have any ideas or perhaps a recommendation about where else I might post this question?
|
Compare the .bashrc / ,profile / .bash_profile, etc. files?
|
scasey, thank you for your reply. While your suggestion didn't directly fix my problem, it did help me find that this particular application that we're trying to run has a secret hidden/dot file that it loads if it exists that I'd completely forgotten about. Once all users have this file, they can run the application.
The application did nothing at all to hint that this was the problem and the errors didn't indicate anything helpful either. Very frustrating and a crappy way to waste time! Thanks again for helping me find this solution! Cheers! |
All times are GMT -5. The time now is 01:11 PM. |