Linux - KernelThis forum is for all discussion relating to the Linux kernel.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
kernel: amdgpu 0000:04:00.0: amdgpu: ERROR: GPU over temperature range(SW CTF) detected!
kernel: amdgpu 0000:04:00.0: amdgpu: ERROR: System is going to shutdown due to GPU SW CTF!
Question: How long this problem in kernel? Why it's happened?
I'm looking for temperature, and can't see more than 72℃. Critical 110℃. If critical 110℃ - why kernel shutting down PC, if vcard have only 72℃ maximum?
This indicates that the hotspot for that card is 95C and it games at around 80C. So it seems to me that the 70C shutdown is unneccesary.
Seems I might as well point out the obvious... *BEFORE DOING ANY OF THIS* you better make sure you really truly know what you are getting into, because you can raise this safety check too high and smoke your card or system.
To change this,. first you will have to find the appropriate hwmon directory for your GPU in /sys/class/hwmon
Then cat the temp1_crit file to see what the setting is. Once you find the correct directory, and that file, you can change it.
It's likely going to be 70000 (which is 70C) and you can change it to something more befitting the operating temperatures of that card.
This indicates that the hotspot for that card is 95C and it games at around 80C. So it seems to me that the 70C shutdown is unneccesary.
Seems I might as well point out the obvious... *BEFORE DOING ANY OF THIS* you better make sure you really truly know what you are getting into, because you can raise this safety check too high and smoke your card or system.
To change this,. first you will have to find the appropriate hwmon directory for your GPU in /sys/class/hwmon
Then cat the temp1_crit file to see what the setting is. Once you find the correct directory, and that file, you can change it.
It's likely going to be 70000 (which is 70C) and you can change it to something more befitting the operating temperatures of that card.
I have a suspicion that the temperature of the "junction" rolls over in games. And the shutdown is precisely because of it. I'll check later.
But what is this "junction" temperature?
I have no digits like 70000.
I think if 6700 going too hot it must get thermal throttling, instead of shutting down whole system.
I have this card: https://www.techpowerup.com/review/s...-nitro/33.html
In idle my card hotter on 5℃-6℃. I see 49℃-50℃, instead of the temperature on the site above.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.