LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel
User Name
Password
Linux - Kernel This forum is for all discussion relating to the Linux kernel.

Notices


Reply
  Search this Thread
Old 04-05-2023, 12:38 PM   #1
pd27
LQ Newbie
 
Registered: Apr 2023
Posts: 4

Rep: Reputation: 0
PC shutting down during games


I see in logs next:
Code:
kernel: amdgpu 0000:04:00.0: amdgpu: ERROR: GPU over temperature range(SW CTF) detected!
kernel: amdgpu 0000:04:00.0: amdgpu: ERROR: System is going to shutdown due to GPU SW CTF!
Question: How long this problem in kernel? Why it's happened?
I'm looking for temperature, and can't see more than 72℃. Critical 110℃. If critical 110℃ - why kernel shutting down PC, if vcard have only 72℃ maximum?
 
Old 04-05-2023, 03:49 PM   #2
uteck
Senior Member
 
Registered: Oct 2003
Location: Elgin,IL,USA
Distribution: Ubuntu based stuff for the most part
Posts: 1,177

Rep: Reputation: 501Reputation: 501Reputation: 501Reputation: 501Reputation: 501Reputation: 501
What distro and kernel are you using? Also,what hardware?
Kernel 5.8 had patches to deal with this issue.
 
Old 04-05-2023, 07:21 PM   #3
Jan K.
Member
 
Registered: Apr 2019
Location: Esbjerg
Distribution: Windows 7...
Posts: 773

Rep: Reputation: 489Reputation: 489Reputation: 489Reputation: 489Reputation: 489
"Only" 72℃?

I would definitely check fans and thermal paste asap!

Critical 110℃? That would probably kill any processor...
 
Old 04-05-2023, 07:43 PM   #4
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,246

Rep: Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323
Your kernel is detecting that your video card is overheating, and you think the problem is with the kernel?
 
Old 04-05-2023, 08:15 PM   #5
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694
This is a config issue. Or maybe a kernel issue.

My ATI has a max SAFE range of up to 110C,. (not a thing wrong with it in 4 years),.

i wouldn't want my machine deciding to shutdown when its running as advertised either.

https://www.pcgamer.com/fretting-ove...-spec-on-navi/

But its all down to the hardware. Clearly the OP is expecting to run at higher temperatures like mine?
 
Old 04-05-2023, 08:43 PM   #6
frankbell
LQ Guru
 
Registered: Jan 2006
Location: Virginia, USA
Distribution: Slackware, Ubuntu MATE, Mageia, and whatever VMs I happen to be playing with
Posts: 19,361
Blog Entries: 28

Rep: Reputation: 6148Reputation: 6148Reputation: 6148Reputation: 6148Reputation: 6148Reputation: 6148Reputation: 6148Reputation: 6148Reputation: 6148Reputation: 6148Reputation: 6148
In addition to what Jan K. suggested, also check to ensure that cooling vents are free of dust and other obstructions.
 
Old 04-06-2023, 12:51 AM   #7
pd27
LQ Newbie
 
Registered: Apr 2023
Posts: 4

Original Poster
Rep: Reputation: 0
OpenSuse Leap 15.5
Kernel: 5.14.21-150500.46-default
CPU AMD 3700X
Vcard RX 6700XT nitro+
Collers are clean, radiator is clean.

Previous card - RX 570x had the same temperatures, but never shutting downs.
 
Old 04-06-2023, 12:55 PM   #8
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694
https://www.techpowerup.com/review/a...700-xt/33.html

This indicates that the hotspot for that card is 95C and it games at around 80C. So it seems to me that the 70C shutdown is unneccesary.

Seems I might as well point out the obvious... *BEFORE DOING ANY OF THIS* you better make sure you really truly know what you are getting into, because you can raise this safety check too high and smoke your card or system.

To change this,. first you will have to find the appropriate hwmon directory for your GPU in /sys/class/hwmon

Then cat the temp1_crit file to see what the setting is. Once you find the correct directory, and that file, you can change it.

It's likely going to be 70000 (which is 70C) and you can change it to something more befitting the operating temperatures of that card.
 
Old 04-11-2023, 01:32 AM   #9
pd27
LQ Newbie
 
Registered: Apr 2023
Posts: 4

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by szboardstretcher View Post
https://www.techpowerup.com/review/a...700-xt/33.html

This indicates that the hotspot for that card is 95C and it games at around 80C. So it seems to me that the 70C shutdown is unneccesary.

Seems I might as well point out the obvious... *BEFORE DOING ANY OF THIS* you better make sure you really truly know what you are getting into, because you can raise this safety check too high and smoke your card or system.

To change this,. first you will have to find the appropriate hwmon directory for your GPU in /sys/class/hwmon

Then cat the temp1_crit file to see what the setting is. Once you find the correct directory, and that file, you can change it.

It's likely going to be 70000 (which is 70C) and you can change it to something more befitting the operating temperatures of that card.
Code:
~> cat /sys/class/hwmon/hwmon1/temp1_crit
110000
~> cat /sys/class/hwmon/hwmon1/temp1_emergency 
115000
~> cat /sys/class/hwmon/hwmon1/temp1_input 
50000
~> cat /sys/class/hwmon/hwmon1/temp1_label 
edge
~> cat /sys/class/hwmon/hwmon1/temp2_crit
110000
~> cat /sys/class/hwmon/hwmon1/temp2_emergency 
115000
~> cat /sys/class/hwmon/hwmon1/temp2_input 
57000
~> cat /sys/class/hwmon/hwmon1/temp2_label
junction
~> cat /sys/class/hwmon/hwmon1/temp3_crit
105000
~> cat /sys/class/hwmon/hwmon1/temp3_emergency 
110000
~> cat /sys/class/hwmon/hwmon1/temp3_input 
54000
~> cat /sys/class/hwmon/hwmon1/temp3_label
mem
I have a suspicion that the temperature of the "junction" rolls over in games. And the shutdown is precisely because of it. I'll check later.
But what is this "junction" temperature?
I have no digits like 70000.
I think if 6700 going too hot it must get thermal throttling, instead of shutting down whole system.
I have this card: https://www.techpowerup.com/review/s...-nitro/33.html
In idle my card hotter on 5℃-6℃. I see 49℃-50℃, instead of the temperature on the site above.

Last edited by pd27; 04-11-2023 at 01:44 AM.
 
Old 04-17-2023, 01:16 PM   #10
pd27
LQ Newbie
 
Registered: Apr 2023
Posts: 4

Original Poster
Rep: Reputation: 0
Change thermal compound, found out missing part of thermal pad on power.
Now I haven't problems with shutting down. And have lower temperature on GPU.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Shutting down of daemons during Ubuntu shutdown Richard Rahl Ubuntu 2 03-02-2011 08:19 PM
errors/problems during booting and shutting down cusri2004 Ubuntu 1 06-26-2010 05:58 PM
Shutting down and non shutting down x windows golden_boy615 Linux - General 1 03-01-2010 06:28 AM
Database doesn't shutdown during shutting down the server shipon_97 Linux - Enterprise 0 08-11-2007 11:12 AM
"Bad EIP Value" during shutting down anindyanuri Linux - Newbie 2 01-29-2005 11:46 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel

All times are GMT -5. The time now is 07:00 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration