LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Debian (https://www.linuxquestions.org/questions/debian-26/)
-   -   Upgrade from Debian10 to Debian11: issue with the grafic card Radeon [1002:6611] (https://www.linuxquestions.org/questions/debian-26/upgrade-from-debian10-to-debian11-issue-with-the-grafic-card-radeon-%5B1002-6611%5D-4175699554/)

floppy_stuttgart 08-23-2021 07:13 AM

Upgrade from Debian10 to Debian11: issue with the grafic card Radeon [1002:6611]
 
Hello,
since I have upgraded to Debian11 yesterday, few errors appear in dmesg and the grafic card dont send data straight into my screen via HDMI switch (I have to on/off the switch now).

If anybody has an idea what to do, let me know (this is a small issue; by clicking the HDMI switch on/off I see Debian11 on the screen).


lspci -nn
...
Quote:

01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland [Radeon HD 8570 / R5 430 OEM / R7 240/340 / Radeon 520 OEM] [1002:6611] (rev 87)
...

dmesg | grep -i "radeon"
Quote:

[ 1.446411] [drm] radeon kernel modesetting enabled.
[ 1.446455] fb0: switching to radeondrmfb from EFI VGA
[ 1.446575] radeon 0000:01:00.0: vgaarb: deactivate vga console
[ 1.446870] radeon 0000:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used)
[ 1.446871] radeon 0000:01:00.0: GTT: 2048M 0x0000000080000000 - 0x00000000FFFFFFFF
[ 1.446967] [drm] radeon: 2048M of VRAM memory ready
[ 1.446967] [drm] radeon: 2048M of GTT memory ready.
[ 1.446985] radeon 0000:01:00.0: firmware: direct-loading firmware radeon/oland_pfp.bin
[ 1.446993] radeon 0000:01:00.0: firmware: direct-loading firmware radeon/oland_me.bin
[ 1.447001] radeon 0000:01:00.0: firmware: direct-loading firmware radeon/oland_ce.bin
[ 1.447008] radeon 0000:01:00.0: firmware: direct-loading firmware radeon/oland_rlc.bin
[ 1.447018] radeon 0000:01:00.0: firmware: direct-loading firmware radeon/si58_mc.bin
[ 1.447033] radeon 0000:01:00.0: firmware: direct-loading firmware radeon/oland_smc.bin
[ 1.453194] [drm] radeon: dpm initialized
[ 1.453236] radeon 0000:01:00.0: firmware: direct-loading firmware radeon/TAHITI_uvd.bin
[ 1.453260] radeon 0000:01:00.0: firmware: direct-loading firmware radeon/TAHITI_vce.bin
[ 1.504506] radeon 0000:01:00.0: WB enabled
[ 1.504507] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00
[ 1.504508] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04
[ 1.504509] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08
[ 1.504509] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c
[ 1.504510] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10
[ 1.504828] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18
[ 1.533145] radeon 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem
[ 1.605397] radeon 0000:01:00.0: failed VCE resume (-110).
[ 1.605427] radeon 0000:01:00.0: radeon: MSI limited to 32-bit
[ 1.605455] radeon 0000:01:00.0: radeon: using MSI.
[ 1.605471] [drm] radeon: irq initialized.
[ 2.617879] [drm] Radeon Display Connectors
[ 2.723579] fbcon: radeondrmfb (fb0) is primary device
[ 2.814890] radeon 0000:01:00.0: [drm] fb0: radeondrmfb frame buffer device
[ 2.833132] [drm] Initialized radeon 2.50.0 20080528 for 0000:01:00.0 on minor 0
[ 2.846166] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery reached max voltage
[ 2.846204] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery failed
[ 20.824797] radeon_dp_aux_transfer_native: 74 callbacks suppressed
[ 30.336785] radeon_dp_aux_transfer_native: 116 callbacks suppressed
[ 44.640777] radeon_dp_aux_transfer_native: 32 callbacks suppressed
[ 58.844770] radeon_dp_aux_transfer_native: 74 callbacks suppressed
[ 104.730197] radeon_dp_aux_transfer_native: 158 callbacks suppressed

HappyTux 08-23-2021 11:00 AM

Not much to be done I would think, the new driver does not like the switch connection, you need to have it connect again. The only way I could see for a resolution to your problem is get in touch with the developers and see if they want to resolve the new behaviour by debugging it with you.

floppy_stuttgart 08-23-2021 12:08 PM

Thanks. Not much a big problem. However, I see a difference: it comes straight after i915

Old DMESG: (Debian10)
Quote:

[ 2.942663] fbcon: radeondrmfb (fb0) is primary device
[ 2.967836] usb 1-13: new low-speed USB device number 5 using xhci_hcd
[ 3.090718] Console: switching to colour frame buffer device 240x67
[ 3.093490] radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device
[ 3.108330] [drm] Initialized radeon 2.50.0 20080528 for 0000:01:00.0 on minor 0
[ 3.125261] usb 1-13: New USB device found, idVendor=03f0, idProduct=354a, bcdDevice= 1.22
[ 3.125262] usb 1-13: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[ 3.125262] usb 1-13: Product: HP USB Slim Keyboard
[ 3.128461] input: HP USB Slim Keyboard as /devices/pci0000:00/0000:00:14.0/usb1/1-13/1-13:1.0/0003:03F0:354A.0005/input/input11
[ 3.187865] hid-generic 0003:03F0:354A.0005: input,hidraw4: USB HID v1.10 Keyboard [HP USB Slim Keyboard] on usb-0000:00:14.0-13/input0
[ 3.190664] input: HP USB Slim Keyboard Consumer Control as /devices/pci0000:00/0000:00:14.0/usb1/1-13/1-13:1.1/0003:03F0:354A.0006/input/input12
[ 3.194942] [drm] amdgpu kernel modesetting enabled.
[ 3.247809] input: HP USB Slim Keyboard System Control as /devices/pci0000:00/0000:00:14.0/usb1/1-13/1-13:1.1/0003:03F0:354A.0006/input/input13
[ 3.247824] input: HP USB Slim Keyboard as /devices/pci0000:00/0000:00:14.0/usb1/1-13/1-13:1.1/0003:03F0:354A.0006/input/input14
[ 3.247854] hid-generic 0003:03F0:354A.0006: input,hiddev2,hidraw5: USB HID v1.10 Device [HP USB Slim Keyboard] on usb-0000:00:14.0-13/input1
[ 3.379790] usb 1-14: new full-speed USB device number 6 using xhci_hcd
[ 3.532745] usb 1-14: New USB device found, idVendor=8087, idProduct=0aaa, bcdDevice= 0.02
[ 3.532746] usb 1-14: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[ 3.647562] [drm] Initialized i915 1.6.0 20180719 for 0000:00:02.0 on minor 1
[ 3.647663] ACPI: Video Device [PEGP] (multi-head: yes rom: no post: no)
[ 3.647825] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:06/LNXVIDEO:00/input/input15
[ 3.649096] ACPI: Video Device [GFX0] (multi-head: yes rom: no post: no)
[ 3.649151] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:01/input/input16
[ 3.677486] [drm] Cannot find any crtc or sizes
[ 3.708458] [drm] Cannot find any crtc or sizes
[ 3.738503] [drm] Cannot find any crtc or sizes
New DMESG: (Debian11)
Quote:

[ 2.721528] fbcon: radeondrmfb (fb0) is primary device
[ 2.785100] usb 1-13: new low-speed USB device number 6 using xhci_hcd
[ 2.808753] Console: switching to colour frame buffer device 240x67
[ 2.812297] radeon 0000:01:00.0: [drm] fb0: radeondrmfb frame buffer device
[ 2.825391] [drm] Initialized radeon 2.50.0 20080528 for 0000:01:00.0 on minor 0
[ 2.825950] [drm] Initialized i915 1.6.0 20200917 for 0000:00:02.0 on minor 1
[ 2.826120] ACPI: Video Device [PEGP] (multi-head: yes rom: no post: no)
[ 2.826374] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:06/LNXVIDEO:00/input/input10
[ 2.828387] ACPI: Video Device [GFX0] (multi-head: yes rom: no post: no)
[ 2.828491] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:01/input/input11
[ 2.856402] i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
[ 2.859611] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery tried 5 times
[ 2.859792] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery failed
[ 2.885312] i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
[ 2.912178] i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
Could it be the i915 onboard grafic and the radeon would disturb together?
Why a non primary device (i915) was tried to get a setup?
i915 grafic perhaps should be blacklisted. I will further search.

HappyTux 08-23-2021 12:29 PM

Quote:

Originally Posted by floppy_stuttgart (Post 6277857)

Could it be the i915 onboard grafic and the radeon would disturb together?
Why a non primary device (i915) was tried to get a setup?
i915 grafic perhaps should be blacklisted. I will further search.

The blacklist idea is worth a try, it is not being used so no sense in it even loading.

floppy_stuttgart 08-24-2021 10:21 AM

Taking i915 out would be an issue if the radeon grafic would scratch?
https://wiki.debian.org/KernelModuleBlacklisting
I would not like to mess up my system with an initramfs command where I could not go back.
Further finding; It looks like the i915 in Debian11 has a more extended setup; see below.
Comments/remarks are welcome.

Debian10 kernel
grep -i i915 /boot/config-4.19.0-17-amd64
Quote:

CONFIG_DRM_I915=m
# CONFIG_DRM_I915_ALPHA_SUPPORT is not set
CONFIG_DRM_I915_CAPTURE_ERROR=y
CONFIG_DRM_I915_COMPRESS_ERROR=y
CONFIG_DRM_I915_USERPTR=y
# CONFIG_DRM_I915_GVT is not set
# drm/i915 Debugging
# CONFIG_DRM_I915_WERROR is not set
# CONFIG_DRM_I915_DEBUG is not set
# CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS is not set
# CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is not set
# CONFIG_DRM_I915_DEBUG_GUC is not set
# CONFIG_DRM_I915_SELFTEST is not set
# CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS is not set
# CONFIG_DRM_I915_DEBUG_VBLANK_EVADE is not set
CONFIG_SND_HDA_I915=y
Debian11 kernel
grep -i i915 /boot/config-5.10.0-8-amd64
Quote:

CONFIG_DRM_I915=m
CONFIG_DRM_I915_FORCE_PROBE=""
CONFIG_DRM_I915_CAPTURE_ERROR=y
CONFIG_DRM_I915_COMPRESS_ERROR=y
CONFIG_DRM_I915_USERPTR=y
CONFIG_DRM_I915_GVT=y
CONFIG_DRM_I915_GVT_KVMGT=m
# drm/i915 Debugging
# CONFIG_DRM_I915_WERROR is not set
# CONFIG_DRM_I915_DEBUG is not set
# CONFIG_DRM_I915_DEBUG_MMIO is not set
# CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS is not set
# CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is not set
# CONFIG_DRM_I915_DEBUG_GUC is not set
# CONFIG_DRM_I915_SELFTEST is not set
# CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS is not set
# CONFIG_DRM_I915_DEBUG_VBLANK_EVADE is not set
# CONFIG_DRM_I915_DEBUG_RUNTIME_PM is not set
# end of drm/i915 Debugging
# drm/i915 Profile Guided Optimisation
CONFIG_DRM_I915_FENCE_TIMEOUT=10000
CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND=250
CONFIG_DRM_I915_HEARTBEAT_INTERVAL=2500
CONFIG_DRM_I915_PREEMPT_TIMEOUT=640
CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT=8000
CONFIG_DRM_I915_STOP_TIMEOUT=100
CONFIG_DRM_I915_TIMESLICE_DURATION=1
# end of drm/i915 Profile Guided Optimisation
CONFIG_SND_HDA_I915=y

HappyTux 08-24-2021 07:00 PM

Quote:

Originally Posted by floppy_stuttgart (Post 6278198)
Taking i915 out would be an issue if the radeon grafic would scratch?
https://wiki.debian.org/KernelModuleBlacklisting
I would not like to mess up my system with an initramfs command where I could not go back.
Further finding; It looks like the i915 in Debian11 has a more extended setup; see below.
Comments/remarks are welcome.

Debian10 kernel
grep -i i915 /boot/config-4.19.0-17-amd64


Debian11 kernel
grep -i i915 /boot/config-5.10.0-8-amd64

As well it should that is many version of the kernel later, there is bound to be some progress on the driver. It is up to you if you want to try but there is not much to lose on a one time boot with the module blacklisted. You do not use the onboard card so it will still use the add in card. All you have to do to reverse it is eliminate the blacklist file do the dependmod command again and update the initramfs once more. Reboot to have everything back to where it was before you did the black listing.

floppy_stuttgart 08-25-2021 10:46 AM

I had to blacklist and make a fake install. i915 gone. Issue still there. Old errors still there. One new errors coming (no effect seen in the sound).
I still have to click on my HDMI switch (the card still to wake up due to this; else the screen goes into a sleep mode). A bit ennoying.
Look like not an i915 and radeon interaction but now a radeon issue on Debian11. I will try to identify again the dmesg output when I switch the HDMI after a while.

Quote:

[ 1.613602] radeon 0000:01:00.0: failed VCE resume (-110).
[ 2.894117] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery tried 5 times
[ 2.894270] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery failed
Quote:

[ 77.090436] hdaudio hdaudioC0D2: Unable to bind the codec

floppy_stuttgart 08-29-2021 05:27 AM

Update.

PC boot this morning after a night when off: this was a cold boot. Screen is coming after a while (a bit too long in my opinion; whatever thats ok).

a) error below are gone

[ 2.894117] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery tried 5 times
[ 2.894270] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery failed

b) yesterday I made following (effect positive on a cold boot?)

apt-get update
apt-get dist-upgrade ; one lib was updated (cannot remember which one)
pip3 install amdgpu-fan ; see https://wiki.debian.org/AtiHowTo
apt-get install radeontop ; see https://wiki.debian.org/AtiHowTo

After a reboot, the issue is back again.

So, there is a difference between cold boot and warm boot. I had such an issue few time ago (that was sound; not graphic), I will see if a similar approach will solve https://www.linuxquestions.org/quest...ve-4175534057/

This PC has a Win10 double boot. No issues at all with Win10.

Any advice/help is welcome.

floppy_stuttgart 09-01-2021 08:34 AM

Adding
Quote:

radeon.pcie_gen2=0
as boot parameter made it https://wiki.archlinux.org/title/ATI

cat /proc/cmdline
Quote:

BOOT_IMAGE=/boot/vmlinuz-5.10.0-8-amd64 root=UUID=300f2560-ac9d-4eb9-8d18-c155909766c7 ro quiet 8250.nr_uarts=10 radeon.pcie_gen2=0
Thread closed.

HappyTux 09-01-2021 12:50 PM

Quote:

Originally Posted by floppy_stuttgart (Post 6280530)
Adding as boot parameter made it https://wiki.archlinux.org/title/ATI

cat /proc/cmdline


Thread closed.

And the Arch Linux wiki strikes again, they have some good pages on that thing, I have used it a few times as well. Good you hear you got it going and nice to see the solution posted.

floppy_stuttgart 09-02-2021 02:31 AM

Quote:

Originally Posted by HappyTux (Post 6280616)
And the Arch Linux wiki strikes again, they have some good pages on that thing, I have used it a few times as well. Good you hear you got it going and nice to see the solution posted.

Yes. Arch Wiki is a source of good things. My feeling is, this is not the first time I am having a solution from there. With your post I came with the radeon.pcm=0 which made it. But at a third boot, nomore. And then I was further searching for parameters radeon.xxx=1/0. Due to cold boot / warm boot observed behaviour it made sense an instability in the hardware had to compensated. A pci boot parameter made sense for me. I tried. It works 5 times. Not longer. Situation is still unstable.

floppy_stuttgart 09-02-2021 06:23 AM

Closed. Done.

lspci -nn
Quote:

01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland [Radeon HD 8570 / R5 430 OEM / R7 240/340 / Radeon 520 OEM] [1002:6611] (rev 87)
Radeon HD 8570 is "Sea Islands" https://en.wikipedia.org/wiki/Radeon#Sea_Islands
Quote:

Sea Islands
Main article: Radeon HD 8000 series
The "Sea Islands" were OEM rebadges of the 7000 series, with only three products, code named Oland, available for general retail. The series, just like the "Southern Islands", used a mixture of VLIW5 models and GCN models for its desktop products.
Card advice https://wiki.debian.org/AtiHowTo#gcn1011
Quote:

Within the quotes on the line that starts with GRUB_CMDLINE_LINUX_DEFAULT, add the options radeon.si_support=0*amdgpu.si_support=1 for Southern Islands (GCN 1.0) cards, or radeon.cik_support=0 amdgpu.cik_support=1 for Sea Islands (GCN 1.1) cards.
By implementing it as boot parameter (see below).
cat /proc/cmdline
Quote:

BOOT_IMAGE=/boot/vmlinuz-5.10.0-8-amd64 root=UUID=300f2560-ac9d-4eb9-8d18-c155909766c7 ro quiet 8250.nr_uarts=10 radeon.cik_support=0 amdgpu.cik_support=1
System is ok. Why it was not necessary for Debian10? I dont know.

HappyTux 09-02-2021 07:46 AM

Quote:

Originally Posted by floppy_stuttgart (Post 6280786)


System is ok. Why it was not necessary for Debian10? I dont know.

You now boot a newer kernel with Debian 11 unless you had the backports enabled in Debian 10 and went out of your way to install a newer kernel, that would be the difference. I had to do this to have my ZFS work in the 10 version when it stopped working on the upgrade from the 9. Luckily I had waited enough time to do the upgrade for the bugs to have been filed so I got to find the solution quickly there. Then once I went with the 11 I forgot the backports was in a file in the sources.list.d directory and it broke again with the old files it installed from there.

floppy_stuttgart 09-03-2021 06:49 AM

Issue came again today despite it worked yesterday. Let keep this thread closed. I will come back here if I have found a solution with a stable situation for few weeks.

floppy_stuttgart 09-20-2021 12:58 PM

I am still battling with my card after the upgrade. 2 screens can be attached but I did not find precisely the sequence to make this happing continously.Any advice/idea is welcome.

Following error happens now:

Quote:

[ 1.609962] kfd kfd: OLAND not supported in kfd
Not sure what it is.

dmesg | grep -i "error"
Quote:

[ 0.292129] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.PEG0.HDAU._STA.M097], AE_NOT_FOUND (20200925/psargs-330)
[ 0.292136] ACPI Error: Aborting method \_SB.PCI0.PEG0.HDAU._STA due to previous error (AE_NOT_FOUND) (20200925/psparse-529)
[ 0.349177] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.PEG0.HDAU._STA.M097], AE_NOT_FOUND (20200925/psargs-330)
[ 0.349183] ACPI Error: Aborting method \_SB.PCI0.PEG0.HDAU._STA due to previous error (AE_NOT_FOUND) (20200925/psparse-529)
[ 0.379951] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.PEG0.HDAU._STA.M097], AE_NOT_FOUND (20200925/psargs-330)
[ 0.379951] ACPI Error: Aborting method \_SB.PCI0.PEG0.HDAU._STA due to previous error (AE_NOT_FOUND) (20200925/psparse-529)
[ 1.172543] pcieport 0000:00:1b.0: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
[ 1.172870] pcieport 0000:00:1d.0: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
[ 12.644585] EXT4-fs (sda2): re-mounted. Opts: errors=remount-ro
[ 15.879956] hp_wmi: query 0x4 returned error 0x5
[ 15.881006] hp_wmi: query 0xd returned error 0x5
[ 15.883392] hp_wmi: query 0x1b returned error 0x5
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.10.0-8-amd64 root=UUID=300f2560-ac9d-4eb9-8d18-c155909766c7 ro quiet 8250.nr_uarts=6 radeon.pcie_gen2=0 radeon.dpm=0 radeon.cik_support=0 amdgpu.cik_support=1

I will observe how stable it is the next days. So far ok. Re-reading the Arch Wiki was a good thing.


All times are GMT -5. The time now is 11:37 AM.