LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 08-06-2023, 06:00 PM   #1
selfprogrammed
Member
 
Registered: Jan 2010
Location: Minnesota, USA
Distribution: Slackware 13.37, 14.2, 15.0
Posts: 638

Rep: Reputation: 155Reputation: 155
Enabling Kernel edac AMD64 causes fuse segfaults


Been working on bringing up this custom kernel for Slackware 15, kernel 5.15.
I have a huge kernel, from Slackware distribution.
I have a custom kernel, based initially off my previous custom kernel from Linux 4.4.
I have done this many times so I know how to migrate to a new kernel. I have been running Slackware and making custom kernels since 1997.

The last problem was that I needed to get a new webcam installed for a meeting.
Because of that I had to put support for v4linux into the custom kernel.
At the same time I looked around for other things, and noticed the EDAC.
I don't have ECC memory, but it seemed to support AMD64 (which I do have) and seeming to offer other things that it did not get specific about, but mention mce registers, which I do have.
So I enabled it. It was enabled in the huge kernel and was not causing any problems there.
I only enabled the EDAC that was AMD and made them modules, just like in the huge kernel from Slackware.

Been getting errors in dmesg and syslog ever since.
I run the old Linux 4.4 until I get this running right, and there are no errors over there, so it is NOT an actual memory problem. I run for hours and hours and have NOT seen new nor random faults.
These errors also do not affect this 5.15 system, even after running it for hours and hours.

The huge kernel does not keep module edac_amd64 loaded, but keeps the edac_amd_mce module loaded.

The edac_amd64 module loads with missing symbol error messages (edac_get_owner is one).
These would be satisfied by functions in edac_mc.c. The huge kernel does not get these error messages.
I cannot find where edac_mc.c code would end up.

I am configuring the kernel using menuconfig, and selecting normal EDAC selections.
Unless I am doing something wrong, there is a BUG in the kernel EDAC.

The story continues thus:
So, since the huge kernel does not keep edac_amd64, and after two more compile attempts I cannot fix this,
I decide to turn off the edac_amd64, and turn off the edac_amd76 module too.

NOW:
I notice in the logs that fuse has been having a NULL ptr and has been OOPS-ing.
I can see in the syslog, that this started exactly with the custom kernel with edac.

I am looking a dmesg now and see this message

Selected portions of DMESG.
Code:
[    0.000000] Linux version 5.15.19-smp-W27 (root@darkstar) (gcc (GCC) 11.2.0, GNU ld version 2.37-slack15) #3 SMP Wed Apr 5 11:14:12 CDT 2023
[    0.000000] KERNEL supported cpus:
[    0.000000]   AMD AuthenticAMD
[    0.000000] x86/fpu: x87 FPU will use FXSAVE
[    0.000000] signal: max sigframe size: 1440
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009f7ff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009f800-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000cfdeffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000cfdf0000-0x00000000cfdf0fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000cfdf1000-0x00000000cfdfffff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000cfe00000-0x00000000cfefffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000020fffffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] SMBIOS 2.4 present.
[    0.000000] DMI: Gigabyte Technology Co., Ltd. GA-880GA-UD3H/GA-880GA-UD3H, BIOS F4 07/28/2010
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 3013.345 MHz processor
[    0.003390] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.003394] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.003399] last_pfn = 0x210000 max_arch_pfn = 0x1000000
[    0.003513] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT  
[    0.003691] e820: update [mem 0xcfe00000-0xffffffff] usable ==> reserved
[    0.003707] initial memory mapped: [mem 0x00000000-0x01dfffff]
[    0.003752] ACPI: Early table checksum verification disabled
Code:
[    0.402563] ACPI: 2 ACPI AML tables successfully acquired and loaded
[    0.402750] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKC (20210730/dspkginit-438)
[    0.402759] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKD (20210730/dspkginit-438)
[    0.402766] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKA (20210730/dspkginit-438)
[    0.402773] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKB (20210730/dspkginit-438)
[    0.402785] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKD (20210730/dspkginit-438)
[    0.402792] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKA (20210730/dspkginit-438)
[    0.402799] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKB (20210730/dspkginit-438)
[    0.402806] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKC (20210730/dspkginit-438)
[    0.402828] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKA (20210730/dspkginit-438)
[    0.402838] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKB (20210730/dspkginit-438)
[    0.402845] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKC (20210730/dspkginit-438)
[    0.402852] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKD (20210730/dspkginit-438)
[    0.402863] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKB (20210730/dspkginit-438)
[    0.402870] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKC (20210730/dspkginit-438)
[    0.402876] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKD (20210730/dspkginit-438)
[    0.402883] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKA (20210730/dspkginit-438)
[    0.402894] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKC (20210730/dspkginit-438)
[    0.402901] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKD (20210730/dspkginit-438)
[    0.402908] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKA (20210730/dspkginit-438)
[    0.402914] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKB (20210730/dspkginit-438)
[    0.402925] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKD (20210730/dspkginit-438)
[    0.402932] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKA (20210730/dspkginit-438)
[    0.402939] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKB (20210730/dspkginit-438)
[    0.402945] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKC (20210730/dspkginit-438)
[    0.402956] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKB (20210730/dspkginit-438)
[    0.402963] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKC (20210730/dspkginit-438)
[    0.402970] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKD (20210730/dspkginit-438)
[    0.402977] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKA (20210730/dspkginit-438)
[    0.402988] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKC (20210730/dspkginit-438)
[    0.402995] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKD (20210730/dspkginit-438)
[    0.403002] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKA (20210730/dspkginit-438)
[    0.403009] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKB (20210730/dspkginit-438)
[    0.403020] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKD (20210730/dspkginit-438)
[    0.403027] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKA (20210730/dspkginit-438)
[    0.403033] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKB (20210730/dspkginit-438)
[    0.403040] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKC (20210730/dspkginit-438)
[    0.403051] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKA (20210730/dspkginit-438)
[    0.403058] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKB (20210730/dspkginit-438)
[    0.403064] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKC (20210730/dspkginit-438)
[    0.403071] ACPI Error: AE_NOT_FOUND, While resolving a named reference package element - LNKD (20210730/dspkginit-438)
[    0.404151] ACPI: Interpreter enabled
[    0.404178] ACPI: PM: (supports S0 S3 S5)
[    0.404182] ACPI: Using IOAPIC for interrupt routing
However it successfully uses those LNKA to LNKD interrupts later.

Code:
[    5.940387] ACPI Warning: SystemIO range 0x0000000000000B00-0x0000000000000B08 conflicts with OpRegion 0x0000000000000B00-0x0000000000000B0F (\SOR1) (20210730/utaddress-204)
Code:
[   11.503885] Adding 2046972k swap on /dev/sda10.  Priority:-2 extents:1 across:2046972k 
[   12.262110] EXT4-fs (sda9): re-mounted. Opts: (null). Quota mode: disabled.
[   19.360174] EXT4-fs (sda8): mounted filesystem with ordered data mode. Opts: (null). Quota mode: disabled.
[   19.465154] EXT4-fs (sda11): mounted filesystem with ordered data mode. Opts: (null). Quota mode: disabled.
[   20.420001] NET: Registered PF_INET6 protocol family
[   20.421272] Segment Routing with IPv6
[   20.421294] In-situ OAM (IOAM) with IPv6
[   32.633951] Generic RTL PHY on Gigabyte r8169-0-400:00: attached PHY driver (mii_bus:phy_addr=r8169-0-400:00, irq=MAC)
[   32.821644] r8169 0000:04:00.0 eth0: Link is Down
[   38.666748] timidity[1025]: segfault at 9800000 ip b79955e7 sp bfa8bad0 error 4 in libc-2.33.so[b791d000+156000]
[   38.666765] Code: 08 00 76 3f 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00 90 89 d0 e8 99 a2 ff ff 65 89 2f eb b2 8d 74 26 00 89 d0 25 00 00 f0 ff <8b> 00 eb 99 8d 74 26 00 90 89 34 24 8b 54 24 3c 89 54 24 04 ff d0
[  214.449660] fuse: init (API version 7.34)
[  214.481647] BUG: kernel NULL pointer dereference, address: 00000000
[  214.481664] #PF: supervisor read access in kernel mode
[  214.481672] #PF: error_code(0x0000) - not-present page
[  214.481678] *pdpt = 0000000008ce3001 *pde = 0000000000000000 
[  214.481691] Oops: 0000 [#1] SMP NOPTI
[  214.481702] CPU: 1 PID: 1152 Comm: fusermount3 Not tainted 5.15.19-smp-W27 #3
[  214.481715] Hardware name: Gigabyte Technology Co., Ltd. GA-880GA-UD3H/GA-880GA-UD3H, BIOS F4 07/28/2010
[  214.481721] EIP: fuse_kill_sb_anon+0x67/0xa0 [fuse]
[  214.481744] Code: 0c 8d 57 08 8b 04 24 89 4d 04 89 29 89 57 08 89 57 0c 8b 16 39 f2 74 33 e8 b6 7f fc c8 89 d8 e8 2f 3f 10 c9 8b 9b 08 02 00 00 <8b> 03 e8 32 fc ff ff 89 d8 8b 74 24 08 8b 5c 24 04 8b 7c 24 0c 8b
[  214.481754] EAX: c18736f8 EBX: 00000000 ECX: 00000001 EDX: 00000206
[  214.481762] ESI: f80c9040 EDI: ffffffea EBP: 00000000 ESP: c8e29ed8
[  214.481769] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010286
[  214.481778] CR0: 80050033 CR2: 00000000 CR3: 02aec280 CR4: 000006f0
[  214.481785] Call Trace:
[  214.481792]  ? deactivate_locked_super+0x2a/0x90
[  214.481806]  ? get_tree_nodev+0x87/0xa0
[  214.481814]  ? fuse_get_tree+0xa8/0x180 [fuse]
[  214.481834]  ? vfs_get_tree+0x20/0xa0
[  214.481842]  ? ns_capable+0x2d/0x50
[  214.481853]  ? path_mount+0x3ab/0x910
[  214.481865]  ? user_path_at_empty+0x45/0x60
[  214.481874]  ? __ia32_sys_mount+0x13a/0x1a0
[  214.481884]  ? do_int80_syscall_32+0x33/0x80
[  214.481896]  ? entry_INT80_32+0xed/0xed
[  214.481908] Modules linked in: fuse ipv6 hid_generic usbhid hid rtl8192ee btcoexist rtl_pci radeon rtlwifi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi mac80211 drm_ttm_helper snd_hda_intel ttm r8169 snd_intel_dspcfg snd_hda_codec snd_hwdep drm_kms_helper snd_hda_core cfg80211 snd_pcm syscopyarea sysfillrect sysimgblt serio_raw snd_timer ohci_pci rfkill snd realtek soundcore fb_sys_fops ohci_hcd mdio_devres drm xhci_pci libphy ehci_pci drm_panel_orientation_quirks firewire_ohci i2c_algo_bit xhci_hcd ehci_hcd firewire_core i2c_piix4 evdev pata_acpi floppy loop
[  214.482021] CR2: 0000000000000000
[  214.482056] ---[ end trace 53bef2cf1a56c3e6 ]---
[  214.482063] EIP: fuse_kill_sb_anon+0x67/0xa0 [fuse]
[  214.482082] Code: 0c 8d 57 08 8b 04 24 89 4d 04 89 29 89 57 08 89 57 0c 8b 16 39 f2 74 33 e8 b6 7f fc c8 89 d8 e8 2f 3f 10 c9 8b 9b 08 02 00 00 <8b> 03 e8 32 fc ff ff 89 d8 8b 74 24 08 8b 5c 24 04 8b 7c 24 0c 8b
[  214.482090] EAX: c18736f8 EBX: 00000000 ECX: 00000001 EDX: 00000206
[  214.482097] ESI: f80c9040 EDI: ffffffea EBP: 00000000 ESP: c8e29ed8
[  214.482104] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010286
[  214.482112] CR0: 80050033 CR2: 00000000 CR3: 02aec280 CR4: 000006f0
[  214.485997] BUG: kernel NULL pointer dereference, address: 00000000
[  214.486014] #PF: supervisor read access in kernel mode
[  214.486021] #PF: error_code(0x0000) - not-present page
[  214.486027] *pdpt = 0000000008ce4001 *pde = 0000000000000000 
[  214.486039] Oops: 0000 [#2] SMP NOPTI
[  214.486049] CPU: 1 PID: 1156 Comm: fusermount3 Tainted: G      D           5.15.19-smp-W27 #3
[  214.486060] Hardware name: Gigabyte Technology Co., Ltd. GA-880GA-UD3H/GA-880GA-UD3H, BIOS F4 07/28/2010
[  214.486066] EIP: fuse_kill_sb_anon+0x67/0xa0 [fuse]
[  214.486088] Code: 0c 8d 57 08 8b 04 24 89 4d 04 89 29 89 57 08 89 57 0c 8b 16 39 f2 74 33 e8 b6 7f fc c8 89 d8 e8 2f 3f 10 c9 8b 9b 08 02 00 00 <8b> 03 e8 32 fc ff ff 89 d8 8b 74 24 08 8b 5c 24 04 8b 7c 24 0c 8b
[  214.486097] EAX: c18736f8 EBX: 00000000 ECX: 00000001 EDX: 00000206
[  214.486105] ESI: f80c9040 EDI: ffffffea EBP: 00000000 ESP: c8e31ed8
[  214.486112] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010286
[  214.486120] CR0: 80050033 CR2: 00000000 CR3: 02aec280 CR4: 000006f0
[  214.486127] Call Trace:
[  214.486134]  ? deactivate_locked_super+0x2a/0x90
[  214.486147]  ? get_tree_nodev+0x87/0xa0
[  214.486155]  ? fuse_get_tree+0xa8/0x180 [fuse]
[  214.486173]  ? vfs_get_tree+0x20/0xa0
[  214.486181]  ? ns_capable+0x2d/0x50
[  214.486193]  ? path_mount+0x3ab/0x910
[  214.486204]  ? user_path_at_empty+0x45/0x60
[  214.486212]  ? __ia32_sys_mount+0x13a/0x1a0
[  214.486223]  ? do_int80_syscall_32+0x33/0x80
[  214.486234]  ? entry_INT80_32+0xed/0xed
[  214.486245] Modules linked in: fuse ipv6 hid_generic usbhid hid rtl8192ee btcoexist rtl_pci radeon rtlwifi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi mac80211 drm_ttm_helper snd_hda_intel ttm r8169 snd_intel_dspcfg snd_hda_codec snd_hwdep drm_kms_helper snd_hda_core cfg80211 snd_pcm syscopyarea sysfillrect sysimgblt serio_raw snd_timer ohci_pci rfkill snd realtek soundcore fb_sys_fops ohci_hcd mdio_devres drm xhci_pci libphy ehci_pci drm_panel_orientation_quirks firewire_ohci i2c_algo_bit xhci_hcd ehci_hcd firewire_core i2c_piix4 evdev pata_acpi floppy loop
[  214.486357] CR2: 0000000000000000
[  214.486363] ---[ end trace 53bef2cf1a56c3e7 ]---
[  214.486369] EIP: fuse_kill_sb_anon+0x67/0xa0 [fuse]
[  214.486387] Code: 0c 8d 57 08 8b 04 24 89 4d 04 89 29 89 57 08 89 57 0c 8b 16 39 f2 74 33 e8 b6 7f fc c8 89 d8 e8 2f 3f 10 c9 8b 9b 08 02 00 00 <8b> 03 e8 32 fc ff ff 89 d8 8b 74 24 08 8b 5c 24 04 8b 7c 24 0c 8b
[  214.486396] EAX: c18736f8 EBX: 00000000 ECX: 00000001 EDX: 00000206
[  214.486403] ESI: f80c9040 EDI: ffffffea EBP: 00000000 ESP: c8e29ed8
[  214.486409] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010286
[  214.486417] CR0: 80050033 CR2: 00000000 CR3: 02aec280 CR4: 000006f0
[ 1239.898257] BUG: kernel NULL pointer dereference, address: 00000000
[ 1239.898273] #PF: supervisor read access in kernel mode
[ 1239.898282] #PF: error_code(0x0000) - not-present page
[ 1239.898288] *pdpt = 000000000b346001 *pde = 0000000000000000 
[ 1239.898301] Oops: 0000 [#3] SMP NOPTI
[ 1239.898312] CPU: 1 PID: 1727 Comm: fusermount Tainted: G      D           5.15.19-smp-W27 #3
[ 1239.898324] Hardware name: Gigabyte Technology Co., Ltd. GA-880GA-UD3H/GA-880GA-UD3H, BIOS F4 07/28/2010
[ 1239.898331] EIP: fuse_kill_sb_anon+0x67/0xa0 [fuse]
[ 1239.898355] Code: 0c 8d 57 08 8b 04 24 89 4d 04 89 29 89 57 08 89 57 0c 8b 16 39 f2 74 33 e8 b6 7f fc c8 89 d8 e8 2f 3f 10 c9 8b 9b 08 02 00 00 <8b> 03 e8 32 fc ff ff 89 d8 8b 74 24 08 8b 5c 24 04 8b 7c 24 0c 8b
[ 1239.898365] EAX: c18736f8 EBX: 00000000 ECX: 00000001 EDX: 00000206
[ 1239.898373] ESI: f80c9040 EDI: ffffffea EBP: 00000000 ESP: cb559ed8
[ 1239.898380] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010286
[ 1239.898388] CR0: 80050033 CR2: 00000000 CR3: 02aecfa0 CR4: 000006f0
[ 1239.898395] Call Trace:
[ 1239.898403]  ? deactivate_locked_super+0x2a/0x90
[ 1239.898416]  ? get_tree_nodev+0x87/0xa0
[ 1239.898425]  ? fuse_get_tree+0xa8/0x180 [fuse]
[ 1239.898444]  ? vfs_get_tree+0x20/0xa0
[ 1239.898452]  ? ns_capable+0x2d/0x50
[ 1239.898464]  ? path_mount+0x3ab/0x910
[ 1239.898475]  ? user_path_at_empty+0x45/0x60
[ 1239.898484]  ? __ia32_sys_mount+0x13a/0x1a0
[ 1239.898494]  ? do_int80_syscall_32+0x33/0x80
[ 1239.898505]  ? entry_INT80_32+0xed/0xed
[ 1239.898517] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device fuse ipv6 hid_generic usbhid hid rtl8192ee btcoexist rtl_pci radeon rtlwifi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi mac80211 drm_ttm_helper snd_hda_intel ttm r8169 snd_intel_dspcfg snd_hda_codec snd_hwdep drm_kms_helper snd_hda_core cfg80211 snd_pcm syscopyarea sysfillrect sysimgblt serio_raw snd_timer ohci_pci rfkill snd realtek soundcore fb_sys_fops ohci_hcd mdio_devres drm xhci_pci libphy ehci_pci drm_panel_orientation_quirks firewire_ohci i2c_algo_bit xhci_hcd ehci_hcd firewire_core i2c_piix4 evdev pata_acpi floppy loop
[ 1239.898643] CR2: 0000000000000000
[ 1239.898650] ---[ end trace 53bef2cf1a56c3e8 ]---
[ 1239.898656] EIP: fuse_kill_sb_anon+0x67/0xa0 [fuse]
[ 1239.898674] Code: 0c 8d 57 08 8b 04 24 89 4d 04 89 29 89 57 08 89 57 0c 8b 16 39 f2 74 33 e8 b6 7f fc c8 89 d8 e8 2f 3f 10 c9 8b 9b 08 02 00 00 <8b> 03 e8 32 fc ff ff 89 d8 8b 74 24 08 8b 5c 24 04 8b 7c 24 0c 8b
[ 1239.898683] EAX: c18736f8 EBX: 00000000 ECX: 00000001 EDX: 00000206
[ 1239.898689] ESI: f80c9040 EDI: ffffffea EBP: 00000000 ESP: c8e29ed8
[ 1239.898696] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010286
[ 1239.898704] CR0: 80050033 CR2: 00000000 CR3: 02aecfa0 CR4: 000006f0
[ 1239.901496] BUG: kernel NULL pointer dereference, address: 00000000
[ 1239.901505] #PF: supervisor read access in kernel mode
[ 1239.901509] #PF: error_code(0x0000) - not-present page
[ 1239.901512] *pdpt = 000000000b309001 *pde = 0000000000000000 
[ 1239.901518] Oops: 0000 [#4] SMP NOPTI
[ 1239.901523] CPU: 2 PID: 1728 Comm: fusermount Tainted: G      D           5.15.19-smp-W27 #3
[ 1239.901529] Hardware name: Gigabyte Technology Co., Ltd. GA-880GA-UD3H/GA-880GA-UD3H, BIOS F4 07/28/2010
[ 1239.901532] EIP: fuse_kill_sb_anon+0x67/0xa0 [fuse]
[ 1239.901543] Code: 0c 8d 57 08 8b 04 24 89 4d 04 89 29 89 57 08 89 57 0c 8b 16 39 f2 74 33 e8 b6 7f fc c8 89 d8 e8 2f 3f 10 c9 8b 9b 08 02 00 00 <8b> 03 e8 32 fc ff ff 89 d8 8b 74 24 08 8b 5c 24 04 8b 7c 24 0c 8b
[ 1239.901548] EAX: c18736f8 EBX: 00000000 ECX: 00000001 EDX: 00000206
[ 1239.901551] ESI: f80c9040 EDI: ffffffea EBP: 00000000 ESP: cb7eded8
[ 1239.901555] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010286
[ 1239.901558] CR0: 80050033 CR2: 00000000 CR3: 01ef0c60 CR4: 000006f0
[ 1239.901562] Call Trace:
[ 1239.901565]  ? deactivate_locked_super+0x2a/0x90
[ 1239.901572]  ? get_tree_nodev+0x87/0xa0
[ 1239.901576]  ? fuse_get_tree+0xa8/0x180 [fuse]
[ 1239.901585]  ? vfs_get_tree+0x20/0xa0
[ 1239.901588]  ? ns_capable+0x2d/0x50
[ 1239.901594]  ? path_mount+0x3ab/0x910
[ 1239.901599]  ? user_path_at_empty+0x45/0x60
[ 1239.901603]  ? __ia32_sys_mount+0x13a/0x1a0
[ 1239.901608]  ? do_int80_syscall_32+0x33/0x80
[ 1239.901614]  ? entry_INT80_32+0xed/0xed
[ 1239.901619] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device fuse ipv6 hid_generic usbhid hid rtl8192ee btcoexist rtl_pci radeon rtlwifi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi mac80211 drm_ttm_helper snd_hda_intel ttm r8169 snd_intel_dspcfg snd_hda_codec snd_hwdep drm_kms_helper snd_hda_core cfg80211 snd_pcm syscopyarea sysfillrect sysimgblt serio_raw snd_timer ohci_pci rfkill snd realtek soundcore fb_sys_fops ohci_hcd mdio_devres drm xhci_pci libphy ehci_pci drm_panel_orientation_quirks firewire_ohci i2c_algo_bit xhci_hcd ehci_hcd firewire_core i2c_piix4 evdev pata_acpi floppy loop
[ 1239.901679] CR2: 0000000000000000
[ 1239.901682] ---[ end trace 53bef2cf1a56c3e9 ]---
[ 1239.901685] EIP: fuse_kill_sb_anon+0x67/0xa0 [fuse]
[ 1239.901693] Code: 0c 8d 57 08 8b 04 24 89 4d 04 89 29 89 57 08 89 57 0c 8b 16 39 f2 74 33 e8 b6 7f fc c8 89 d8 e8 2f 3f 10 c9 8b 9b 08 02 00 00 <8b> 03 e8 32 fc ff ff 89 d8 8b 74 24 08 8b 5c 24 04 8b 7c 24 0c 8b
[ 1239.901698] EAX: c18736f8 EBX: 00000000 ECX: 00000001 EDX: 00000206
[ 1239.901701] ESI: f80c9040 EDI: ffffffea EBP: 00000000 ESP: c8e29ed8
[ 1239.901704] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010286
[ 1239.901707] CR0: 80050033 CR2: 00000000 CR3: 01ef0c60 CR4: 000006f0
[ 1248.338539] wlan0: authenticate with 04:95:e6:17:3c:78
[ 1248.349114] wlan0: send auth to 04:95:e6:17:3c:78 (try 1/3)
[ 1248.364682] wlan0: authenticated
[ 1248.366754] wlan0: associate with 04:95:e6:17:3c:78 (try 1/3)
[ 1248.385027] wlan0: RX AssocResp from 04:95:e6:17:3c:78 (capab=0x411 status=0 aid=6)
[ 1248.390454] wlan0: associated
[ 1248.428846] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[ 2127.369126] EXT4-fs (sda7): mounted filesystem with ordered data mode. Opts: (null). Quota mode: disabled.
I only have the EDAC for AMD mce registers enabled now.
I could disable the EDAC entirely.

Either there is something wrong with the kernel compiling, or there is a kernel BUG in EDAC.
I don't know how I could configure a kernel using menuconfig and get this.

I am not looking for someone to tell me "just use that" to avoid the issue. I can get a kernel working.
This is about, what the hell is this, and what is going on.

There is the consideration that I also enabled v4linux in the same kernel compile. But, that has not been having problems. The webcam has not even been plugged into the USB for these EDAC tests.

Anyone recognize anything.

Do I keep fiddling with the kernel options, or upgrade the kernel to a newer version.
Did not want to be changing too much at one time.

Last edited by selfprogrammed; 08-08-2023 at 06:17 AM.
 
Old 08-09-2023, 12:28 PM   #2
selfprogrammed
Member
 
Registered: Jan 2010
Location: Minnesota, USA
Distribution: Slackware 13.37, 14.2, 15.0
Posts: 638

Original Poster
Rep: Reputation: 155Reputation: 155
This is going to be a lesson on paying attention to error messages that seem to be unrelated.
We have too many tools and situations where the developers have been lazy and left "normal" error messages, that you are expected to not get too concerned about.
I have been trying to bring up this Slack15 for 9 months or more, and have seen endless error messages, that you are supposed to ignore.

To test out if updating kernels might affect this problem, or changing kernel options again, I installed the 5.15.117 kernel source, and recompiled it with the same kernel options I have been using.
I also updated the latest Slack15 patches.
Installed the 5.15.117 kernel into the LILO boot.

The 5.15.117 kernel would not show up in the LILO boot.
There WAS an error line from LILO about an MSDOS partition that I had not got around to fixing. I was too busy trying to fix other problems to deal with that now.

Now that I thought about it, maybe I should deal with that. Commented out the parts of the lilo config for those partitions.
Run LILO, and now the new kernel shows up in the LILO choices.

This would mean that LILO has not been writing a new boot block for however many of the last kernel changes I did.
All the kernels on this system, including the old Linux 4.4, have been booting well enough, that this situation was almost unnoticable.
This includes compiling changes to one kernel and installing it and booting it.

What is strange is that instead of crashing in some spectacular way, it only manifested seemingly unrelated small problems.
 
Old 08-09-2023, 03:12 PM   #3
onebuck
Moderator
 
Registered: Jan 2005
Location: Central Florida 20 minutes from Disney World
Distribution: Slackware®
Posts: 13,927
Blog Entries: 45

Rep: Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159
Moderator Response

Moved: This thread is more suitable in <Slackware> and has been moved accordingly to help your thread/question get the exposure it deserves.
 
Old 08-10-2023, 12:43 AM   #4
henca
Senior Member
 
Registered: Aug 2007
Location: Linköping, Sweden
Distribution: Slackware
Posts: 1,012

Rep: Reputation: 678Reputation: 678Reputation: 678Reputation: 678Reputation: 678Reputation: 678
Quote:
Originally Posted by selfprogrammed View Post
This would mean that LILO has not been writing a new boot block for however many of the last kernel changes I did.
After rebooting with an updated kernel it might a good idea to do:

Code:
cat /proc/version
...just to see that you get a version and time stamp that you would expect.

regards Henrik
 
Old 08-12-2023, 03:41 PM   #5
selfprogrammed
Member
 
Registered: Jan 2010
Location: Minnesota, USA
Distribution: Slackware 13.37, 14.2, 15.0
Posts: 638

Original Poster
Rep: Reputation: 155Reputation: 155
Oh yes, I was checking several ways that the changes to the kernel that I had made were there.
It was booting, almost perfectly normal.
It was unlikely for the previous kernel to survive as it was overwritten in the filesystem by the new kernel.
On my system, there is a /boot on a separate small partition, to isolate that from daily work.
So after compiling the kernel, I would mount the boot partition and copy the kernel files there. The modules are allowed to go to /lib/modules normally, as they are not sensitive to being moved.
Because this was a fix of an existing kernel, and I already had 3 other bootable kernels, I just overwrote that particular kernel.
Then I would run lilo.

Upon booting, there was no change to the lilo menu because I had NOT changed lilo, only changed the kernel binary.
Upon booting the kernel that was re-compiled, the differences were visible in dmesg and other places.

I find it difficult to actually believe that the binary for the recompiled kernel was landing in that boot partition in such a position that it would actually boot using the old lilo boot record.
But, at the moment, that is what it looks like happened. But, I still do not believe it.

The only thing that I have to go on is that error message from LILO when it reached the last two entries which were for the two DOS partitions that had not been fixed since the hard drive was replaced. The original lilo.conf and LILO boot record must have been written from the installation. At some time I must have updated it to use the lilo.conf on the boot partition. The old system and this new system both have to have the same LILO setup so the same LILO boot could be written from either. For that reason the lilo.conf actually resides on the boot partition, so there is only one copy. This is what happens when you immediately got 20 problems to solve and have to pick which one to fix first.
Makes me wonder now how did I get the LILO to work with the new drive in the first place. I would have had to write the LILO boot from a rescue boot.

It had turned out that it was not the hard drive that had failed, but was the IDE interface. Ended up buying a new motherboard and then reconfiguring everything for the new motherboard, which was now a quad-cpu. Another situation of 20 things to fix immediately, pick one to fix first. Spend weeks going over the old drive trying to rescue data.
That new drive got partitioned at least 4 times over 8 months, trying to get it organized for all the unpredictable future needs. It has 6 bootable partitions for Linux now, 14 Linux data partitions, 4 foreign partitions, and swap. With all those partition changes, that LILO boot must have been getting written from some temporary lilo.conf, is all I can figure.

Last edited by selfprogrammed; 08-12-2023 at 04:13 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
EDAC amd64: Eccdisabled in bios or no ECC capability EDDY1 Linux - Newbie 3 11-27-2010 11:43 PM
Northbridge EDAC amd64 problems on HDAMA mobo bgjuk Linux - Server 2 10-29-2010 08:18 AM
fusermount: fuse device not found, try 'modprobe fuse' first maestromani Linux - Newbie 1 10-21-2010 12:53 PM
CentOS Error after kernel upgrade. server1 kernel: EDAC MC0: UE page 0x0, offset 0x0 abefroman Linux - Hardware 4 05-15-2010 05:38 PM
FUSE works but fuse group does not exist? violagirl23 Linux - Software 3 01-21-2008 04:01 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 06:51 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration