LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   Gigabyte Ethernet RTL8168 broken on new Kernel releases (https://www.linuxquestions.org/questions/slackware-14/gigabyte-ethernet-rtl8168-broken-on-new-kernel-releases-4175734884/)

the3dfxdude 03-15-2024 10:09 AM

Don't misunderstand. The probing change fixed some boards, and broke his. We want to keep the probing change for the other boards. The problem is the kernel dev doesn't want to address that they broke his chip completely. They fixed some, and broke some. So it is still a regression and instead they should handle things properly. Going back to my story earlier about a broken BIOS and the kernel devs helping to fix that, they introduced a switch, because there is always a chance to get probing wrong. I ended up not using the switch for the fix because it wasn't needed for me (I found the BIOS problem and got a BIOS fix during the troubleshooting). Others still needed the switch because there was no BIOS fix. So the fix should be is to add a switch to let selfprogrammed set the phy_id for his chip.

elcore 03-15-2024 10:42 AM

The default r8169 works fine here, on all 5.15.x kernels, with all 6 bios upgrade images from gigabyte.
Mine is (0x001cc800) and I never had to patch the thing, it does use libphy unlike the older realtek chip but works okay.
@OP This patch should be done by kernel maintainer, simply to avoid breaking unaffected realtek chips in the process.
Also, please note which "new kernel versions" are affected, which old kernel versions are not, and which realtek chip is broken etc.

hazel 03-15-2024 10:54 AM

Quote:

Originally Posted by elcore (Post 6489907)
The default r8169 works fine here, on all 5.15.x kernels, with all 6 bios upgrade images from gigabyte.
Mine is (0x001cc800)

Same ID for me too and it works. My BIOS/UEFI is probably the original version, 'cos I never upgraded it.

elcore 03-15-2024 11:06 AM

Quote:

Originally Posted by hazel (Post 6489910)
Same ID for me too and it works. My BIOS/UEFI is probably the original version, 'cos I never upgraded it.

Okay hazel, it's probably fine to keep the old bios if nothing seems obviously broken.
I upgrade mine because gigabyte often say the AMD vulnerability has been fixed, i.e. they're shipping AMD microcode in these images.

nons1 03-15-2024 03:38 PM

Quote:

Originally Posted by the3dfxdude (Post 6489901)
Don't misunderstand. The probing change fixed some boards, and broke his. We want to keep the probing change for the other boards. The problem is the kernel dev doesn't want to address that they broke his chip completely. They fixed some, and broke some. So it is still a regression and instead they should handle things properly. Going back to my story earlier about a broken BIOS and the kernel devs helping to fix that, they introduced a switch, because there is always a chance to get probing wrong. I ended up not using the switch for the fix because it wasn't needed for me (I found the BIOS problem and got a BIOS fix during the troubleshooting). Others still needed the switch because there was no BIOS fix. So the fix should be is to add a switch to let selfprogrammed set the phy_id for his chip.

Switching to phylib isn't just about probing. It's a major architectural change in the driver, separating handling of MAC and PHY layer.
Therefore the switch idea is appealing, but not that easy to implement.

Coming back to the OP, what we still don't know:
- What is the exact board type, revision, and BIOS version?
- Did he upgrade the BIOS to the latest one?
- Did he try setting BIOS option "NETWORK BOOT ROM"?

rkelsen 03-16-2024 09:09 PM

Quote:

Originally Posted by selfprogrammed (Post 6489583)
Can the Slackware release of the kernel include a patch so those of us with Gigabyte motherboards can have working Ethernet.

Speaking for myself, I'd rather that Patrick didn't patch the kernel.
Quote:

Originally Posted by selfprogrammed (Post 6489583)
It cost me 8 hours of trying to figure out why the Ethernet was not working.

Quote:

Originally Posted by hazel (Post 6489709)
I'm puzzled. I'm using kernel 5.15.145 with that driver and my ethernet works OK.

Yes, I've also found that Realtek NIC support can be flaky. Same chip, same kernel, different motherboard... different results!

The easiest solution is to download the source code from RealTek & re-compile it whenever you update your kernel:

https://www.realtek.com/en/component...press-software

Keep it under /usr/src with the NVidia driver & Virtualbox module sources... it's only a couple of hundred Kb and a few seconds to re-compile.

henca 03-17-2024 06:17 AM

Quote:

Originally Posted by rkelsen (Post 6490148)
The easiest solution is to download the source code from RealTek & re-compile it whenever you update your kernel

An even easier solution, at the cost of some money, might be to buy a NIC, maybe with an intel e1000e chipset and put it in a free PCI slot. The choice of solution depends upon how much value you put in your time. It is easy to buy hardware that works fine with Linux, but some hardware could be really tricky to get to work with Linux.

regards Henrik

business_kid 03-17-2024 07:22 AM

Personally, as an ex-hardware guy, I have never seen a Realtek 8111/8211/8168/8411 IC. All of those devices I have owned (3-5) have been IP cores and a small part of a larger chip/ If the Gigabyte board he has is arranged that way, I'd like to know what the chip is. It's probably the large SoC on the Motherboard.

Unless the boards is brand new, I'm amazed this issue hasn't surfaced before, as Gigabyte boards are common enough.

mw.decavia 03-17-2024 12:11 PM

I have a Startech pci ethernet card (st1000bt32) with a realtek rtl8110sc chip for both mac and phy. I can't upload a picture right now. It is in a desktop system I have in storage. Sometimes it identified to the OS as a 8111.

I think realtek is a good choice. I think the Intel e1000 gives the possibility of strangers wanting to remote manage your system via AMT. And Broadcom nics have IPMI to do the same.

Quote:

Originally Posted by business_kid (Post 6490213)
Personally, as an ex-hardware guy, I have never seen a Realtek 8111/8211/8168/8411 IC. All of those devices I have owned (3-5) have been IP cores and a small part of a larger chip/ If the Gigabyte board he has is arranged that way, I'd like to know what the chip is. It's probably the large SoC on the Motherboard.

Unless the boards is brand new, I'm amazed this issue hasn't surfaced before, as Gigabyte boards are common enough.


business_kid 03-17-2024 01:07 PM

Quote:

Originally Posted by mw.decavia
I have a Startech pci ethernet card (st1000bt32) with a realtek rtl8110sc chip for both mac and phy. I can't upload a picture right now. It is in a desktop system I have in storage. Sometimes it identified to the OS as a 8111.

And I presume, the r8169.ko module works for you??

Whoever controls device numbering at Realtek clearly needs a long period in drug rehab :D.

henca 03-17-2024 01:25 PM

Quote:

Originally Posted by mw.decavia (Post 6490241)
I think the Intel e1000 gives the possibility of strangers wanting to remote manage your system via AMT. And Broadcom nics have IPMI to do the same.

That kind of functionality is (if present) part of the motherboard, no separate NICs that I am aware of adds such functionality.

But I have seen some 10 Gb/s intel NICs which with default settings add their own LLDP messages to the network, messing things up when the operating systems wants to send its LLDP messages.

regards Henrik

mw.decavia 03-18-2024 06:28 PM

Quote:

Originally Posted by business_kid (Post 6490248)
And I presume, the r8169.ko module works for you??

Whoever controls device numbering at Realtek clearly needs a long period in drug rehab :D.

On the Startech st1000bt32, the r8169.ko module worked if the card's boot rom was enabled. Even if you had it set to not actually boot. Apparently the boot rom initialized card-specific registers so that the phy could be identified. Disabling the rom prevented the linux driver from using the phy.

On my diy router sff pc, there is a rtl8168 rev.1 which always works perfectly with the r8169.ko module. In theory I can open it to look at the chip, but that is a complicated job to close it all back up again correctly.

For my laptop, I have a rtl8168 rev.6 expresscard. It has a lot of problems, and is getting worse. But I don't know if the failure is with the card or with the slot. The only way to tell what chip markings are inside would be to open it up and ruin it altogether.

I personally would not criticize Realtek, to design even one chip and bring it to market is something I could never do. I would honor them for doing so much.

business_kid 03-19-2024 08:09 AM

As I said, I never saw the chip, but I have numerous copies of the IP core, which works well. My suspicion is that somebody split the thing, and added their own PHY or something, and that's the issue. It would save a few cents, but has added trouble.

selfprogrammed 03-20-2024 05:49 AM

I have hand patched kernels 5.15.19 and 5.15.117. I am currently running 5.15.117.
This is a GA-880GA-UD3H motherboard, which I used to replace the previous motherboard when that hard drive interface failed. This is a full size tower (5 bay + 2 bay + 3 hd) that I built from selected parts that I ordered.
The internet worked using this hardware, with no patches needed, on 4.4 kernels.

The lspci comes from the another user who submitted the kernel bug report.


The driver change broke the driver for existing hardware, as mentioned in the existing kernel bug report for this problem.
The change added a list of accepted PHY codes, which the previous 4.4 driver did not have.
Once that driver is patched to recognize the PHY, the driver works again. I just used to it to Ethernet configure another piece of hardware. Now, I will not likely use it again for months.
I have no need to throw money or hardware at this problem. Once I debug the problem enough to "re-discover" the patch I have sitting in the /usr/src directory, it does not cost much extra time.
The pain is in the Slackware kernel upgrade breaking it again, and that it may be months before I discover that the Ethernet is broken, again.
I am getting too old to remember this among all the other things that constantly need fixing.

Most traffic goes through a WIFI internet board. The Ethernet is necessary for Lab work, modems, and anything that needs a hard connection for setup.


Just how do you go about escalating these things. I seem to not have the right connections, as I always have trouble getting an account setup just to access the bug report system.

The kernel has many fixes for hardware, where the user selects the kernel option that supports their hardware.
There would be no problem if there was a kernel option that selected a patch to this driver.
It is not necessary for this patch to be in the pre-built Slackware kernels.

Thank you for all the comments.

I think that the kernel should handle configuring, in whatever way it needs to, for any hardware it can reasonably handle.
BIOS has always been suspect and has been considered buggy in the past, more than a few times.
It is possible to recognize the situation, or ask the user to enable a specific alternative behavior.
I use Slackware because it is best for making any hardware that I may have work. I wish to keep it that way.

Changing to other hardware is not acceptable, for many reasons, and I see no reason to even consider that as an answer for the kernel problem as stated.
Trying to avoid this just for my hardware, is not the stated problem.

I am not comfortable with upgrading the BIOS for many reasons. That is a good way to make your motherboard unusable, and this machine is critical. It is so critical that I must always maintain the means to back out of any hardware or software change. Every time I upgrade something, especially hardware, my usage of this machine is compromised. The kernel upgrades have been good at breaking things that I have been using for years and cannot easily stop using. I have been unable to keep up with the broken software caused by kernel upgrades. Even after months of work trying to restore working utilities, I am realizing that I am losing capability with every kernel upgrade. I wish to hold on to an existing kernel capability, rather than have to launch into another round of new hardware, new software drivers, new setup, and what every else might happen down the line (because something is always incompatible with the change).

business_kid 03-20-2024 07:37 AM

Well, if the kernel bug has gone in, why not make sure your missing phy is available to add to it? Then make a mental note to stick to current, or compile your own more recent kernel?


All times are GMT -5. The time now is 08:51 PM.