The solution
Preface
I noticed the problem with ThinkPad T60 with e1000 Ethernet card and Slackware 12.1 with generic kernel 2.6.24.5-smp. Sometimes the system recognized eth0 interface but sometimes it displayed errors as follows...
# ifconfig eth0 192.168.1.3 broadcast 192.168.2.255 netmask 255.255.255.0
SIOCSIADDR: No such device
eth0: ERROR while getting interface flags: No such device
SIOCSIBRDADDR: No such device
eth0: ERROR while getting interface flags: No such device
SIOCSIFNETMASK: No such device
# ifconfig eth0
eth0: error fetching interface information: Device not found
# ifconfig
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:56 errors:0 dropped:0 overruns:0 frame:0
TX packets:56 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2960 (2.8 KiB) TX bytes:2960 (2.8 KiB)
That problem appeared after some reboots of the machine. In the other words sometimes it occurred and sometimes it didn't occur...
Investigations
I waited until the problem will appear and then started to check various system information. I found something interesting in dmesg...
# dmesg | grep e1000
e1000: 0000:02:00.0: e1000_probe: The EEPROM Checksum Is Not Valid
ACPI: PCI interrupt for device 0000:02:00.0 disabled
e1000: probe of 0000:02:00.0 failed with error -5
I googled for ``The EEPROM Checksum Is Not Valid'' phrase and I found
http://www.thinkwiki.org/wiki/Proble...m_Is_Not_Valid with an interesting information:
Quote:
On certain ThinkPads, e1000 driver for Intel Gigabit controller fails to load with the following error message in /var/log/messages:
e1000: 0000:02:00.0: e1000_probe: The EEPROM Checksum Is Not Valid
e1000: probe of 0000:02:00.0 failed with error -5
The problem is caused by a power savings feature obstructing normal operation, and causes the first bytes read from the EEPROM to be corrupt, resulting in a random or invalid MAC address (but no other data corruption). The EEPROM checksum test traps the problem and the driver refuses to load.
|
On that page there are enumerated four possible solutions of that problem. I checked three of them...
Unsuccessful attempt (1)
In ``From Lenovo'' section there is mentioned Lenovo
vidalia-eeprom-mod-script available on
http://www-307.ibm.com/pc/support/si...cid=MIGR-67166. Maybe it's harmless but as yet I didn't dare to run it.
I just tried one safe command from that script...
When eth0 interface don't work that command gives the information about an error.
# ethtool -e eth0
Cannot get driver information: No such device
When eth0 interface works that command gives more relevant information.
# ethtool -e eth0
Offset Values
------ ------
0x0000 xx xx xx xx xx xx 30 0b b2 ff 51 00 ff ff ff ff
0x0010 53 00 03 01 6b 02 01 20 aa 17 9a 10 86 80 df 80
0x0020 00 00 00 20 54 7e 00 00 14 00 da 00 04 00 00 27
0x0030 c9 6c 50 31 3e 07 0b 04 8b 29 00 00 00 f0 02 0f
0x0040 08 10 00 00 04 0f ff 7f 01 4d ff ff ff ff ff ff
0x0050 14 00 1d 00 14 00 1d 00 af aa 1e 00 00 00 1d 00
0x0060 00 01 00 40 32 12 07 40 ff ff ff ff ff ff ff ff
0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 8f a2
(I commented MAC address of my Ethernet card using ``xx''.)
Just for your convenience I attach that script...
vidalia-eeprom-mod-script:
Code:
#!/bin/bash
if [ -z "$1" ]; then
echo "Usage: $0 \<interface\>"
echo " i.e. $0 eth0"
exit 1
fi
if ! ifconfig $1 > /dev/null; then
exit 1
fi
dev=$(ethtool -e $1 | grep 0x0010 | awk '{print "0x"$13$12$15$14}')
case $dev in
0x109a8086)
echo "$1: is a \"82573L Gigabit Ethernet Controller\""
;;
*)
echo "No appropriate hardware found for this fixup"
exit 1
;;
esac
echo "This fixup is applicable to your hardware"
var=$(ethtool -e $1 | grep 0x0020 | awk '{print $17}')
new1=$(echo ${var:0:1} | tr '2367abef' '014589cd')
new2=$(echo ${var:1})
new=$new1$new2
if [ ${var:0:1}${var:1} == $new ]; then
echo "Your eeprom is up to date, no changes were made"
exit 2
fi
echo "executing command: ethtool -E $1 magic $dev offset 0x2f value 0x$new"
ethtool -E $1 magic $dev offset 0x2f value 0x$new
echo "Please reboot for the change to take effect."
After that weekend I will phone to Lenovo service and ask them if that script is secure enough to run it on my machine.
Unsuccessful attempt (2)
First I tried ``Via module parameter'' method...
# echo "e1000 eeprom_bad_csum_allow=1" > /etc/modprobe.d/eth0
# modprobe e1000
FATAL: Error inserting e1000 (/lib/modules/2.6.24.5-smp/kernel/drivers/net/e1000/e1000.ko): Unknown symbol in module, or unknown parameter (see dmesg)
# dmesg | grep e1000
e1000: Unknown parameter `eeprom_bad_csum_allow'
# rm /etc/modprobe.d/eth0
Successful attempt
Then I tried ``Use e1000e -- Kernel Patch'' method...
On the mentioned page there are two links to Auke Kok's patches to some kernel modules:
http://kerneltrap.org/mailarchive/li...7/10/31/374579 and
http://kerneltrap.org/mailarchive/li...7/10/31/374573.
I stored these patches in files patch1 and patch2. Because the paths to the kernel modules in those two patches were invalid I made two small corrections:
# cat patch1 | sed 's/[ab]\/drivers\/net/drivers\/net/g' > patch_e1000-e1000e
# cat patch2 | sed 's/[ab]\/drivers\/net/drivers\/net/g' > patch_e1000e
Just for your convenience I attach here both modified patches...
patch_e1000-e1000e:
Code:
diff --git drivers/net/e1000/e1000_main.c drivers/net/e1000/e1000_main.c
index 72deff0..d1b88e4 100644
--- drivers/net/e1000/e1000_main.c
+++ drivers/net/e1000/e1000_main.c
@@ -73,14 +73,6 @@ static struct pci_device_id e1000_pci_tbl[] = {
INTEL_E1000_ETHERNET_DEVICE(0x1026),
INTEL_E1000_ETHERNET_DEVICE(0x1027),
INTEL_E1000_ETHERNET_DEVICE(0x1028),
- INTEL_E1000_ETHERNET_DEVICE(0x1049),
- INTEL_E1000_ETHERNET_DEVICE(0x104A),
- INTEL_E1000_ETHERNET_DEVICE(0x104B),
- INTEL_E1000_ETHERNET_DEVICE(0x104C),
- INTEL_E1000_ETHERNET_DEVICE(0x104D),
- INTEL_E1000_ETHERNET_DEVICE(0x105E),
- INTEL_E1000_ETHERNET_DEVICE(0x105F),
- INTEL_E1000_ETHERNET_DEVICE(0x1060),
INTEL_E1000_ETHERNET_DEVICE(0x1075),
INTEL_E1000_ETHERNET_DEVICE(0x1076),
INTEL_E1000_ETHERNET_DEVICE(0x1077),
@@ -89,28 +81,9 @@ static struct pci_device_id e1000_pci_tbl[] = {
INTEL_E1000_ETHERNET_DEVICE(0x107A),
INTEL_E1000_ETHERNET_DEVICE(0x107B),
INTEL_E1000_ETHERNET_DEVICE(0x107C),
- INTEL_E1000_ETHERNET_DEVICE(0x107D),
- INTEL_E1000_ETHERNET_DEVICE(0x107E),
- INTEL_E1000_ETHERNET_DEVICE(0x107F),
INTEL_E1000_ETHERNET_DEVICE(0x108A),
- INTEL_E1000_ETHERNET_DEVICE(0x108B),
- INTEL_E1000_ETHERNET_DEVICE(0x108C),
- INTEL_E1000_ETHERNET_DEVICE(0x1096),
- INTEL_E1000_ETHERNET_DEVICE(0x1098),
INTEL_E1000_ETHERNET_DEVICE(0x1099),
- INTEL_E1000_ETHERNET_DEVICE(0x109A),
- INTEL_E1000_ETHERNET_DEVICE(0x10A4),
- INTEL_E1000_ETHERNET_DEVICE(0x10A5),
INTEL_E1000_ETHERNET_DEVICE(0x10B5),
- INTEL_E1000_ETHERNET_DEVICE(0x10B9),
- INTEL_E1000_ETHERNET_DEVICE(0x10BA),
- INTEL_E1000_ETHERNET_DEVICE(0x10BB),
- INTEL_E1000_ETHERNET_DEVICE(0x10BC),
- INTEL_E1000_ETHERNET_DEVICE(0x10C4),
- INTEL_E1000_ETHERNET_DEVICE(0x10C5),
- INTEL_E1000_ETHERNET_DEVICE(0x10D5),
- INTEL_E1000_ETHERNET_DEVICE(0x10D9),
- INTEL_E1000_ETHERNET_DEVICE(0x10DA),
/* required last entry */
{0,}
};
diff --git drivers/net/e1000e/82571.c drivers/net/e1000e/82571.c
index 45f5ee2..3beace5 100644
--- drivers/net/e1000e/82571.c
+++ drivers/net/e1000e/82571.c
@@ -194,6 +194,8 @@ static s32 e1000_init_mac_params_82571(struct e1000_adapter *adapter)
break;
case E1000_DEV_ID_82571EB_SERDES:
case E1000_DEV_ID_82572EI_SERDES:
+ case E1000_DEV_ID_82571EB_SERDES_DUAL:
+ case E1000_DEV_ID_82571EB_SERDES_QUAD:
hw->media_type = e1000_media_type_internal_serdes;
break;
default:
@@ -260,6 +262,7 @@ static s32 e1000_get_invariants_82571(struct e1000_adapter *adapter)
case E1000_DEV_ID_82571EB_QUAD_COPPER:
case E1000_DEV_ID_82571EB_QUAD_FIBER:
case E1000_DEV_ID_82571EB_QUAD_COPPER_LP:
+ case E1000_DEV_ID_82571PT_QUAD_COPPER:
adapter->flags |= FLAG_IS_QUAD_PORT;
/* mark the first port */
if (global_quad_port_a == 0)
@@ -285,6 +288,9 @@ static s32 e1000_get_invariants_82571(struct e1000_adapter *adapter)
if (adapter->flags & FLAG_IS_QUAD_PORT &&
(!(adapter->flags & FLAG_IS_QUAD_PORT_A)))
adapter->flags &= ~FLAG_HAS_WOL;
+ /* Does not support WoL on any port */
+ if (pdev->device == E1000_DEV_ID_82571EB_SERDES_QUAD)
+ adapter->flags &= ~FLAG_HAS_WOL;
break;
case e1000_82573:
diff --git drivers/net/e1000e/hw.h drivers/net/e1000e/hw.h
index 1bb2052..71f93ce 100644
--- drivers/net/e1000e/hw.h
+++ drivers/net/e1000e/hw.h
@@ -303,8 +303,11 @@ enum e1e_registers {
#define E1000_DEV_ID_82571EB_FIBER 0x105F
#define E1000_DEV_ID_82571EB_SERDES 0x1060
#define E1000_DEV_ID_82571EB_QUAD_COPPER 0x10A4
+#define E1000_DEV_ID_82571PT_QUAD_COPPER 0x10D5
#define E1000_DEV_ID_82571EB_QUAD_FIBER 0x10A5
#define E1000_DEV_ID_82571EB_QUAD_COPPER_LP 0x10BC
+#define E1000_DEV_ID_82571EB_SERDES_DUAL 0x10D9
+#define E1000_DEV_ID_82571EB_SERDES_QUAD 0x10DA
#define E1000_DEV_ID_82572EI_COPPER 0x107D
#define E1000_DEV_ID_82572EI_FIBER 0x107E
#define E1000_DEV_ID_82572EI_SERDES 0x107F
diff --git drivers/net/e1000e/netdev.c drivers/net/e1000e/netdev.c
index ec427e2..a271112 100644
--- drivers/net/e1000e/netdev.c
+++ drivers/net/e1000e/netdev.c
@@ -4088,16 +4088,15 @@ static struct pci_error_handlers e1000_err_handler = {
};
static struct pci_device_id e1000_pci_tbl[] = {
- /*
- * Support for 82571/2/3, es2lan and ich8 will be phased in
- * stepwise.
-
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_82571EB_COPPER), board_82571 },
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_82571EB_FIBER), board_82571 },
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_82571EB_QUAD_COPPER), board_82571 },
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_82571EB_QUAD_COPPER_LP), board_82571 },
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_82571EB_QUAD_FIBER), board_82571 },
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_82571EB_SERDES), board_82571 },
+ { PCI_VDEVICE(INTEL, E1000_DEV_ID_82571EB_SERDES_DUAL), board_82571 },
+ { PCI_VDEVICE(INTEL, E1000_DEV_ID_82571EB_SERDES_QUAD), board_82571 },
+ { PCI_VDEVICE(INTEL, E1000_DEV_ID_82571PT_QUAD_COPPER), board_82571 },
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_82572EI), board_82572 },
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_82572EI_COPPER), board_82572 },
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_82572EI_FIBER), board_82572 },
@@ -4120,8 +4119,6 @@ static struct pci_device_id e1000_pci_tbl[] = {
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_ICH8_IGP_C), board_ich8lan },
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_ICH8_IGP_M), board_ich8lan },
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_ICH8_IGP_M_AMT), board_ich8lan },
- */
-
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_ICH9_IFE), board_ich9lan },
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_ICH9_IFE_G), board_ich9lan },
{ PCI_VDEVICE(INTEL, E1000_DEV_ID_ICH9_IFE_GT), board_ich9lan },
-
patch_e1000e:
Code:
diff --git drivers/net/e1000e/82571.c drivers/net/e1000e/82571.c
index b6401ab..45f5ee2 100644
--- drivers/net/e1000e/82571.c
+++ drivers/net/e1000e/82571.c
@@ -1343,7 +1343,6 @@ struct e1000_info e1000_82573_info = {
| FLAG_HAS_STATS_ICR_ICT
| FLAG_HAS_SMART_POWER_DOWN
| FLAG_HAS_AMT
- | FLAG_HAS_ASPM
| FLAG_HAS_ERT
| FLAG_HAS_SWSM_ON_LOAD,
.pba = 20,
diff --git drivers/net/e1000e/e1000.h drivers/net/e1000e/e1000.h
index 473f78d..8b88c22 100644
--- drivers/net/e1000e/e1000.h
+++ drivers/net/e1000e/e1000.h
@@ -288,7 +288,6 @@ struct e1000_info {
#define FLAG_HAS_CTRLEXT_ON_LOAD (1 << 5)
#define FLAG_HAS_SWSM_ON_LOAD (1 << 6)
#define FLAG_HAS_JUMBO_FRAMES (1 << 7)
-#define FLAG_HAS_ASPM (1 << 8)
#define FLAG_HAS_STATS_ICR_ICT (1 << 9)
#define FLAG_HAS_STATS_PTC_PRC (1 << 10)
#define FLAG_HAS_SMART_POWER_DOWN (1 << 11)
diff --git drivers/net/e1000e/netdev.c drivers/net/e1000e/netdev.c
index 4fd2e23..ec427e2 100644
--- drivers/net/e1000e/netdev.c
+++ drivers/net/e1000e/netdev.c
@@ -3511,6 +3511,33 @@ static int e1000_suspend(struct pci_dev *pdev, pm_message_t state)
return 0;
}
+static void e1000e_disable_l1aspm(struct pci_dev *pdev)
+{
+ int pos;
+ u32 cap;
+ u16 val;
+
+ /*
+ * 82573 workaround - disable L1 ASPM on mobile chipsets
+ *
+ * L1 ASPM on various mobile (ich7) chipsets do not behave properly
+ * resulting in lost data or garbage information on the pci-e link
+ * level. This could result in (false) bad EEPROM checksum errors,
+ * long ping times (up to 2s) or even a system freeze/hang.
+ *
+ * Unfortunately this feature saves about 1W power consumption when
+ * active.
+ */
+ pos = pci_find_capability(pdev, PCI_CAP_ID_EXP);
+ pci_read_config_dword(pdev, pos + PCI_EXP_LNKCAP, &cap);
+ pci_read_config_word(pdev, pos + PCI_EXP_LNKCTL, &val);
+ if (val & 0x2) {
+ dev_warn(&pdev->dev, "Disabling L1 ASPM\n");
+ val &= ~0x2;
+ pci_write_config_word(pdev, pos + PCI_EXP_LNKCTL, val);
+ }
+}
+
#ifdef CONFIG_PM
static int e1000_resume(struct pci_dev *pdev)
{
@@ -3521,6 +3548,7 @@ static int e1000_resume(struct pci_dev *pdev)
pci_set_power_state(pdev, PCI_D0);
pci_restore_state(pdev);
+ e1000e_disable_l1aspm(pdev);
err = pci_enable_device(pdev);
if (err) {
dev_err(&pdev->dev,
@@ -3621,6 +3649,7 @@ static pci_ers_result_t e1000_io_slot_reset(struct pci_dev *pdev)
struct e1000_adapter *adapter = netdev_priv(netdev);
struct e1000_hw *hw = &adapter->hw;
+ e1000e_disable_l1aspm(pdev);
if (pci_enable_device(pdev)) {
dev_err(&pdev->dev,
"Cannot re-enable PCI device after reset.\n");
@@ -3722,6 +3751,7 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
u16 eeprom_data = 0;
u16 eeprom_apme_mask = E1000_EEPROM_APME;
+ e1000e_disable_l1aspm(pdev);
err = pci_enable_device(pdev);
if (err)
return err;
diff --git drivers/net/e1000e/param.c drivers/net/e1000e/param.c
index 3327892..df266c3 100644
--- drivers/net/e1000e/param.c
+++ drivers/net/e1000e/param.c
@@ -262,13 +262,6 @@ void __devinit e1000e_check_options(struct e1000_adapter *adapter)
.max = MAX_RXDELAY } }
};
- /* modify min and default if 82573 for slow ping w/a,
- * a value greater than 8 needs to be set for RDTR */
- if (adapter->flags & FLAG_HAS_ASPM) {
- opt.def = 32;
- opt.arg.r.min = 8;
- }
-
if (num_RxIntDelay > bd) {
adapter->rx_int_delay = RxIntDelay[bd];
e1000_validate_option(&adapter->rx_int_delay, &opt,
-
Then I applied those new patches to the kernel...
# cp patch* /usr/src/linux
# cd /usr/src/linux
# patch -p0 < patch_e1000-e1000e
patching file drivers/net/e1000/e1000_main.c
patching file drivers/net/e1000e/82571.c
patching file drivers/net/e1000e/hw.h
patching file drivers/net/e1000e/netdev.c
Hunk #1 succeeded at 4056 (offset -32 lines).
# patch -p0 < patch_e1000e
patching file drivers/net/e1000e/82571.c
Hunk #1 succeeded at 1345 (offset 2 lines).
patching file drivers/net/e1000e/e1000.h
patching file drivers/net/e1000e/netdev.c
Hunk #1 succeeded at 3509 (offset -2 lines).
Hunk #3 succeeded at 3647 (offset -2 lines).
patching file drivers/net/e1000e/param.c
At the end I compiled the kernel and installed it together with the modules. Remember to compile the file system you use in the kernel or to prepare the appropriate initrd and LILO configuration before reboot.
After reboot of the machine eth0 interface started to work properly. I tried to reboot the machine for a few consecutive times. As yet everything works well.
Conclusion
I hope both these patches will be integrated with the future versions of 2.6.x kernel. I wouldn't like to see the future kernel inconsistent with those patches nor the future Slackware version with such kernel which I couldn't use on my ThinkPad T60 with e1000 Ethernet card.
Have a nice day...