Is there a way to stop Linux from chocking up and often crashing if disk IO is slow?

Red Squirrel · 07-06-2015, 01:37 AM

One issue I noticed with Linux that always aggrevates me is if there is any situation where IO is being slow, such as backup jobs running, stuff starts to choke up and crash. Example output from dmesg:

Code:

sd 2:0:0:0: [sda] task abort on host 2, ffff880037410380
sd 2:0:0:0: [sda] task abort on host 2, ffff880037410980
sd 2:0:0:0: [sda] Failed to abort cmd ffff880037410980
sd 2:0:0:0: [sda] task abort on host 2, ffff88003758d880
sd 2:0:0:0: [sda] Failed to abort cmd ffff88003758d880
sd 2:0:0:0: [sda] task abort on host 2, ffff88003758db80
sd 2:0:0:0: [sda] Failed to abort cmd ffff88003758db80
udev: starting version 147
type=1305 audit(1436149718.420:11170): audit_pid=0 old=1285 auid=4294967295 ses=4294967295 res=1
type=1305 audit(1436149718.531:11171): audit_enabled=0 old=1 auid=0 ses=1844 res=1
type=1305 audit(1436149718.539:11172): audit_pid=20173 old=0 auid=0 ses=1844 res=1

My setup consists of a CentOS box with 3 mdadm raid arrays, and all my data and VMs (ESXi on another server) are on there and accessed via NFS. On most of my VMs I constantly get IO related errors in dmesg, especially during heavy IO usage. How do I make Linux more tolerant of slow downs? Another one I used to get a lot too is "task paused for more than 120 seconds". This would end up crashing most things and I'd have to reboot. I don't seem to get that one as often.

Overall it's not one specific error I get, it's random, and it depends on the VM. Is there some kind of time out value I can adjust somewhere to make it more tolerate of slow I/O? Most of my OSes are CentOS 6.5 or near.

Also would I be better off going ZFS instead of mdadm raid? Two of my raid arrays are raid 10, one is raid 5. That one is mostly just used for backups right now while the heavier loads are on raid 10. Would I be better off having 1 large raid 10 or multiple small ones?

Keruskerfuerst · 07-06-2015, 01:51 AM

More precisly information on:
1. Hardware (hdds)
2. OS version

Red Squirrel · 07-06-2015, 02:12 AM

That particular system is CentOS 6.5. Hard drives:

http://gal.redsquirrel.me/images/oth...6_03_06_52.png

This is what my raid configs look like:

Code:

[root@isengard ~]# mdadm --detail /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Thu Sep  5 00:19:01 2013
     Raid Level : raid10
     Array Size : 5860270080 (5588.79 GiB 6000.92 GB)
  Used Dev Size : 2930135040 (2794.39 GiB 3000.46 GB)
   Raid Devices : 4
  Total Devices : 4
    Persistence : Superblock is persistent

    Update Time : Mon Jul  6 03:03:11 2015
          State : active 
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

           Name : isengard.loc:0  (local to host isengard.loc)
           UUID : 2e257e19:33dab86c:2e112e06:b386598e
         Events : 404

    Number   Major   Minor   RaidDevice State
       0       8      112        0      active sync set-A   /dev/sdh
       1       8      144        1      active sync set-B   /dev/sdj
       2       8      160        2      active sync set-A   /dev/sdk
       3       8      128        3      active sync set-B   /dev/sdi
[root@isengard ~]# 
[root@isengard ~]# 
[root@isengard ~]# mdadm --detail /dev/md1
/dev/md1:
        Version : 0.90
  Creation Time : Sat Sep 20 02:15:28 2008
     Raid Level : raid5
     Array Size : 6837319552 (6520.58 GiB 7001.42 GB)
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
   Raid Devices : 8
  Total Devices : 8
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Mon Jul  6 03:03:12 2015
          State : clean 
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 11f961e7:0e37ba39:2c8a1552:76dd72ee
         Events : 0.2094122

    Number   Major   Minor   RaidDevice State
       0       8       32        0      active sync   /dev/sdc
       1       8      192        1      active sync   /dev/sdm
       2       8       48        2      active sync   /dev/sdd
       3       8       16        3      active sync   /dev/sdb
       4       8       96        4      active sync   /dev/sdg
       5       8        0        5      active sync   /dev/sda
       6       8       80        6      active sync   /dev/sdf
       7       8       64        7      active sync   /dev/sde
[root@isengard ~]# 
[root@isengard ~]# 
[root@isengard ~]# mdadm --detail /dev/md3
/dev/md3:
        Version : 1.2
  Creation Time : Mon Jul 28 23:31:31 2014
     Raid Level : raid10
     Array Size : 7813531648 (7451.56 GiB 8001.06 GB)
  Used Dev Size : 1953382912 (1862.89 GiB 2000.26 GB)
   Raid Devices : 8
  Total Devices : 8
    Persistence : Superblock is persistent

    Update Time : Mon Jul  6 03:03:14 2015
          State : clean 
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

           Name : isengard.loc:3  (local to host isengard.loc)
           UUID : 99f0389f:dbf75cb3:c841340e:33f62841
         Events : 185

    Number   Major   Minor   RaidDevice State
       0      65       48        0      active sync set-A   /dev/sdt
       1      65       64        1      active sync set-B   /dev/sdu
       2      65       80        2      active sync set-A   /dev/sdv
       3      65       96        3      active sync set-B   /dev/sdw
       4      65      128        4      active sync set-A   /dev/sdy
       5      65      144        5      active sync set-B   /dev/sdz
       6      65      160        6      active sync set-A   /dev/sdaa
       7      65      176        7      active sync set-B   /dev/sdab
[root@isengard ~]#

I don't think it's actually hardware issue at the storage server level, as the storage server rarely has IO related errors and when it does it's about a raid array (md2) that refused to shut down properly and has no more drives in it. I could not manage to kill it and the system still thinks it's there and occasionally tries to access it. Though I guess I can't rule out that some drives might hit bad sectors or something and it's just not failing them and instead times out forever till it gives up? Would ZFS solve this?

It's the VMs that depend on the storage that seem to choke up a lot. Storage server is CentOS 6.6. Most VMs are 6.5 or 6.6. Depends on age and what was latest OS version at the time of install. I occasionally go through and do yum update so they are up to date for their respective versions.

Oh and both storage server and VM server are Supermicro based systems with ECC ram, if that helps.

Keruskerfuerst · 07-06-2015, 02:52 AM

I had one WD Caviar Black 4TB FEAX drive in my system. It failed and had to replace it. SMART status was bad and I/O errors occured.

Red Squirrel · 07-06-2015, 03:38 AM

But why would one drive cause IO errors on the guests? The guests don't even see individual drives they just see their own virtual disk. If it was a failing drive shouldn't mdadm push it out of the array and fail it?

jpollard · 07-06-2015, 04:02 AM

Are all the VMs trying to access the same filesystem through NFS?

How many VMs are there? By default there are only 8 NFS server processes to handle the traffic. I have seen tremendous throughput improvement by going to 16 or even 24 (depending on how many processors you have on the NFS server... I ran one 8 core server with 16 without penalty. Going to 24 didn't improve response much, but it did some).

And if the disks are only being access for supplying virtual disks, why not switch to using iscsi from the server instead? Another alternative is to use gluster instead - and spread the load among a couple of servers. Even with one server, I understand that gluser provides better buffer handling (more of it is in memory rather than on disk, thus bypassing a physical I/O, and the disks get updated via a writethrough operation).

Red Squirrel · 07-06-2015, 04:47 AM

There's 8 VMs at the moment but I'm often adding more. I chose NFS as it is more friendly for being accessed from more than one server in case I ever add more hosts and I use it directly as well. I know iSCSI can still be setup so more than one server accesses the same file system but I think that requires a proprietary file system and VM solution such as Vmware with vmfs. Using vmware now but I don't want to lock myself in that platform. I eventually want to switch to KVM once I can liberate my old server for testing. When I originally tried KVM I found it required way too much work to use so I will want to setup something to make it easier such as a custom front end.

I like the idea of Gluster though, I heard about it before but was not sure how mature it was. Is it production worthy? I only have 1 file server now but eventually might add more as some kind of redundancy would be nice. Right now I have all my eggs in one basket, if my file server fails I'm pretty much dead in the water. It does have dual PSU and 4 hours of battery backup though.

jpollard · 07-06-2015, 05:28 AM

For 8 VMs I would want 16 nfs service daemons (all are in the kernel). One thing I had noticed was that it seems that each open file tends to want to suck one service daemon (if the i/o is active), so having multiple files open starts to bog down the server NFS. More daemons eases this load - but can introduce some thrashing activity.

At the time (early 2.6 kernel) a single NFS server with 8 processor cores could handle as many as 64 nfs service daemons. It depends on what is going on, I had lots of directory searches going on with a file system with 50 million files, with only a few file copies being made. The extra server daemons permitted multiple parallel directory searches to run on the client. The "few file copies" were limited to not saturate the source of the data, as it was also being used for general library access at the same time. The extra server daemons allowed a parallel directory search to complete in 45 minutes without noticeably impacting general use.

BTW, it just occured to me, are the VMs on the NFS server? (that too would cause bottlenecks as the cpu utilitization could hit 100).

With NFS you can extend various timeout intervals on the client mounts (timeo in tenths of a second, before a retry, retrans to change number of retries - the default is 3, you might try 7 and if that helps but failures still occur, try 11. I like prime numbers as it seems to keep things from coinciding and causing a thrashing issue. A failure shouldn't happen until the product of timeo and retrans occurs).

You can also play with rsize/wsize to try and match I/O between client and server, though usually these would default to 4k (the underlying filesystem block size) - if you have a raid though you would have to determine the size of the raid block instead (it might be 16k). This would reduce the number of NFS I/O requests from the client which would drop the load on the server. This also works better over a TCP connection (eliminates UDP fragmenation requirement).

RH is using gluster for its clustering services. So it appears to be in production use already.

Slax-Dude · 07-06-2015, 10:26 AM

Just out of curiosity:

is disk cache active on the guest, host or both?
what is the network bandwidth on the file server?
does the file server only serves VM virtual disks or other data as well?

Red Squirrel · 07-06-2015, 06:06 PM

Where would I go to modify those NFS settings, since this file server is dedicated for just that, is there any harm in putting it at a super high value like 128 service daemons? Do they mostly just sleep until there is a call for IO?

No idea about disk cache, how would I check that? Network is all gigabit. The file server serves mostly just VMs but also raw data. A lot of the VMs also access raw data, so it's own VMDK would be on the NFS server but the files it accesses are on same server, am I better off making it dedicated to just VMs and have the VMs store their data in their own virtual file system?

If I was to switch to KVM for my virtual environment and add more than one VM host, could I use iSCSI, ex: does Linux have a file system that supports iSCSI on multiple machines like VMFS? Is this something Gluster would do? Maybe I would indeed get better performance with iSCSI. At some point I could team the nics and add another gigabit link too, but I don't think the network is really the bottleneck tbh, but it could not hurt to team another link. I have a gigabit managed switch so I could do it.

Hard part though is I can't really make any drastic changes at this point as it's production, but at least it's stuff I could think to look at migrating towards. Probably whenever I decide to switch to KVM as I would need to rebuild all my VMs anyway and could deal with the downtime at that point.

Red Squirrel · 07-06-2015, 08:02 PM

Another of my VMs just started to puke.

Quote:

sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad380
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032780
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032780
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3add80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3add80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad680
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad680
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad380
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad380
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032480
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032480
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032980
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032980
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001c7dbec0
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001c7dbec0
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032580
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032580
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3add80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3add80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032680
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032680
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad880
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad880
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032480
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032480
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032180
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032180
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad480
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad480
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad980
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad980
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032380
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032380
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032780
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032780
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032c80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032c80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad480
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad480
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032880
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032880
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad380
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad380
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad080
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad080
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032a80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032a80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032c80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032c80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032280
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032280
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad280
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad280
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3adb80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3adb80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032380
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032380
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad480
sd 2:0:0:0: [sda] Failed to abort cmd ffff88001f3ad480
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad080
sd 2:0:0:0: [sda] Failed to abort cmd ffff88001f3ad080
sd 2:0:0:0: [sda] task abort on host 2, ffff8800018a3ec0
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff8800018a3ec0
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad380
sd 2:0:0:0: [sda] Failed to abort cmd ffff88001f3ad380
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad880
sd 2:0:0:0: [sda] Failed to abort cmd ffff88001f3ad880
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad080
sd 2:0:0:0: [sda] Failed to abort cmd ffff88001f3ad080
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032580
sd 2:0:0:0: [sda] Failed to abort cmd ffff88001f032580
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032280
sd 2:0:0:0: [sda] Failed to abort cmd ffff88001f032280
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032680
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032680
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032780
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032780
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032380
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032380
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032a80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032a80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad180
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad180
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88000ab7cb80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88000ab7cb80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3adb80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3adb80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3add80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3add80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad580
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad580
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad680
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad680
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3adc80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3adc80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032280
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032280
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032880
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032880
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad380
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad380
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032180
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032180
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad980
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad980
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad780
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad780
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad380
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad380
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad080
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad080
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88000a7498c0
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88000a7498c0
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad680
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad680
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff880005b34c80
sd 2:0:0:0: [sda] Failed to abort cmd ffff880005b34c80
sd 2:0:0:0: [sda] task abort on host 2, ffff880005b34280
sd 2:0:0:0: [sda] Failed to abort cmd ffff880005b34280
sd 2:0:0:0: [sda] task abort on host 2, ffff880005b34180
sd 2:0:0:0: [sda] Failed to abort cmd ffff880005b34180
sd 2:0:0:0: [sda] task abort on host 2, ffff880005b34080
sd 2:0:0:0: [sda] Failed to abort cmd ffff880005b34080
sd 2:0:0:0: [sda] task abort on host 2, ffff880007ca6dc0
sd 2:0:0:0: [sda] Failed to abort cmd ffff880007ca6dc0
sd 2:0:0:0: [sda] task abort on host 2, ffff880007ca6bc0
sd 2:0:0:0: [sda] Failed to abort cmd ffff880007ca6bc0
sd 2:0:0:0: [sda] task abort on host 2, ffff880007ca67c0
sd 2:0:0:0: [sda] Failed to abort cmd ffff880007ca67c0
sd 2:0:0:0: [sda] task abort on host 2, ffff880007ca66c0
sd 2:0:0:0: [sda] Failed to abort cmd ffff880007ca66c0
sd 2:0:0:0: [sda] task abort on host 2, ffff880007ca65c0
sd 2:0:0:0: [sda] Failed to abort cmd ffff880007ca65c0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032a80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032a80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032a80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032a80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001d08ce80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001d08ce80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad380
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad380
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001d13fa80
sd 2:0:0:0: [sda] Failed to abort cmd ffff88001d13fa80
sd 2:0:0:0: [sda] task abort on host 2, ffff88001d13f980
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001d13f980
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032980
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032980
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad480
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad480
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032180
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032180
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032c80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032c80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032c80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032c80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad180
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad180
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad780
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad780
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad280
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad280
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032080
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032080
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f032280
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f032280
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3ad680
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3ad680
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0
sd 2:0:0:0: [sda] task abort on host 2, ffff88001f3adc80
sd 2:0:0:0: [sda] Failed to get completion for aborted cmd ffff88001f3adc80
sd 2:0:0:0: [sda] SCSI device reset on scsi2:0

Torrents seem to trigger this a lot. This is the torrent VM. Keep in mind this is a VM, so even if there was a drive failure on the file server the VM should not even care because the logical volume that's hosting the vmdk is still up.

jpollard · 07-07-2015, 05:25 AM

Quote:

Originally Posted by Red Squirrel

Where would I go to modify those NFS settings, since this file server is dedicated for just that, is there any harm in putting it at a super high value like 128 service daemons? Do they mostly just sleep until there is a call for IO?

All of those are on NFS mounts. I would expect them to be used in the /etc/fstab.

Quote:

No idea about disk cache, how would I check that? Network is all gigabit. The file server serves mostly just VMs but also raw data. A lot of the VMs also access raw data, so it's own VMDK would be on the NFS server but the files it accesses are on same server, am I better off making it dedicated to just VMs and have the VMs store their data in their own virtual file system?

Disk caching occurs everywhere. By default Linux will use all available memory for cache to reduce overhead. As processes start, the cache gets released so the memory can be reused for applications.

Quote:

If I was to switch to KVM for my virtual environment and add more than one VM host, could I use iSCSI, ex: does Linux have a file system that supports iSCSI on multiple machines like VMFS? Is this something Gluster would do? Maybe I would indeed get better performance with iSCSI. At some point I could team the nics and add another gigabit link too, but I don't think the network is really the bottleneck tbh, but it could not hurt to team another link. I have a gigabit managed switch so I could do it.

https://access.redhat.com/documentat...sk_images.html

So, yes, multiple targets are possible.

Quote:

Hard part though is I can't really make any drastic changes at this point as it's production, but at least it's stuff I could think to look at migrating towards. Probably whenever I decide to switch to KVM as I would need to rebuild all my VMs anyway and could deal with the downtime at that point.

The NFS mounts can be done one at a time. It is the best way actually as it allows you to verify an improvement.

Slax-Dude · 07-07-2015, 09:53 AM

Quote:

Originally Posted by Red Squirrel

No idea about disk cache, how would I check that?

Check if the host is caching. If it is (and it probably is) then disble cashing on the guest.
Having both the host and the guest operating systems caching disk I/O means more chance of data loss in case something crashes. Also, performance usually suffers a bit as well.

Quote:

Originally Posted by Red Squirrel

Network is all gigabit.

1 gigabit NIC for serving virtual drives for 8 VMs? That is not nearly enough.
If you count 1 gigabits/sec as 125 megabytes/sec (that is a theoretical peak, not actual continued performance) and divide it by the 8 VMs you have you get about 15.5 megabytes/sec but you also have to take into account network protocol overheads, so at best you'll get 10 to 13 megabytes/sec per VM.
Also, if one of the VMs is at heavy disk I/O usage and another VM starts to perform some heavy disk I/O of its own it may take a few seconds for the host to even things out, so the second VM will lag a bit (which is what I think you are experiencing).
Now, if this is the only NIC on the host, it is also taking care of network I/O for the host itself...
It all adds up, you see

Quote:

Originally Posted by Red Squirrel

The file server serves mostly just VMs but also raw data. A lot of the VMs also access raw data, so it's own VMDK would be on the NFS server but the files it accesses are on same server, am I better off making it dedicated to just VMs and have the VMs store their data in their own virtual file system?

Yes, you would. That would mean less simultaneous disk I/O requests to the file server.

Quote:

Originally Posted by Red Squirrel

If I was to switch to KVM for my virtual environment and add more than one VM host, could I use iSCSI, ex: does Linux have a file system that supports iSCSI on multiple machines like VMFS? Is this something Gluster would do? Maybe I would indeed get better performance with iSCSI.

I think you are confusing protocols with file systems...
iSCSI is just the way the hard drive is presented to the operating system. Just like SATA, or ATA, or SCSI... it is a protocol, nothing more.
If instead of having the .vmdk made visible to the VMWARE host by NFS they would be made visible by iSCSI.
The difference is that NFS is a file system, and iSCSI is a hard drive protocol (a way for the computer to send commands to the HD hardware).

jpollard · 07-07-2015, 10:20 AM

Quote:

Originally Posted by Slax-Dude

Check if the host is caching. If it is (and it probably is) then disble cashing on the guest.
Having both the host and the guest operating systems caching disk I/O means more chance of data loss in case something crashes. Also, performance usually suffers a bit as well.

1 gigabit NIC for serving virtual drives for 8 VMs? That is not nearly enough.
If you count 1 gigabits/sec as 125 megabytes/sec (that is a theoretical peak, not actual continued performance) and divide it by the 8 VMs you have you get about 15.5 megabytes/sec but you also have to take into account network protocol overheads, so at best you'll get 10 to 13 megabytes/sec per VM.
Also, if one of the VMs is at heavy disk I/O usage and another VM starts to perform some heavy disk I/O of its own it may take a few seconds for the host to even things out, so the second VM will lag a bit (which is what I think you are experiencing).
Now, if this is the only NIC on the host, it is also taking care of network I/O for the host itself...
It all adds up, you see

Yes, you would. That would mean less simultaneous disk I/O requests to the file server.

I think you are confusing protocols with file systems...
iSCSI is just the way the hard drive is presented to the operating system. Just like SATA, or ATA, or SCSI... it is a protocol, nothing more.
If instead of having the .vmdk made visible to the VMWARE host by NFS they would be made visible by iSCSI.
The difference is that NFS is a file system, and iSCSI is a hard drive protocol (a way for the computer to send commands to the HD hardware).

One advantage iSCSI has over NFS for the host providing the virtual disk is that it would bypass the two level I/O. In the NFS mode, the VM making an I/O request to its virtual disk first gets translated to a host reference - then the host translates that to an NFS reference... then the NFS server translates that to a disk reference.

With iSCSI, the VM would make a an I/O request to the virtual device, which then gets sent to the iSCSI target, which can then translates to a disk block and to a disk reference.

This would eliminate the VM host from a lot of excess work - including buffer management which can add latency to the usual NFS delays from both the file server and the VM host.

Note: the iscsi target does not have to be a hardware unit - CAN be, but it isn't required.

Red Squirrel · 07-07-2015, 06:15 PM

Quote:

Originally Posted by Slax-Dude

Check if the host is caching. If it is (and it probably is) then disble cashing on the guest.
Having both the host and the guest operating systems caching disk I/O means more chance of data loss in case something crashes. Also, performance usually suffers a bit as well.

Where do I go for that?

Quote:

1 gigabit NIC for serving virtual drives for 8 VMs? That is not nearly enough.
If you count 1 gigabits/sec as 125 megabytes/sec (that is a theoretical peak, not actual continued performance) and divide it by the 8 VMs you have you get about 15.5 megabytes/sec but you also have to take into account network protocol overheads, so at best you'll get 10 to 13 megabytes/sec per VM.
Also, if one of the VMs is at heavy disk I/O usage and another VM starts to perform some heavy disk I/O of its own it may take a few seconds for the host to even things out, so the second VM will lag a bit (which is what I think you are experiencing).
Now, if this is the only NIC on the host, it is also taking care of network I/O for the host itself...
It all adds up, you see

I worked in an environment that had like 50 something VMs on 4 nics and it was fine, though they were spread on 6 hosts if that makes a difference. SAN had 6 I think. It was all gigabit. Not 10.

Quote:

I think you are confusing protocols with file systems...
iSCSI is just the way the hard drive is presented to the operating system. Just like SATA, or ATA, or SCSI... it is a protocol, nothing more.
If instead of having the .vmdk made visible to the VMWARE host by NFS they would be made visible by iSCSI.
The difference is that NFS is a file system, and iSCSI is a hard drive protocol (a way for the computer to send commands to the HD hardware).

Yeah but you can't just put any file system on an iSCSI target if more than one host is accessing the same target, which is what you want so that you can have proper HA. The file system needs to be "iSCSI aware" and handle it properly. I know VMFS does this, what I'm asking is if there is a file system in Linux that also does this as I eventually want to go to a full Linux VM solution such as KVM. VMware is kinda a temp solution for me right now because it was plug and play while KVM requires a lot more work to use and I want to code a front end for it. I am working on decommissioning my main server and virtualizing everything it does into a VM so once I liberate that server I can start using it for testing and developing a front end.