Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
That's a fair amt of stuff.
Did you run the untar/uncompress cmd direct against the usb drive?
Depending on the internals of how tar works, it might(?) be quicker overall to just stream the file onto the local hard disk first, and then run the uncompress/untar from there (especially if it's SSD) .
It's got to be worth a try, just to see...
It amounts to 8M/sec compressed - which is well below the USB 2.0 limit, and 32M/sec uncompressed which by modern standards is pathetic. Of course it depends on what's being restored and where - it could be ok if you are restoring a gazillion of small files or writing to SMR spinner, and there are many reasons for poor SSD write performance, usually not happy ones.
Quote:
Originally Posted by chrism01
Depending on the internals of how tar works
tar stands for Tape ARchive, it is specifically designed to work with sequential streams
there can be other factors too, like free memory or fragmentation, type of filesystem, ... but in general you can measure it by:
Code:
time cat /mnt/external/backup.bz2 > /dev/null
to check the read speed. Additionally you can try the following:
Code:
time pbzip2 -dc /mnt/external/backup.bz2 > /dev/null
time lbzip2 -dc /mnt/external/backup.bz2 > /dev/null
to check the decompression speed.
lbzip2 and pbzip2 are both faster than the original bzip2.
And finally you can also check the write speed of your disk.
That's a fair amt of stuff.
Did you run the untar/uncompress cmd direct against the usb drive?
Yes, directly against the USB drive.
Quote:
Depending on the internals of how tar works, it might(?) be quicker overall to just stream the file onto the local hard disk first, and then run the uncompress/untar from there (especially if it's SSD) .
It's got to be worth a try, just to see...
I'm going to try this idea of first copying the backup to the drive first ... as soon as I set up a machine with a similar RAID-1.
Quote:
Originally Posted by lvm_
It amounts to 8M/sec compressed - which is well below the USB 2.0 limit, and 32M/sec uncompressed which by modern standards is pathetic. Of course it depends on what's being restored and where - it could be ok if you are restoring a gazillion of small files or writing to SMR spinner, and there are many reasons for poor SSD write performance, usually not happy ones.
Not a gazillion small files. There are half-a-dozen small files and one really large 144G (uncompressed) virtual machine .vdi image.
Quote:
tar stands for Tape ARchive, it is specifically designed to work with sequential streams
These tarfiles are intended for long term backup and seem like the right choice for such backups. Better/faster "boutique" programs might not work years from now depending on what flavors of Linux things might evolve to. tar ships with all distros and likely will be forever. But, I'm open to suggestions on alternatives.
Quote:
Originally Posted by pan64
there can be other factors too, like free memory or fragmentation, type of filesystem, ... but in general you can measure it by:
Code:
time cat /mnt/external/backup.bz2 > /dev/null
to check the read speed. Additionally you can try the following:
Code:
time pbzip2 -dc /mnt/external/backup.bz2 > /dev/null
time lbzip2 -dc /mnt/external/backup.bz2 > /dev/null
to check the decompression speed.
Excellent suggestions. I might give those a try after trying chrism01's suggestion.
Quote:
lbzip2 and pbzip2 are both faster than the original bzip2.
And finally you can also check the write speed of your disk.
bzip2 is specified in tar by -j. I don't see options (in my distro's tar) for lbzip2 or pbzip2. Is there such an option? I have lbzip2, but not pbzip2. Without a tar option for lbzip2 I'd have to compress through lbzip2 as an extra step which might confuse future sysadmins not knowing how the tarfile was compressed. See "boutique" comment above. 'tar -x' figures out the bzip2 compression (and other tar-native compressions) and you don't have to specify 'tar -xj'. [lp]zip2 would have to offer significant speed improvements to make the effort worthwhile, and more "sophisticated" compression algorithms might actually take longer to uncompress.
I just finished restoring these same files to a new rig with a newly created/assembled RAID-1. the .bz2 is 118G and the uncompressed files use 450G. I first copied the .bz2 tarfile from the USB drive to the target hard drive, then untarred from there. It took 2 hours and 51 minutes to restore, much better than the 4+ hours described in post #1 (which was not a RAID). And, this restore to a slower computer with less memory, though I don't really know how the SATA transfer rate compares.
So the conclusion is to copy the tarfile from the USB to the hard drive first, then untar.
I'm going to try these restores again, possibly this evening, to the original machine.
you don't really need to copy, it is unnecessary. You can immediately untar/uncompress from the usb drive too. At least you can try it and compare the speeds. If you wish.
you don't really need to copy, it is unnecessary. You can immediately untar/uncompress from the usb drive too. At least you can try it and compare the speeds. If you wish.
If you notice from my original post, that exactly what I did and it too 4+ hours to restore the same tarfile. chrism01 suggested copying to the hard-drive first in post #2, and that did prove faster. Probably because the USB is slower than the hard-drive.
you would need to sum up all the times (copy, uncompress, untar, whatever). From the start (you have a backup on an usb drive) to the end (restore is done). And you will see which one is faster.
if i had a case like this and the spare time to explore where performance might be limited:
1. i would check to see what file system type 118G backup.tar.bz2 is on. is it formatted as ext4? some other file system type? none at all (reading a raw partition)?
2. i would try to see just how fast a process can read the backup.tar.bz2 stream, not decompressing it while watching the drive activity light(s) to see if it stays solidly busy or if not, just how busy it is. of course i would time this activity.
3. i would also be running the top command to see how CPU busy this is.
4. i would do #2 and #3 again but this time with decompression active to see how much this slows it down.
5. i would do a regular restore into a freshly formatted file system as a base reference. i would be sure not to have both the source backup.tar.bz2 and the target partition on the same device if it is an older spinning platter device to avoid head movement contention. i have not studied how things like this might work on SSD but worrying about how they spend time to clear space to make way for new bits, i'd still use different devices just to be sure. i'd still prefer SSD over rotators for the speed. if i was really limited to one device (such as because only one connection), i'd go with SSD.
if i had a case like this and the spare time to explore where performance might be limited:
1. i would [... various procedures]
Those are all good ideas, but I don't know if cost effective time-wise since, at best I think we're talking a matter of 1/2 hour-ish among more optimal techniques and this is not something I need to do routinely.
Quote:
Originally Posted by pan64
you would need to sum up all the times (copy, uncompress, untar, whatever). From the start (you have a backup on an usb drive) to the end (restore is done). And you will see which one is faster.
Right, In my 2 hours and 51 minutes time to restore I did not include the time to copy from the USB to the local hard drive before untarring. I timed that with a USB 2.0 port, but not to the same target machine (not available). That took a little over 16 minutes both for rsync and cp. So the total time for copy-from-USB/restore-from-disk would be 3 hours, 7 minutes, still much better than uncompressing/restore from USB.
Skaperen's proposed set of test would probably shed more light on this, but I think I'm satisfied that copying from USB to hard-drive first, then uncompressing/restoring is about 25% faster than uncompressing/restoring the USB resident tarfile.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.