LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 05-13-2024, 06:23 PM   #1
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,612

Rep: Reputation: 180Reputation: 180
How to speed up tar restores


I'm looking for suggestions -- maybe not doable.

I have a 118G bz2 compressed tarfile on an external USB 2.0 drive. The uncompressed files use 450G. This took 4 hours to restore from the USB drive.

Does that seem reasonable?
 
Old 05-13-2024, 11:28 PM   #2
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,369

Rep: Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753
That's a fair amt of stuff.
Did you run the untar/uncompress cmd direct against the usb drive?

Depending on the internals of how tar works, it might(?) be quicker overall to just stream the file onto the local hard disk first, and then run the uncompress/untar from there (especially if it's SSD) .
It's got to be worth a try, just to see...
 
1 members found this post helpful.
Old 05-14-2024, 03:29 AM   #3
lvm_
Member
 
Registered: Jul 2020
Posts: 983

Rep: Reputation: 348Reputation: 348Reputation: 348Reputation: 348
It amounts to 8M/sec compressed - which is well below the USB 2.0 limit, and 32M/sec uncompressed which by modern standards is pathetic. Of course it depends on what's being restored and where - it could be ok if you are restoring a gazillion of small files or writing to SMR spinner, and there are many reasons for poor SSD write performance, usually not happy ones.


Quote:
Originally Posted by chrism01 View Post
Depending on the internals of how tar works
tar stands for Tape ARchive, it is specifically designed to work with sequential streams
 
Old 05-14-2024, 04:09 AM   #4
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,041

Rep: Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348
there can be other factors too, like free memory or fragmentation, type of filesystem, ... but in general you can measure it by:
Code:
time cat /mnt/external/backup.bz2 > /dev/null
to check the read speed. Additionally you can try the following:
Code:
time pbzip2 -dc /mnt/external/backup.bz2 > /dev/null
time lbzip2 -dc /mnt/external/backup.bz2 > /dev/null
to check the decompression speed.
lbzip2 and pbzip2 are both faster than the original bzip2.
And finally you can also check the write speed of your disk.
 
Old 05-15-2024, 12:00 AM   #5
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,612

Original Poster
Rep: Reputation: 180Reputation: 180
Quote:
Originally Posted by chrism01 View Post
That's a fair amt of stuff.
Did you run the untar/uncompress cmd direct against the usb drive?
Yes, directly against the USB drive.
Quote:
Depending on the internals of how tar works, it might(?) be quicker overall to just stream the file onto the local hard disk first, and then run the uncompress/untar from there (especially if it's SSD) .
It's got to be worth a try, just to see...
I'm going to try this idea of first copying the backup to the drive first ... as soon as I set up a machine with a similar RAID-1.
Quote:
Originally Posted by lvm_ View Post
It amounts to 8M/sec compressed - which is well below the USB 2.0 limit, and 32M/sec uncompressed which by modern standards is pathetic. Of course it depends on what's being restored and where - it could be ok if you are restoring a gazillion of small files or writing to SMR spinner, and there are many reasons for poor SSD write performance, usually not happy ones.
Not a gazillion small files. There are half-a-dozen small files and one really large 144G (uncompressed) virtual machine .vdi image.
Quote:
tar stands for Tape ARchive, it is specifically designed to work with sequential streams
These tarfiles are intended for long term backup and seem like the right choice for such backups. Better/faster "boutique" programs might not work years from now depending on what flavors of Linux things might evolve to. tar ships with all distros and likely will be forever. But, I'm open to suggestions on alternatives.
Quote:
Originally Posted by pan64 View Post
there can be other factors too, like free memory or fragmentation, type of filesystem, ... but in general you can measure it by:
Code:
time cat /mnt/external/backup.bz2 > /dev/null
to check the read speed. Additionally you can try the following:
Code:
time pbzip2 -dc /mnt/external/backup.bz2 > /dev/null
time lbzip2 -dc /mnt/external/backup.bz2 > /dev/null
to check the decompression speed.
Excellent suggestions. I might give those a try after trying chrism01's suggestion.
Quote:
lbzip2 and pbzip2 are both faster than the original bzip2.
And finally you can also check the write speed of your disk.
bzip2 is specified in tar by -j. I don't see options (in my distro's tar) for lbzip2 or pbzip2. Is there such an option? I have lbzip2, but not pbzip2. Without a tar option for lbzip2 I'd have to compress through lbzip2 as an extra step which might confuse future sysadmins not knowing how the tarfile was compressed. See "boutique" comment above. 'tar -x' figures out the bzip2 compression (and other tar-native compressions) and you don't have to specify 'tar -xj'. [lp]zip2 would have to offer significant speed improvements to make the effort worthwhile, and more "sophisticated" compression algorithms might actually take longer to uncompress.

Last edited by mfoley; 05-15-2024 at 12:17 AM.
 
Old 05-15-2024, 12:54 AM   #6
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,041

Rep: Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348
you need to use tar -I <program> instead of tar -j. Check the man page.
 
1 members found this post helpful.
Old 05-15-2024, 03:22 PM   #7
jefro
Moderator
 
Registered: Mar 2008
Posts: 22,020

Rep: Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630
Some of the extreme compression levels might do that. Some compression types need support in hardware for best performance.

Seems a bit long. In theory tar is a file by file stacked one after another so the read ought to be fairly fast.

If the host needs to both read and write to the usb you are then looking at the 4 hours.
 
Old 05-15-2024, 11:40 PM   #8
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,369

Rep: Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753
Quote:
tar stands for Tape ARchive, it is specifically designed to work with sequential streams
Yes, I know ... That wasn't my point ...
 
Old 05-19-2024, 02:14 PM   #9
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,612

Original Poster
Rep: Reputation: 180Reputation: 180
I just finished restoring these same files to a new rig with a newly created/assembled RAID-1. the .bz2 is 118G and the uncompressed files use 450G. I first copied the .bz2 tarfile from the USB drive to the target hard drive, then untarred from there. It took 2 hours and 51 minutes to restore, much better than the 4+ hours described in post #1 (which was not a RAID). And, this restore to a slower computer with less memory, though I don't really know how the SATA transfer rate compares.

So the conclusion is to copy the tarfile from the USB to the hard drive first, then untar.

I'm going to try these restores again, possibly this evening, to the original machine.

Last edited by mfoley; 05-19-2024 at 02:17 PM.
 
Old 05-20-2024, 01:59 AM   #10
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,041

Rep: Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348
you don't really need to copy, it is unnecessary. You can immediately untar/uncompress from the usb drive too. At least you can try it and compare the speeds. If you wish.
 
Old 05-20-2024, 01:20 PM   #11
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,832

Rep: Reputation: 1219Reputation: 1219Reputation: 1219Reputation: 1219Reputation: 1219Reputation: 1219Reputation: 1219Reputation: 1219Reputation: 1219
Quote:
So the conclusion is to copy the tarfile from the USB to the hard drive first, then untar.
Does not make sense.
Untaring from another physical drive is faster - less stress on the target drive.
And a RAID1 speeds up reading but not writing.

Code:
cd /destpath/on/hd/
tar xpf /sourcepath/on/usb/tarfile.bz2
 
Old 05-21-2024, 06:47 AM   #12
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,612

Original Poster
Rep: Reputation: 180Reputation: 180
Quote:
Originally Posted by pan64 View Post
you don't really need to copy, it is unnecessary. You can immediately untar/uncompress from the usb drive too. At least you can try it and compare the speeds. If you wish.
If you notice from my original post, that exactly what I did and it too 4+ hours to restore the same tarfile. chrism01 suggested copying to the hard-drive first in post #2, and that did prove faster. Probably because the USB is slower than the hard-drive.

Last edited by mfoley; 05-21-2024 at 06:53 AM.
 
Old 05-21-2024, 07:01 AM   #13
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,041

Rep: Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348Reputation: 7348
you would need to sum up all the times (copy, uncompress, untar, whatever). From the start (you have a backup on an usb drive) to the end (restore is done). And you will see which one is faster.
 
Old 05-21-2024, 06:44 PM   #14
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,689
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
if i had a case like this and the spare time to explore where performance might be limited:

1. i would check to see what file system type 118G backup.tar.bz2 is on. is it formatted as ext4? some other file system type? none at all (reading a raw partition)?

2. i would try to see just how fast a process can read the backup.tar.bz2 stream, not decompressing it while watching the drive activity light(s) to see if it stays solidly busy or if not, just how busy it is. of course i would time this activity.

3. i would also be running the top command to see how CPU busy this is.

4. i would do #2 and #3 again but this time with decompression active to see how much this slows it down.

5. i would do a regular restore into a freshly formatted file system as a base reference. i would be sure not to have both the source backup.tar.bz2 and the target partition on the same device if it is an older spinning platter device to avoid head movement contention. i have not studied how things like this might work on SSD but worrying about how they spend time to clear space to make way for new bits, i'd still use different devices just to be sure. i'd still prefer SSD over rotators for the speed. if i was really limited to one device (such as because only one connection), i'd go with SSD.
 
Old 05-28-2024, 03:06 PM   #15
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,612

Original Poster
Rep: Reputation: 180Reputation: 180
Quote:
Originally Posted by Skaperen View Post
if i had a case like this and the spare time to explore where performance might be limited:

1. i would [... various procedures]
Those are all good ideas, but I don't know if cost effective time-wise since, at best I think we're talking a matter of 1/2 hour-ish among more optimal techniques and this is not something I need to do routinely.
Quote:
Originally Posted by pan64 View Post
you would need to sum up all the times (copy, uncompress, untar, whatever). From the start (you have a backup on an usb drive) to the end (restore is done). And you will see which one is faster.
Right, In my 2 hours and 51 minutes time to restore I did not include the time to copy from the USB to the local hard drive before untarring. I timed that with a USB 2.0 port, but not to the same target machine (not available). That took a little over 16 minutes both for rsync and cp. So the total time for copy-from-USB/restore-from-disk would be 3 hours, 7 minutes, still much better than uncompressing/restore from USB.

Skaperen's proposed set of test would probably shed more light on this, but I think I'm satisfied that copying from USB to hard-drive first, then uncompressing/restoring is about 25% faster than uncompressing/restoring the USB resident tarfile.

Last edited by mfoley; 05-28-2024 at 03:09 PM.
 
  


Reply

Tags
restore, tar



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how can i decompress this tar.tar file? hmmm sounds new.. tar.tar.. help ;) kublador Linux - Software 14 10-25-2016 02:48 AM
BackUp & Restore with TAR (.tar / .tar.gz / .tar.bz2 / tar.Z) asgarcymed Linux - General 5 12-31-2006 02:53 AM
mandrake 8.2 session dir restores itself khutze Linux - Software 0 07-30-2002 09:56 PM
How do I un tar a .tar, .tar.z, .tar.gz file vofkid Linux - Newbie 4 03-15-2002 02:54 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 11:44 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration