File system optimized for very large file parallel HDD read

reb0rn · 12-04-2023, 11:10 PM

I copied files to empty HDD so they start from from HDD start and are 100% saved one after other, and as read-ahead helped me a lot i can say it defo work, I might test in real use next week with 64MB buffer and see speed and compare vs 32MB
I have already set cluster size as 32 and 64MB but that have not helped me at all, but it will stay as its done

lvm_ · 12-05-2023, 12:08 AM

Quote:

Originally Posted by reb0rn

I copied files to empty HDD so they start from from HDD start and are 100% saved one after other

I won't bet on that, disk allocation strategy may be weird. But you may check this with 'hdparm --fibmap'

pan64 · 12-05-2023, 12:21 AM

yes, if you look at how ext4 allocates blocks, you will be surprised. Otherwise it is optimized for a multi-user, multi-task environment (parallel access to a lot of small files), not for a single file access.
https://www.kernel.org/doc/html/late...xt4/index.html

reb0rn · 12-06-2023, 03:40 AM

Quote:

Originally Posted by lvm_

I won't bet on that, disk allocation strategy may be weird. But you may check this with 'hdparm --fibmap'

its quite fine as i minitor disk read and HDD is 16TB only ~3.5T stored and it has max speed on read for that part, it do go from 210 to 250B/s on 7TB used but speed at offset 8TB+ is under 200MB/s on 3th HDD
one file:
filesystem blocksize 4096, begins at LBA 2048; assuming 512 byte sectors.
byte_offset begin_LBA end_LBA sectors
0 4310435840 4310697983 262144
134217728 4310697984 4318824447 8126464

At first I had data saved over network in multiple tasks and then it was mess (read time would take over 50% more), files where on end of disk even if less then 50% space used, but cp or mv them from disk1 to disk2 helped, at least when I format 2nd hdd to be sure

pan64 · 12-06-2023, 04:18 AM

I just don't understand, you tried a lot of things, but why don't you use ssd, and you can have much better results.

wpeckham · 12-06-2023, 11:58 AM

Quote:

Originally Posted by pan64

I just don't understand, you tried a lot of things, but why don't you use ssd, and you can have much better results.

Great point.

The thing is that you cannot optimize a file system for parallel operation, because that is not how the hardware works. You can tune for best performance OF THE SYSTEM for your use case, but the real optimization is to make the hardware fit the intended function. To optimize for parallel I/O you need a storage controller and channels that can operate in parallel, or multiple controllers: and in any case accessing multiple storage devices that can be read independently and in parallel.

If you are not willing to modify the hardware, you are only tuning the software and system for the optimal performance THAT HARDWARE can achieve under that kind of use given your restrictions. (Which is still not a bad thing to do and may be sufficient for your needs. One can hope.)

Getting access that APPEARS parallel is achieved by loading as much as possible from the slow (nonparallel rotating rust) storage into faster and more parallel (ram) storage. That is not achieved by changes directly to the file system, although a file system with good performance certainly helps. Changes to how you load from storage into ram cache and buffers makes a bigger difference, and of course there must be more than adequate ram to hold all of the data you need to access in parallel.

Were I building a system for this kind of operation, it would involve a stack of 7 to 12 SSD devices on multiple channels in a RAID-5 array, because that would give you the fastest parallel performance.I would have to sit down and do the math on the memory, but twice what it would take to hold all of the buffered data would be a base, then operational memory atop that, and about 20% for spare to reduce swapping. A smart engineer would then double that. (Compared to production impact, even expensive ram is cheap!)