ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am generating an ouput file from a simulation that is several
gig in size. The output file is ascii text and I can generate it from a command line option.
{command line option} > output_file.txt
How do I start writing to a new ouput file once the size of an ouput file reaches a reasonable size.
I tried to check the size of the output file and once it approaches a certain size, redirect the output to a new file ... only it did not quite work out.
{command line option} > output_file-$i.txt
Here, I incremented the variable i once the file size was above a certain threshold.
since i only really know perl, you could maybe use something like this:
Code:
#!/usr/bin/perl -w
#
my $file_write_to="test.pl";
my $size_of_file=(-s $file_write_to);
my $max_size=1000000;
open FILEHAND,">>$file_write_to";
select FILEHAND;
print "whatever you are outputting";
$count=0;
if($size_of_file>$max_size)
{
$count++;
$file_write_to="test.pl".$count;
};
the -s operator in perl will return you the size of a file in bytes.
The problem here is that you're wanting shells > pipe to allow you to monitor the output files' size, which I don't think is possible. I would suggest that you consider writing a wrapper in perl or some such for this. You can fork to the program, and instead of dumping the output to a file, your script handles the output. In perl, it would be something like (for 5.6.1 or better):
Code:
open FORKIN, "-|", "/path/to/your/binary", "cmd","line","opts"
or die "Cannot fork: $!\n";
Then, you'd simply read each line from the program, as though it were a file:
Code:
my($line);
while($line = <FORKIN>) {
}
In the loop, you can count the number of chars per line, or simply count each line, and use that to determine when to open a new output file.
Code:
my($line);
my($char_count) = 0;
my($max_char_count) = 20000; # bytes
while($line = <FORKIN>) {
$char_count += length($line);
if($char_count > $max_char_count) {
# close old output file and open new one
$char_count = 0;
}
print OUT $line;
}
close FORKIN;
Something like that, anyway.. I haven't tested this, but with some modification, it should work.
If I'm not mistaken, piping the output to split would still cause it to dump all it's data to one file first, and then split it. I could be wrong on that though, since I don't know the specifics of how split works. (=
AFA rewriting the prog to do the splitting.. that may no be an option. If it can be done, that would eliminate the need for a wrapper..
piping the output to split would still cause it to dump all it's data to one file first
which file would it dump it to and why?
i dont know the innards of split myself but im pretty sure it just waits until a buffers full then writes out a file.
i think simulation_command | split -l(num lines per file) should work fine for psingh
kev, you're prolly right. There always seems to be a CL way to do just about everything one could need to do. I just don't know about alot of them, so I tend to write my own perl versions. (=
Thanks folks, by the time I had gotten a reply, I found that I could redirect the output via the split command and control the size of the file. Each file size is about 200Megbytes.
I am running this on Redhat 8. I now find that the processing is "extremely slow". For the first few files, the system was processing about 1Meg/minute (until about 8 files). Since then, it has been processing about 1Meg/10 minutes. I am not sure what is slowing the system down.
I made sure I gziped the output files as they were created. There is sufficient space on the system (< 63% utilized). Any thoughts will be greatly appreciated.
I now find that the processing is "extremely slow"
comaped to what, did it run much faster before you piped it to split? whats your load average while the programs running?
Quote:
I opted to use the split command since I am a newbie to perl. However, if perl is my only option, I can redo the work.
I dont get what your asking here, if the simulations finished why run it again? why would perl be you only option?
i dont know perl but from what ive heard its pretty good for knocking up a quick solution and therefore would be fine to use here. my personal choice however would be to rewrite the output code of the simulation so it creates the seperate files itsself.
The simulation starts off extremely fast (essentailly running tcpdump on a large data file).
Memory 433M USED 438M Available
Swap 500M USED 1G Available
CPU useage is rather erratic (1%- 98%). I have nothing else running.
After creating 8 files, it has slowed down considerably (compared to the speed when it started off).
It hasn't finished yet. It is about half done and processing at 1/10th the initial rate. Could it be due to the fact that I am writing to just one output buffer and all split is doing is sending it to a file?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.