LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 07-31-2003, 07:03 AM   #1
psingh
LQ Newbie
 
Registered: Jul 2003
Posts: 10

Rep: Reputation: 0
need help with a script


I need some help.

I am generating an ouput file from a simulation that is several
gig in size. The output file is ascii text and I can generate it from a command line option.
{command line option} > output_file.txt
How do I start writing to a new ouput file once the size of an ouput file reaches a reasonable size.
I tried to check the size of the output file and once it approaches a certain size, redirect the output to a new file ... only it did not quite work out.
{command line option} > output_file-$i.txt
Here, I incremented the variable i once the file size was above a certain threshold.

Any thoughts?
 
Old 07-31-2003, 08:28 AM   #2
sk8guitar
Member
 
Registered: Jul 2003
Location: DC
Distribution: mandrake 9.1
Posts: 415

Rep: Reputation: 30
since i only really know perl, you could maybe use something like this:

Code:
#!/usr/bin/perl -w
#
my $file_write_to="test.pl";
my $size_of_file=(-s $file_write_to);
my $max_size=1000000;
open FILEHAND,">>$file_write_to";
select FILEHAND;
print "whatever you are outputting";
$count=0;
if($size_of_file>$max_size)
{
        $count++;
        $file_write_to="test.pl".$count;
};
the -s operator in perl will return you the size of a file in bytes.
 
Old 07-31-2003, 05:02 PM   #3
TheLinuxDuck
Member
 
Registered: Sep 2002
Location: Tulsa, OK
Distribution: Slack, baby!
Posts: 349

Rep: Reputation: 33
The problem here is that you're wanting shells > pipe to allow you to monitor the output files' size, which I don't think is possible. I would suggest that you consider writing a wrapper in perl or some such for this. You can fork to the program, and instead of dumping the output to a file, your script handles the output. In perl, it would be something like (for 5.6.1 or better):
Code:
  open FORKIN, "-|", "/path/to/your/binary", "cmd","line","opts"
    or die "Cannot fork: $!\n";
Then, you'd simply read each line from the program, as though it were a file:
Code:
  my($line);
  while($line = <FORKIN>) {
  }
In the loop, you can count the number of chars per line, or simply count each line, and use that to determine when to open a new output file.
Code:
  my($line);
  my($char_count) = 0;
  my($max_char_count) = 20000; # bytes
  while($line = <FORKIN>) {
    $char_count += length($line);
    if($char_count > $max_char_count) {
       # close old output file and open new one
      $char_count = 0;
    }
    print OUT $line;
  }
  close FORKIN;
Something like that, anyway.. I haven't tested this, but with some modification, it should work.
 
Old 07-31-2003, 05:17 PM   #4
kev82
Senior Member
 
Registered: Apr 2003
Location: Lancaster, England
Distribution: Debian Etch, OS X 10.4
Posts: 1,263

Rep: Reputation: 51
wouldnt the easiest way be to re-write the output part of the simulation to output to a new file every n lines and pass n by command line argument?

or pipe the output to split.

Last edited by kev82; 07-31-2003 at 05:19 PM.
 
Old 08-01-2003, 08:09 AM   #5
TheLinuxDuck
Member
 
Registered: Sep 2002
Location: Tulsa, OK
Distribution: Slack, baby!
Posts: 349

Rep: Reputation: 33
If I'm not mistaken, piping the output to split would still cause it to dump all it's data to one file first, and then split it. I could be wrong on that though, since I don't know the specifics of how split works. (=

AFA rewriting the prog to do the splitting.. that may no be an option. If it can be done, that would eliminate the need for a wrapper..
 
Old 08-01-2003, 08:44 AM   #6
kev82
Senior Member
 
Registered: Apr 2003
Location: Lancaster, England
Distribution: Debian Etch, OS X 10.4
Posts: 1,263

Rep: Reputation: 51
Quote:
piping the output to split would still cause it to dump all it's data to one file first
which file would it dump it to and why?
i dont know the innards of split myself but im pretty sure it just waits until a buffers full then writes out a file.

i think simulation_command | split -l(num lines per file) should work fine for psingh
 
Old 08-01-2003, 11:00 AM   #7
TheLinuxDuck
Member
 
Registered: Sep 2002
Location: Tulsa, OK
Distribution: Slack, baby!
Posts: 349

Rep: Reputation: 33
kev, you're prolly right. There always seems to be a CL way to do just about everything one could need to do. I just don't know about alot of them, so I tend to write my own perl versions. (=
 
Old 08-05-2003, 04:23 AM   #8
psingh
LQ Newbie
 
Registered: Jul 2003
Posts: 10

Original Poster
Rep: Reputation: 0
Thanks folks, by the time I had gotten a reply, I found that I could redirect the output via the split command and control the size of the file. Each file size is about 200Megbytes.
I am running this on Redhat 8. I now find that the processing is "extremely slow". For the first few files, the system was processing about 1Meg/minute (until about 8 files). Since then, it has been processing about 1Meg/10 minutes. I am not sure what is slowing the system down.
I made sure I gziped the output files as they were created. There is sufficient space on the system (< 63% utilized). Any thoughts will be greatly appreciated.
 
Old 08-05-2003, 04:33 AM   #9
psingh
LQ Newbie
 
Registered: Jul 2003
Posts: 10

Original Poster
Rep: Reputation: 0
I opted to use the split command since I am a newbie to perl. However, if perl is my only option, I can redo the work. What say TheLinuxDuck / Kev82?
 
Old 08-05-2003, 05:43 AM   #10
kev82
Senior Member
 
Registered: Apr 2003
Location: Lancaster, England
Distribution: Debian Etch, OS X 10.4
Posts: 1,263

Rep: Reputation: 51
Quote:
I now find that the processing is "extremely slow"
comaped to what, did it run much faster before you piped it to split? whats your load average while the programs running?
Quote:
I opted to use the split command since I am a newbie to perl. However, if perl is my only option, I can redo the work.
I dont get what your asking here, if the simulations finished why run it again? why would perl be you only option?
i dont know perl but from what ive heard its pretty good for knocking up a quick solution and therefore would be fine to use here. my personal choice however would be to rewrite the output code of the simulation so it creates the seperate files itsself.
 
Old 08-05-2003, 06:02 AM   #11
psingh
LQ Newbie
 
Registered: Jul 2003
Posts: 10

Original Poster
Rep: Reputation: 0
The simulation starts off extremely fast (essentailly running tcpdump on a large data file).
Memory 433M USED 438M Available
Swap 500M USED 1G Available
CPU useage is rather erratic (1%- 98%). I have nothing else running.
After creating 8 files, it has slowed down considerably (compared to the speed when it started off).
It hasn't finished yet. It is about half done and processing at 1/10th the initial rate. Could it be due to the fact that I am writing to just one output buffer and all split is doing is sending it to a file?
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Iptables (with masq) troubleshooting, very simple script attached script and logs. xinu Linux - Networking 13 11-01-2007 04:19 AM
Directory listing - Calling shell script from a CGI script seran Programming 6 08-11-2005 11:08 PM
creating shell script that executes as root regardless of who runs the script? m3kgt Linux - General 13 06-04-2004 10:23 PM
send automatic input to a script called by another script in bash programming jorgecab Programming 2 04-01-2004 12:20 AM
linux 9 and java script error - premature end of script header sibil Linux - Newbie 0 01-06-2004 04:21 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 02:52 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration