reading a tagged log
I have a log file with simple(no meta characters) text lines. Each entry starts with a + followed by the name of the person who added an entry to the log. So for example:
+Alan line1 line2 more lines some blanks lines +David line more lines still more lines +Alan ... ... +Chris .. .. What is a good way to filter out all the entries by, say,Alan? Blank lines must be maintained as part of the entry. |
I don't really know the desired output, but probably awk with a well defined record separator (RS="+" probably) can do the job.
|
Yes, awk is good for this.
RS is best if it's at the end of the record. But here the + is at the beginning, so perhaps a state variable is simpler. (Set the state if the + is met. If state is good then print the current line.) Waiting for some attempt of the O/P... |
My first goal was to find a list of all unique names. I can do that with:
awk '/^+/' logfile | awk '!a[$0]++' But I am still struggling with filtering out all entries of the same person. Ideally I could call a function with any of the names as parameter and have all his/her entries printed. |
Quote:
My approach would be to read the file line-by-line (you don't say how big these files are), and look for the + sign, then compare the name to what you're looking for...if you find it, all other lines would be pushed into an array, until you hit the NEXT line with a + at the beginning. Name match? Keep shoving data out to the array. Doesn't match? keep reading. When you're done, you'll have an array with all of Alan's data in it, and you can output to screen/file/whatever. There are also approximately 10,000 other ways to do this, but for quick-and-dirty (this sounds like homework, honestly), that'd be my approach. |
I can't imagine any circumstance where you should need to pipe awk to awk - it has all the conditionals needed, and the END block for tidying up after the input has reached EOF in need.
The typical solution for this sort of thing is to search for your key and set a flag - print while the flag is true. Turn the flag off at the next non-key. You can pass the key in by a bash variable. Pretty straightforward. |
Quote:
Code:
awk '/^+/ && !a[$0]++' logfile You can pass a parameter like this Code:
awk -v tag=Alan '...' Code:
awk '...' tag=Alan |
Code:
awk '...' tag=Alan Thanks for the improvements. Yes the single awk is neat and concise. This log has just over 2000 lines. The log was used by several engineers as they were installed a refrigeration plant. It all went well. But I am now preparing an activity report on each engineer's contribution. I think I may have found a way to pull all blocks of text for each person (tag) with this command: awk '/^+/{f=(/tag/)} f' logfile # e.g tag=Alan No failures so far. I don't like the fact that the tag is hard-coded in the command. I guess I could use something like: <logfile awk '/^+/{f=(/tag/)} f' tag=Alan If there is a more efficient way I would be grateful to hear. |
Yes that's the efficient way.
But the / / enclose a literal string. Because / / is short for $0~/ / you can do $0 ~ variable Code:
<logfile awk '/^+/{f=($0 ~ tag)} f' tag=Alan The following are more precise: Code:
<logfile awk '/^+/{f=($0 ~ ("^+" tag " *$"))} f' tag=Alan Code:
<logfile awk '/^+/{f=($1 == ("+" tag))} f' tag=Alan |
I guess something like this may work:
Code:
awk 'BEGIN{RS="+"} /^Alan/ {next};1' |
All times are GMT -5. The time now is 05:00 AM. |