[SOLVED] Deleting n number of consecutive occurrences of a pattern
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
When the input line matches the pattern, remember the line in an array and count down. If counter is 0, throw the array away and set the counter back to N.
When it doesn't match the pattern:
- if the array isn't empty, less than N patterns were in a row, so write the array out. Clear the array. Set the counter back to N.
- print the current line
Nope, we are not going to write it for you.
You have been given some hints - incorporate them in your code. The countdown is a good idea, use it to also test if the current record is equal to the previous.
When the input line matches the pattern, remember the line in an array and count down. If counter is 0, throw the array away and set the counter back to N.
When it doesn't match the pattern:
- if the array isn't empty, less than N patterns were in a row, so write the array out. Clear the array. Set the counter back to N.
- print the current line
Sorry I couldn't resist the itch and ended up writing it. Why not share it then:
Code:
#!/usr/bin/awk -f
BEGIN { N=5; PAT="0000"; ix=0 }
$0==PAT { saved[ix] = $0; ix++;
N--
if (N==0) { delete saved; N=5 }
next }
{ for (i in saved) print saved[i]
delete saved
N=5
print }
Adding this condition is left as an exercise:
Quote:
Originally Posted by Thirumala!
It should not replace if occurrences are more than n. And it should replace only if next set of occurrences are n again.
By the way, now I notice that I forget to reset the index variable ix. Thanks to the associative nature of awk arrays, this doesn't seem to be a problem.
@berndbausch - just remember that now this user may expect to be told answers without doing any work in the future too
But, as you have let the cat out of the bag, here are 2 points of interest:
1. What happens if the last 3 entries in the file are the pattern?
2. If you rethink your use of N, you could reduce it to only being needed once outside the definition (hint: consider ix values)
Polishing is exercise for the reader, and if somebody has wrong expectations, they can be reset quickly.
Well, whenI have a little more time I may do the polishing just to prove my value
in the correct order.
Because the order is to be kept, we can store it in a string as well
Code:
awk '
{ buf=buf sep $0; sep=RS } # add sep and $0 to buf; undefined variables are "" in string context; RS is newline
$0!="0000" { print buf; f=0; buf=sep=""; next } # print and clear buffer; "next" skips the following code
++f==5 { f=0; buf=sep="" } # if 5 found then clear buffer; an undefined variable is 0 in number context
END {if (f>0) print buf} # print a remaining buffer
' temp
Last edited by MadeInGermany; 11-19-2015 at 07:02 AM.
I think some of you might be getting a little too carried away with the order stuff, try and remember what is being stored in the array, ie. it is only the same pattern (0000), so really
order here is pretty irrelevant
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.