deleting a line matching two or more regexp in bash, sed maybe?

patolfo · 05-18-2010, 12:34 PM

Hi guys, i want to delete from a file lines matching two regexp using sed, or other one line command

any ideas?

AlucardZero · 05-18-2010, 01:50 PM

You can pass multiple -e options to sed

pixellany · 05-18-2010, 01:53 PM

Show an example of the patterns you want to match. Show the before and after state

patolfo · 05-18-2010, 02:52 PM

Yep problem is how to put two reg exp

Code:

grep regex1 file | sed '/regex2/d' -i file

Quote:

Originally Posted by AlucardZero

You can pass multiple -e options to sed

g0su · 05-18-2010, 05:00 PM

cat file | sed -e 's/pattern//g' -e 's/pattern//g' > newFile;
mv newFile file;

ntubski · 05-18-2010, 05:03 PM

Code:

sed -i '/regex1/{/regex2/d}' file

Andrew Benton · 05-18-2010, 05:22 PM

I like nutbski's solution (though I would wrap it in '') But just to be different I thought I'd say how I'd do it

Code:

sed -i '/regex1.*regex2/d' file

patolfo · 05-19-2010, 10:16 AM

I tried that too, problem was that there are some lines containing the first pattern, but no the second.
And they got affected

Quote:

Originally Posted by g0su

cat file | sed -e 's/pattern//g' -e 's/pattern//g' > newFile;
mv newFile file;

grail · 05-19-2010, 10:28 AM

Did you try ntubski's? Worked like a charm for me

patolfo · 05-19-2010, 03:48 PM

Quote:

Originally Posted by grail

Did you try ntubski's? Worked like a charm for me

Let me try it again, i was doing other stuff at the same time

...

patolfo · 05-19-2010, 04:11 PM

Quote:

Originally Posted by ntubski

Code:

sed -i '/regex1/{/regex2/d}' file

Andrew actually i like your piece code, but ntubski (if it is a name where does it comes from) i find yours quite interesting, the {} inclusion, well i must said it is a first time for me...

please do not start trowing rotten tomatoes yet, but if i do remember well
sed starts from left to right, right?

so if i get it straight, this command looks for regex1, and to those lines it applies the second regex2/delete line command

which could be another command?, not precisely deleting

So the {}, can be used to put a command inside a command; now a far fetched question, how many substitution {'s}, can be nested, inside sed, one or more?

By the way thanks to both of you for your code, i completely forgot about making composite regexps, i was thinking in terms of isolated terms instead of seeing the text line as a whole.

patolfo · 05-19-2010, 04:14 PM

Code:

#!/bin/bash
sed -i '/aunque/{/tengo/d}' $1
#sed -i '/aunque.*tengo/d' $1
exit

input

Code:

ENGANCHADO A TI
(bunbury) 
aunque me haga daño
aunque sea extraño
aunque cuando no te tengo
aunque me hayas capturado
aunque me confundes
aunque me transformes
aunque sea un mr. high encantador

output

Code:

ENGANCHADO A TI
(bunbury) 
aunque me haga daño
aunque sea extraño
aunque me hayas capturado
aunque me confundes
aunque me transformes
aunque sea un mr. high encantador

Both roads take us to Rome ...
Thanks

grail · 05-19-2010, 06:46 PM

You can think of the braces as the same as in awk, if the previous statement is true then proceed with next inside braces.

ta0kira · 05-19-2010, 07:13 PM

Does that mean you meant "matching two of two" instead of "matching two or more" regexes? The solutions for "two or more" are different than what you ended up with. Here is one, just because I think the title of the post is more interesting.

Code:

#!/bin/bash

max_matches=1               #max number of pattern matches allowed
patterns=('aunque' 'tengo') #the patterns to match (you can use as many as you want)

file="$1"

counts="$( eval echo -n {1..$(($max_matches+1))} | tr ' ' '|' )"

{ for pattern in "${patterns[@]}"; do
  egrep -n "$pattern" "$file"
done; grep -n '' "$file"; } | sort -n | uniq -c | egrep "^ *($counts) " | sed -r 's/^[^:]+://'

Kevin Barry

PS On FreeBSD, use -E instead of -r for sed.

patolfo · 05-19-2010, 07:32 PM

Yes the idea is two or more, that is why ntubski code pick my interest, because there i can nest regexps, well using variables, '"$regex_n"' inside the brackets.

Thanks for pointing that out, i marked solved because the possibility of nesting.

However, your code on the other hand, well it does what i wanted with out nesting all the regexs

Quote:

Originally Posted by ta0kira

Does that mean you meant "matching two of two" instead of "matching two or more" regexes? The solutions for "two or more" are different than what you ended up with. Here is one, just because I think the title of the post is more interesting.

Code:

#!/bin/bash

max_matches=1               #max number of pattern matches allowed
patterns=('aunque' 'tengo') #the patterns to match (you can use as many as you want)

file="$1"

counts="$( eval echo -n {1..$(($max_matches+1))} | tr ' ' '|' )"

{ for pattern in "${patterns[@]}"; do
  egrep -n "$pattern" "$file"
done; grep -n '' "$file"; } | sort -n | uniq -c | egrep "^ *($counts) " | sed -r 's/^[^:]+://'

Kevin Barry

PS On FreeBSD, use -E instead of -r for sed.

p.s. By the way, this is where i got the data, the lyrics, i mean... the text lines.

Code:

http://www.youtube.com/watch?v=ufoANAKh-GE