printing pattern match and not whole line that matches pattern
Hi all.
I've been jumping between the manuals of grep, awk and sed to find a way to print the match of a pattern. Grep seems able to print the entire line that matches the regular expression, but I want to print only the string that matches the regular expression. I could not find anything in awk or sed manuals. For example I have a html file that has many links in it. I want to output the location of the links to a plain text file. So I would need to make a regular expression similar to the following: Code:
href="[^"\r\n]*" I could output this to a file and then remove the href part. What tool should I be using to do this? Thanks in advance. Avatar |
Hi,
Something like this maybe: echo '<A HREF="xdpyinfo.1.html">xdpyinfo(1)</A>' | sed 's/.*HREF="\(.*\)".*/\1/' $ echo '<A HREF="xdpyinfo.1.html">xdpyinfo(1)</A>' | sed 's/.*HREF="\(.*\)".*/\1/' xdpyinfo.1.html The \( , \) and \1 are the key. The \1 represents and print that what is found between the \( and \) in the searchstring. Hope this helps. |
That's really cool.
I've gota get into a sed manual/tutorial one of these days :-) Thanks Avatar |
If your using grep,
grep -o PATTERN The -o option tells it to output only the matching part of the string. Check out man grep for more info. |
Handy one-liners for sed is a nice reference, too. I use it a lot :)
|
Hi all,
Quote:
grep -o is nice but doesn't offer the flexibility of using \( \) which allows you to match something bigger but print only part of it. Thanks in advance! Dirk |
Try
Code:
sed -n 's/.*HREF="\(.*\)".*/\1/p' |
My favorite sed and awk tutorials here: http://www.grymoire.com/Unix/
|
Hi,
The sed part used is just a search and print, and is indeed done on all lines in a file. It's not entirely clear to me what you want to match and what you do not want to match, but the following example should get you going again: Code:
$ cat sed.infile |
awk:
Code:
awk -F'"' 'NR>1&&$0=$2' RS='HREF=' file |
Thanks guys! Problem solved: -n in combination with /p. Will have a look at those tutorials... looking good!
Dirk |
Similar problems
Please help this:
cat aa <a href=#Say,123> >>Hi<< <a href=#Say,234> >>Hello<< <a href=#Say,345> >>World<< Code:
cat aa | sed -n 's/.*href=#Say,\(.*\)>.*/\1/p' 345> > What is sed or awk command to get like this: 123 234 345 If this work then it is fine but the above is referred. cat bb <a href=#Say,123> >>Hi<< <a href=#Say,234> >>Hello<< to: 123 234 |
Using GNU sed
Code:
# Patterns such as [^<]*< limit "greedy matching" |
It works
It works but then I have another problems so I wrote in awk then. Thanks
|
All times are GMT -5. The time now is 09:40 AM. |