LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 10-27-2005, 06:31 PM   #1
Dave Kelly
Member
 
Registered: Aug 2004
Location: Todd Mission Texas
Distribution: Linspire
Posts: 215

Rep: Reputation: 31
infomation extraction


As usual I have jumped in the water over my head. I have read through "Advanced Bash Scripting' on the Linux Document Project and the book 'Shell Scripting Recipes' by Chris F. A. Johnson and do not find the script code snippet I need. It is also possible I don't know what I am looking for either.

I have a very large file with lines of this flavor:
Code:
<li><font face="Arial, Helvetica, sans-serif"><a href="http://tldp.mirrors.dumbdave.com.au">http://tldp.mirrors.dumbdave.com.au</a><br>
<!-- Dumb Dave <tech.dumbdave.com.au> -->
I would like to wind up with a file that takes this format.
Code:
http://tldp.mirrors.dumbdave.com.au; Dumb Dave <tech.dumbdave.com.au
I did read in the book that a good application to search and extract information from a line was either 'sed' or 'awk'.

Is there a tutoral that can quickly teach me or show me how to do this. I don't use either of these in my everyday life. At my age I'm doing good to hold on to what bash I know.

Thanks
Dave
 
Old 10-27-2005, 07:18 PM   #2
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
Pardon my regex, but maybe: cat very large file with lines of this flavor | sed -e "s/<[/?a-z ].*\">//g" -e "s/<\/.*>/;/g" -e "s/<\!\-\{2\}.//g" -e "s/.-\{2\}>//g" | sed '$!N;s/\n/ /'

Last edited by unSpawn; 10-27-2005 at 08:07 PM.
 
Old 10-27-2005, 08:56 PM   #3
Dave Kelly
Member
 
Registered: Aug 2004
Location: Todd Mission Texas
Distribution: Linspire
Posts: 215

Original Poster
Rep: Reputation: 31
Quote:
Originally posted by unSpawn
Pardon my regex,
I shall not pardon you! It is most welcome. 'regex' is something I can not bend my mind around. I don't understand.
To me a regular expression should be 'search for this' not /\?.,??

I will give your suggestion a try in a few minutes and let you know how it works.

Thanks for the reply.
Dave
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to view system infomation lomnhom Linux - Newbie 4 05-05-2005 03:09 AM
How can I get the infomation of sysServices from Linux MIB? davyzhang Programming 2 03-29-2005 11:29 PM
extraction of words pantera Programming 4 10-15-2004 02:28 PM
When using the command dmesg I only rceivice infomation about wlan0 jginger Linux - Newbie 3 07-23-2004 03:44 PM
How to remove the size infomation of linked file? wolfshome Linux - General 3 05-28-2002 03:39 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:57 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration