Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am a newbie in shell scripting and would appreciate a help with this qstn. many thanks in advance and apologies for the big input file.
I have a .xml file that is a concat of multiple rss files. reqrmnt is to filter out all extra content in the file and keep only the actual items.
eg:
<?xml version="1.0" encoding="iso-8859-1"?>
<rss>
...........
some text here
<channel>
..........
some more tags here
<item>
<title>Item Example 1</title>
<link>http://www.domain.com/link1.htm</link>
</item>
<item>
<title>Item Example 2</title>
<link>http://www.domain.com/link2.htm</link>
</item>
</channel>
</rss>
<rss>
.....
some other tags
......
<item>
<title>Item Example 3</title>
<link>http://www.domain.com/link3.htm</link>
</item>
.......
more tags
.......
<item>
<title>Item Example 4</title>
<link>http://www.domain.com/link4.htm</link>
</item>
<item>
<title>Item Example 5</title>
<link>http://www.domain.com/link5.htm</link>
</item>
</rss>
//item can have more attribs
output should be:
<item>
<title>Item Example 1</title>
<link>http://www.domain.com/link1.htm</link>
</item>
and other items
grep -v rss ?
i think if your format of RSS is static(one tag per line) it's quite simple to remove unwanted tags with grep, unless format will change to, say, single-line, where you will need to either use hard regexes or external programming lang.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.