LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 12-03-2008, 10:25 PM   #1
Shobhna
LQ Newbie
 
Registered: Dec 2008
Posts: 6

Rep: Reputation: Disabled
How to compare two lines and delete the duplicate line from a file?


Hi Friends,

I have a file with contents as given below.

1111|9999|||1|WHI1|Name1||0|0||
1111|9999|1111CS|9999|2|WHI1|Name1||0|0||
1111|55555|||1|MER|Name2||0|0||
1111|55555|22222|55555|2|MER|Name2||0|0||

I want to compare two fields separated by "|" symbol from each line and delete the entire line if there is any duplicate.

Example:

1111|9999|||1|WHI1|Name1||0|0||
1111|9999|1111CS|9999|2|WHI1|Name1||0|0||

In these two lines, the value "1111|9999" is matching and the second line (which is duplicate) should be removed from the file.

The search should continue till the end of the file to remove the duplicates.

Can you please help me on this.

Thanks in advance.

Have a great day!!!
 
Old 12-04-2008, 12:53 AM   #2
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Hi,

And welcome to LQ!

You're positive that the second line with the same
matching field is always the duplicate that needs
to be removed?
Code:
sort -t\| -k 1,2 -u dupes.txt 
1111|55555|||1|MER|Name2||0|0||
1111|9999|||1|WHI1|Name1||0|0||
 
Old 12-04-2008, 01:14 AM   #3
Shobhna
LQ Newbie
 
Registered: Dec 2008
Posts: 6

Original Poster
Rep: Reputation: Disabled
Yes, the second line will always be duplicate if it matches.
Thanks a lot for your help

Last edited by Shobhna; 12-04-2008 at 01:16 AM.
 
Old 12-04-2008, 03:03 AM   #4
Shobhna
LQ Newbie
 
Registered: Dec 2008
Posts: 6

Original Poster
Rep: Reputation: Disabled
Hi,

From the above example file, is it possible to export the fields separated by "|" to an excel file?

Can you please let me know.

Thanks..
 
Old 12-04-2008, 03:09 AM   #5
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Well, not strictly speaking an Excel-file, but it's trivial
to convert it to CSV which excel will most happily open
directly
Code:
sort -t\| -k 1,2 -u dupes.txt | sed -e 's/|/","/g' -e 's/^/"/' -e 's/$/"/' > no_dupes.csv
*if* this is what you mean by "exporting to an excel file".
If it's not - please explain in more detail what you're
trying to achieve.

Last edited by Tinkster; 12-04-2008 at 03:10 AM.
 
Old 12-04-2008, 03:23 AM   #6
Shobhna
LQ Newbie
 
Registered: Dec 2008
Posts: 6

Original Poster
Rep: Reputation: Disabled
Thanks a lot..
I'll try this.
 
Old 12-04-2008, 04:07 AM   #7
skob
LQ Newbie
 
Registered: Dec 2008
Posts: 6

Rep: Reputation: 0
in excel try file->import and there you can specify the delimiter char to '|' or whatever you need
 
Old 12-04-2008, 07:40 AM   #8
Shobhna
LQ Newbie
 
Registered: Dec 2008
Posts: 6

Original Poster
Rep: Reputation: Disabled
Hi,

Can somebody help me on this:

How to add a new field to end of each line in a file based on a condition?

For Example,

If the 6th field is "MER" concatenate "NEW1" and "Mercury" to the end of the line. And if it's "WHI1", concatenate "NEW2" and "White" to the end of the line separated by "|" as shown below.

Original File:
1111|55555|||1|MER|Name2||0|0||
1111|9999|||1|WHI1|Name1||0|0||

New File:
1111|55555|||1|MER|Name2||0|0||NEW1|Mercury
1111|9999|||1|WHI1|Name1||0|0||NEW2|White

Thank you.
 
Old 12-04-2008, 11:13 AM   #9
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Code:
awk -F\| '$6 ~ /MER/ {print $0"|New1|Mercury"} $6 ~ /WHI1/{print $0"|New2|White"}' dupes.txt
1111|9999|||1|WHI1|Name1||0|0|||New2|White
1111|55555|||1|MER|Name2||0|0|||New1|Mercury
Cheers,
Tink
 
Old 12-04-2008, 09:53 PM   #10
Shobhna
LQ Newbie
 
Registered: Dec 2008
Posts: 6

Original Poster
Rep: Reputation: Disabled
Thank you :-)
 
Old 12-05-2008, 01:08 PM   #11
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
welcome ... ;}
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: fdupes - Command line tool to find and list/delete duplicate files LXer Syndicated Linux News 0 10-28-2008 03:40 PM
Finding duplicate lines in a file MikeyCarter Linux - Software 3 10-05-2008 05:28 PM
how do u delete duplicate lines bharatbsharma Programming 4 10-29-2007 06:04 PM
command to compare lines in a file? wgato Linux - Newbie 4 12-17-2006 08:55 AM
Using diff to compare file with common lines, but at different line numbers jimieee Linux - Newbie 3 05-10-2004 07:26 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 07:43 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration