LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 12-03-2005, 07:18 PM   #1
Wire323
LQ Newbie
 
Registered: Dec 2005
Posts: 3

Rep: Reputation: 0
Bash - Deleting duplicate records


I have a text file full of user-submitted email addresses. I want to remove the duplicate records, but it isn't as simple as using "uniq." When I find a dupe I want to remove both of them, not just one. If it's possible I'd also like to create a text file containing all of the email addresses that had duplicates.

Is this possible?

Thanks
 
Old 12-03-2005, 08:50 PM   #2
Wire323
LQ Newbie
 
Registered: Dec 2005
Posts: 3

Original Poster
Rep: Reputation: 0
I've changed things slightly. Instead of removing them completely I'd like to leave on, and only take the dupes out. I know I can do that with uniq, but how would I know which ones were taken out so I can write them to a file?
 
Old 12-03-2005, 10:14 PM   #3
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Try this:
Code:
vi x
aaa
bbb
aaa
ccc
aaa

sort x|uniq -d
aaa
 
Old 12-03-2005, 10:57 PM   #4
Wire323
LQ Newbie
 
Registered: Dec 2005
Posts: 3

Original Poster
Rep: Reputation: 0
Thanks for the reply.

I don't know if this was the best way, but I was able to do it like this:

sort participants | uniq > temp1
sort participants > temp2
comm -1 -3 temp1 temp2 > temp3
sort temp3 | uniq > outputfile
 
Old 12-03-2005, 11:39 PM   #5
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Try "sort participants|uniq -d"

I suspect you'll probably get the same result (but I confess - I don't know for sure!)

Anyway, glad you got it working!

Your .. PSM
 
Old 12-04-2005, 08:51 AM   #6
eddiebaby1023
Member
 
Registered: May 2005
Posts: 378

Rep: Reputation: 33
You can use "uniq -c" which will prefix each line with a count of the number of times the line occurred; any count greater than 1 will have been a duplicate.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Deleting empty line at end of text file in BASH human2.0 Linux - General 8 04-01-2009 02:44 AM
Deleting duplicate messages essdeeay Linux - General 1 11-20-2005 07:58 AM
Detecting duplicate keys in records. carl.waldbieser Programming 15 09-15-2005 06:24 AM
bash script, for deleting a specefic mail sn0wman Programming 8 01-16-2004 06:20 PM
MX Records Terri Linux - Networking 2 01-21-2002 07:06 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:39 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration