LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Bash - Deleting duplicate records (https://www.linuxquestions.org/questions/programming-9/bash-deleting-duplicate-records-388974/)

Wire323 12-03-2005 07:18 PM

Bash - Deleting duplicate records
 
I have a text file full of user-submitted email addresses. I want to remove the duplicate records, but it isn't as simple as using "uniq." When I find a dupe I want to remove both of them, not just one. If it's possible I'd also like to create a text file containing all of the email addresses that had duplicates.

Is this possible?

Thanks

Wire323 12-03-2005 08:50 PM

I've changed things slightly. Instead of removing them completely I'd like to leave on, and only take the dupes out. I know I can do that with uniq, but how would I know which ones were taken out so I can write them to a file?

paulsm4 12-03-2005 10:14 PM

Try this:
Code:

vi x
aaa
bbb
aaa
ccc
aaa

sort x|uniq -d
aaa


Wire323 12-03-2005 10:57 PM

Thanks for the reply.

I don't know if this was the best way, but I was able to do it like this:

sort participants | uniq > temp1
sort participants > temp2
comm -1 -3 temp1 temp2 > temp3
sort temp3 | uniq > outputfile

paulsm4 12-03-2005 11:39 PM

Try "sort participants|uniq -d"

I suspect you'll probably get the same result (but I confess - I don't know for sure!)

Anyway, glad you got it working!

Your .. PSM

eddiebaby1023 12-04-2005 08:51 AM

You can use "uniq -c" which will prefix each line with a count of the number of times the line occurred; any count greater than 1 will have been a duplicate.


All times are GMT -5. The time now is 09:19 AM.