uniq -u : does not seem to remove duplicate lines
I am trying to comb through my local.cf file and remove all the duplicate blacklist_from entries. I ran
uniq -u local.cf output.cf It did trim about 45 lines out of the file. But, there are still many many duplicate lines. I thought maybe they were different some how, but not visible to the eye...BUT, I ran: sort output.cf | uniq -dc this gave me a line count output for all the dups, and there are still many many...as you can see (below). HELP :) root@LINUX03:/home/backups# sort output.cf | uniq -dc 3 11 # 12 blacklist_from 1800FLOWERS@e.1800flowers.com 2 blacklist_from acrane@amgacademy.com 2 blacklist_from alejmagna@hotmail.com 10 blacklist_from alerts@personals.yahoo.com 3 blacklist_from Allen_Brothers@mail.vresp.com 8 blacklist_from Borders@e.borders.com 2 blacklist_from buy.com_offers@enews.buy.com 5 blacklist_from capitalone@email.capitalone.com 2 blacklist_from customerservice@duebrightlive.info 2 blacklist_from customerservice@ehealthinsurance.com 2 blacklist_from customerservice@mymorepayhomeonline.info 2 blacklist_from customerservice@youreraseduelive.info 2 blacklist_from directv@customerinfo.directv.com 3 blacklist_from email@email.creditreport.com 8 blacklist_from email@email.hotels.com 4 blacklist_from etrade@email.etradefinancial.com 30 blacklist_from group-digests@linkedin.com 4 blacklist_from HHonors@h3.hilton.com 4 blacklist_from info@aiueducationonline.com 4 blacklist_from info@birdiebug.com 2 blacklist_from info@promo-em.jetblue.com 2 blacklist_from info@samstailor.com 2 blacklist_from invite@naymz.com 6 blacklist_from iprint@specials.iprint.com 12 blacklist_from JobAlerts@CyberCoders.com 2 blacklist_from lilly@sportsub.com 16 blacklist_from listmaster@thegolfchannel.com 7 blacklist_from mail@netapp.com 2 blacklist_from mail@news.beachcamera.com 2 blacklist_from microsoft@reply.digitalriver.com 2 blacklist_from mike.moreno_at_mbofpleasanton.com@mmserver.com 2 blacklist_from Mimosa_Systems@mail.vresp.com 6 blacklist_from movies@news.fandango.com 2 blacklist_from mwilkinson@serrahs.com 2 blacklist_from nancyp@saintmatthew.org 2 blacklist_from newsletter@reply.ticketmaster.com 2 blacklist_from notifications@email.etradefinancial.com 6 blacklist_from NutriSystem@news.nutrisystem.com 4 blacklist_from paypal@email.paypal.com 2 blacklist_from PGATOUR@pgatouremail.com 2 blacklist_from PGATOUR@weic11.com 4 blacklist_from radioshack@em.radioshack.com 2 blacklist_from Rebecca_Salie@mail.vresp.com 4 blacklist_from replies@oracle-mail.com 10 blacklist_from reply@igmemail.com 2 blacklist_from rexspelling@resumespider.com 28 blacklist_from rushinahurry@rushlimbaugh.com 2 blacklist_from sanjoseexecutives@gmail.com 2 blacklist_from store-news@amazon.com 4 blacklist_from Store-News@ShopAETV.p0.com 2 blacklist_from support@myremoveliability.info 2 blacklist_from TheHartford@weic11.com 4 blacklist_from updates@linkedin.com 4 blacklist_from update@stubhub-mail.com 2 blacklist_from ups@upsemail.com 2 blacklist_from vmwareteam@connect.vmware.com 2 blacklist_from voyages@viator.messages1.com 2 blacklist_from WebEx@weic11.com |
you must supply uniq with sorted data. try 'sort local.cf | uniq -c'
|
The same goes for the "comm" command.
comm -3 <(sort list1) <(sort list2) |
why use 2 progrs: sort -u local.cf
sort has a unique (-u/--unique) option..... |
Quote:
Look at that file: $ cat file Code:
one $ uniq -u file Code:
one $ uniq -dc file Code:
2 three $ sort file | uniq -c Code:
4 four $ sort file | uniq -c | sort -nr Code:
4 four $ sort -u file Code:
four |
I know that.
I just try to say that: sort file | uniq is too long. use sort -u file |
I quoted in my post your comment but the entire post was directed rather to boxb29 than to you. I'm not sure what he'd like to achieve but I suppose he'd like to receive ``clean'' file including each unique line only once. If so your advice is the best solution of that problem.
|
perfect...
Code:
sort -u local.cf > new.file |
All times are GMT -5. The time now is 04:11 PM. |