Quote:
Originally Posted by raj000
|
This may be a better description of what you want to do, if you mean "email addresses" when you say mails and you provided a represented sample of lines from each file.
Since you are using the entire line, you want to be left with a file that contains items in file1 that are unique.
Code:
comm -23 <(sort file1) <(sort file2) >file3
The "comm" program prints 3 columns. Files unique to file1, files unique to file 2 and files common to both. The -23 option suppresses the printing of the second and third column.
The "comm" program is one of the programs supplied by the coreutils package so you should have it.
--
Another way of doing this is using grep with the -f option combined with the -v option. This would remove lines from one file that contains patterns in a second file.
Code:
grep -vf <(sort file2 | uniq) file1 >file3
Sorting file2 isn't necessary here but if there is a lot of repetition `uniq' can eliminate the duplicates to save time.