[SOLVED] how can I ignore or remove lines with 2 or more identical numbers in the same line?
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
this is called grouping and backreference. You need to create a group (this what you are looking for) and use backreference to specify repetition of the same string.
sed is probably the easiest to delete lines based on content. It accepts regex constructs you have already been been directed to in prior threads.
Read the doco.
this is called grouping and backreference. You need to create a group (this what you are looking for) and use backreference to specify repetition of the same string.
sed is probably the easiest to delete lines based on content. It accepts regex constructs you have already been been directed to in prior threads.
Read the doco.
Some man pages for some commands are easy to decipher, but man pages for sed, awk and grep can be confusing to understand especially dealing with regex. I know the very, very basics of these commands. Sometimes it's hard to know when to use grouping and how to group it properly. I need to study regex as much as possible.
...how can I ignore or remove lines with 2 or more identical numbers
in the same line.
His example InFile contained two-digit numbers and your solution produced a correct OutFile for this limited case. I tried to extend your solution to numbers of various lengths and was not successful. Please teach us how this is done. You might like to use this sample InFile...
Is there a clear delimitation between each number?
You can trial this with python3.
Split up string into elements of list -> 'a a b c' into ['a', 'a', 'b', 'c'] and remove extra characters like newlines
Put copy of list into set. ['a', 'a', 'b', 'c'] into {'b', 'a', 'c'} (Sets are unordered and can only contain unique values)
Check the number of elements in the list (['a', 'a', 'b', 'c'] = 4) and set ({'b', 'a', 'c'} = 3). If they are equal, print the original string since no duplicates were detected.
Code:
#!/usr/bin/env python3
import fileinput
dlm = ' '
for line in fileinput.input():
dlm_line = line.strip().split(dlm)
if len(set(dlm_line)) == len(dlm_line):
print(line, end='')
For variable sized numbers, you'd need to add word boundaries before and after the group as part of the pattern. The notation is different for the different styles of regular expression:
but we also need to specify delimiter (to avoid match 234 and 123456), so you need to specify zero length boundaries: http://perldoc.perl.org/perlrebacksl...%7b%7d%2c-%5cB
It is not trivial (looks like zero length pattern cannot be backreferenced), so:
I think the answer to that question depends on you. But I'll ramble since you ask. I myself find perl much, much easier and quite fun but part of that is that there are some key characteristics of python that I do not like at all and I'm not able to get past that distaste. That said, there was also a big push for a long time to disparage perl. I think it was backed by M$ in an attempt to push one of their failures but instead most people just pivoted to python and (ugh) PHP. perl has much more flexible syntax, a proven mature catalog of modules, and more powerful regular expressions. However, most regex work can still be met by python. In favor of python is that it has be adopted by a great many successful training programmes and initiatives as a training language. The back side of that is that it strikes me as a training language and may end up haunting us 20 years from now in bad ways like BASIC once did. Python enjoys a certain trendiness at the moment. I also suspect, but don't fully have the skill to assess, that perl has been put together better from a CS standpoint.
One assumes all those comments pertain to perl 5. Only.
The schism in perl is no more attractive than that in python. The user has been the victim of the developers once again.
I keep trying to get into python, but it just hasn't happened.
One assumes all those comments pertain to perl 5. Only.
Yes. Perl 6 is a totally different language despite the name and the development team. I have not gotten around to looking carefully at Perl 6, it might be good it might not be. However, it is not ubiquitous like Perl 5 is, and has been for decades.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.