[SOLVED] Replace pattern in specific lines and column with AWK
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Replace pattern in specific lines and column with AWK
Hi eveyone,
I´m tryng to replace in specific column and line number within a file where its 3erd column contain the same string in all lines.
Code:
Name 1|Last name 1|Normal player
Name 2|Last name 2|Normal player
Name 3|Last name 3|Normal player
Name 4|Last name 4|Normal player
Name 5|Last name 5|Normal player
Name 6|Last name 6|Normal player
Name 7|Last name 7|Normal player
My goal is to replace only first and last ocurrences of "Normal player", with the following desired output:
Code:
Name 1|Last name 1|Captain
Name 2|Last name 2|Normal player
Name 3|Last name 3|Normal player
Name 4|Last name 4|Normal player
Name 5|Last name 5|Normal player
Name 6|Last name 6|Normal player
Name 7|Last name 7|Coach
I´m not sure how to use the "IF" and "AND" conditions together. I´ve tryed with the code below, but the script replaces the string for every line.
Code:
awk 'BEGIN {OFS=FS="|"; IGNORECASE=1} {if($3 && NR=1) sub(/Normal player/,"Captain",$3); print}' inputfile
Name 1|Last name 1|Captain
Name 2|Last name 2|Captain
Name 3|Last name 3|Captain
Name 4|Last name 4|Captain
Name 5|Last name 5|Captain
Name 6|Last name 6|Captain
Name 7|Last name 7|Captain
how to replace values for specific column in first and last lines within same AWK script, without taking reference data in other columns?
You may need to provide some more details about the data, ie is the pattern likely to repeat?
So if I use a shorter list could it look like:
Code:
Name 1|Last name 1|Normal player
Name 2|Last name 2|Normal player
Name 3|Last name 3|Normal player
Name 1|Last name 1|Normal player
Name 2|Last name 2|Normal player
Name 3|Last name 3|Normal player
So here you would need "Captain" and "Coach" on a few different lines.
If not and your data only looks like the above you should also check out the END and NR construct/variables
Thanks Syg00, your suggestion is correct, I changed "==" instead of "="
and now it works. grail, thanks for your help so far.
Following syg00´s advice, now I have two different scripts that work separately, one for replace pattern in "first line/3erd column", and the other to replace pattern in "last line/3rd column".
Code:
To search and replace pattern in "first line/3erd column"
awk 'BEGIN {OFS=FS="|"; IGNORECASE=1} {if($3 && NR==1) sub(/normal player/,"captain"); print}' inputfile
To search and replace pattern in "last line/3rd column"-->(in this example I now the last line is 7, but varies)
awk 'BEGIN {OFS=FS="|"; IGNORECASE=1} {if($3 && NR==7) sub(/normal player/,"coach"); print}' inputfile
And besides this I know how to get the total number of lines in a file using:
Code:
LastRecord=$(awk 'END{print NR}' inputfile)
Now to finish my script I would like to join these 2 scripts and the "LastRecord" variable in one single AWK script. The variable part should be something like NR==LastRecord instead of NR==7.
I´ve been trying to join the first two scripts as can be seen below, adding and END{} sentence, but only shows me the last line.
You've got almost all of it already. You need to print each line. Remove the 3 characters "END" and see what happens.
I find you learn most by "playing around" - with a reference handy usually helps.
This works only if you know prior to running how many lines there are.
Also, as you do know the format of each line, ie that the third column is the one needing changing and you are changing the whole field,
you can dispense with the sub and just use: <field_number>="<new_text>"
As syg00 has said, you already have the solution, but I thought I would give you an alternative to think about. Let me know if you need
help with what is happening:
I only deleted "END" and the script prints all records.
Study + Test + Error + Test + Error + Test + ... + Great Help = Success!
Working of my version script after changes:
awk 'BEGIN {OFS=FS="|"; IGNORECASE=1}
{if($3 && NR==1) sub(/normal player/,"captain");
if($3 && NR==7) sub(/normal player/,"coach")} {print}' inputfile
syg00, finally with the above script, it´s possible to include the variable that stores the number of the last line "LastRecord=$(awk 'END{print NR}' inputfile)" in something similar to NR==LastRecord instead of NR==7?
grail,
Excellent! really thanks to you too. Your recent code works precise, even I don´t understand some parts of it. For this reason I´ve followed your suggestions and built something similar to your script using what I know so far:
I remove "sub" function assigning directly a new value to $3 and the script is simpler now, but doesn´t work correctly. But doesn´t matter, is only to compare yours and trying to undertand it.
Code:
I´m not sure if it works, because only prints the last line.
awk 'BEGIN {OFS=FS="|"; IGNORECASE=1} NR==1{$3="Captain"} END{$3="Coach";print}' file
Seeing the different parts, how is structured and how works your script comparing with the one I wrote I think that:
1-)";f=0" --> Is like a control variable function and could take values like 0, 1, ...n depending needs.
2-)"f{print x}" --> Evaluating the function to print variable x, but I´m not sure what really is, and what does.
3-)"{x=$0;f=1}" --> Obviously assign all fields to x and f takes value=1 like saying that will execute a new process. I think this part stores all records to avoid show only last line when execute END{$3="coach";print} like happen to me in the previous script.
4-)"NR==1{$3="captain"}" and "END{$3="coach"" --> I understand better this, I think is a directly assignation in a specific column and specific line.
Well, this is how I can interpret your script, please may you help me explaining "why each part is present?", "how it works each part?", "what does each part do?" and "how the script knows which is the last line?".
I hope not to overwhelm you with so many questions
1) f=0 - this is simply assigning a value to f and you are correct it could be any number but for our purposes you need to think of it as a boolean, ie only
ever 1 or 0
2) f{print x} - f is from above and is like an if at the front. In this case false is 0 and 1 (or any other number really) is true. So this says only print x
when f is true (ie not 0)
3) {x=$0;f=1} - Yes to assigning entire line to x to be printed. We are now setting f to 1 (true) so that next time through the option in number "2)" above will work.
4)NR==1{$3="captain"}" and "END{$3="coach" - Change third field to "captain" when in the first record and to coach when processing the last record.
Essentially everything prior to END will print all lines except the last with the word "captain" replaced on the first line.
The END then allows you to print the final line with third field as "coach"
The nice thing about doing it this way is it doesn't matter if file has 7 lines or 700, it will still work.
grail, I´m understand much, much better with your explanation. I´ve learn a lot in this thread. This will help a lot in future scripts.A good example in how to use a boolean controller to print on our needs.
I´ve put your script operation mode and logic that used it within how I understand in a graphic way, an Script Image.
There is no "last record" indicator I'm aware of. END enables you to process after you've read past the last record (i.e. reached eof). Useful for printing totals or footer lines.
Else you need something like grail offered.
Take note of 2) and 3) above - it introduces a nuance that may not be obvious.
And note the comment above re "f{print x}" - it is *not* a function, although it looks (too much) like it if you are used to other languages.
Thanks for explain me END use, I´m now more clear with it use (In Which cases, When and When Not).
Regarding f{print x}, I changed my idea that could be like a function to think it that acts like an IF statement. Thanks to you both I´m understand in better level several things about awk language.
Very appreciated your priceless help. I looked for help, and I received a lot.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.