LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-23-2022, 09:18 PM   #1
sysmicuser
Member
 
Registered: Mar 2010
Posts: 458

Rep: Reputation: 0
Extract first word after a match


Hey Guys,


I have a file like this

cat extract_data.text
[CODE]Delete Unattached Managed Standard SSD Volume pvc-1566b063-fcc6-45a0-a95d-d16e01408807 from PZI-AU-SANDBOX-SUB001
Delete Unattached Managed Standard SSD Volume pvc-edfd546d-646a-4fb8-99a4-d400cbeb608c from PZI-AU-SANDBOX-SUB001
Delete Unattached Managed Standard SSD Volume kubernetes-dynamic-pvc-404fd0bd-10ac-4ca8-bc14-2d292632f59a from PZI-AU-SANDBOX-SUB001
Delete Unattached Managed Standard SSD Volume kubernetes-dynamic-pvc-8893f6fe-d06b-4e59-a22e-9271785f7b94 from PZI-AU-SANDBOX-SUB001
Delete Unattached Managed Standard HDD Volume kubernetes-dynamic-pvc-68a671fb-b250-42df-b335-2a4c0a375262 from PZI-AU-SANDBOX-SUB001

All I want is the resource name which comes as First word AFTER Volume. Therefore the sample output what I want is

Code:
pvc-1566b063-fcc6-45a0-a95d-d16e01408807
pvc-edfd546d-646a-4fb8-99a4-d400cbeb608c
kubernetes-dynamic-pvc-404fd0bd-10ac-4ca8-bc14-2d292632f59a
kubernetes-dynamic-pvc-68a671fb-b250-42df-b335-2a4c0a375262
ANd, I did try something like this:


sed -nr "s/.*Volume (\w+).*/\1/p" extract_data.text

Code:
pvc
pvc
kubernetes
kubernetes
pvc
pvc
pvc
pvc
pvc
kubernetes
kubernetes
kubernetes
kubernetes
kubernetes
kubernetes
kubernetes
kubernetes
kubernetes
kubernetes
kubernetes
kubernetes
kubernetes
kubernetes
pvc
pvc
pvc
pvc
pvc
kubernetes
kubernetes
kubernetes
kubernetes
kubernetes
kubernetes
Second try:

grep "Volume" extract_data.text |cut -f 3
Code:
Delete Unattached Managed Standard SSD Volume pvc-1566b063-fcc6-45a0-a95d-d16e01408807 from PZI-AU-SANDBOX-SUB001
Delete Unattached Managed Standard SSD Volume pvc-edfd546d-646a-4fb8-99a4-d400cbeb608c from PZI-AU-SANDBOX-SUB001
Delete Unattached Managed Standard SSD Volume kubernetes-dynamic-pvc-404fd0bd-10ac-4ca8-bc14-2d292632f59a from PZI-AU-SANDBOX-SUB001
Delete Unattached Managed Standard SSD Volume kubernetes-dynamic-pvc-8893f6fe-d06b-4e59-a22e-9271785f7b94 from PZI-AU-SANDBOX-SUB001
Delete Unattached Managed Standard HDD Volume kubernetes-dynamic-pvc-68a671fb-b250-42df-b335-2a4c0a375262 from PZI-AU-SANDBOX-SUB001
3rd oen is EPIC but I am not an awk expert, something like form this [post|https://stackoverflow.com/questions/...d-after-match]

awk 'BEGIN{FS="Volume"} {printf ("%s:%d:%s\n", extract_data.txt, NR, $2)}'
Code:
awk: cmd. line:1: BEGIN{FS="Volume"} {printf ("%s:%d:%s\n", extract_data.txt, NR, $2)}
awk: cmd. line:1:                                                       ^ syntax error

Last edited by sysmicuser; 10-23-2022 at 09:28 PM. Reason: Providing relevant examples.
 
Old 10-23-2022, 10:31 PM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,152

Rep: Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125
Use sed. The word parameter (\w) doesn't include special characters like the dash. You could create a char group that does, but I'd likely use "not space" like this (untested)
Code:
sed -nr "s/.*Volume ([^[:space:]]+).*/\1/p" extract_data.text
awk can be made much simpler if the data are all that well structured.
 
Old 10-23-2022, 11:27 PM   #3
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,039

Rep: Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347
Code:
awk -F'[- ]' '{ print $7 }' file
or something similar (not tested)
 
Old 10-23-2022, 11:37 PM   #4
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,357
Blog Entries: 3

Rep: Reputation: 3767Reputation: 3767Reputation: 3767Reputation: 3767Reputation: 3767Reputation: 3767Reputation: 3767Reputation: 3767Reputation: 3767Reputation: 3767Reputation: 3767
sed and AWK will do the job as will Perl:

Code:
perl -n -e 'm/(?<=Volume )(\S+)/ && print $1,"\n"' extract_data.text
The advantage that approach has is the pattern matching can be quite powerful. I used Perl for everything for a very long time before learning sed and AWK.
 
1 members found this post helpful.
Old 10-24-2022, 03:15 AM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,011

Rep: Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194
awk just likes encouragement
Code:
awk 'n{print;n=0}/Volume/{n++}' RS=' ' file
 
1 members found this post helpful.
Old 10-24-2022, 03:29 AM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,152

Rep: Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125
What's wrong with something much simpler (presuming well-formed data)
Code:
awk '/Volume/ {print $7}' file
 
1 members found this post helpful.
Old 10-24-2022, 06:46 AM   #7
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,627

Rep: Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556
Quote:
Originally Posted by sysmicuser View Post
All I want is the resource name which comes as First word AFTER Volume.

If the data is guaranteed well-formed, there's no need for a test - either of these would do:
Code:
awk '{print $7}' extract_data.txt
Code:
cut -d' ' -f7 extract_data.txt

Otherwise, I might use slight variations on the various examples already provided...


If you need to exclude certain rows, you can say when column 6 is "Volume", print column 7 - this is more restrictive than just checking Volume exists within the line:
Code:
awk '$6=="Volume" {print $7}' extract_data.txt

If "Volume" is at an unknown/changing position, a tweaked version of the answer Grail provided:
Code:
awk -vRS='\\s' 'found {print;found=0}  /\<Volume\>/ {found=1}' extract_data.txt
Mainly the "\<" and "\>" ensure "Volume" is a distinct word and fixing the variable to make it obvious what "found" is doing.
Using "\s" for the record separator provides more predictable behaviour at end of lines (though not necessarily correct).


Turbocapitalist's Perl also may need a word boundary at the start, and can use "$&" instead of the capturing group.
Code:
perl -n -e 'm/(?<=\bVolume )\S+/ && print $&,"\n"' extract_data.txt
Also, because it's Perl there's a slightly simpler version available, using "\K" to reset the match text, but not the position instead of the lookbehind.
Code:
perl -n -e 'm/\bVolume \K\S+/ && print $&,"\n"' extract_data.txt
And whilst Perl is very powerful, we don't need all that power here, and can just use grep with -P flag:
Code:
grep -oP '\bVolume \K\S+' extract_data.txt
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
regex for phrase like'word-word-word' Zero4 Linux - General 9 07-06-2019 06:36 AM
swap words delimited by : only if second word doesn't contain the first word vincix Programming 13 08-02-2018 02:54 PM
How to capture 1000 lines before a string match and 1000 line a string match including line of string match ? sysmicuser Linux - Newbie 12 11-14-2017 05:21 AM
[SOLVED] How to extract the first word following a string Feynman Linux - Newbie 24 08-24-2010 12:27 PM
print second word in 1st line along with 5th word in all the lines after the first bangaram Programming 5 08-31-2009 03:42 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 06:10 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration