LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-09-2015, 03:31 PM   #1
rbees
Member
 
Registered: Mar 2004
Location: northern michigan usa
Distribution: Debian Squeeze, Whezzy, Jessie
Posts: 921

Rep: Reputation: 46
after all day I still sed it wrong


Ladies & Gents,

Thanks again for all the great help here. ))

I have been at it all day trying to figure out how to do what should be a simple find & replace with sed. Sed is winning. I could have edited a couple of the files by hand in this much time, but there are some 900 of them so......

The goal: take a list of both antiquated words and phrases and replace them with modern equivalents. The list of words (75 of them) is in a two column csv file. I have managed to get those passed to sed but sed is returning an error.

The code
Code:
#!/bin/bash

# Script to change antiquated words to more modern language

 set -x


# path to word list file
DPATH="$HOME/bin/scripting/JPSupdate.csv"
# Path to files to be edited
BPATH="$HOME/bin/shabbat/tmp/*"

while read line; do
IFS=, read -a arr <<< "$line"
#  echo "${arr[0]}"	# old word
#  echo "${arr[1]}"	# new word
  sed -i.orig "s/"${arr[0]}"/"${arr[1]}"/gi" $BPATH
#  echo "${arr[0]}-${arr[1]} &" >> "$HOME/bin/output.jps"

done < "$DPATH"
The debug
Code:
+ read line
+ IFS=,
+ read -a arr
+ sed -i.orig s/doth know/knows/gi /home/kingbee/bin/shabbat/tmp/et0105.htm /home/kingbee/bin/shabbat/tmp/jpsdata
sed: -e expression #1, char 6: unterminated `s' command
+ read line
+ IFS=,
+ read -a arr
+ sed -i.orig s/hearken unto/listen to/gi /home/kingbee/bin/shabbat/tmp/et0105.htm /home/kingbee/bin/shabbat/tmp/jpsdata
sed: -e expression #1, char 9: unterminated `s' command
+ read line
It seems to me that sed may be having issues with there being whitespace in the array. Later in the script when it gets past the phrases it seems to stop this behavior.

It is also not replacing the data correctly so I redirected the output, without the sed command active, to a file to check that the variables are being filled correctly and it seems so. The , in the csv has been replaced by a -.

The redirect
Code:
and called his name-and named him &
art thou-are you &
doth know-knows &
hearken unto-listen to &
I did eat-I ate &
I do bring-I will bring &
I know not-I don't know &
is come-is &
It does one other thing that needs to be fixed. It is generating a new file for every loop in the csv file. I assume that has something to do with the -i.orig in the sed line. But getting it to process correctly first is key.


Thanks
Thanks
 
Old 03-09-2015, 05:21 PM   #2
rbees
Member
 
Registered: Mar 2004
Location: northern michigan usa
Distribution: Debian Squeeze, Whezzy, Jessie
Posts: 921

Original Poster
Rep: Reputation: 46
So I found reference to needing to add a '&' to cause sed to replace the whole string but found no joy.
Code:
sed -i.orig "s/"${arr[0]}"/"${arr[1]}" &/gi" $BPATH
As can be seen in the (colored) output the variables are not being processed correctly
Code:
+ DPATH=/home/kingbee/bin/scripting/JPSupdate.csv
+ BPATH='/home/kingbee/bin/shabbat/tmp/*'
+ read line
+ IFS=,
+ read -a arr
+ sed -i.orig s/and called his name/and named 'him &/gi' /home/kingbee/bin/shabbat/tmp/et0104.htm /home/kingbee/bin/shabbat/tmp/jpsdata
sed: -e expression #1, char 5: unterminated `s' command
+ echo 'and called his name,and named him'
+ read line
+ IFS=,
+ read -a arr
+ sed -i.orig s/are art you You thou/are 'you &/gi' /home/kingbee/bin/shabbat/tmp/et0104.htm /home/kingbee/bin/shabbat/tmp/jpsdata
sed: -e expression #1, char 5: unterminated `s' command
+ echo 'are art you You thou,are you'
+ read line
+ IFS=,
+ read -a arr
+ sed -i.orig s/doth 'know/knows &/gi' /home/kingbee/bin/shabbat/tmp/et0104.htm /home/kingbee/bin/shabbat/tmp/jpsdata
sed: -e expression #1, char 6: unterminated `s' command
+ echo 'doth know,knows'
+ read line
+ IFS=,
+ read -a arr
+ sed -i.orig s/hearken unto/listen 'to &/gi' /home/kingbee/bin/shabbat/tmp/et0104.htm /home/kingbee/bin/shabbat/tmp/jpsdata
sed: -e expression #1, char 9: unterminated `s' command
+ echo 'hearken unto,listen to'
The variables redirected to a file seem to be correct. (partial list):
Code:
and called his name,and named him
art thou,are you
doth know,knows
hearken unto,listen to
I did eat,I ate
I do bring,I will bring
But the output file has not been altered correctly as can be seen in these excerpts:
Code:
 ...... help of the look HaShem LORD.' .... unto the look HaShem LORD. 4 And .... 'Why are art you You thou angry wroth? ...... If you You thou do doest well, will shall it..... if you You thou do doest not well
So still searching google for some direction.

Thanks again.
 
Old 03-09-2015, 05:36 PM   #3
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Build a command file with "old word or phrase" and "new word or phrase."

With this CmdFile file ...
Code:
s/dreary/dark and rainy/g
s/curious volume/obscure tome/g
s/napping/snoozing/g
... and this InFile ...
Code:
Once upon a midnight dreary, while I pondered weak and weary,
Over many a quaint and curious volume of forgotten lore,
While I nodded, nearly napping, suddenly there came a tapping,
As of some one gently rapping, rapping at my chamber door.
''Tis some visitor,' I muttered, 'tapping at my chamber door -
Only this, and nothing more.'
... this sed ...
Code:
sed -f $CmdFile $InFile >$OutFile
... produced this OutFile ...
Code:
Once upon a midnight dark and rainy, while I pondered weak and weary,
Over many a quaint and obscure tome of forgotten lore,
While I nodded, nearly snoozing, suddenly there came a tapping,
As of some one gently rapping, rapping at my chamber door.
''Tis some visitor,' I muttered, 'tapping at my chamber door -
Only this, and nothing more.
Daniel B. Martin
 
1 members found this post helpful.
Old 03-09-2015, 06:28 PM   #4
rbees
Member
 
Registered: Mar 2004
Location: northern michigan usa
Distribution: Debian Squeeze, Whezzy, Jessie
Posts: 921

Original Poster
Rep: Reputation: 46
thanks danielbmartin,

That helped a lot. But now I have a different problem. It is finding partial words and replacing that part. For instance one of the words that is searched for is art, as in "where fore art thou". But earth has art in it so it replaces art with are and I end up with this type of output "field was yout in the eareh," How do I make it skip partial words?

I had seen the use of a command file and had considered it but not gotten around to trying it.

Thanks again.
 
Old 03-09-2015, 07:14 PM   #5
rbees
Member
 
Registered: Mar 2004
Location: northern michigan usa
Distribution: Debian Squeeze, Whezzy, Jessie
Posts: 921

Original Poster
Rep: Reputation: 46
We have joy.

Thanks again for all the help.

Google revealed the solution to my new problem, and I didn't have to search for two days to find it. I only had to add the \bsome text\b to the input string in the command file to get it to process correctly.

The code:
Code:
#!/bin/bash

# Script to change antiquated words to more modern language

# set -x

# Build the command file like
# s/\band called his name\b/and named him/g
# s/\bart thou\b/are you/g

# The \b some text\b causes the whole string to be required
# and so partial words are not changed (beware of extra spaces.

# path to command file
DPATH="$HOME/bin/scripting/JPSupdateCMD"

# Path to files to be edited
BPATH="$HOME/bin/shabbat/tmp/*"


#sed -f $CmdFile $InFile >$OutFile
find $BPATH -type f -exec \
    sed -i -f $DPATH $BPATH {} +

Last edited by rbees; 03-09-2015 at 07:15 PM.
 
Old 03-09-2015, 08:09 PM   #6
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,011

Rep: Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194
Quick few things about the bash and sed original solution:

1. BPATH="$HOME/bin/shabbat/tmp/*" - In original script this makes sense as you want sed to work on all files under path. In latest script you can remove the * as find will already recursively look for all files under the path

2. I think I have mentioned this previously, but you do not need to read a line from a file into a variable and the split it after. Just use read into variables on the while line:
Code:
while read line; do
IFS=, read -a arr <<< "$line"

while IFS=, read -a arr; do

# or even clearer name wise
while IFS=, read -a old_words new_words; do
3. As for the original sed error. It may be to do with all the superfluous quoting. You could try:
Code:
while IFS=, read -a old_words new_words; do
  sed -i.orig "s/$old_words/$new_words/gi" $BPATH
Of course you would still want to place the \b's around the old words to make sure you get the correct ones

Another potential issue you may face with the above is once the * is expanded, sed may not be able to handle all the files provided.
You may then have to resort to find and xargs to help with this or you could do an outer loop for the file names and use the inner while loop for the changes:
Code:
for bfile in $BPATH; do
  while IFS=, read -a old_words new_words; do
    sed -i.orig "s/$old_words/$new_words/gi" "$bfile"
(Something like that ... untested)
 
1 members found this post helpful.
Old 03-09-2015, 10:04 PM   #7
rbees
Member
 
Registered: Mar 2004
Location: northern michigan usa
Distribution: Debian Squeeze, Whezzy, Jessie
Posts: 921

Original Poster
Rep: Reputation: 46
thanks grail,

I will look at that tomorrow
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] .\+ in sed: What am I doing wrong? kmkocot Linux - Newbie 5 07-09-2013 02:33 PM
wrong sed operation shridhar22 Programming 4 08-30-2012 07:15 PM
[SOLVED] Thunderbird email wrong times of the day fusion1275 Linux - Software 3 04-25-2012 04:00 AM
calendar first day wrong andrewld Slackware 4 09-28-2009 05:02 PM
sed command error - what am i doing wrong? Morrighan Linux - Newbie 8 06-15-2008 11:12 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:26 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration