ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have been at it all day trying to figure out how to do what should be a simple find & replace with sed. Sed is winning. I could have edited a couple of the files by hand in this much time, but there are some 900 of them so......
The goal: take a list of both antiquated words and phrases and replace them with modern equivalents. The list of words (75 of them) is in a two column csv file. I have managed to get those passed to sed but sed is returning an error.
The code
Code:
#!/bin/bash
# Script to change antiquated words to more modern language
set -x
# path to word list file
DPATH="$HOME/bin/scripting/JPSupdate.csv"
# Path to files to be edited
BPATH="$HOME/bin/shabbat/tmp/*"
while read line; do
IFS=, read -a arr <<< "$line"
# echo "${arr[0]}" # old word
# echo "${arr[1]}" # new word
sed -i.orig "s/"${arr[0]}"/"${arr[1]}"/gi" $BPATH
# echo "${arr[0]}-${arr[1]} &" >> "$HOME/bin/output.jps"
done < "$DPATH"
The debug
Code:
+ read line
+ IFS=,
+ read -a arr
+ sed -i.orig s/doth know/knows/gi /home/kingbee/bin/shabbat/tmp/et0105.htm /home/kingbee/bin/shabbat/tmp/jpsdata
sed: -e expression #1, char 6: unterminated `s' command
+ read line
+ IFS=,
+ read -a arr
+ sed -i.orig s/hearken unto/listen to/gi /home/kingbee/bin/shabbat/tmp/et0105.htm /home/kingbee/bin/shabbat/tmp/jpsdata
sed: -e expression #1, char 9: unterminated `s' command
+ read line
It seems to me that sed may be having issues with there being whitespace in the array. Later in the script when it gets past the phrases it seems to stop this behavior.
It is also not replacing the data correctly so I redirected the output, without the sed command active, to a file to check that the variables are being filled correctly and it seems so. The , in the csv has been replaced by a -.
The redirect
Code:
and called his name-and named him &
art thou-are you &
doth know-knows &
hearken unto-listen to &
I did eat-I ate &
I do bring-I will bring &
I know not-I don't know &
is come-is &
It does one other thing that needs to be fixed. It is generating a new file for every loop in the csv file. I assume that has something to do with the -i.orig in the sed line. But getting it to process correctly first is key.
So I found reference to needing to add a '&' to cause sed to replace the whole string but found no joy.
Code:
sed -i.orig "s/"${arr[0]}"/"${arr[1]}" &/gi" $BPATH
As can be seen in the (colored) output the variables are not being processed correctly
Code:
+ DPATH=/home/kingbee/bin/scripting/JPSupdate.csv
+ BPATH='/home/kingbee/bin/shabbat/tmp/*'
+ read line
+ IFS=,
+ read -a arr
+ sed -i.orig s/and called his name/and named 'him &/gi' /home/kingbee/bin/shabbat/tmp/et0104.htm /home/kingbee/bin/shabbat/tmp/jpsdata
sed: -e expression #1, char 5: unterminated `s' command
+ echo 'and called his name,and named him'
+ read line
+ IFS=,
+ read -a arr
+ sed -i.orig s/are art you You thou/are 'you &/gi' /home/kingbee/bin/shabbat/tmp/et0104.htm /home/kingbee/bin/shabbat/tmp/jpsdata
sed: -e expression #1, char 5: unterminated `s' command
+ echo 'are art you You thou,are you'
+ read line
+ IFS=,
+ read -a arr
+ sed -i.orig s/doth 'know/knows &/gi' /home/kingbee/bin/shabbat/tmp/et0104.htm /home/kingbee/bin/shabbat/tmp/jpsdata
sed: -e expression #1, char 6: unterminated `s' command
+ echo 'doth know,knows'
+ read line
+ IFS=,
+ read -a arr
+ sed -i.orig s/hearken unto/listen 'to &/gi' /home/kingbee/bin/shabbat/tmp/et0104.htm /home/kingbee/bin/shabbat/tmp/jpsdata
sed: -e expression #1, char 9: unterminated `s' command
+ echo 'hearken unto,listen to'
The variables redirected to a file seem to be correct. (partial list):
Code:
and called his name,and named him
art thou,are you
doth know,knows
hearken unto,listen to
I did eat,I ate
I do bring,I will bring
But the output file has not been altered correctly as can be seen in these excerpts:
Code:
...... help of the look HaShem LORD.' .... unto the look HaShem LORD. 4 And .... 'Why are art you You thou angry wroth? ...... If you You thou do doest well, will shall it..... if you You thou do doest not well
Build a command file with "old word or phrase" and "new word or phrase."
With this CmdFile file ...
Code:
s/dreary/dark and rainy/g
s/curious volume/obscure tome/g
s/napping/snoozing/g
... and this InFile ...
Code:
Once upon a midnight dreary, while I pondered weak and weary,
Over many a quaint and curious volume of forgotten lore,
While I nodded, nearly napping, suddenly there came a tapping,
As of some one gently rapping, rapping at my chamber door.
''Tis some visitor,' I muttered, 'tapping at my chamber door -
Only this, and nothing more.'
... this sed ...
Code:
sed -f $CmdFile $InFile >$OutFile
... produced this OutFile ...
Code:
Once upon a midnight dark and rainy, while I pondered weak and weary,
Over many a quaint and obscure tome of forgotten lore,
While I nodded, nearly snoozing, suddenly there came a tapping,
As of some one gently rapping, rapping at my chamber door.
''Tis some visitor,' I muttered, 'tapping at my chamber door -
Only this, and nothing more.
That helped a lot. But now I have a different problem. It is finding partial words and replacing that part. For instance one of the words that is searched for is art, as in "where fore art thou". But earth has art in it so it replaces art with are and I end up with this type of output "field was yout in the eareh," How do I make it skip partial words?
I had seen the use of a command file and had considered it but not gotten around to trying it.
Google revealed the solution to my new problem, and I didn't have to search for two days to find it. I only had to add the \bsome text\b to the input string in the command file to get it to process correctly.
The code:
Code:
#!/bin/bash
# Script to change antiquated words to more modern language
# set -x
# Build the command file like
# s/\band called his name\b/and named him/g
# s/\bart thou\b/are you/g
# The \b some text\b causes the whole string to be required
# and so partial words are not changed (beware of extra spaces.
# path to command file
DPATH="$HOME/bin/scripting/JPSupdateCMD"
# Path to files to be edited
BPATH="$HOME/bin/shabbat/tmp/*"
#sed -f $CmdFile $InFile >$OutFile
find $BPATH -type f -exec \
sed -i -f $DPATH $BPATH {} +
Quick few things about the bash and sed original solution:
1. BPATH="$HOME/bin/shabbat/tmp/*" - In original script this makes sense as you want sed to work on all files under path. In latest script you can remove the * as find will already recursively look for all files under the path
2. I think I have mentioned this previously, but you do not need to read a line from a file into a variable and the split it after. Just use read into variables on the while line:
Code:
while read line; do
IFS=, read -a arr <<< "$line"
while IFS=, read -a arr; do
# or even clearer name wise
while IFS=, read -a old_words new_words; do
3. As for the original sed error. It may be to do with all the superfluous quoting. You could try:
Code:
while IFS=, read -a old_words new_words; do
sed -i.orig "s/$old_words/$new_words/gi" $BPATH
Of course you would still want to place the \b's around the old words to make sure you get the correct ones
Another potential issue you may face with the above is once the * is expanded, sed may not be able to handle all the files provided.
You may then have to resort to find and xargs to help with this or you could do an outer loop for the file names and use the inner while loop for the changes:
Code:
for bfile in $BPATH; do
while IFS=, read -a old_words new_words; do
sed -i.orig "s/$old_words/$new_words/gi" "$bfile"
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.