LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   sed question (https://www.linuxquestions.org/questions/programming-9/sed-question-303194/)

tifu 03-18-2005 10:36 AM

sed question
 
I have a file (file01) with several words that I need to output to another file (file02). I need file02 to be formated to a single column, will all non-alphabet characters removed, except for dash (-) and appostrophe (') characters.

When I use the following, it formats correcly and removes all non-alphabet characters (including ' and -)

tr ' ' '\n' <$FILEDIR/file01 | sed -e 's/[^a-z A-Z]//;/^$/ d' >$FILEDIR/file02


When I try to specify the exclusion of ' and - characters as in below, the command fails. What am I doing wrong?

tr ' ' '\n' <$FILEDIR/file01 | sed -e 's/[^a-z A-Z]//;s/[\']//; s/[\-];/^$/ d' >$FILEDIR/file02

Thanks

Omar

TheLinuxDuck 03-18-2005 11:17 AM

Note that the sed expression uses single quotes (') for it's info, you'll need to escape the single quote in the body of the sed block. That's prolly the error.

TheLinuxDuck 03-18-2005 11:23 AM

Ok, after playing, it looks like you'll have to change the sed block characters from a single quote(') to a double quote("), which will then allow you to escape the single quote in the body:
Code:

cat filename.txt | sed -e "s/[^a-zA-Z\\'-]//g;/^$/ d"
This worked as expected.

ahh 03-18-2005 11:28 AM

I was playing around with this too, and came to the same conclusion.

However, I also noted that the single quote doesn't have to be escaped when enclosed in double quotes.

TheLinuxDuck 03-18-2005 11:32 AM

Quote:

Originally posted by ahh
However, I also noted that the single quote doesn't have to be escaped when enclosed in double quotes.
(= I thought I had tried it without and it didn't work, but I just tried it again and sure enough. (= Ok, well at any rate, you should be good to go. (=

tifu 03-18-2005 12:02 PM

Thank you both. I ended up using the command below. Note the \ as suggested [^a-zA-Z\'-] meant that I wanted to keep the character \ as well. I removed it and my universe is all well.

Thanks again

tr ' ' '\n' <$FILEDIR/file01 | sed -e "s/[^a-zA-Z'-]//;/^$/ d" >$FILEDIR/file02


All times are GMT -5. The time now is 07:57 AM.