LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 02-11-2009, 10:35 PM   #1
azdruid
LQ Newbie
 
Registered: Jan 2005
Posts: 14

Rep: Reputation: 0
"find" piped to a file mangles Unicode characters


Hi folks, I'm having some trouble with "find" in a bash script. The command I am trying to run is
Code:
find /Users/andrew/Music/ -iname "*.mp3" -or -iname "*.flac"
. Some of the filenames contain Unicode characters. These characters print out perfectly fine in the Terminal, like so:
Code:
/Users/andrew/Music//Brazilian Girls/Brazilian Girls/04 - Sirènes de la Fête.flac
/Users/andrew/Music//Brazilian Girls/Brazilian Girls/05 - Corner Store.flac
HOWEVER, when I pipe this output to a file (like so)
Code:
find /Users/andrew/Music/ -iname "*.mp3" -or -iname "*.flac" > mkplaylist
, the characters are mangled. When I open the file in TextEdit (I am on a Mac), I get this:
Code:
/Users/andrew/Music//Brazilian Girls/Brazilian Girls/04 - SireÃÄnes de la FeÃÇte.flac
How can I make bash print these characters properly? Thanks very much in advance!
 
Old 02-12-2009, 06:55 AM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
I assume your language environment is set to UTF-8? But what encoding does "file filename" give you? It might be saving the file in, say iso-8859, or something instead. Then, are you sure the encoding display of TextEdit is set properly? It may not be automatically detecting the encoding of the file.

You might try piping the find output, or the file, through iconv to ensure that the data is in the encoding that you want.

I myself try to avoid filenames with non-ascii or restricted characters like spaces in them. It saves a lot of hassle in situations like this.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how can I "cat" or "grep" a file to ignore lines starting with "#" ??? callagga Linux - Newbie 7 08-16-2013 06:58 AM
anything piped to grep returns "(standard input)"... why? allohakdan Linux - Software 8 09-28-2008 01:02 PM
Accented Characters and other "foreign language" Characters Mark_in_Hollywood LQ Suggestions & Feedback 2 04-30-2007 06:10 PM
Can't install "glibmm" library. "configure" script can't find "sigc++-2.0&q kornerr Linux - General 4 05-10-2005 02:32 PM
need help, on how to access quickly to special characters like "ñ" or "á"? Motaro Linux - Newbie 1 12-31-2003 11:53 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:34 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration