Strange "characters" appearing in auto "created" man pages
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Strange "characters" appearing in auto "created" man pages
Hello people
If I use:
Code:
man aptitude
I see what I am supposed to see, for example:
Code:
install
Install one or more packages. The packages should be listed after
the “install” command; if a package name contains a tilde character
(“~”) or a question mark (“?”), it will be treated as a search
pattern and every package matching the pattern will be installed
(see the section “Search Patterns” in the aptitude reference
manual).
Now if I create a text file with:
Code:
man aptitude>aptitude.txt
and then look at it, I see:
Code:
install
Install one or more packages. The packages should be listed after
the “install” command; if a package name contains a tilde character
(“~”) or a question mark (“?”), it will be treated as a search
pattern and every package matching the pattern will be installed
(see the section “Search Patterns” in the aptitude reference
manual).
Garbled symbols are a sure sign of a conflict in character encodings. The file is probably either being created in an encoding that can't handle those characters, or it's being created correctly and the display program is set to use the wrong encoding. Do you get the same effect no matter what text reader or editor you use? If not, then my first guess is that the file is being created using utf-8, but the text display is trying to use something else, such as Western European (iso-8859-1).
If all programs show the same problem, then the source is likely the encoding used when the file is created; in which case I couldn't off-hand tell you why it's doing that exactly or how to fix it. The same command works just fine for me.
Please run the "locale" command and post the results, so we can see what encoding your shell is set to.
Last edited by David the H.; 02-26-2010 at 09:58 AM.
Reason: correction and cleanup
That is what the man command does. Convert troff documents to text in your terminal. Changing the encoding of your terminal to utf8 would resolve strange characters when reading a manpage.
You may have a document that is intended to be printed instead of viewed in the terminal. But this wouldn't be the case for man pages.
It may be better to do something like this:
man --pager=cat --encoding=utf8 ><topic>.txt <topic>
You could create a oneliner in ~/bin/ or use an alias
alias man2txt='man --pager=cat --encoding=utf8'
man2txt smb.conf
#!/bin/bash
topic="$1"
man --pager=cat --encoding=utf8 $topic >${topic}.txt
p.s. No, I didn't change my signature just for this post. I had it previously.
Please run the "locale" command and post the results, so we can see what encoding your shell is set to.
Code:
Sun Feb 28, 12:47 $ locale
LANG=en_GB.UTF-8
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER="en_GB.UTF-8"
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT="en_GB.UTF-8"
LC_IDENTIFICATION="en_GB.UTF-8"
LC_ALL=
Sun Feb 28, 12:47 $
Hi David the H., Thanks for the response.
I see UTF-8 in there, Geany tells me the file is: ISO-8859-1
But the file I created "copying" the terminal output to a text tile is: UTF-8 (without BOM) and both gedit and geany read the strange characters in the first file.
Hi knudfl, thanks for responding, I'll check those links out.
I had to reinstall lately and was looking at the aptitude man pages when I decided that I wanted it as a text tile, and that's the result. Strange thing is about 50% of the time I get these strange characters.
They are usually a single quote: ( ' ) a double quote but not the "text" ones ( " ) these look like a ( 66 ) and ( 99 ) if you get my drift, and the hyphen ( - ).
A search and replace fixes it but it is a "process" I could do without.
That is what the man command does. Convert troff documents to text in your terminal. Changing the encoding of your terminal to utf8 would resolve strange characters when reading a manpage.
You may have a document that is intended to be printed instead of viewed in the terminal. But this wouldn't be the case for man pages.
It may be better to do something like this:
man --pager=cat --encoding=utf8 ><topic>.txt <topic>
You could create a oneliner in ~/bin/ or use an alias
alias man2txt='man --pager=cat --encoding=utf8'
man2txt smb.conf
#!/bin/bash
topic="$1"
man --pager=cat --encoding=utf8 $topic >${topic}.txt
p.s. No, I didn't change my signature just for this post. I had it previously.
Hi jschiwal,
Terminator is configured to use UTF-8 and I've never had the problem when "reading" in a terminal just with reading the text file:
Code:
man program_name > program_name.txt
I tried your man2txt above and ended up with the same strange characters.
p.s. No, I didn't change my signature just for this post. I had it previously.
Cute, I'm doing 110 things at once and just saw what you were talking about. Works nice except I don't use KDE. Evince can read the .ps file but not like you have in your sig.
I'm going to play with that though. I would much rather have text files, I can read them easier and edit things (add nots etc - for personal use.)
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.