LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 02-26-2010, 09:01 AM   #1
Sector11
Member
 
Registered: Feb 2010
Distribution: BunsenLabs (Debian Stable)
Posts: 132

Rep: Reputation: Disabled
Strange "characters" appearing in auto "created" man pages


Hello people

If I use:
Code:
man aptitude
I see what I am supposed to see, for example:
Code:
       install
           Install one or more packages. The packages should be listed after
           the “install” command; if a package name contains a tilde character
           (“~”) or a question mark (“?”), it will be treated as a search
           pattern and every package matching the pattern will be installed
           (see the section “Search Patterns” in the aptitude reference
           manual).
Now if I create a text file with:
Code:
man aptitude>aptitude.txt
and then look at it, I see:
Code:
       install
           Install one or more packages. The packages should be listed after
           the “install” command; if a package name contains a tilde character
           (“~”) or a question mark (“?”), it will be treated as a search
           pattern and every package matching the pattern will be installed
           (see the section “Search Patterns” in the aptitude reference
           manual).
Does anyone know why and is there a fix?
 
Old 02-26-2010, 09:55 AM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Garbled symbols are a sure sign of a conflict in character encodings. The file is probably either being created in an encoding that can't handle those characters, or it's being created correctly and the display program is set to use the wrong encoding. Do you get the same effect no matter what text reader or editor you use? If not, then my first guess is that the file is being created using utf-8, but the text display is trying to use something else, such as Western European (iso-8859-1).

If all programs show the same problem, then the source is likely the encoding used when the file is created; in which case I couldn't off-hand tell you why it's doing that exactly or how to fix it. The same command works just fine for me.

Please run the "locale" command and post the results, so we can see what encoding your shell is set to.

Last edited by David the H.; 02-26-2010 at 09:58 AM. Reason: correction and cleanup
 
Old 02-26-2010, 02:39 PM   #3
knudfl
LQ 5k Club
 
Registered: Jan 2008
Location: Copenhagen DK
Distribution: PCLinuxOS2023 Fedora38 + 50+ other Linux OS, for test only.
Posts: 17,517

Rep: Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641
Are you sure, that a "troff" document can be converted to
text just like that. I don't think so.

http://heirloom.sourceforge.net/doctools/troff.1b.html

http://vmlinux.org/cgi-bin/dwww?type...cation=TROFF/1
.....
 
Old 02-26-2010, 02:44 PM   #4
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
That is what the man command does. Convert troff documents to text in your terminal. Changing the encoding of your terminal to utf8 would resolve strange characters when reading a manpage.

You may have a document that is intended to be printed instead of viewed in the terminal. But this wouldn't be the case for man pages.

It may be better to do something like this:
man --pager=cat --encoding=utf8 ><topic>.txt <topic>

You could create a oneliner in ~/bin/ or use an alias

alias man2txt='man --pager=cat --encoding=utf8'

man2txt smb.conf

#!/bin/bash
topic="$1"
man --pager=cat --encoding=utf8 $topic >${topic}.txt

p.s. No, I didn't change my signature just for this post. I had it previously.

Last edited by jschiwal; 02-26-2010 at 02:55 PM.
 
Old 02-28-2010, 09:56 AM   #5
Sector11
Member
 
Registered: Feb 2010
Distribution: BunsenLabs (Debian Stable)
Posts: 132

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by David the H. View Post
Please run the "locale" command and post the results, so we can see what encoding your shell is set to.
Code:
Sun Feb 28, 12:47 $ locale
LANG=en_GB.UTF-8
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER="en_GB.UTF-8"
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT="en_GB.UTF-8"
LC_IDENTIFICATION="en_GB.UTF-8"
LC_ALL=
Sun Feb 28, 12:47 $
Hi David the H., Thanks for the response.

I see UTF-8 in there, Geany tells me the file is: ISO-8859-1

But the file I created "copying" the terminal output to a text tile is: UTF-8 (without BOM) and both gedit and geany read the strange characters in the first file.
 
Old 02-28-2010, 10:13 AM   #6
Sector11
Member
 
Registered: Feb 2010
Distribution: BunsenLabs (Debian Stable)
Posts: 132

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by knudfl View Post
Are you sure, that a "troff" document can be converted to
text just like that. I don't think so.

http://heirloom.sourceforge.net/doctools/troff.1b.html

http://vmlinux.org/cgi-bin/dwww?type...cation=TROFF/1
.....
Hi knudfl, thanks for responding, I'll check those links out.

I had to reinstall lately and was looking at the aptitude man pages when I decided that I wanted it as a text tile, and that's the result. Strange thing is about 50% of the time I get these strange characters.

They are usually a single quote: ( ' ) a double quote but not the "text" ones ( " ) these look like a ( 66 ) and ( 99 ) if you get my drift, and the hyphen ( - ).

A search and replace fixes it but it is a "process" I could do without.
 
Old 02-28-2010, 10:31 AM   #7
Sector11
Member
 
Registered: Feb 2010
Distribution: BunsenLabs (Debian Stable)
Posts: 132

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by jschiwal View Post
That is what the man command does. Convert troff documents to text in your terminal. Changing the encoding of your terminal to utf8 would resolve strange characters when reading a manpage.

You may have a document that is intended to be printed instead of viewed in the terminal. But this wouldn't be the case for man pages.

It may be better to do something like this:
man --pager=cat --encoding=utf8 ><topic>.txt <topic>

You could create a oneliner in ~/bin/ or use an alias

alias man2txt='man --pager=cat --encoding=utf8'

man2txt smb.conf

#!/bin/bash
topic="$1"
man --pager=cat --encoding=utf8 $topic >${topic}.txt

p.s. No, I didn't change my signature just for this post. I had it previously.
Hi jschiwal,

Terminator is configured to use UTF-8 and I've never had the problem when "reading" in a terminal just with reading the text file:

Code:
man program_name > program_name.txt
I tried your man2txt above and ended up with the same strange characters.
 
Old 02-28-2010, 11:05 AM   #8
Sector11
Member
 
Registered: Feb 2010
Distribution: BunsenLabs (Debian Stable)
Posts: 132

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by jschiwal View Post
p.s. No, I didn't change my signature just for this post. I had it previously.
Cute, I'm doing 110 things at once and just saw what you were talking about. Works nice except I don't use KDE. Evince can read the .ps file but not like you have in your sig.

I'm going to play with that though. I would much rather have text files, I can read them easier and edit things (add nots etc - for personal use.)

Another thanks for you.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
automount 5.0.3 with auto.smb script yields "key "<name>" not found in map" rbergen Linux - Software 0 01-01-2010 12:27 PM
Cannot access "man pages" as normal user - temp filename creation error. uncle-c Linux - Newbie 2 03-11-2008 01:10 PM
List tools that have "man" pages. OldAl Ubuntu 11 08-03-2007 10:38 AM
LXer: Displaying "MyComputer", "Trash", "Network Servers" Icons On A GNOME Desktop LXer Syndicated Linux News 0 04-02-2007 08:31 AM
man pages show backquotes as "u" with umlaut? brainclots Red Hat 1 12-01-2003 11:26 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 03:33 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration