LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-06-2003, 08:49 AM   #1
lackluster
Member
 
Registered: Apr 2002
Location: D.C - USA
Distribution: slackware-current
Posts: 488

Rep: Reputation: 30
HTML::Parser behaviour


consider the following code:

PHP Code:
#!/perl/bin/perl

use strict;
use 
HTML::Parser;

my $parser HTML::Parser->newapi_version => );
$parser->handler start => sub { print "$_[0] started\n"; }, 'tag' );
$parser->handler end => sub { print "$_[0] ended\n"; }, 'tag' );
$parser->parse_file('test.html'); 
the simple HTML file:

PHP Code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<
html>
    <
head>
        <
meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type">
        <
meta content="Microsoft FrontPage 4.0" name="GENERATOR">
        <
style type="text/css">
            <!--
            .
navigation {  fontnormal 8pt ArialHelveticasans-seriftext-decorationnone}
            -->
            <!--
            .
topic {  fontnormal 10pt ArialHelveticasans-seriftext-decorationnone}
            -->
        </
style>
        <
title>Federal Motor Carrier Safety Administration About Us</title>
    </
head>
    <
body>
        <
p>Body</p>
        <
p>Content</p>
        <
Table border "1">
            <
tr>
                <
td valign "top">Body</td>
                <
td valign "top" align "right">Content</td>
            </
tr>
        </
tablE>
    </
body>
</
html
and the output:

Quote:
html started
head started
meta started
meta started
style started
/style ended
title started
/title ended
/head ended
body started
p started
/p ended
p started
/p ended
table started
tr started
td started
/td ended
td started
/td ended
/tr ended
/table ended
/body ended
/html ended
because the meta tags were never closed, no end event was ever called on them. I really, REALLY need these end events to fire. Is there either

1.) a way to get HTML::Parser to work like this

2.) a decent theory on how I could make it behave like this?

thanks!
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
checking for XML::Parser... configure: error: XML::Parser perl module is required for kornerr Linux - General 11 11-16-2008 07:24 AM
c++ html parser needed gecoool Programming 1 11-07-2005 09:47 PM
java HTML parser ashirazi Programming 1 09-20-2004 03:33 AM
Installing HTML::Parser via webmin fails dARkHunTEr Linux - Software 10 05-01-2004 04:29 PM
Konqueror + file:/usr/share/doc/HTML/index.html jon_k Linux - Software 2 11-25-2003 05:06 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 02:38 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration