consider the following code:
PHP Code:
#!/perl/bin/perl
use strict;
use HTML::Parser;
my $parser = HTML::Parser->new( api_version => 3 );
$parser->handler ( start => sub { print "$_[0] started\n"; }, 'tag' );
$parser->handler ( end => sub { print "$_[0] ended\n"; }, 'tag' );
$parser->parse_file('test.html');
the simple HTML file:
PHP Code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type">
<meta content="Microsoft FrontPage 4.0" name="GENERATOR">
<style type="text/css">
<!--
.navigation { font: normal 8pt Arial, Helvetica, sans-serif; text-decoration: none}
-->
<!--
.topic { font: normal 10pt Arial, Helvetica, sans-serif; text-decoration: none}
-->
</style>
<title>Federal Motor Carrier Safety Administration - About Us</title>
</head>
<body>
<p>Body</p>
<p>Content</p>
<Table border = "1">
<tr>
<td valign = "top">Body</td>
<td valign = "top" align = "right">Content</td>
</tr>
</tablE>
</body>
</html>
and the output:
Quote:
html started
head started
meta started
meta started
style started
/style ended
title started
/title ended
/head ended
body started
p started
/p ended
p started
/p ended
table started
tr started
td started
/td ended
td started
/td ended
/tr ended
/table ended
/body ended
/html ended
|
because the meta tags were never closed, no end event was ever called on them. I really, REALLY need these end events to fire. Is there either
1.) a way to get HTML::Parser to work like this
2.) a decent theory on how I could make it behave like this?
thanks!