HTML::Parser and entities

Is there a way to get HTML::Parser to leave entities in text alone?
There is the attr_encode() method, but that only appears to affect
attributes. Basically I have code that wants to selectively remove
some tags but leave others and entities intact. I could convert
back to entities using HTML::Entities, but the text I have in some
cases mixes numeric and named entities and I need to maintain what
the original was.

--
Steve Sapovits steves06 [at] comcast.net
steves06 [ Mo, 24 Januar 2005 23:27 ] [ ID #605268 ]

Re: [PMX:VIRUS] HTML::Parser and entities

Steve Sapovits <steves06 [at] comcast.net> writes:

> Is there a way to get HTML::Parser to leave entities in text alone?

Just use 'text' argspec and you get the text exactly as it is.

> There is the attr_encode() method, but that only appears to affect
> attributes. Basically I have code that wants to selectively remove
> some tags but leave others and entities intact.

The hstrip example does exactly this.

http://search.cpan.org/src/GAAS/HTML-Parser-3.45/eg/hstrip

Regards,
Gisle
gisle [ Mo, 24 Januar 2005 23:44 ] [ ID #605269 ]
Perl » perl.libwww » HTML::Parser and entities

Vorheriges Thema: Unable to build Crypt::SSLeay.
Nächstes Thema: automating javascript data forms