Re: [Maypole-dev] Strange automatic character conversion effect in Maypole

From: Simon Flack (sf at flacks.net)
Date: Tue Dec 14 2004 - 23:15:53 GMT


Kester Habermann wrote:
> Hi,
>
> I had a strange character encoding effect when using maypole. As it
> took me quite a while to figure out what's causing the problem and how
> to get rid of it, I'll post it here. Maybe it will save someone else's
> time. I'm also curious to find out one last detail about the problem.
>
[snip]

> After many hours I managed to track down the cause for this automatic
> conversion from iso-8859-15 to utf-8. I store the uri base (and other
> stuff) in a XML-file and retrieve and set it like this:
>
> $config = XMLin("conf.xml");
> __PACKAGE__->config->uri_base($config->{uri_base});
>
> XMLin sets the utf-8-flag on all strings, even though the file
> contains only ascii and the ecoding of the file is set to iso-8859-15
> (this behavior is documented for XML::Simple).
>
> My hack to solve this problem was to clear the utf-8 flag on the
> strings I got from XMLin using Encode::_utf8_off().
>
> At the moment I have no idea where this magical conversion takes
> place. I know that concatenating a string with cleared utf-8 flag and
> a string with set utf-8-flag will result in a string with utf-8-flag
> set, but I can't see where the actual conversion takes place.
>
> If you want to try this, take the beerdb example and add
>
> use Encode;
>
> and change the uri-base to
>
> my $uri_base = "http://localhost/beerdb/";
> Encode::_utf8_on($uri_base);
> BeerDB->config->uri_base($uri_base);
>
> Then add some non ASCII characters to the frontpage template. I added
>
> iso-8859-15 "ä" (0xE4 LATIN SMALL LETTER A WITH DIAERESIS)
>
> Leaving the maypole default set to utf-8 will give a perfectly valid
> utf-8-page where my umlaut has been converted to the utf-8 two-byte
> coding À and will show correctly in the browser.
>
> Does anyone have an idea where this conversion takes place? Maybe in
> mod_perl, perl-file-io, apache, template toolkit or maypole?

I've experienced this before with perl5.6.1. I /think/ it's XML::Parser
that does that conversion.

> Btw. I you additinal_data to set the document encoding back to the old
> default:
>
> sub additional_data {
> my $r = shift;
>
> $r->{document_encoding} = "iso-8859-15";
> }
>
> Is that the way it is supposed to be done or is there a better way?

Yes. Or using the accessor: $r->document_encoding('iso-8859-15')

--simonflk

_______________________________________________
maypole-dev mailing list
maypole-dev at lists.netthink.co.uk
http://lists.netthink.co.uk/listinfo/maypole-dev



This archive was generated by hypermail 2.1.3 : Thu Feb 24 2005 - 22:25:57 GMT