[erlang-questions] Leex And Character Encodings
Gordon Guthrie
gordon@REDACTED
Mon Aug 23 10:13:11 CEST 2010
Richard
> 3. In any case, if you are going to hack it, you should make
the 16#C2,16#20 sequence equivalent to ONE space, not two.
I am getting 16#C2, 16#A0 and not 16#C2, 16#20 which I think is right for
non-breaking spaces...
Gordon
On 22 August 2010 23:41, Richard O'Keefe <ok@REDACTED> wrote:
>
> On Aug 21, 2010, at 8:37 PM, Gordon Guthrie wrote:
> > The problem comes when I put spaces in the white space:
> > = 1 + 2 "=Â 1 +Â Â Â Â 2" = 1 +
> > 2 #ERROR!
> >
> > The expression round trips fine but (unlike the previous examples) the
> > server-side expression returns an error for the value because the
> expression
> > doesn't match any valid syntax.
> >
> > Tabs are expanded to white spaces so the only problem (I think) is with
> > multiple white spaces - which is why I think just adding a lexical token
> to
> > make  the same as 2 spaces would work.
>
> It's not clear to me what precisely is mangling the spaces.
> What _is_ clear is that "Â " is precisely what you see when
> the Latin-1 No-Break-Space is first converted to UTF-8 and
> then displayed by something expecting Latin-1.
>
> 1. How do no-break-space characters turn up?
> 2. What is it that is rendering them as if they were encoded
> in Latin-1 rather than UTF-8?
> 3. In any case, if you are going to hack it, you should make
> the 16#C2,16#20 sequence equivalent to ONE space, not two.
>
>
--
Gordon Guthrie
CEO hypernumbers
http://hypernumbers.com
t: hypernumbers
+44 7776 251669
More information about the erlang-questions
mailing list