[erlang-questions] Character encodings and lager
Roger Lipscombe
roger@REDACTED
Mon Aug 3 17:12:31 CEST 2015
I'm tracking down a crash in one of our custom lager backends. The
relevant piece of code is, cut down, something like the following:
Message = lager_msg:message(Msg),
JSON = mochijson2:encode({struct, [{"msg", list_to_binary(Message)}]}).
When I call it with the following...
lager:log(info, self(), "~p", [<<178, 179>>]).
...it crashes with an exception: {ucs,{bad_utf8_character_code}}
Now, I know that's not a valid UTF8 character code: it's superscript-2
and superscript-3, as encoded in Latin1.
Cutting this down further, I get:
Message = [60,60,178,179,62,62].
mochijson2:encode({struct, [{"msg", list_to_binary(Message)}]}).
** exception exit: {ucs,{bad_utf8_character_code}}
So, my question would -- usually -- be: "how do I convert the Latin1
string to UTF8?".
However, the binary isn't supposed to contain anything outside the
32-127 ASCII range. In fact, it should be an uppercase hexadecimal
string: [A-F0-9] in ASCII.
Note: In the original crash, the string was sent from an embedded
device, and it appears to have garbage in it because of some kind of
corruption in configuration NVRAM.
So, I have an actual *binary*, which usually only contains valid hex
characters (in ASCII), but occasionally has bytes outside this range.
How do I get that into mochijson2, via lager, without anything
crashing?
I tried the following:
mochijson2:encode({struct, [{"msg",
unicode:characters_to_binary(Message)}]}).
...which works, but am I going to get burnt if I start using UTF-8 in
my logging once we move to Erlang 17 or 18?
How do others deal with this kind of thing in Erlang?
Regards,
Roger.
More information about the erlang-questions
mailing list