[erlang-questions] Unicode, decoding the content from a HTML page (www_tools and rfc4627)
Berlin Brown
berlin.brown@REDACTED
Sun Feb 17 23:05:07 CET 2008
I was using the rfc4627 library which I think uses
"xmerl_ucs:from_utf8(Str)" ... for example. Trying to decode a web
page, the binary contents and convert them into utf-8.
E.g. I did the following:
-module(socialstats).
-export([start_social/0]).
-import(url, [test/0, raw_get_url/2]).
-import(rfc4627, [unicode_decode/1]).
start_social() ->
io:format("*** Running social statistics~n"),
case url:raw_get_url("http://botnode.com/", 60000) of
{ok, Data} ->
%% val = list_to_binary(xmerl_ucs:from_utf8([Data])),
val = rfc4627:unicode_decode(Data),
io:format("Data is valid: ~p ~n", val),
{ok, Data};
{error, What} ->
io:format("ERR:~p ~n", [What]),
{error, What}
end,
io:format("*** Done [!]~n").
--
Berlin Brown
http://botspiritcompany.com/botlist/spring/help/about.html
More information about the erlang-questions
mailing list