[erlang-questions] Unicode, decoding the content from a HTML page (www_tools and rfc4627)

Berlin Brown berlin.brown@REDACTED
Sun Feb 17 23:05:07 CET 2008

I was using the rfc4627 library which I think uses
"xmerl_ucs:from_utf8(Str)" ... for example.  Trying to decode a web
page, the binary contents and convert them into  utf-8.

E.g. I did the following:



-import(url, [test/0, raw_get_url/2]).
-import(rfc4627, [unicode_decode/1]).

start_social() ->
	io:format("*** Running social statistics~n"),
	case url:raw_get_url("http://botnode.com/", 60000) of
		{ok, Data} ->
			%% val = list_to_binary(xmerl_ucs:from_utf8([Data])),
			val = rfc4627:unicode_decode(Data),
			io:format("Data is valid: ~p ~n", val),
			{ok, Data};
		{error, What} ->
			io:format("ERR:~p ~n", [What]),
		    {error, What}
	io:format("*** Done [!]~n").

Berlin Brown

More information about the erlang-questions mailing list