xmerl simple out of CDATA blocks
Anthony Molinaro
anthonym@REDACTED
Thu Nov 18 01:00:35 CET 2010
Hi,
So I noticed after some searching that while you can read CDATA blocks
with xmerl you can't seem to write them out. So for instance
Erlang R14B (erts-5.8.1) [source] [64-bit] [smp:2:2] [rq:2] [async-threads:0] [kernel-poll:false]
Eshell V5.8.1 (abort with ^G)
1> I = "<HTMLResource><![CDATA[\n<html>\n <body>\n <h1>Hello World</h1>\n <a href=\"http://www.example.com/?foo=bar&baz=bob\">Bye</a>\n </body>\n</html>\n]]></HTMLResource>".
"<HTMLResource><![CDATA[\n<html>\n <body>\n <h1>Hello World</h1>\n <a href=\"http://www.example.com/?foo=bar&baz=bob\">Bye</a>\n </body>\n</html>\n]]></HTMLResource>"
2> {X,_} = xmerl_scan:string (I).
{{xmlElement,'HTMLResource','HTMLResource',[],
{xmlNamespace,[],[]},
[],1,[],
[{xmlText,[{'HTMLResource',1}],
1,[],
"\n<html>\n <body>\n <h1>Hello World</h1>\n <a href=\"http://www.example.com/?foo=bar&baz=bob\">Bye</a>\n </body>\n</html>\n",
cdata}],
[],"/home/molinaro/tmp",undeclared},
[]}
3> O = lists:flatten (xmerl:export_simple_content ([X], xmerl_xml)).
"<HTMLResource>\n<html>\n <body>\n <h1>Hello World</h1>\n <a href=\"http://www.example.com/?foo=bar&baz=bob\">Bye</a>\n </body>\n</html>\n</HTMLResource>"
4> I = O.
** exception error: no match of right hand side value "<HTMLResource>\n<html>\n <body>\n <h1>Hello World</h1>\n <a href=\"http://www.example.com/?foo=bar&baz=bob\">Bye</a>\n </body>\n</html>\n</HTMLResource>"
5>
However, with the following patch
--- a/xmerl.erl 2010-09-29 11:13:00.000000000 -0700
+++ b/xmerl.erl 2010-11-01 11:23:54.000000000 -0700
@@ -185,6 +185,8 @@
%% Content = [Element]
%% Callback = [atom()]
%% @doc Exports normal XML content directly, without further context.
+export_content([#xmlText{value = Text, type = cdata} | Es], Callbacks) ->
+ [ "<![CDATA[", Text, "]]>" | export_content(Es, Callbacks) ];
export_content([#xmlText{value = Text} | Es], Callbacks) ->
[apply_text_cb(Callbacks, Text) | export_content(Es, Callbacks)];
export_content([#xmlPI{} | Es], Callbacks) ->
I get the behavior I want.
Erlang R14B (erts-5.8.1) [source] [64-bit] [smp:2:2] [rq:2] [async-threads:0] [kernel-poll:false]
Eshell V5.8.1 (abort with ^G)
1> I = "<HTMLResource><![CDATA[\n<html>\n <body>\n <h1>Hello World</h1>\n <a href=\"http://www.example.com/?foo=bar&baz=bob\">Bye</a>\n </body>\n</html>\n]]></HTMLResource>".
"<HTMLResource><![CDATA[\n<html>\n <body>\n <h1>Hello World</h1>\n <a href=\"http://www.example.com/?foo=bar&baz=bob\">Bye</a>\n </body>\n</html>\n]]></HTMLResource>"
2> {X,_} = xmerl_scan:string (I). {{xmlElement,'HTMLResource','HTMLResource',[], {xmlNamespace,[],[]},
[],1,[],
[{xmlText,[{'HTMLResource',1}],
1,[],
"\n<html>\n <body>\n <h1>Hello World</h1>\n <a href=\"http://www.example.com/?foo=bar&baz=bob\">Bye</a>\n </body>\n</html>\n",
cdata}],
[],"/home/molinaro",undeclared},
[]}
3> O = lists:flatten (xmerl:export_simple_content ([X], xmerl_xml)). "<HTMLResource><![CDATA[\n<html>\n <body>\n <h1>Hello World</h1>\n <a href=\"http://www.example.com/?foo=bar&baz=bob\">Bye</a>\n </body>\n</html>\n]]></HTMLResource>"
4> I = O.
"<HTMLResource><![CDATA[\n<html>\n <body>\n <h1>Hello World</h1>\n <a href=\"http://www.example.com/?foo=bar&baz=bob\">Bye</a>\n </body>\n</html>\n]]></HTMLResource>"
However, it seems like this sort of patch would be unacceptable since it may
be invalid for certain formatters (ie, it assumes cdata is always rendered
the same). However, I don't see anyway to do this with the current code
as there is only a '#text#' callback which takes the text from an #xmlText
element without the type. This means you can't have different rules based
on the type.
So I'm basically looking for a little guidance before I start hacking a
larger patch. What sort of behavior would be acceptable? Maybe a new
callback '#cdata#' with a default of using '#text#'? Or a '#text#'/2
function with takes the type? Other ideas? Or maybe this patch is
fine?
Thanks,
-Anthony
--
------------------------------------------------------------------------
Anthony Molinaro <anthonym@REDACTED>
More information about the erlang-questions
mailing list