xmerl newlines

Ulf Wiger etxuwig@REDACTED
Tue Jun 17 17:34:36 CEST 2003


On Tue, 17 Jun 2003, Erik Reitsma (ETM) wrote:

>> i;m trying to output an xml files and with xmerl 0.15
>> i seem to recall newlines were put in for me.  now i
>> upgraded to 0.18 and the xml is one long string --
>> fine for word-wrap editor like emacs otherwise i bit
>> hard to use.
>
>On the other hand, these newlines and tabs and spaces also
>appear in parsed XML. Therefore this is indeed more
>symmetrical.

There are many different requirements... (:

I understand the problem as being that you've built a
structure of tuples (or #xmlElement{}) in Erlang and want
them exported with some pretty-printing. Is that right?

A less than perfect pretty-print hack to xmerl_xml.erl
illustrates how it could be done. xmerl_xml.erl is quite a
small module, and easily modified into a local version of
export that fits your needs perfectly:

'#element#'(Tag, [], Attrs, Parents, E) ->
    [pp(length(Parents)), empty_tag(Tag, Attrs)];
'#element#'(Tag, Data, Attrs, Parents, E) ->
    [pp(length(Parents)), markup(Tag, Attrs, Data)].

pp(N) ->
    ["\n", lists:duplicate(N*4, $\s)].


This has the disadvantage of not putting the end tags where
you'd expect them to be. To fix that, you have to
copy xmerl_lib:markup/3 and modify it -- not that difficult
perhaps.

To accompany the above hack, you may want to remove any
whitespace already in the structure:

'#text#'(Text) ->
    case is_whitespace(Text) of
	true ->
	    [];
	false ->
	    export_text(Text)
    end.

is_whitespace(" " ++ T) ->  is_whitespace(T);
is_whitespace("\n" ++ T) -> is_whitespace(T);
is_whitespace("\t" ++ T) -> is_whitespace(T);
is_whitespace([_|_]) -> false;
is_whitespace([]) ->
    true.

Or scan with the {space, normalize} option (see below.)


>It would be nice if the newlines and tabs and spaces would
>be removed from the parsed XML (if it is not already
>supported and I missed some option). Now I do a pass
>through the parsed XML to remove all xmlText records that
>contain nothing but whitespace.

I had implemented the {space,...} option wrongly in
xmerl-0.15 and was set straight by those who use and
understand XML. (: However, the option {space,normalize}
_should_ do almost what you want.  It will accumulate
consecutive whitespace and replace it with one space. Close
enough?

/Uffe
-- 
Ulf Wiger, Senior Specialist,
   / / /   Architecture & Design of Carrier-Class Software
  / / /    Strategic Product & System Management
 / / /     Ericsson AB, Connectivity and Control Nodes




More information about the erlang-questions mailing list