[erlang-questions] Rant: I hate parsing XML with Erlang

Torbjorn Tornkvist tobbe@REDACTED
Tue Oct 23 17:57:05 CEST 2007


Hakan Mattsson wrote:
> On Tue, 23 Oct 2007, Joel Reymont wrote:
> 
> JR> Take a look at the following [1] and try to visualize an  
> JR> implementation in Erlang. More thoughts after the example.
> JR> 
> JR> The data:
> JR> 
> JR> <Export>
> JR>    <Product>
> JR>      <SKU>403276</SKU>
> JR>      <ItemName>Trivet</ItemName>
> JR>      <CollectionNo>0</CollectionNo>
> JR>      <Pages>0</Pages>
> JR>    </Product>
> JR> </Export>
> JR> 
> JR> The Ruby hPricot code:
> JR> 
> JR> FIELDS = %w[SKU ItemName CollectionNo Pages]
> JR> 
> JR> doc = Hpricot.parse(File.read("my.xml"))
> JR> (doc/:product).each do |xml_product|
> JR>    product = Product.new
> JR>    for field in FIELDS
> JR>      product[field] = (xml_product/field.intern).first.innerHTML
> JR>    end
> JR>    product.save
> JR> end
> 
> At a first glance your Ruby code looks impressively
> compact.  But the corresponding implementation in
> Erlang is about the same size. What's the point in
> adding some syntactic sugar in order to make it even
> more compact? It is just a matter of taste.
> 
>     % cat product.erl
>     -module(product).
>     -compile(export_all).
>     -include_lib("xmerl/include/xmerl.hrl").
>     
>      parse(File) ->
>         {#xmlElement{content = Exports}, _} = xmerl_scan:file(File),
>         [{Tag, Val} || #xmlElement{content = Products} <- Exports,
>                        #xmlElement{content = Fields} <- Products,
>                        #xmlText{parents = [{Tag, _} | _], value = Val}  <- Fields].
>     
>     % erl
>     Erlang (BEAM) emulator version 5.5.5 [source] [async-threads:0] [kernel-poll:false]
>     
>     Eshell V5.5.5  (abort with ^G)
>     1> product:parse("my.xml").
>     [{'SKU',"403276"},{'ItemName',"Trivet"},{'CollectionNo',"0"},{'Pages',"0"}]
>     2> 
> 
> /Håkan
> 

Well done Håkan ;-)

Here is another (not as nice as your) solution, which however is
rather fun, making use of Xpath:
--------------------------------------------
-module(xp).
-export([go/0, go/1]).

-include_lib("xmerl/include/xmerl.hrl").

-define(Val(X),
        (fun() ->
                 [#xmlElement{name = N, content = [#xmlText{value =
V}|_]}] = X,
                 {N,V} end)()).

go() ->
    go("/home/tobbe/hej.xml").

go(File) ->
    {Xml, _} = xmerl_scan:file(File),
    [?Val(xmerl_xpath:string("//SKU", Xml)),
     ?Val(xmerl_xpath:string("//ItemName", Xml)),
     ?Val(xmerl_xpath:string("//CollectionNo", Xml)),
     ?Val(xmerl_xpath:string("//Pages", Xml))].
-------------------------------------------------

5> xp:go().
[{'SKU',"403276"},{'ItemName',"Trivet"},{'CollectionNo',"0"},{'Pages',"0"}]


Cheers, Tobbe







More information about the erlang-questions mailing list