[erlang-questions] Rant: I hate parsing XML with Erlang
Torbjorn Tornkvist
tobbe@REDACTED
Tue Oct 23 17:57:05 CEST 2007
Hakan Mattsson wrote:
> On Tue, 23 Oct 2007, Joel Reymont wrote:
>
> JR> Take a look at the following [1] and try to visualize an
> JR> implementation in Erlang. More thoughts after the example.
> JR>
> JR> The data:
> JR>
> JR> <Export>
> JR> <Product>
> JR> <SKU>403276</SKU>
> JR> <ItemName>Trivet</ItemName>
> JR> <CollectionNo>0</CollectionNo>
> JR> <Pages>0</Pages>
> JR> </Product>
> JR> </Export>
> JR>
> JR> The Ruby hPricot code:
> JR>
> JR> FIELDS = %w[SKU ItemName CollectionNo Pages]
> JR>
> JR> doc = Hpricot.parse(File.read("my.xml"))
> JR> (doc/:product).each do |xml_product|
> JR> product = Product.new
> JR> for field in FIELDS
> JR> product[field] = (xml_product/field.intern).first.innerHTML
> JR> end
> JR> product.save
> JR> end
>
> At a first glance your Ruby code looks impressively
> compact. But the corresponding implementation in
> Erlang is about the same size. What's the point in
> adding some syntactic sugar in order to make it even
> more compact? It is just a matter of taste.
>
> % cat product.erl
> -module(product).
> -compile(export_all).
> -include_lib("xmerl/include/xmerl.hrl").
>
> parse(File) ->
> {#xmlElement{content = Exports}, _} = xmerl_scan:file(File),
> [{Tag, Val} || #xmlElement{content = Products} <- Exports,
> #xmlElement{content = Fields} <- Products,
> #xmlText{parents = [{Tag, _} | _], value = Val} <- Fields].
>
> % erl
> Erlang (BEAM) emulator version 5.5.5 [source] [async-threads:0] [kernel-poll:false]
>
> Eshell V5.5.5 (abort with ^G)
> 1> product:parse("my.xml").
> [{'SKU',"403276"},{'ItemName',"Trivet"},{'CollectionNo',"0"},{'Pages',"0"}]
> 2>
>
> /Håkan
>
Well done Håkan ;-)
Here is another (not as nice as your) solution, which however is
rather fun, making use of Xpath:
--------------------------------------------
-module(xp).
-export([go/0, go/1]).
-include_lib("xmerl/include/xmerl.hrl").
-define(Val(X),
(fun() ->
[#xmlElement{name = N, content = [#xmlText{value =
V}|_]}] = X,
{N,V} end)()).
go() ->
go("/home/tobbe/hej.xml").
go(File) ->
{Xml, _} = xmerl_scan:file(File),
[?Val(xmerl_xpath:string("//SKU", Xml)),
?Val(xmerl_xpath:string("//ItemName", Xml)),
?Val(xmerl_xpath:string("//CollectionNo", Xml)),
?Val(xmerl_xpath:string("//Pages", Xml))].
-------------------------------------------------
5> xp:go().
[{'SKU',"403276"},{'ItemName',"Trivet"},{'CollectionNo',"0"},{'Pages',"0"}]
Cheers, Tobbe
More information about the erlang-questions
mailing list