[erlang-questions] xmerl parsing of DTD

Haobo Yu <>
Thu Jul 19 23:10:52 CEST 2007


Hi,

I ran into a problem of parsing XML files with an external DTD.  For  
example, the following XML file:

<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "t2.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

When parsed by xmerl 1.1.4, gives an error (I have t2.dtd in the same  
dir)

** exited: {{case_clause,{error,enoent}},
             [{xmerl_scan,fetch_DTD,2},
              {xmerl_scan,scan_doctype2,3},
              {xmerl_scan,scan_prolog,3},
              {xmerl_scan,scan_document,2},
              {xmerl_scan,string,2},
              {erl_eval,do_apply,5},
              {shell,exprs,6},
              {shell,eval_loop,3}]} **

The test code is here:

test_check_dtd() ->
     Xml = "<?xml version=\"1.0\"?>\n<!DOCTYPE note SYSTEM \"t2.dtd\"> 
\n<note>\n<to>Tove</to>\n<from>Jani</from>\n<heading>Reminder</ 
heading>\n<body>Don't forget me this weekend!</body>\n</note>\n",
     xmerl_scan:string(Xml, [{validation, off}]).

I found that if I apply the attached patch to xmerl_scan.erl, which  
uncomments two lines that appear to prevent DTD loading if validation  
is off, it works fine.  However, I don't know enough to tell what is  
the right patch.  Would appreciate suggestions.

Thanks.

- Haobo

--- /opt/local/lib/erlang/lib/xmerl-1.1.4/src/xmerl_scan.erl     
2007-04-12 07:28:36.000000000 -0700
+++ xmerl_scan.erl      2007-07-19 11:21:11.000000000 -0700
@@ -715,6 +715,7 @@
      ?dbg("prolog(\"<\")~n", []),

      %% Here we consider the DTD provided by doctype_DTD option,
+    io:format("~p~n", [S0]),
      S1 =
         case S0 of
             #xmerl_scanner{validation=dtd,doctype_DTD=DTD} when list 
(DTD) ->
@@ -1202,8 +1203,8 @@
      fetch_DTD({system,URI},S#xmerl_scanner 
{doctype_DTD=option_provided});
fetch_DTD(undefined, S) ->
      S;
-% fetch_DTD(_,S=#xmerl_scanner{validation=false}) ->
-%     S;
+fetch_DTD(_,S=#xmerl_scanner{validation=off}) ->
+    S;
fetch_DTD(DTDSpec, S)->
      case fetch_and_parse(DTDSpec,S,[{text_decl,true},
                                     {environment, 
{external,subset}}]) of




More information about the erlang-questions mailing list