[erlang-questions] json to map

Roelof Wobben r.wobben@REDACTED
Fri Aug 28 09:45:18 CEST 2015


I will take the challenge but im stuck at the types part.

so far I have this :

-module(time_parser).

-export([]).

-type token :: tInt()
              | tWord()
              | tSlash()
              | tDash()
              | tComma().

-type tint()   :: integer().
-type tword()  :: binary().
-type tSlash() :: binary().
-type tDash()  :: binary().
-type tComma() :: binary().


and I see this error : time_parser.erl:5: bad type declaration

Roelof

ime_parser.erl:5: bad type declaration


Op 28-8-2015 om 07:21 schreef Richard A. O'Keefe:
> On 27/08/2015, at 11:04 pm, Roelof Wobben <r.wobben@REDACTED> wrote:
>
>> Thanks,
>>
>> Can this be a way to solve the challenge : http://www.evanmiller.org/write-a-template-compiler-for-erlang.html
> That link starts by making three claims:
>
>   • Erlang is hard to refactor
>
>     I don't find manual refactoring harder in Erlang than in
>     any other language (not excluding Java and Smalltalk).
>     I haven't tried Wrangler or RefactorErl (see
>     http://plc.inf.elte.hu/erlang/) yet, but they look good.
>
>   • There is no built-in syntax for hash maps
>
>     This is no longer true.
>
>   • String manipulation is hard
>
>     That's a puzzler.  I've found string manipulation using
>     lists *easier* in Erlang than in almost anything but SNOBOL
>     or Prolog.  I would certainly ***MUCH*** rather write a
>     string -> JSON parser in Erlang than in say Java or even
>     Lisp.  (Of course the Bigloo implementation of Scheme has
>     special support for lexers and parsers built in, which does
>     change the picture.)
>
>     The question is always "compared with WHAT?"  In many case
>     the key trick for manipulating strings is DON'T.  My JSON
>     parser in Smalltalk, for example, is only concerned with
>     strings to the extent that they are a nasty problem posed
>     by JSON that it has to solve; they are not something that
>     it uses for its own purposes.  The tokeniser converts a
>     stream of characters to a stream of tokens, and the parser
>     works with tokens, not characters.  (Yes, I know about
>     scannerless parsers, but the factoring has always helped me
>     to get a parser working.  A separate tokeniser is something
>     that I can *TEST* without having to have the rest of the
>     parser working.)
>
> Then it turns out that the web page is really about writing
> a compiler from "Django Template Language" to Erlang.
> "It helps to get a hold of a language specification if there
> is one. I am implementing the Django Template Language.  There's
> not really a spec, but there is an official implementation in Python,"
>
> OUCH!  What *IS* it about this industry?  Why do we get notations
> that become popular where there is no spec (like Markdown,
> originally, or JSON, ditto -- it had syntax but no semantics)
> or the spec is confused (like XML, where they muddled up
> syntax and semantics so that we ended up with several different
> semantics for XML, or the first version of RDF, where they
> meant to define it in terms of XML semantics, but there wasn't
> really one, so they defined it in terms of XML syntax *by mistake*).
>
> That page talks about writing a scanner with an argument to
> say what the state is.  This is almost always a bad idea.
> Each state should be modelled by a separate Erlang function.
>
> Let's see an example of this.
> Let's consider dates written in one of four ways:
>      dd/mm/yyyy
>      dd MON yyyy
>      MON dd[,] yyyy
>      yyyy-mm-dd
>
> (By the way, we give matching and cleaning up data that's just
> a little bit more complex than this as an exercise to 3rd year
> students.  Thinking in Java makes it *impossible* for them to
> get something like this right in a 2-hour lab session.
> Regular expressions are a royal road to ruin.)
>
> I'll do this in Haskell.
>
> data Token
>     = TInt Int
>     | TWord String
>     | TSlash
>     | TDash
>     | TComma
>
> tokens :: [Char] -> [Token]
>
> tokens [] = []
> tokens (c:cs) | isSpace c = tokens cs
> tokens (c:cs) | isDigit c = digits cs (ord c - ord '0')
> tokens (c:cs) | isAlpha c = word   cs [c]
> tokens ('/':cs) = TSlash : tokens cs
> tokens ('-':cs) = TDash  : tokens cs
> tokens (',':cs) = TComma : tokens cs
> -- anything else will crash
>
> digits (c:cs) n | isDigit c = digits cs (ord c - ord '0' + n*10) : digits cs
> digits cs     n             = TInt : tokens cs
>
> word (c:cs) w | isAlpha c = word cs (toLower c : w)
> word cs     w             = TWord (reverse w) : tokens cs
>
> Converting the tokeniser to Erlang is a trivial exercise for
> the reader.
>
> valid_month :: String -> Int
> valid_month "jan"      = 1
> valid_month "january"  = 1
> ...
> valid_month "december" = 12
> -- anything else will crash
>
> string_to_date :: [Char] -> (Int,Int,Int)
>
> string_to_date cs =
>    case tokens cs of
>      [TInt d,TSlash,TInt m,TSlash,TInt y] -> check y m d
>      [TInt y,TDash, TInt m,TDash, TInt d] -> check y m d
>      [TInt d,TWord m,TInt y]              -> check y (valid_month m) d
>      [TWord m,TInt d,TComma,TInt y]       -> check y (valid_month m) d
>      [TWord m,TInt d,       TInt y]       -> check y (valid_month m) d
> -- anything else will crash
>
> check :: Int -> Int -> Int -> (Int,Int,Int)
> -- left as a boring exercise for the reader.
>
> Converting this to Erlang is also a trivial exercise for the reader.
>
> You will notice that there are multiple scanning functions and
> no 'what state am I in?' parameter.  Your scanner should KNOW
> what state it is in because it knows what function is running.
>
> Yecc is a great tool, but for something like this there's no
> real point in it, and even for something like JSON I would
> rather not use it.
>
> One thing that Leex and Yecc can do for you
> is to help you track source position for reporting
> errors.  For a configuration file, it may be sufficient to
> just say "Can't parse configuration file X as JSON."
>
> OK, the technique I used above is "recursive descent",
> which works brilliantly for LL(k) languages with small k.
> But you knew that.
> Oh yes, this does mean that writing a parser is just like
> writing a lexical analyser, except that you get to use
> general recursion.  Again, you typically have (at least)
> one function per non-terminal symbol, plus (if your
> original specification used extended BNF) one function
> per repetition.
>
> Heck.
> s expression
>    = word
>    | "(", [s expression+, [".", s expression]], ")".
>
> data SExpr
>     = Word String
>     | Cons SExpr SExpr
>     | Nil
>
> s_expression :: [Token] -> (SExpr, [Token])
>
> s_expression (TWord w : ts) = (Word w, ts)
> s_expression (TLp : TRp : ts) = (Nil, ts)
> s_expression (TLp : ts) = s_expr_body ts
>
> s_expr_body (TRp : ts) = (Nil ts)
> s_expr_body (TDot : ts) =
>     let (e, TRp : ts') = s_expression ts
>      in (e, ts')
> s_expr_body ts =
>     let (f, ts')  = s_expression ts
>         (r, ts'') = s_expr_body ts'
>      in (f:r, ts'')
>
> This is so close to JSON that handling JSON without
> "objects" should now be straightforward.  And it makes
> a good development step.
>
>
>
>
>
>
> -----
> Geen virus gevonden in dit bericht.
> Gecontroleerd door AVG - www.avg.com
> Versie: 2015.0.6140 / Virusdatabase: 4409/10524 - datum van uitgifte: 08/27/15
>
>
>




More information about the erlang-questions mailing list