[erlang-questions] json to map
Roelof Wobben
r.wobben@REDACTED
Fri Aug 28 09:45:18 CEST 2015
I will take the challenge but im stuck at the types part.
so far I have this :
-module(time_parser).
-export([]).
-type token :: tInt()
| tWord()
| tSlash()
| tDash()
| tComma().
-type tint() :: integer().
-type tword() :: binary().
-type tSlash() :: binary().
-type tDash() :: binary().
-type tComma() :: binary().
and I see this error : time_parser.erl:5: bad type declaration
Roelof
ime_parser.erl:5: bad type declaration
Op 28-8-2015 om 07:21 schreef Richard A. O'Keefe:
> On 27/08/2015, at 11:04 pm, Roelof Wobben <r.wobben@REDACTED> wrote:
>
>> Thanks,
>>
>> Can this be a way to solve the challenge : http://www.evanmiller.org/write-a-template-compiler-for-erlang.html
> That link starts by making three claims:
>
> • Erlang is hard to refactor
>
> I don't find manual refactoring harder in Erlang than in
> any other language (not excluding Java and Smalltalk).
> I haven't tried Wrangler or RefactorErl (see
> http://plc.inf.elte.hu/erlang/) yet, but they look good.
>
> • There is no built-in syntax for hash maps
>
> This is no longer true.
>
> • String manipulation is hard
>
> That's a puzzler. I've found string manipulation using
> lists *easier* in Erlang than in almost anything but SNOBOL
> or Prolog. I would certainly ***MUCH*** rather write a
> string -> JSON parser in Erlang than in say Java or even
> Lisp. (Of course the Bigloo implementation of Scheme has
> special support for lexers and parsers built in, which does
> change the picture.)
>
> The question is always "compared with WHAT?" In many case
> the key trick for manipulating strings is DON'T. My JSON
> parser in Smalltalk, for example, is only concerned with
> strings to the extent that they are a nasty problem posed
> by JSON that it has to solve; they are not something that
> it uses for its own purposes. The tokeniser converts a
> stream of characters to a stream of tokens, and the parser
> works with tokens, not characters. (Yes, I know about
> scannerless parsers, but the factoring has always helped me
> to get a parser working. A separate tokeniser is something
> that I can *TEST* without having to have the rest of the
> parser working.)
>
> Then it turns out that the web page is really about writing
> a compiler from "Django Template Language" to Erlang.
> "It helps to get a hold of a language specification if there
> is one. I am implementing the Django Template Language. There's
> not really a spec, but there is an official implementation in Python,"
>
> OUCH! What *IS* it about this industry? Why do we get notations
> that become popular where there is no spec (like Markdown,
> originally, or JSON, ditto -- it had syntax but no semantics)
> or the spec is confused (like XML, where they muddled up
> syntax and semantics so that we ended up with several different
> semantics for XML, or the first version of RDF, where they
> meant to define it in terms of XML semantics, but there wasn't
> really one, so they defined it in terms of XML syntax *by mistake*).
>
> That page talks about writing a scanner with an argument to
> say what the state is. This is almost always a bad idea.
> Each state should be modelled by a separate Erlang function.
>
> Let's see an example of this.
> Let's consider dates written in one of four ways:
> dd/mm/yyyy
> dd MON yyyy
> MON dd[,] yyyy
> yyyy-mm-dd
>
> (By the way, we give matching and cleaning up data that's just
> a little bit more complex than this as an exercise to 3rd year
> students. Thinking in Java makes it *impossible* for them to
> get something like this right in a 2-hour lab session.
> Regular expressions are a royal road to ruin.)
>
> I'll do this in Haskell.
>
> data Token
> = TInt Int
> | TWord String
> | TSlash
> | TDash
> | TComma
>
> tokens :: [Char] -> [Token]
>
> tokens [] = []
> tokens (c:cs) | isSpace c = tokens cs
> tokens (c:cs) | isDigit c = digits cs (ord c - ord '0')
> tokens (c:cs) | isAlpha c = word cs [c]
> tokens ('/':cs) = TSlash : tokens cs
> tokens ('-':cs) = TDash : tokens cs
> tokens (',':cs) = TComma : tokens cs
> -- anything else will crash
>
> digits (c:cs) n | isDigit c = digits cs (ord c - ord '0' + n*10) : digits cs
> digits cs n = TInt : tokens cs
>
> word (c:cs) w | isAlpha c = word cs (toLower c : w)
> word cs w = TWord (reverse w) : tokens cs
>
> Converting the tokeniser to Erlang is a trivial exercise for
> the reader.
>
> valid_month :: String -> Int
> valid_month "jan" = 1
> valid_month "january" = 1
> ...
> valid_month "december" = 12
> -- anything else will crash
>
> string_to_date :: [Char] -> (Int,Int,Int)
>
> string_to_date cs =
> case tokens cs of
> [TInt d,TSlash,TInt m,TSlash,TInt y] -> check y m d
> [TInt y,TDash, TInt m,TDash, TInt d] -> check y m d
> [TInt d,TWord m,TInt y] -> check y (valid_month m) d
> [TWord m,TInt d,TComma,TInt y] -> check y (valid_month m) d
> [TWord m,TInt d, TInt y] -> check y (valid_month m) d
> -- anything else will crash
>
> check :: Int -> Int -> Int -> (Int,Int,Int)
> -- left as a boring exercise for the reader.
>
> Converting this to Erlang is also a trivial exercise for the reader.
>
> You will notice that there are multiple scanning functions and
> no 'what state am I in?' parameter. Your scanner should KNOW
> what state it is in because it knows what function is running.
>
> Yecc is a great tool, but for something like this there's no
> real point in it, and even for something like JSON I would
> rather not use it.
>
> One thing that Leex and Yecc can do for you
> is to help you track source position for reporting
> errors. For a configuration file, it may be sufficient to
> just say "Can't parse configuration file X as JSON."
>
> OK, the technique I used above is "recursive descent",
> which works brilliantly for LL(k) languages with small k.
> But you knew that.
> Oh yes, this does mean that writing a parser is just like
> writing a lexical analyser, except that you get to use
> general recursion. Again, you typically have (at least)
> one function per non-terminal symbol, plus (if your
> original specification used extended BNF) one function
> per repetition.
>
> Heck.
> s expression
> = word
> | "(", [s expression+, [".", s expression]], ")".
>
> data SExpr
> = Word String
> | Cons SExpr SExpr
> | Nil
>
> s_expression :: [Token] -> (SExpr, [Token])
>
> s_expression (TWord w : ts) = (Word w, ts)
> s_expression (TLp : TRp : ts) = (Nil, ts)
> s_expression (TLp : ts) = s_expr_body ts
>
> s_expr_body (TRp : ts) = (Nil ts)
> s_expr_body (TDot : ts) =
> let (e, TRp : ts') = s_expression ts
> in (e, ts')
> s_expr_body ts =
> let (f, ts') = s_expression ts
> (r, ts'') = s_expr_body ts'
> in (f:r, ts'')
>
> This is so close to JSON that handling JSON without
> "objects" should now be straightforward. And it makes
> a good development step.
>
>
>
>
>
>
> -----
> Geen virus gevonden in dit bericht.
> Gecontroleerd door AVG - www.avg.com
> Versie: 2015.0.6140 / Virusdatabase: 4409/10524 - datum van uitgifte: 08/27/15
>
>
>
More information about the erlang-questions
mailing list