[erlang-questions] io_lib format inconsistancies

Wed Sep 19 15:38:42 CEST 2012

On Fri, Sep 07, 2012 at 03:03:29PM -0400, Andrew Thompson wrote:
> I wrote a quickcheck test for lager to compare its formatting to io_lib,
> and I found a bunch of dissimilarities. However, a couple of the things
> I found didn't make a lot of sense, so I figured I'd ask here for an
> explanation.
> 
> First one:
> 
> 1> io:format("~P", [<<>>, 1]).
> <<>>ok
> 2> io:format("~W", [<<>>, 1]).
> <<...>>ok
> 3> io:format("~P", [<<1>>, 1]).
> <<...>>ok
> 4> io:format("~W", [<<1>>, 1]).
> <<...>>ok
> 
> As you can see, ~P and ~W behave differently on the empty binary, but
> behave the same when the binary isn't empty. Seems a bit weird.
> 
> The other one that bothers me:
> 
> 16> io:format("~P", ['hello world', 2]).
> 'hello world'ok
> 17> io:format("~P", ["hello world", 2]).
> "hello world"ok
> 18> io:format("~P", [<<"hello world">>, 2]).
> <<"hell"...>>ok
> 19> io:format("~W", ['hello world', 2]).
> 'hello world'ok
> 20> io:format("~W", ["hello world", 2]).
> [104|...]ok
> 21> io:format("~W", [<<"hello world">>, 2]).
> <<104,...>>ok
> 
> Why does a 'depth' of 2 mean we print the <<>> and the first *four*
> characters of a binary? ~W behaves much more logically. Also why does
> depth not apply to printable lists, but it does to binaries?
> 
> I've got a branch now for lager to be compatible with this madness, but
> I wondered if anyone knows *why* things are like this?

I guess the awkward answer is that the different routines were implemented
by different people at different times without hard specification...

Examples 1> vs. 2> looks like an off-by-one bug where 2> must be wrong
since it is misleading.

Example 17> shows that the ~P thinks a printable string is one item
only and hence has depth 1. This is dangerous since a very long string
can hog the I/O server. There is however some sense in regarding a printable
string as having no depth (depth one).

Example 18> shows that the author of binary pretty-print thought a very long binary
even with a printable string should have depth not to hog an I/O server. A value
was chosen so that 4 characters are considered as depth 1. That looks like an
arbitrary practical compromise; 4 characters are 1 word.

These are the obvious flaws I see:

* Something like ~P for printable binaries is needed for strings,
but the value 4 characters for depth 1 seems low. An atom has max length
256 bytes (I think) giving depth 1. ~P for printable binaries should behave
like for strings.

* The discrepancy for ~P vs. ~W on <<>> should be fixed.

> 
> https://github.com/basho/lager/pull/77
> 
> Andrew
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB