[erlang-questions] behavior of lists:append/1

Thu Sep 19 03:33:28 CEST 2019

I'm sorry, but what O(1) runtime check are we talking about here?
I don't know of any O(1) way to test "is this a proper list" in
Erlang.

A little bit of history here.
Erlang has roots in Strand88 which has roots in Prolog which (in its
current form) is an "Edinburgh" language which has roots in Pop-2 and
Lisp.

In Prolog, you can construct a term [H|T] before you know what H and/or
T are.  Indeed, [1,2,...,1000000|T] is possible.  When eventually T is
bound, the runtime system has no way of telling that it is used as the
final tail of an incomplete list.

In Lisp, (cons H T) must know what H and T are.  So
   (set! X (cons 1 (cons 2 (cons 3 nil))))
   (list? X)
is possible.  But it's pretty much useless, because
the fact that X is a list *now* doesn't mean it will *always*
be a list.  Things like
   (set-cdr! X 42)
are possible, as indeed are things like
   (set-cdr! X X)
I used Scheme syntax here because Scheme has list? and Lisp doesn't.

Pop-2, the rival language where Edinburgh Prolog was developed,
is an interesting case.  It's a strict imperative language that also
has lazy lists.  So
function islist x;
   if x.ispair then
      x.back -> x;
      x.ispair or x = nil or x.isfunc
   else
      x = nil
   close
end;
takes O(1) time, recognises cons(1,[]) as a list, rejects cons(1,2),
but given what *might* be a lazy list guesses that it is, rather than
expand what might well be an infinite sequence.  Thanks to lazy
lists, x.islist might be true *now*, but after evaluating it one
step, it might be false quite soon.

So in the 1960s and 1970s, programmers using Lisp, Pop-2, and Prolog
got used to the idea that "is this a well-formed list" was somewhere
between impossible to implement, usable only at the instant of testing,
or dangerously expensive.  Lisp and Pop-2 settled on having LISTP or
islist tests that are fast but unreliable.  Prolog settled on using
pattern matching.  And Erlang followed Lisp.

Returning now to Erlang.
Historically, just like Lisp, Pop-2, and Prolog,
Erlang uses the *same* data type for arbitrary pairs and for
nodes in a list.  [1|2] is legal Erlang data.

This means that we can do

fakeout(0) -> 0;
fakeout(N) -> [N | fakeout(N-1)].

...
   [a,b] ++ fakeout(1000000)

Because Erlang
 * uses only strict evaluation, so all data are known in full
 * does not allow data to be changed
it _would_ be possible to change the data representation used
by Erlang to distinguish between
 - a pair whose tl is a proper list
 - a pair whose tl is not a proper list.
This would allow an "is_proper_list/1" test that is not only
constant time but fast, at the price of slowing down every
constructive use of [H|T].

Now no-one who is unwilling to make a speed-for-safety tradeoff
would be using Erlang in the first place.  So the community may
well decide that this major overhaul to BEAm is worth while.

But until such a change is made, there is no O(1) test that we
can use in (++) that will guarantee the result is well formed.

Now I am a bear of very little brain, and I haven't been getting
as much sleep as I should lately, so it would be the very reverse
of surprising if I was wrong about this.  But if I am, it would
be very much appreciated if people saying that we should use an
O(1) test in appending would actually say what test they are
talking about.  (Hint: is_list/1 is *not* that test.)

On Thu, 19 Sep 2019 at 09:54, zxq9 <zxq9@REDACTED> wrote:

> On 2019/09/18 21:00, Raimo Niskanen wrote:
> > On Tue, Sep 17, 2019 at 11:25:11PM +0900, zxq9 wrote:
> >> On 2019/09/17 22:13, Raimo Niskanen wrote:
> >>> On Mon, Sep 16, 2019 at 10:53:57PM +0900, zxq9 wrote:
> >>> No.
> >>>
> >>> lists:append/2 can be used and *is* used by existing code to construct
> >>> improper lists.  It would break backwards compatibility.
> >
> > ...to construct improper lists even as in appending a tail element (that
> is
> > need not be a list) to a list...
>
> At least here we can put down a line and say "that's a violation of the
> typespec", because it is:
> http://erlang.org/doc/man/lists.html#append-2
>
> So if we want to continue to use append/2 to *create* improper lists
> (which I've never once encountered anywhere in the last decade, but
> maybe you've got some examples of super wizard code where this is
> necessary?) then the typespec should be changed, otherwise I think it
> isn't unreasonable to do at +O(1) cost at runtime what Dialyzer would
> do, or just leave things as they are and actively discourage deliberate
> construction of improper lists via append/2.
>
> I'm not particularly opposed to changing the typespec to say "List2 can
> be whatever because who cares whether lists are lists or not? Let's
> ditch type sanity because a handful of people feel like it! WOOO!" but I
> *do* believe it is inconsistent to the extreme to say "Well, we're going
> to have Dialyzer (and all users who read the documentation... har har
> har! Joke's on them!) believe that both arguments must be lists!" and
> then in unofficial -- but heavily referenced -- discussion have an OTP
> developer say "well, append/2 is used by some people to deliberately
> construct improper lists, so let's just ignore the discussion entirely
> even though there is actually a way to efficiently check this at runtime".
>
> -Craig
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20190919/f6f13d42/attachment.htm>