Implementing tables - advice wanted

Tue Jun 13 14:47:33 CEST 2006

Joe Armstrong (AL/EAB) wrote:
> Second try, my last mail got sent prematurely ...
> 
> I disagree about the syntax.
> 
>      Try this:
> 
> 	X = @{name="fred", age=12, footsize=8}
> 	foo(X)
> 
>       foo(@{name=Name}=X) ->
> 	    ...
>       foo(Other) ->
> 
> How does this look with dicts?
> 
> 	X = dict:from_list([{name,"fred"},{age,12},{footsize,8}]),
>       foo(X).
> 
>       foo(X) ->
>          case dict:find(name, X) of
> 	      {ok, name} ->
> 			...
> 	      error ->
> 			...
> 	   end
> 
> Which is much more verbose.

More verbose, yes. Much more - no, not really; only by about 50%.
But yes, such notation would be convenient - in particular if you
could use it in pattern matching, as in your example.

My point is that I really find the above verbosity a
much smaller inconvenience, compared to the amounts of careful
thinking and separation and laying out of data and writing
wrappers that hide the actual calls to ets, etc., that I
have to write each time I really need some fast indexed tables.

That's not to say that Erlang shouldn't have both, if possible.

> I recently attended a dagstuhl workshop where one of the
> topics was "tables as universal data structures" - some of the people
> there thought that "the python data structures were just hash tables and
> this was why python was popular".

It is. It's also why Python is slow, but that's mainly a language
implementation issue. It shows that it can be better to use a
generic, though less efficient, mechanism to base your constructs
on if you want to get stuff done, rather than spend ages trying to
find the best trade-off between power and speed (virtual function
tables? - it's just another hash table, mapping names to functions).
However, when the tool starts to be used for really heavy tasks,
you need to start optimizing internally, and I think that's where
Python is currently lacking most. (Yes, I have a little experience
in that area.)

>> If I remember correctly, the experiment with a "vector" data 
>> type (which used destructive update internally, with some 
>> penalty for accessing older versions of the data) was killed 
>> by bad interaction with the garbage collector, leading to 
>> rotten performance. Have things changed enough in the GC by 
>> now for this to become worth a new attempt?
> 
> Strange - I can't understand what "interaction with GC" has to do with
> matters. 

Ah. I remember when I too, once, was blissfully unaware of the
many ways the current GC implementation can get in the way of other,
supposedly orthogonal issues. Alas, my mind is now forever tainted.
But ask Björn if you want the details.

	/Richard