[erlang-questions] towards a unified dict api

Richard Carlsson carlsson.richard@REDACTED
Thu Dec 29 18:35:46 CET 2011


On 2011-12-29 10:39, Witold Baryluk wrote:
> I agree dict API is way to go, because it is so used. However I would be
> against introdusing any synonymous, it may start confusing.

The synonyms I was referring to are only the old names for certain 
functions in the dict module. For backwards compatibility reasons, they 
cannot simply be removed from dict.erl, but new names are needed for the 
unified API to avoid collisions with functions in other modules (or, in 
a few cases, to allow better names and more consistent argument order). 
The suggested unified dict API does not in itself contain more than one 
name for any function.

> What I do not link in dict API is how update/4 function behaves.
> It is called like update(Key, Fun, Initial, Dict) ->  Dict.
> Problem is that if Initial is some sort of complex and costly
> to compute initial value, then it will be most of the time
> wasted, because it will be not needed.

This could be a good point. The function update/4 is part of the 
existing dict module and cannot be changed, but my suggested new 
function map/4 could be made to take an initialization function (Key) -> 
term() instead of just a term to be stored.

> I'm not really sure if integrating abstract collections into existing
> dict module is good idea. It should not break compatibility, but it will
> for sure bring some performance hit (due pattern matching in the dict
> module and delegation to other modules).

And this is definitely not the intention. The only delegation that is 
done is in order to save the user from knowing about the gb_trees 
module. The dict module becomes more like ets, in that you can choose 
between a hashed and an ordered dictionary, but you shouldn't have to 
know anything about the implementation. Other kinds of dictionaries such 
as ets and orddict only implement the same interface; you are not 
supposed to be able to use the dict module to manipulate an orddict or 
an ets table. This is not object oriented programming. If you want to do 
that kind of wrapper where the data structure knows its implementation 
and does its own dispatch, you can build that yourself on top of this API.

> I think dict module should be leaves as is, and new module should be
> introduced, like gen_dict. Sure in some sense, it is easier to just
> find all dict:new() using simple grep, and change it to dict:new([...]),
> where appropriate without worring about other call sites, but if for some
> reasons one changes dict:new([dict]), to something else, some functions
> may subtelly change how they work, so I think it should not messed too much.

You should not have to change anything in your code. Existing calls to 
dict:new() work as they are, since the default is still to create a 
plain old hashed dictionary. Only if you want to create an ordered 
dictionary do you need to call dict:new([ordered_set]). You do not have 
to specify a callback module as in your example dict:new([dict]).

I agree that if you want to make a more object oriented implementation 
with method dispatch, another module name should be used for that. But 
what I have suggested is just a unified API for the existing dictionary 
data types.

    /Richard



More information about the erlang-questions mailing list