[erlang-questions] Why do we need modules at all?

Wed May 25 05:34:17 CEST 2011

Hello Joe,

I've personally wanted an easy way to load modules from different places.

If you started with making modules easy to load from different places,  
then later provide support for functions, I imagine this would be easier  
to implement immediately and easier for people to make use of immediately  
... where "people" refers to me :)

I've got two suggestions (mostly inspired by the Mozilla Framework)...

1. How about letting the programmer choose where the functions come from  
(kv store, fs, web-server, etc)...

For instance, a module could start...

-import(["file://misc/collect_int", "http://erlang.org/lists/*",  
"mnesia://db1/misc/merge_kv"]).

By default, the code loader would know what to do when it encounters  
"file://", "http://", "https://", "ftp://", "sftp://" and "mnesia://" URI  
schemes. Programmers could be allowed define their own URI schemes by  
providing callbacks for loading code. For instance, one would be able to  
define "ubf://" and have the code loader call his/her code when it  
encounters "ubf://foo/bar"

2. Expanding on this: letting the programmer centralise the possible  
locations of modules/functions and use this to define namespaces...

One possible way of doing this is to steal from the "chrome registration"  
system that the Mozilla Framework uses. A programmer could define  
one-level deep namespaces in a configuration file which points to targets  
that could come from anywhere, then use that in her modules. A module  
could then start...

-import(["erlang://misc/collect_int", "erlang://couch/mapreduce"]).

Where "misc" and "couch" are namespaces and "erlang://" is a special URI  
scheme telling the code loader to refer to the configuration file  
"erlang.manifest" which would contain...

[{"misc", "file://./ebin/misc.beam"},
  {"couch", "http://localhost:7777/ebin/"}]

Where "couch" points to a yaws server serving couch beams, or something  
along those lines. Then she can easy switch between different code sources  
using one file.

A code-signing system could also be introduced.

- Edmond -

On Tue, 24 May 2011 18:06:19 +1000, Joe Armstrong <erlang@REDACTED> wrote:

> Why do we need modules at all?
>
> This is a brain-dump-stream-of-consciousness-thing. I've been
> thinking about this for a while.
>
> I'm proposing a slightly different way of programming here
> The basic idea is
>
>     - do away with modules
>     - all functions have unique distinct names
>     - all functions have (lots of) meta data
>     - all functions go into a global (searchable) Key-value database
>     - we need letrec
>     - contribution to open source can be as simple as
>       contributing a single function
>     - there are no "open source projects" - only "the open source
>       Key-Value database of all functions"
>     - Content is peer reviewed
>
> These are discussed in no particular order below:
>
> Why does Erlang have modules?
>
> There's a good an bad side to modules:
>
> Good: Provides a unit of compilation, a unit of code
> distribution. unit of code replacement
>
> Bad: It's very difficult to decide which module to put an individual
> function in. Break encapsulation (see later)
>
> Aside: lib_misc.erl
>
> When I'm programming I often get to the point were I say there should
> a function foo/2 in lists.erl but their isn't. There should be but
> there isn't - foo/2 is a small self contained thing. Why should it be
> in lists.erl because it "feels right".
>
> Strings are lists, so why do we have two modules lists.erl and
> string.erl how should I decide in which module my new string/list
> processing function should go.
>
> To avoid all mental anguish when I need a small function that
> should be somewhere else and isn't I stick it in
> a module elib1_misc.erl.
>
> My elib1_misc exports the following:
>
> added_files/2                 make_challenge/0
> as_bits/1                     make_response/2
> as_bits_test/0                make_response_test/0
> bdump/2                       make_test_strings/1
> bin2hex/1                     make_test_strings_test/0
> bin2hex_test/0                make_tmp_filename/2
> check_io_list/1               merge_kv/1
> collect_atom/1                merge_kv_test/0
> collect_atom_test/0           mini_shell/0
> collect_int/1                 module_info/0
> collect_int_test/0            module_info/1
> collect_string/1              ndots/1
> collect_string_test/0         nibble_to_hex_char/1
> collect_word/1                nibble_to_hex_char_test/0
> complete/2                    odd/1
> complete_test/0               on_exit/2
> dos2unix/1                    out_of_date/2
> downcase_char/1               outfile/2
> dump/2                        padd/2
> dump_tmp/2                    perms/1
> duplicates/1                  perms_test/0
> ensure_started/2              pmap/2
> eval_file/1                   pmap1/2
> eval_file_test/0              pmap1_test/0
> eval_string/1                 pmap_test/0
> eval_string_test/0            priority_receive/0
> every/3                       random_seed/0
> expand_env_vars/1             random_string/1
> expand_file_template/3        random_string/2
> expand_string_template/2      read_at_most_n_lines/2
> expand_tabs/1                 read_at_most_n_lines_test/0
> expand_tabs_test/0            remove_duplicates/1
> expand_template/2             remove_duplicates_test/0
> extract_attribute/2           remove_leading_and_trailing_whitespace/1
> extract_attribute_test/0       
> remove_leading_and_trailing_whitespace_test/0
> extract_prefix/2              remove_leading_whitespace/1
> fetch/2                       remove_prefix/2
> fetch_test/0                  remove_prefix_test/0
> file2lines/1                  remove_trailing_whitespace/1
> file2lines_test/0             replace/3
> file2md5/1                    root_dir/0
> file2numberedlines/1          rpc/2
> file2numberedlines_test/0     safe/1
> file2paras/1                  show_loaded/1
> file2stream/1                 signed_byte_to_hex_string/1
> file2string/1                 signed_byte_to_hex_string_test/0
> file2template/1               skip_blanks/1
> file2term/1                   skip_blanks_test/0
> file_size_and_type/1          skip_to_nl/1
> find_src/1                    skip_to_nl_test/0
> first/1                       sleep/1
> flatten_io_list/1             spawn_monitor/3
> flush_buffer/0                split_at_char/2
> for/3                         split_at_char_test/0
> force/1                       split_list/2
> foreach_chunk_in_file/3       split_list_test/0
> foreach_word_in_file/2        string2exprs/1
> foreach_word_in_string/2      string2exprs_test/0
> forever/0                     string2html/1
> get_erl_section/2             string2latex/1
> get_line/1                    string2lines/1
> get_line/2                    string2lines_test/0
> have_common_prefix/1          string2stream/1
> have_common_prefix_test/0     string2stream_test/0
> hex2bin/1                     string2template/1
> hex2bin_test/0                string2template_test/0
> hex2list/1                    string2term/1
> hex2list_test/0               string2term_test/0
> hex_nibble2int/1              string2toks/1
> hex_nibble2int_test/0         string2toks_test/0
> id/1                          sub_binary/3
> include_dir/0                 template2file/3
> include_file/1                term2file/2
> interleave/2                  term2string/1
> is_alphanum/1                 test/0
> is_blank_line/1               test1_test/0
> is_prefix/2                   test_function_over_substrings/2
> is_prefix_test/0              tex2pdf/1
> is_response_correct/3         time_fun/2
> keep_alive/2                  time_stamp/0
> lines2para/1                  to_lower/1
> list2frequency_distribution/1 to_lower_test/0
> list2frequency_distribution_tetrim/1
> longest_common_prefix/1       trim_test/0
> longest_common_prefix_test/0  unconsult/2
> lookup/2                      unsigned_byte_to_hex_string/1
> lorem/1                       unsigned_byte_to_hex_string_test/0
> ls/1                          which/1
>                               which_added/1
>
> Now I find this very convenient when I write a new small utility function
> I stick in in elib1_misc.erl - no mental anguish in choosing a module
> name is involved.
>
> The observation that I find this very-convenient is telling me something
> about modules - I like my elib1_misc it feels right.
>
> (aside - It seems many development projects have their own private
> lib_miscs ...)
>
> Which brings me to the point of my question.
>
> Do we need module's at all? Erlang programs are composed of lots of small
> functions, the only place where modules seem useful is to hide a letrec.
>
> The classic example is fibonacci. We want to expose fib/1 but hide the
> helper function fib/3. Using modules we say
>
> -module(math).
> -export([fib/1]).
>
> fib(N) ->
>     fib(N, 1, 0).
>
> fib(N, A, B) when N < 2 -> A;
> fib(N, A, B) -> fib(N-1, A+B, A).
>
> The downside is we have had to *invent* one module name math - whose  
> *only*
> purpose is to hide the definition of fib/3 which we don't want to be made
> callable.
>
> If we put a second function into the module math, then this second  
> function
> could call fib/3 which breaks the encapsulation of fib/3.
>
> We could say:
>
> let fib = fun(N) -> fib(N, 1, 0) end
> in
>    fib(N, A, B) when N < 2 -> A;
>    fib(N, A, B) -> fib(N-1, A+B, A).
> end.
>
> I hardly dare suggest a syntax for this since I've been following
> another thread in this forum where syntax discussion seem to encourage
> much comment.
>
> ** Please do suggest alternative syntax's here - but do not comment on
> other peoples suggestions ...
>
> I would like to just talk about why we have modules.
>
> Another question:
>
> Does the idea of a module come from the idea that functions have to be
> stored somewhere, so we store them in a file, and we slurp the
> file (as a unit) into the system, so the file becomes a module?
>
> If all the files were store by themselves in a database would this
> change things.
>
> I am thinking more and more that if would be nice to have *all*  
> functions in
> a key_value database with unique names.
>
> lookup(foo,2) would get the definition foo foo/2 from a database.
>
> The unique names bit is interesting - is this a good idea. Qualified
> names (ie names like xxx:foo/2) or (a.b.c.foo/2) sounds like a good
> idea but but when I'm programming I have to invent the xxx or the
> a.b.c which is very difficult. It also involves the "decision problem"
> if the namespaces xxx and a.b.c already exist I have to *choose* which
> to put my new function in.
>
> I think there might be a case for alises here joe:foo/2 could be used
> while developing "joe" would expand to a horrible random local string the
> real name being ab123aZwerasch123123_foo/2  but I would not be able to
> publish my code or make it available to a third_part before I had
> chosen a sensible name.
>
> (( managing namespaces seems really tricky, a lot of peoople seem
> to thing that the problem goes away by adding "." 's to the name
> but managing a namespace with namees like foo.bar.baz.z is just as  
> complex
> as managing a namespace with names like foo_bar_baz_z or names like
> 0x3af312a78a3f1ae123 - the problem is that we have to go from a symbolic
> name like www.a.b to a reference like 123.45.23.12 - but how do we  
> discover
> the initial name www.a.b? - there are two answers - a) we are given the  
> name
> (ie we click on a link) - we do not know the name but we search fo it ))
>
>
> When programs are small we can live with "just the code" in "a few
> modules" the ratio of code to meta data is high.
>
> When programs are large we need a lot of meta-data to understand them.
>
> I would like to see all functions with all meta-data in a data base.
>
> I'd like to say:
>
>    lookup(foo,2,Attribute) when Attribute =
>
>       code|source|documentation|type signatures|revision  
> history|authors|...
>
> The more I think about it the more I think program development should
> viewed as changing the state of a Key-Value database.
>
> So I imagine:
>
>     1) all functions have unique names
>     2) there are no modules
>     3) we discover the name of a function by searching metadata
>        describing the function in a database
>     4) all public functions (think open source) are in the same
>        database
>
> We could make a system to do this.
>
> I think this would make open-source projects easier, since the
> granularity of contribution goes down. You could contribute
> a single function - not an entire application.
>
> (( A problem with GUT style open source projects is there is
>    not one database of functions, I often what one function from
>    this project, another function from another project -- the
>    granularity of reusable parts should be the individual function.
>
>    functions are really easy to reuse
>    modules are more difficult to reuse
>    entire applications are very difficult to reuse
>      (Unless there are isolated through a communication channel))
>
> Possible extensions.
>
>     1) Voting for promotion
>     2) A review process
>
> Given a raw database will *all* functions in it - we could derive an
> "approved" functions database.
>
> Popular functions could be moved to the approved database - the
> review process would need to be discussed - so kind of peer-review/wiki
> stuff.
>
> Comments?
>
> Volunteers?
>
> /Joe

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/