efficient line handling

Ulf Wiger etxuwig@REDACTED
Fri Apr 14 17:39:08 CEST 2000

Since the traffic in this forum is not exceptionally high, I thought
I'd pitch for my latest contrib: lines-1.1.

This is one of those toy problems that I find oddly relaxing.

I've given some thought off and on to how one would represent lines of
text in Erlang, e.g. when implementing a text editor. Obviously a
simple list would be pretty inefficient...

So, I implemented a module called lines.erl, which mimics a
dynamic array, in which you can easily insert and delete lines
anywhere, and maintain a sequential ordering. Access times are close
to those of ordered set tables. The implementation is a crudely
balanced B-tree, where each node contains a list of elements
(length(List) =< 10.) 

I thought performance was quite good when I tested it with 100,000
lines (mean: 30 usec (append), 16 usec (nth)).

Conceptually, the API works like the following implementation
(using a list):

append(Line, Lines) -> 
   Lines ++ [Line].

nth(Pos, Lines) ->
   lists:nth(Pos, Lines).

count(Lines) ->

insert(1, Lines, Line) ->
insert(N, [H|T], Line) when N > 1 ->
    [H|insert(N-1, T, Line)].

replace(1, [H|T], X) ->
replace(N, [H|T], X) when N > 0 ->
   [H|replace(N-1, T, X)].

delete(1, [H|T]) ->
delete(N, [H|T]) when N > 0 ->
   [H|delete(N-1, T)].

When comparing my implementation with the above (naiive) one,
my Btree version beats the list version on 
- nth/2 on a 20-line structure
- delete/2 on a 20-line structure
- append/2 on a 300-line structure

(which makes sense, since ++ is a BIF.)

On 1,000 lines, the Btree version is:
- 20x faster on nth/2
- 25x faster on delete/2
- 4x faster on append/2
- well... 30x slower on insert(1,...)

Obviously, lines.erl could be used for something other than lines
of text... can't think of anything right now, though -- except
perhaps that a more sophisticated word processor might handle
paragraphs and page objects rather than lines of text.

So, now, anyone who wants to write an emacs clone in Erlang, will
at least have a reasonably efficient data structure for line
handling. (:

Ulf Wiger, Chief Designer AXD 301         <ulf.wiger@REDACTED>
Ericsson Telecom AB                          tfn: +46  8 719 81 95
Varuvägen 9, Älvsjö                          mob: +46 70 519 81 95
S-126 25 Stockholm, Sweden                   fax: +46  8 719 43 44

More information about the erlang-questions mailing list