[Ericsson AB]

3 Common Caveats

Here we list a few modules and BIFs to watch out for, and not only from a performance point of view.

3.1 The regexp module

The regular expression functions in the regexp module are written in Erlang, not in C, and were meant for occasional use on small amounts of data, for instance for validation of configuration files when starting an application.

Don't use the regexp module in time-critical code.

There will probably be a faster regular expression library in a future release of Erlang/OTP. Until then, you can do one of the following:

3.2 The timer module

Creating timers using erlang:send_after/3 and erlang:start_timer/3 is much more efficient than using the timers provided by the timer module. The timer module uses a separat process to manage the timers, and that process can easily become overloaded if many processes create and cancel timers frequently (especially when using the SMP emulator).

The functions in the timer module that do not manage timers (such as timer:tc/3 or timer:sleep/1), do not call the timer-server process and are therefore harmless.

3.3 list_to_atom/1

Atoms are not garbage-collected. Once an atom is created, it will never be removed. The emulator will terminate if the limit for the number of atoms (1048576) is reached.

Therefore, converting arbitrary input strings to atoms could be dangerous in a system that will run continuously. If only certain well-defined atoms are allowed as input, you can use list_to_existing_atom/1 to guard against a denial-of-service attack. (All atoms that are allowed must have been created earlier, for instance by simply using all of them in a module and loading that module.)

Using list_to_atom/1 to construct an atom that is passed to apply/3 like this

apply(list_to_atom("some_prefix"++Var), foo, Args)

is quite expensive and is not recommended in time-critical code.

3.4 length/1

The time for calculating the length of a list is proportional to the length of the list, as opposed to tuple_size/1, byte_size/1, and bit_size/1, which all execute in constant time.

Normally you don't have to worry about the speed of length/1, because it is efficiently implemented in C. In time critical-code, though, you might want to avoid it if the input list could potentially be very long.

Some uses of length/1 can be replaced by matching. For instance, this code

foo(L) when length(L) >= 3 ->
    ...

can be rewritten to

foo([_,_,_|_]=L) ->
   ...

(One slight difference is that length(L) will fail if the L is an improper list, will the pattern in the second code fragment will accept an improper list.)

3.5 setelement/3

setelement/3 copies the tuple it modifies. Therefore, updating a tuple in a loop using setelement/3 will create a new copy of the tuple every time.

There is one exception to the rule that the tuple is copied. If the compiler clearly can see that destructively updating the tuple would give exactly the same result as if the tuple was copied, the call to setelement/3 will be replaced with a special destructive setelement instruction. In the following code sequence

multiple_setelement(T0) ->
    T1 = setelement(9, T0, bar),
    T2 = setelement(7, T1, foobar),
    setelement(5, T2, new_value).

the first setelement/3 call will copy the tuple and modify the ninth element. The two following setelement/3 calls will modify the tuple in place.

For the optimization to be applied, all of the followings conditions must be true:

If it is not possible to structure the code as in the multiple_setelement/1 example, the best way to modify multiple elements in a large tuple is to convert the tuple to a list, modify the list, and convert the list back to a tuple.

3.6 size/1

size/1 returns the size for both tuples and binary.

Using the new BIFs tuple_size/1 and byte_size/1 introduced in R12B gives the compiler and run-time system more opportunities for optimization. A further advantage is that the new BIFs could help Dialyzer find more bugs in your program.

3.7 split_binary/2

It is usually more efficient to split a binary using matching instead of calling the split_binary/2 function. Furthermore, mixing bit syntax matching and split_binary/2 may prevent some optimizations of bit syntax matching.

DO

        <<Bin1:Num/binary,Bin2/binary>> = Bin,

DON'T

        {Bin1,Bin2} = split_binary(Bin, Num),
        

Copyright © 1991-2008 Ericsson AB