send/receive faster than ets:lookup?

Tue Mar 29 15:24:00 CEST 2005

Rather than spending the next hour or so meditating over the C 
code in BEAM, I thought I'd ask the list:

I slapped together a small benchmark to show that the cost of 
sending/receiving a data object in erlang is no more expensive 
than reading same object from an ETS table (I know: an 'apples 
and oranges' comparison). The result surprised me somewhat. It 
appears as if a client-server call is _much_ more efficient
than ets:lookup() for large data.

The OTP version is R10B-2:

First runs with payload being lists:seq(1,10).

55> eptest:run(1000).
[{proc,{2081,2047,2045}},{ets,{1987,1049,1376}},{empty,107}]
56> eptest:run(1000).
[{proc,{2098,2064,8537}},{ets,{1935,1050,1384}},{empty,107}]
57> eptest:run(1000).
[{proc,{2103,2070,2068}},{ets,{1978,1028,1375}},{empty,110}]
58> eptest:run(1000).
[{proc,{2083,2048,2125}},{ets,{1821,1042,1389}},{empty,112}]
59> eptest:run(1000).
[{proc,{2114,2083,2080}},{ets,{1886,1046,1382}},{empty,110}]

Explanation:
- The 'proc' pass runs the same test three times: one process 
  sends a {self(), get} message to a server process, and 
  receives {Server, Data} back; repeat a 1000 times.
  i.e. 1000 client-server calls took 2,064 ms
- The 'ets' pass writes Data to an initially empty ets table,
  using the iteration counter as key; the second number 
  is for writing same data again, overwriting the old -- thus,
  no re-hashing; the third number is for ets:lookup().
  I.e. 1000 ets:lookup() took 1,38 ms. It doesn't feel too
  surprising that ets:lookup() beats the client-server call,
  for small payload.
- The 'empty' pass just runs through an empty loop 1000 times.

Next, same test but with payload being lists:seq(1,1000).

61> eptest:run(1000).
[{proc,{26812,32757,27390}},{ets,{60121,37657,47838}},{empty,101}]
62> eptest:run(1000).
[{proc,{27575,28201,26403}},{ets,{47367,37911,50910}},{empty,100}]
63> eptest:run(1000).
[{proc,{26866,27328,26295}},{ets,{51932,38646,48290}},{empty,103}]
64> eptest:run(1000).
[{proc,{26635,31830,26444}},{ets,{44664,40211,47878}},{empty,110}]
65> eptest:run(1000).
[{proc,{27073,38103,29029}},{ets,{41096,37969,48585}},{empty,104}]

Now, what happened here? Why did ets:lookup() get so expensive 
all of a sudden? 2,6 ms for the client-server call vs. 4.78 ms
for the ets:lookup(). The ratio is 1.8x. For lists:seq(1,10000),
the ratio is the same.

I checked the key distribution in the ets table. For 1000 keys,
sequential 1-1000, the average bucket length was 3.9, and the 
biggest bucket contained 12 objects. 9 out of 256 buckets were 
empty. But the difference seemed the same even for much smaller
N. Is the term copy between process heaps more efficient than 
the copy from ets to process heap?

Benchmark source attached.

/Uffe

-------------- next part --------------
A non-text attachment was scrubbed...
Name: eptest.erl
Type: application/octet-stream
Size: 2490 bytes
Desc: eptest.erl
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20050329/466c61dd/attachment.obj>