Reducing number of copies of bulk network data

Jim Larson jim@REDACTED
Fri Jan 28 03:09:14 CET 2000


Hello!

I'm working on an Erlang application that needs to handle a large
bandwidth of network data, multiplexing it among a variety of other
(UNIX) processes, attached over TCP connections on the loopback
device.

We use binary sockets to get the bulk data into Erlang binary form,
thus avoiding copies during intra-Erlang message-passing.

When analyzing the performance of the application, I've noticed
that the current Internet socket driver (erts/emulator/drivers/common
/inet_drv.c) copies the data once during reception and once during
sending.

UDP reception works fine, receiving data directly into its Binary
buffer.

TCP reception, at least when packetizing is turned on, makes one
copy of the data.  Is there any way to eliminate this copy?

In our application, the binary received from the network is split
and reassembled into a new packet which is then sent back out over
another interface.  However, for anything but a single binary, the
runtime system makes a copy of the I/O list into a contiguous
buffer, then passes that buffer to the driver.  Is there any way
to eliminate this copy?

I've noticed an "outputv" driver entry point which seems to accept
an iovec argument, instead of a simple pointer and length.  This
would help our application tremendously.  Is this entry point
well-supported in the runtime system?  Is it mature enough for a
driver to use it?

We'd eventually like to use shared memory to communicate with the
clients of our Erlang application that are running on the same
machine.  To get true zero-copy, we'd need to be able to:

	- incorporate buffers from an mmap()'ed file as Erlang binary
	  objects;

	- receive data from a network socket directly into a buffer
	  in mmap()'ed space;

	- allocate all new binary objects, or buffers created by
	  the runtime system as concatenations of byte lists, into
	  buffers in mmap()'ed space.

The easiest way to do this seems to be to:

	- modify the runtime system's malloc() wrappers to call
	  malloc() replacements that use shared memory (note that
	  this puts the entire Erlang heap into shared memory,
	  which may be extreme);

	- create a new driver which can incorporate shared memory
	  buffers as new Erlang binaries.

Are there any other ideas on how to do this?

Lastly, will the upcoming binary syntax bring along an iovec-style
internal representation of binaries?  This would allow us to
concatenate binaries with zero copies.

Thanks,

Jim Larson
jim@REDACTED



More information about the erlang-questions mailing list