Untimely garbage collection

Sun Jul 14 15:57:36 CEST 2002

>>>>> "slf" == Scott Lystig Fritchie <fritchie@REDACTED> scrawled:
slf> >>>>> "sp" == Shawn Pearce <spearce@REDACTED> writes:
slf> sp> When the second gen_server gets the binaries, it sends them to the
slf> sp> port using Port ! {self(), {command, List}}, where List is the
slf> sp> List of ErlDrvBinary objects given to Erts by the bt848 driver.
slf>
slf> The docs & erts code seem to imply that erlang:port_command/2 is the
slf> preferred way of doing that.  {shrug}

Learn something new every day.  Thanks, I'll update my code.

slf> Would I be correct to guess that your XVideo driver defines the
slf> 'outputv' method and that its 'outputv' handler accesses the pointers
slf> inside of the ErlIOVec directly (to avoid unnecessary data copies)?

Yes.  Because the binaries are much larger than the 4*ERL_ONHEAP_BIN_LIMIT
(which in R8B-1 is 256 bytes), Erts won't combine them into a single
binary.  Instead I get the group of them as an ErlIoVec, which I then
just take the binaries out of.

However, my code is "poor" in that the data stored within the binary must
start at the first byte of the ErlDrvBinary.  In reality, the ErlIoVec may
point to a byte within the binary, not at the start.  This can occur if
the binary actually came from another driver, but the driver used an
offset to skip some leading number of bytes, or if Erts "splits" the
binary into two subbinaries without copying the data.

I guess that would be a 'bug' that I should address at some point.  Right
now the only binaries I am dealing with are controlled by other drivers
I've also written.

slf> sp> Initial testing showed that allocating ErlDrvBinary objects for
slf> sp> each video frame was far too costly in CPU time.  The allocator is
slf> sp> just too slow.
slf> 
slf> Really?  You really be moving a *lot* of data through those drivers.
slf> Or, if after allocating a ErlDrvBinary, you don't have enough time to
slf> copy the frame into the new ErlDrvBinary without dropping some data?
slf>
slf> Perhaps this strategy would be useful?  Have the bt848 driver allocate
slf> a single (or a small number of) ErlDrvBinary large enough to hold
slf> several frames worth of data.  The driver can choose the offset in a
slf> ErlDrvBinary to deposit the next frame's data.  Hrm ... it isn't
slf> obvious if this would lower your overhead or not.

Eh.  Its digital video.  Frames of digital video are not exactly what I'd
call small.  Plus I have digital audio too, but those are quite small
compared to the digital video frames.

It was "initial" testing, my test consisted of a very small Erlang
driver and module, was run several times in a couple of hours, and that
was that.  I may very well have done something in that test that wasn't
close enough to real life, giving me bad results.

Part of the problem with allocating a frame (or even a group of frames)
every time I need them, rather than reusing ErlDrvBinarys is that i could
potentially explode my memory heap quite dramatically.  If a video compressor
gets behind, I'll still be capturing video frames at "wire speed".  I'll
never get any back pressure from the compressor to slow down the capture
engine, forcing the capture engine to just skip capturing frames.

I could setup my own counters and stuff and have the video compressor send
a message to the capture driver when the video compressor starts to see that
its queue is getting long I guess...

It just seemed so much more convienent to let the driver "own" an
ErlDrvBinary, and when that binary's refc == 1, reuse it.  With a fixed
number of binaries, its possible for the driver to feel the backpressure
very quickly, as the video compressor won't be "releasing" binaries by
setting their refc to 1.

slf> sp> From the perspective of my application, it would be ok
slf> sp> for my Erlang servers to notify the C drivers when they are done
slf> sp> with the binary so it can rewrite it, regardless of the refc.
slf> 
slf> I had a brainstorm I had yesterday on this topic.  Consider this
slf> example from the SWIG (http://www.swig.org/) documentation:
slf> 
slf> 	# Copy a file
slf> 	def filecopy(source,target):
slf> 		f1 = fopen(source, "r")
slf> 		f2 = fopen(target, "w")
slf> 		buffer = malloc(8192)
slf> 		nbytes = fread(buffer,8192,1,f1)
slf> 		while (nbytes > 0):
slf> 			fwrite(buffer,8192,1,f2)
slf> 			nbytes = fread(buffer,8192,1,f1)
slf> 		free(buffer)
slf> 
slf> An Erlang driver cannot implement malloc and fread in this manner
slf> because of its assumption of multiple assignment.
slf> 
slf> But, what if the local Erlang process knew that certain binaries were
slf> multiple-assignment-capable?  Then it could safely work like filecopy
slf> above *if* it were written carefully.  This might be viable if two
slf> things were added:
slf> 
slf> 	1. If the owner process of such a binary were to send it
slf> 	to another process, the ErlDrvBinary data would be _copied_ so
slf> 	that the multiple-assignment-ignorant receiver could
slf> 	blissfully assume single-assignment semantics?
slf> 
slf> 	2. The driver implemented a copy method so that the owner
slf> 	process could make a single-assignment "snapshot" of the
slf> 	multiple-assignment binary for long-term keeping.
slf> 
slf> Is this a good idea?  {shrug}

I don't like this idea that much.  What I was thinking about instead
was a way for a driver to "register" interest in an ErlDrvBinary's
refc mutation.  For example:

void
my_watcher(ErlDrvData ref0, ErlDrvBinary theBinary)
{
	fprintf(stderr, "%p now has %i users\r\n", theBinary, theBinary->refc);
}

	ErlDrvBinary* b = driver_alloc_binary(8192);
	driver_watch_binary(ref0, b, my_watcher);

Then when Erts decrements refc during GC in a process we can see that
because Erts called the my_watcher function with the driver's own
data object and the binary in question.

Then the Erlang code could do:

	% Copy a file
	filecopy(source,target) ->
		{ok, F1} = file:open(source, [read, raw, binary]),
		{ok, F2} = file:open(target, [write, raw, binary]),
		filecopy2(F1, F2).

	filecopy2(F1, F2) ->
		erlang:garbage_collect(),
		filecopy2(F1, F2, file:read(F1, 8192)).

	filecopy2(F1, F2, {ok, Buffer}) ->
		file:write(F2, Buffer),
		filecopy2(F1, F2);
	filecopy2(F1, F2, eof) ->
		file:close(F1),
		file:close(F2),
		ok;
	filecopy2(F1, F2, {error, Info}) ->
		file:close(F1),
		file:close(F2),
		exit({error, Info}).

Initally, the F1 driver would check during the call to file:read
to see if a buffer has been allocated of that size.  If it has not,
it would allocate it, if it has, it would reuse that buffer, so
long as refc == 1.

The erlang:garbage_collect() call would be to force the process to
release its reference to the binary so that the driver's my_watcher
would be called, allowing the driver to see the refc decrease and
put that binary back into its list of usable buffers.

This is admittedly a stupid example, as the driver could have just
checked for refc == 1 during file:read.  This works today in R8B-1,
and I use that to some extent.

--
Shawn.

Why do I like Perl?  Because ``in accordance with Unix tradition Perl
gives you enough rope to hang yourself with.''

Why do I dislike Java? Because ``the class ROPE that should contain the
method HANG to do the hanging doesn't exist because there is too much
'security' built into the base language.''