FW: pg2 is broken R13B04

Tue May 4 16:25:25 CEST 2010

Hi,

I sent this to erlang-bugs too, but for some reason it didn't get in the list?

....

So after more tests I have seen that pg2 is definitely not working as intended.

It appears that the root problem is how a new instantiation of pg2 within a cluster of Erlang nodes gets its data.

The following sequences of events occur.

1) All nodes do a net_kernel:monitor_nodes(true) in the init function of pg2.

2) The new instance of pg2 will send {new_pg2, node()} to all other nodes in the pool.

3) The new instance of pg2 will send {nodeup, Node} to itself (where nodes is a list of nodes()).

What it appears is that when only 2 nodes are in the pool things are generally ok. However, the synchronization process gets muddied when there are many members.

The process of updating the nodes is that upon receipt of {new_pg2,Node} or {nodeup,Node} to literally go through the table of pids in the ets pg2_table and build a list similar to:

[proxy_micro_cache,[<6325.319.0>,<6324.324.0>]]]

This is dispatched to the new pg2 instance. The problem is every node does that, so the new pg2 instance will end up with a table like:

[proxy_micro_cache,
  [<6437.319.0>,<6437.319.0>,<6437.319.0>,<6437.319.0>,
   <6437.319.0>,<6437.319.0>,<6436.324.0>,<6436.324.0>,
   <6436.324.0>,<6436.324.0>,<6436.324.0>,<6436.324.0>]]]

Where there are many instances for each Pid since each node has sent its copy of the data, causing that process to be replicated many times (i.e. the call to ets:update_counter in pg2:join_group)!!!

This problem is compounded further on nodes that join later, or when a VM stops and is restarted.

An additional problem arises when another process in the new group sends pg2:join on its own. In that case there is a timing window whereby the new instance could get that new entry more than once.

My recommendation is:

1) Have a new pg2:join function called pg2:join_once. In this case a process will never be permitted to have more than 1 join.

2) When a new node joins one could either select only one node to get its data from, or have all nodes in the system send the result of ets:tab2list(pg2_table) to the new node, then have that data inserted directly into its local ets table, as opposed to going through the process of join_group (possibly with the additional step erlang:monitor/2). In this way a process that has been registered more than once would be inserted into the local ets table as a single operation as opposed to many times.

3) Possibly defer new requests to pg2:join on the new instance until synchronization is complete.

I understand that gproc is on the way, but I suspect that pg2 does need fixing.

Regards

Matt