[erlang-questions] Trouble with JInterface

jim rosenblum jim.rosenblum@REDACTED
Tue Jan 13 17:20:12 CET 2015


It happens for  all mailboxes. Outside of the failover scenario, everything
functions perfectly. So it happens for all mailboxes that are in their
receive loop when the OtpErlangExit is received.

We will try the tracing with the -DOtpConnection.trace parameter as you
suggested - we haven't done that yet.

thanks

On Tue, Jan 13, 2015 at 10:01 AM, Vlad Dumitrescu <vladdu55@REDACTED>
wrote:

> Hi Jim!
>
> Does this happen every time, for all mailboxes?
>
> Did you try to trace the Java code, to see if there is more information to
> get?
> Use -DOtpConnection.trace=3 and you will get all the activity on the
> connection from the Java side (it's global, so it will be for _everything_,
> you might want to try starting with a value of 1 at first and increase if
> it's not enough).
>
> regards,
> Vlad
>
>
> On Tue, Jan 13, 2015 at 3:39 PM, jim rosenblum <jim.rosenblum@REDACTED>
> wrote:
>
>> Folks,
>>
>> We are having a problem with Jinterface and Erlang 17.2. Essentially we
>> have a Java/Clojure application that is using a home-brewed, Erlang,
>> two-node, in-memory cache. We think the problem is either a bug with
>> Jinterface or a misunderstanding on how to use it.
>>
>> On the Java side, we create a node and 1 to 12 unnamed mailboxes. We have
>> a receive loop on each mailbox: receiving messages from Erlang, doing the
>> right things with those messages, catching any errors and looping back to
>> the receive. In general, everything works fine.
>>
>> When one of the Erlang cache-nodes goes down, the application recreates
>> its mailboxes, connects to the “other” Erlang node via the new mailboxes
>> and attempts to carry on. Here is where we have the problem:
>>
>> The Failover Protocol:
>> For each mailbox
>> 1. Receive loop gets OtpErlangExit
>> 2. Existing mailbox is closed
>> 3. New unnamed mailbox created
>> 4. Application appropriate message is sent to the failover node via this
>> new mailbox
>> 5. Loop to the receive
>>
>> Symptom:
>> If the receive loop has a timeout, no message is received by the new
>> mailbox until AFTER the timeout fires and the loop iterates back to the
>> receive. If there is no timeout specified the receive hangs (seemingly)
>> forever.  We have confirmed that the failover Erlang node received the
>> message (step 4) sent via the new mailbox, and that the failover Erlang
>> node is sending messages to the PID representing the new mailbox —
>> inspecting my server using Observer, I can see that it is sending the
>> expected message to the correct PID.
>>
>> Our workaround is to use a short timeout in our mailbox receive loop so
>> that we don’t hang for too long, but we are troubled by the thought that we
>> might not understand JInterface best-practices as we are skeptical that the
>> JInterface code is anything but perfect :)
>>
>> Is this a known issue, or does someone have any advice on how to proceed?
>>
>>
>> Thanks,
>>
>> Jr0
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150113/afb99208/attachment.htm>


More information about the erlang-questions mailing list