[erlang-questions] Is ei_xreceive_msg() thread safe?

Peter Membrey peter@REDACTED
Wed Aug 28 14:17:49 CEST 2013


Hi Sverker, 

Coincidentally with some help of a colleague and friend, I have implemented your suggested solution. It seems like a good half way house and means I don't need to put a sleep in there and poll the socket directly. 

I am a bit worried about the receive though. The timeout is necessary (although hopefully rarely required) but if it means I could only get a partial message, that would obviously be very bad. On the other hand, ending up blocked inside the mutex would also be very bad. 

I did come up with one potential solution for this problem - I could modify ei_xreceive() to accept a mutex as an argument. Then if a heart beat request is received, the mutex can be applied either side of the socket operation. This would mean that a mutex lock would only be taken when it was actually needed and would also mean that I wouldn't have to use the _tmo version (which could corrupt data) as even if the request blocks, it won't be in a critical section and so won't block the senders. 

The change would add a new function, but wouldn't alter the behaviour of existing functions. It seems a simple and elegant solution, but I'm not sure there'd be much support in accepting such a change upstream :) 

Cheers, 

Pete 

----- Original Message -----

From: "Sverker Eriksson" <sverker.eriksson@REDACTED> 
To: "Peter Membrey" <peter@REDACTED> 
Cc: "Serge Aleynikov" <serge@REDACTED>, "Erlang Questions" <erlang-questions@REDACTED> 
Sent: Wednesday, August 28, 2013 6:04:04 PM 
Subject: Re: [erlang-questions] Is ei_xreceive_msg() thread safe? 

On 08/27/2013 05:27 PM, Peter Membrey wrote: 



Hi Serge, 

Thank you very much for clarifying this! At least I know where everything stands now. 

Does the scenario I'm talking about make sense? Could a heartbeat in the current design end up interleaved with other data? I suspect it would be pretty rare, but we pump 10 million or more messages through this app per day, so even if the chance was a million to one, we'd end up seeing it ten times a day on average... 


Yes, it make sense. The ei_send/receive API is not thread safe for the same file descriptor. 


<blockquote>

I'm thinking for now that I'll give Robert's suggestion a try. If I call ei_xreceive() with a timeout and wrap a mutex around it, would that then protect the socket? I'm assuming (probably a bad thing) that it doesn't reply in a different thread or something... 

I guess it would be a bit wasteful to effectively poll the socket every millisecond or so (I can't really block the other threads from writing longer than that) but it seems to be a potential way to resolve this issue in the short term while I come up with a better fix. 

Does that sound like a reasonable plan? 

</blockquote>
You could use select() to wait for a message to arrive on the socket. 
Then lock your mutex and call ei_xreceive(). 

It might not be such a good idea to poll by repeated calls to ei_xreceive() with a short timeout. You can get situations when the timeout triggers after receiving half a message. That will make all following calls to ei_xreceive() get out of synch and probably fail. I guess the probability of this scenario depends on the message sizes and the speed and reliability of your connection. 

Calling ei_xreceive() without a timeout and my suggestion with select() on the other hand could make a hickup on the net to cause all your sending threads to freeze waiting for the mutex. 

/Sverker 




<blockquote>

Thanks again! 

Kind Regards, 

Peter Membrey 

----- Original Message -----

From: "Serge Aleynikov" <serge@REDACTED> To: "Peter Membrey" <peter@REDACTED> Cc: "Erlang Questions" <erlang-questions@REDACTED> Sent: Tuesday, August 27, 2013 10:59:27 PM 
Subject: Re: [erlang-questions] Is ei_xreceive_msg() thread safe? 

e 
i_send/receive family of functions are not thread safe, and receive functions do handle heartbeats internally. The way the ei_{connect,send,receive} functions are written it's not possible to do any non-blocking I/O with them or use them in multi-threaded code in the manner you described without modifying the functions. 

You can take a look at the alternative C++ library that doesn't have such limitations: https://github.com/saleyn/eixx , and offers almost all functionality that ei has. 

Regards, 

Serge 


On Tue, Aug 27, 2013 at 3:30 AM, Peter Membrey < peter@REDACTED > wrote: 


Hi guys, 

I've got a fairly basic C Node set up where I have the main thread running in a loop with ei_xreceive_msg() and a number of "callback" threads that execute functions and write data using ei_send() to the shared socket (connecting to the Erlang node). 

Originally I had a lot of data corruption (the Erlang node crashing due to corrupt data) because of incorrect locking on socket writes. I added mutexes to the ei_send() calls and this problem seemed to go away. 

However I've had a couple of occasions where the system has been quiet and then suddenly become busy where corrupt data has still been sent to the Erlang node. I'm positive all the places where I do ei_send() are protected, but that got me wondering about ei_xreceive_msg(). 

>From what I can find, ei_xreceive_msg() automatically handles heartbeats for you and I guess that means it will send some sort of reply on that socket. If the heart beat is being sent at the same time as some other process tries to write to the socket, is it possible that the two could get interleaved or something? I would honestly have thought it unlikely but I'm running out of ideas. 

Assuming it's possible, how could I add a mutex in this case? The call itself blocks, so I can't wrap the whole call in the mutex else nothing else will be able to send data, and there's no way to pass a mutex into the call itself. So as far as I can tell, there's no way to protect these writes and prevent them from getting mixed up with other writes on that socket. 

Does anyone have any ideas? I'm quite willing to accept I could be doing something pretty stupid, but I'm really out of ideas as to what that might be... 

Thanks in advance! 

Kind Regards, 

Peter Membrey 
_______________________________________________ 
erlang-questions mailing list erlang-questions@REDACTED http://erlang.org/mailman/listinfo/erlang-questions 


_______________________________________________
erlang-questions mailing list erlang-questions@REDACTED http://erlang.org/mailman/listinfo/erlang-questions 

</blockquote>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130828/235981a7/attachment.htm>


More information about the erlang-questions mailing list