Troubleshooting a high-load scenario
Joel Reymont
joelr1@REDACTED
Tue Jan 17 12:33:32 CET 2006
Folks,
I have a test harness that launches poker bots against a poker
server. The harness is written in Erlang but the poker server is C++
on Windows. The poker server uses completion ports and async IO.
I'm running into trouble with just 500 bots playing on the server,
launched from the same VM. It appears that the bots get their
commands up to 1 minute late. I'm trying to troubleshoot this and I'm
looking for ideas. I would like to believe that it's not Erlang
running out of steam but the C++ server :-).
I read the packets manually since their length is little-endian and
includes the whole packet (with the 4-byte length). I enabled
{nodelay, true} on the socket since I always write complete packets
to the socket.
I use selective receive and used to have no flow control in my socket
reader. It would just read packet length, read packet and send the
whole packet to its parent. Message queues were filling up when I was
doing that so I only read the next network message once the current
one has been processed.
I'm using the default socket buffer size for sending and receiving.
I'm not sure what the default buffer size as it's not stated in 'man
inet'. I do not have the source code for the poker server and I'm not
sure what IOCP does in the scenario when I delay reading from the
socket on my end. I'm being told by the client's techs that I could
be getting the command 1 minute late because I'm reading it from the
socket 1 minute late and the command sits in the network buffers all
the while.
How do I troubleshoot this scenario? The bots don't do much
processing themselves, basically make a decision and shoot a command
back. They don't even react to all commands. The server spits out
packets all the time, though, since all bots in the game get game
notifications and table state updates from the lobby.
Thanks, Joel
--
http://wagerlabs.com/
More information about the erlang-questions
mailing list