[erlang-questions] Issues with the inets http client

Fri May 15 01:11:40 CEST 2009

Hi All,

I have some experience of trying to use the inets HTTP client in 
production code that heavily relies on XML over HTTP APIs to external 
systems. Among other client libraries we've tried to use the inets HTTP 
client and have found some issues with this.

1) Synchronous requests between the httpc_manager and httpc_handler
When a new connection needs to be spawned, the httpc_manager calls 
start_handler/2, which in turn either calls 
httpc_handler_sup:start_child/1 or httpc_handler:start_link/3. 
httpc_handler is a gen_server, which will do way too much work in its 
init function. It will call send_first_request/3 (unless it returns a 
https_through_proxy_is_not_currently_supported error), which will try to 
establish a connection to a remote server, and send the request. This 
might sound like the sane thing to do, but establishing a connection can 
take a very long time, and it will actually use the timeout for the 
complete request in the call to http_transport:connect/4. Now, it has 
been pointed out that the TCP stack might time out earlier, but this is 
still potentially a very long time. The problem here is that the process 
that called start_link on a gen_server will be blocked until the call to 
init/1  has returned. So, the manager, responsible for relaying requests 
and responses for every request in the system is blocked while some 
request is trying to connect and send its data. On a congested network 
this can be even worse, since the call httpc_request:send/3, thit will 
eventually call gen_tcp:send/2 or ssl:send/2, might also block, this 
time for an unspecified time. During this time, no other requests or 
responses can't be handled by the manager.

In our case this caused the manager to have several thousands of http 
requests queued up in its message queue, using quite a lot of memory, 
and making issue number 2 even worse.

2) Timeout handling
The http:request/X function will call http_handler:request/2, which will 
return a request id. It will then call handle_answer/2. If the request 
is a synchronous request it will wait (without a timeout) for a response 
from the manager. The manager in question can (as pointed out above) be 
busy with any other request and this request will be somewhere in its 
message queue. When the manager however manages to deal with the request 
it can either spawn a new httpc_handler, or reuse one. In the first 
case, as pointed out above, the timeout is used to connect (with ssl 
this gets even worse, since the same timeout is reused in several calls 
without subtracting the time used, and in some cases, an infinite 
timeout is used to resolve the host name). In the second case, 
httpc_handler:send/2 is called. This is a gen_server call which would 
time out in 5 seconds (card coded). Again, this calls 
httpc_request:send, which might block forever. After (possibly 
connecting and) sending the request, which might or might not have taken 
a long time, activate_request_timeout/1 is called, with the originally 
specified timeout. This starts a timer, using erlang:send_after/3. (Just 
as a side note, this does a call back to httpc_manager, to add a session 
in some ets table...) which is the request timeout.

In our case, we use SSL. We did get hit by the infinite timeout for DNS 
looups when out network wasn't performing as it should, and we had lots 
of hanging processes. This means that combined with the first case, you 
can hang the httpc_manager forever. When processes didn't hang, the 
requests would time out long after they "should have" according to what 
I would expect from passing a timeout to an API call.

3) Copying of data between processes
The http module will send the request to the httpc_manager, which will 
either dispatch it to an existing httpc_handler process, or will spawn a 
new one and send it to that. This mean that all data in the request will 
be copied form requesting process to the manager and the httpc_handler 
process. The result of the request is copied from the httpc_handler 
process to the httpc_manager and then to the requesting process. The 
httpc_handler process will hang around until a timeout is reached or the 
socket is closed remotely. Before that, a gc has hopefully taken place. 
The manager will never exit, but will hopefully also gc at some point. 
The requesting process is the only one that needs to keep the data, 
whatever it does is outside of inets.

4) Reading of requests
The httpc_handler module will read HTTP responses solely relying on 
{active once}, using some kind of continuation style parser. This parser 
generally reads one octet at the time from the binary received from the 
socket, storing it in an accumulator and continuing with the next. Some 
experiments have shown that collecting large bodies with {active, once} 
is quite expensive (CPU wise) 
(http://code.hellstrom.st/hg.cgi/gen_httpd/file/tip/test/micro_benchmark.erl). 
But of course, if this process is supposed to answer to other pipelined 
requests, it can't use passive receive from the socket.

I remember there being other issues with the client, such as not 
persisting connections in case of POST requests, since the concepts of 
pipelining and persistent connections was mixed up. I don't know if this 
has been fixed, and I'm too lazy to look in the source for this. Other 
ppl. have also complained about session_remotely_closed errors 
(http://www.erlang.org/pipermail/erlang-bugs/2009-April/001269.html)

I hope this is useful feedback, as the OTP team asked for feedback on 
the inets http client earlier 
(http://erlang.org/pipermail/erlang-bugs/2009-March/001227.html, 
http://erlang.org/pipermail/erlang-questions/2009-February/041813.html).

Best regards
-- 
Oscar Hellström, oscar@REDACTED
Office: +44 20 7655 0337
Mobile: +44 798 45 44 773
Erlang Training and Consulting Ltd
http://www.erlang-consulting.com/