[erlang-questions] gen_tcp very slow to fetch data

Tony Rogvall tony@REDACTED
Tue Nov 17 16:03:43 CET 2009


Do not forget about {active, once} mode.
{active,once} will receive one message (depends on buffer size etc)
the it will switch to passive mode. To get the next message you use
inet:setopts(Socket, [{active,once}]) to activate it again. This mode enables
a selective receive at the same time as it enables flow control.

/Tony


On 17 nov 2009, at 01.51, Ngoc Dao wrote:

>> From inet's doc:
> http://www1.erlang.org/documentation/doc-4.9.1/lib/kernel-2.4.1/doc/html/inet.html
> 
> If the active option is true, which is the default, everything
> received from the socket will be sent as messages to the receiving
> process. If the active option is set to false (passive mode), the
> process must explicitly receive incoming data by calling
> gen_tcp:recv/N or gen_udp:recv/N  (depending on the type of socket).
> Note: Passive mode provides flow control; the other side will not be
> able send faster than the receiver can read. Active mode provides no
> flow control; a fast sender could easily overflow the receiver with
> incoming messages. Use active mode only if your high-level protocol
> provides its own flow control (for instance, acknowledging received
> messages) or the amount of data exchanged is small.
> 
> 
> On Tue, Nov 17, 2009 at 2:59 AM, ERLANG <erlangy@REDACTED> wrote:
>> Hi Chandru !
>> 
>> That's fix my problem. Thanks.
>> While googling a bit, I found two ways to read from the Socket:
>> 
>> recv(Socket, Bin) ->
>>    receive
>>        {tcp, Socket, B} ->
>>            io:format(".", []),
>>            recv(Socket, concat_binary([Bin, B]));
>>        {tcp_closed, Socket} ->
>>            {ok, Bin};
>>        Other ->
>>            {error, {socket, Other}}
>>        after
>>            ?TIMEOUT ->
>>            {error, {socket, timeout}}
>>    end.
>> 
>> % version 2 with "gen_tcp:recv"
>> recv2(Socket, Bin) ->
>>    case gen_tcp:recv(Socket, 0, ?TIMEOUT) of
>>         {ok, B} ->
>>             io:format(".", []),
>>             recv(Socket, concat_binary([Bin, B]));
>>         {error, closed} ->
>>             {ok, Bin};
>>        {error, timeout} ->
>>             {error, {socket, timeout}};
>>         Other ->
>>             {error, {socket, Other}}
>>     end.
>> 
>> 
>> Which one is the best in my case (see below: fetch.erl)?
>> 
>> Regards
>> Zabrane
>> 
>> Le 16 nov. 09 à 18:53, Chandru a écrit :
>> 
>>> You are expecting the server to indicate end of response by closing the
>>> connection, but because you specify HTTP/1.1 in the request, the server is
>>> holding up your connection, and you are timing out. Try replacing HTTP/1.1
>>> with HTTP/1.0 in your request, or parse the response to detect end of
>>> response.
>>> 
>>> cheers
>>> Chandru
>>> 
>>> 2009/11/16 zabrane Mikael <zabrane3@REDACTED>
>>> 
>>>> Hi List !
>>>> 
>>>> New to Erlang, I'm trying to implement a simple URL fetcher.
>>>> Here's my code (please, feel free to correct it if you find any bug or
>>>> know
>>>> a better approach):
>>>> 
>>>> 
>>>> 
>>>> 8-----8-----8-----8-----8-----8-----8-----8-----8-----8-----8-----8-----8----
>>>> -module(fetch).
>>>> 
>>>> -export([url/1]).
>>>> 
>>>> -define(TIMEOUT,    7000).
>>>> -define(TCP_OPTS,   [binary, {packet, raw}, {nodelay, true},
>>>>                   {active, true}]).
>>>> 
>>>> url(Url) ->
>>>>  {ok, _Tag, Host, Port} = split_url(Url),
>>>> 
>>>>  Hdrs = [],
>>>>  Request = ["GET ", Url, " HTTP/1.1\r\n", Hdrs, "\r\n\r\n"],
>>>> 
>>>>  case catch gen_tcp:connect(Host, Port, ?TCP_OPTS) of
>>>>    {'EXIT', Why} ->
>>>>          {error, {socket_exit, Why}};
>>>>      {error, Why} ->
>>>>          {error, {socket_error, Why}};
>>>>      {ok, Socket} ->
>>>>          gen_tcp:send(Socket, list_to_binary(Request)),
>>>>          recv(Socket, list_to_binary([]))
>>>>  end.
>>>> 
>>>> recv(Socket, Bin) ->
>>>>  receive
>>>>      {tcp, Socket, B} ->
>>>>          io:format(".", []),
>>>>          recv(Socket, concat_binary([Bin, B]));
>>>>      {tcp_closed, Socket} ->
>>>>          {ok, Bin};
>>>>      Other ->
>>>>          {error, {socket, Other}}
>>>> after
>>>>  ?TIMEOUT ->
>>>>          {error, {socket, timeout}}
>>>>  end.
>>>> 
>>>> 
>>>> split_url([$h,$t,$t,$p,$:,$/,$/|T]) ->  split_url(http, T);
>>>> split_url(_X)                       ->  {error, split_url}.
>>>> 
>>>> split_url(Tag, X) ->
>>>>  case string:chr(X, $:) of
>>>>      0 ->
>>>>          Port = 80,
>>>>          case string:chr(X,$/) of
>>>>              0 ->
>>>>                  {ok, Tag, X, Port};
>>>>              N ->
>>>>                  Site = string:substr(X,1,N-1),
>>>>                  {ok, Tag, Site, Port}
>>>>          end;
>>>>      N1 ->
>>>>          case string:chr(X,$/) of
>>>>              0 ->
>>>>                  error;
>>>>              N2 ->
>>>>                  PortStr = string:substr(X,N1+1, N2-N1-1),
>>>>                  case catch list_to_integer(PortStr) of
>>>>                      {'EXIT', _} ->
>>>>                          {error, port_number};
>>>>                      Port ->
>>>>                          Site = string:substr(X,1,N1-1),
>>>>                          {ok, Tag, Site, Port}
>>>>                  end
>>>>          end
>>>>  end.
>>>> 
>>>> 
>>>> 
>>>> 8-----8-----8-----8-----8-----8-----8-----8-----8-----8-----8-----8-----8------
>>>> 
>>>> When testing it, the receiving socket gets very very slow:
>>>> $ erl
>>>> 1> c(fetch).
>>>> 2> Bin = fetch:url("http://www.google.com").
>>>> ......{error,{socket,timeout}}
>>>> 
>>>> Am I missing something?
>>>> What I like to get at the end is a very fast fetcher. Any hint?
>>>> 
>>>> Regards
>>>> Zabrane
>>>> 
>> 
>> 
>> ________________________________________________________________
>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>> erlang-questions (at) erlang.org
>> 
>> 
> 
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
> 



More information about the erlang-questions mailing list