[erlang-questions] Need help with async disk IO and thread pool on many devices (more than 1 Gbit/s )
Patrik Nyblom
pan@REDACTED
Wed Apr 3 11:19:14 CEST 2013
On 04/03/2013 10:56 AM, Max Lapshin wrote:
> You are trying to solve another problem.
>
> Hardware can _always_ be bad. But whole system must not go down if one
> disk is slow.
>
> Currently, whole erlang is becoming useless if only one hard drive
> from 16 doesn't respond.
>
> This is the problem.
Wouldn't you rather need to have one async pool per device? Or the
option to bind certain files to specific async threads? The thing with
dynamic async threads is that the implementation relies on requests for
the same file descriptor being routed to the same thread for all of the
descriptors lifespan. If the async thread hangs, other file descriptors
already routed to that thread have to wait. You would need to have the
"good" mapping between threads and devices from the beginning.
Something like
open_a_file(Name) ->
PoolID = get_pool_id_from_name(Name),
{ok, FD} = file:open(Name,{async_pool_id, PoolId}),
FD.
and get_pool_id_from_name/1 will look at the file name, figure out the
device and return a pool id that is only used for that particular
device? It could even be a specific thread id - you could always find
out how many async threads you have, know how many devices you have and
round robin files between the threads that should handle that particular
device. In that case you would only need to add an interface where you
can specify the thread when opening the file (and remenber which thread
is connected to each port). Sounds like a fairly feasible hack...
On a side note, are you using sendfile? That might at least make the
ongoing connections run smooth while another device is blocked... Not
that it addresses the real problem, but just wondering if you have tried
using it - it might be good for performance in your application.
Another, less compelling, solution would be to route only the writes
through another process, so that you open a small "writer program" and
pump the data to it through the spawn driver. Not a general solution
either, but much simpler than to hack the file driver...
The async thread pool leaves a lot to desire - we have planned to
replace it with something called "dirty schedulers" (internal name...),
which is in our roadmap for R17. In the end they should solve a lot of
the problems with the current implementation. Not that that helps you
now, but it makes me wonder if not a "dirty" solution might be the right
way to solve it for the moment, as another solution is in the pipe...
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
Cheers,
/Patrik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130403/56749e52/attachment.htm>
More information about the erlang-questions
mailing list