[erlang-questions] [ANN] Asynchronous PostgreSQL driver, second release

Fri Feb 24 20:50:25 CET 2012

Good point. The same is true in JDBC - not all drivers provide all
features. There is an API to ask the driver what features it does
support, and a well defined behaviour when you try to invoke one that
is unsupported.

Also I'm pretty sure that the mysql driver *would* be able to support
this if it was written to do so. Most database APIs have support for
batch operations (certainly JDBC drivers for MySQL support this) and
most of them give you access to each raw 'Row Object' that is returned
from a query, so you could replicate what Anton has done for other
databases.

If you're in a driver, then you have to use the async thread pool (or
a private thread pool) or the new NIF process stuff when it comes in,
and then use driver_send_term to send the rows back to the proxy
process (between you and the client) in whatever fashion you like -
you don't have to wait for the result set to be fully available before
returning data to the client. The MySQL native (C/C++) API must
support batch write operations in order for the JDBC libraries built
on top of it to support it.

Same applies if you're using a native (network) protocol - it's how
you design the interaction with the database that determines whether
or not you can do this or not.

CAVEAT: I've never actually used MySQL so I could be wrong about this,
but I'm pretty sure I've seen JDBC batch operations in sample code
that hits MySQL.

On 24 February 2012 16:58, Tom Burdick <thomas.burdick@REDACTED> wrote:
> I was under the impression mysql and sqlite wouldn't be able to
> provide all the features listed here?
>
> -Tom
>
> On Fri, Feb 24, 2012 at 3:13 AM, Tim Watson <watson.timothy@REDACTED> wrote:
>> Cool that sounds excellent. I'm also very much interested in the read
>> performance with the ipgsql call. Because you've got so many new API
>> features, I think this might help inform what a generic DB access API
>> might look like in the future.
>>
>> Very cool stuff.
>>
>> On 24 February 2012 05:18, Anton Lebedevich <mabrek@REDACTED> wrote:
>>> Hello!
>>>
>>> Publicly available benchmarks are in my todo list, haven't done them yet.
>>> At my workplace I have write intensive application that does about 7
>>> inserts/updates per client action and can tolerate some latency (seconds).
>>> My first attempt was to parse all statements and bind/execute them. It
>>> was terribly slow (~20 blocking network roundtrips per client action).
>>> Second attempt was to store all incoming action into long list of
>>> semicolon separated queries while waiting for previous query to finish
>>> then run squery. Performance was acceptable (less than 1 non-blocking
>>> network roundtrip per client action) but there was a room for
>>> improvement (backend parsed same queries over and over).
>>> Third attempt was to parse all statements and run apgsql:execute_batch
>>> per client action. It was 2 times faster than squery approach.
>>>
>>> Regards,
>>> Anton Lebedevich.
>>>
>>> On 02/24/2012 03:41 AM, Tim Watson wrote:
>>>> Anton, this is very cool nice work.
>>>>
>>>> Have you done much benchmarking with it? I have some reasonably sized
>>>> schemas so I'll go play when I get a bit of time, but I can't publish
>>>> them externally. I've been trying to figure out a good way to get hold
>>>> of publicly available data sets in order to automate this, but just
>>>> haven't got around to it.
>>>>
>>>> Anyway I will definitely go play with the ipgsql API. If someone ever
>>>> gets around to writing an EDBC API (which I've been dying to get
>>>> around to but am just too busy for) then streaming delivery of rows is
>>>> a must! :)
>>>>
>>>> Cheers,
>>>>
>>>> Tim
>>>>
>>>> On 23 February 2012 13:28, Anton Lebedevich <mabrek@REDACTED> wrote:
>>>>> Hello!
>>>>>
>>>>> https://github.com/mabrek/epgsql
>>>>> branch named 'async'
>>>>>
>>>>> It's a fork of Will Glozer's epgsql.
>>>>>
>>>>> * Motivation
>>>>>
>>>>>  When you need to execute several queries it involves several network
>>>>>  round-trips between your application and database.
>>>>>  PostgreSQL frontend/backend protocol supports request pipelining.
>>>>>  It means that you don't need to wait for previous command to finish
>>>>>  before sending next command. This version of driver makes full use
>>>>>  of the protocol feature allowing faster execution.
>>>>>
>>>>>
>>>>> * Difference highlights
>>>>>
>>>>>  + 3 API sets: pgsql, apgsql and ipgsql:
>>>>>    pgsql maintains backwards compatibility with original driver API,
>>>>>    apgsql delivers complete results as regular erlang messages,
>>>>>    ipgsql delivers results as messages incrementally (row by row)
>>>>>  + internal queue of client requests, so you don't need to wait for
>>>>> response to send next request
>>>>>  + single process to hold driver state and receive socket data
>>>>>  + execute several prepared statements as a batch
>>>>>  + bind timestamps in erlang:now() format
>>>>>  see CHANGES for full list.
>>>>>
>>>>>
>>>>> Regards,
>>>>> Anton Lebedevich.
>>>>> _______________________________________________
>>>>> erlang-questions mailing list
>>>>> erlang-questions@REDACTED
>>>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions