[erlang-questions] [ANN] Asynchronous PostgreSQL driver, second release

Tim Watson watson.timothy@REDACTED
Mon Feb 27 11:44:21 CET 2012


Thanks Anton, that's very useful to know. I guess that it would be worth my investigating the protocol level support for this if I were planning on using MySQL in anger. I do find it rather entertaining that the default approach to this appears to be gluing together the statements in a semicolon delimited list. :)

On 27 Feb 2012, at 07:31, Anton Lebedevich wrote:

> Yes, MySQL JDBC driver supports executeBatch().
> Depending on configuration it will either rewrite query into single one
> by joining with semicolon and send it as one packet or just execute
> batch as several statements one by one serially.
> I haven't found if current MySQL protocol does support sending several
> command packets without waiting for previous one to execute or not.
> I've heard that MySQL fork Drizzle improved protocol to support
> asynchronous execution.
> 
> On 02/24/2012 11:50 PM, Tim Watson wrote:
>> Good point. The same is true in JDBC - not all drivers provide all
>> features. There is an API to ask the driver what features it does
>> support, and a well defined behaviour when you try to invoke one that
>> is unsupported.
>> 
>> Also I'm pretty sure that the mysql driver *would* be able to support
>> this if it was written to do so. Most database APIs have support for
>> batch operations (certainly JDBC drivers for MySQL support this) and
>> most of them give you access to each raw 'Row Object' that is returned
>> from a query, so you could replicate what Anton has done for other
>> databases.
>> 
>> If you're in a driver, then you have to use the async thread pool (or
>> a private thread pool) or the new NIF process stuff when it comes in,
>> and then use driver_send_term to send the rows back to the proxy
>> process (between you and the client) in whatever fashion you like -
>> you don't have to wait for the result set to be fully available before
>> returning data to the client. The MySQL native (C/C++) API must
>> support batch write operations in order for the JDBC libraries built
>> on top of it to support it.
>> 
>> Same applies if you're using a native (network) protocol - it's how
>> you design the interaction with the database that determines whether
>> or not you can do this or not.
>> 
>> CAVEAT: I've never actually used MySQL so I could be wrong about this,
>> but I'm pretty sure I've seen JDBC batch operations in sample code
>> that hits MySQL.
>> 
>> On 24 February 2012 16:58, Tom Burdick <thomas.burdick@REDACTED> wrote:
>>> I was under the impression mysql and sqlite wouldn't be able to
>>> provide all the features listed here?
>>> 
>>> -Tom
>>> 
>>> On Fri, Feb 24, 2012 at 3:13 AM, Tim Watson <watson.timothy@REDACTED> wrote:
>>>> Cool that sounds excellent. I'm also very much interested in the read
>>>> performance with the ipgsql call. Because you've got so many new API
>>>> features, I think this might help inform what a generic DB access API
>>>> might look like in the future.
>>>> 
>>>> Very cool stuff.
>>>> 
>>>> On 24 February 2012 05:18, Anton Lebedevich <mabrek@REDACTED> wrote:
>>>>> Hello!
>>>>> 
>>>>> Publicly available benchmarks are in my todo list, haven't done them yet.
>>>>> At my workplace I have write intensive application that does about 7
>>>>> inserts/updates per client action and can tolerate some latency (seconds).
>>>>> My first attempt was to parse all statements and bind/execute them. It
>>>>> was terribly slow (~20 blocking network roundtrips per client action).
>>>>> Second attempt was to store all incoming action into long list of
>>>>> semicolon separated queries while waiting for previous query to finish
>>>>> then run squery. Performance was acceptable (less than 1 non-blocking
>>>>> network roundtrip per client action) but there was a room for
>>>>> improvement (backend parsed same queries over and over).
>>>>> Third attempt was to parse all statements and run apgsql:execute_batch
>>>>> per client action. It was 2 times faster than squery approach.
>>>>> 
>>>>> Regards,
>>>>> Anton Lebedevich.
>>>>> 
>>>>> On 02/24/2012 03:41 AM, Tim Watson wrote:
>>>>>> Anton, this is very cool nice work.
>>>>>> 
>>>>>> Have you done much benchmarking with it? I have some reasonably sized
>>>>>> schemas so I'll go play when I get a bit of time, but I can't publish
>>>>>> them externally. I've been trying to figure out a good way to get hold
>>>>>> of publicly available data sets in order to automate this, but just
>>>>>> haven't got around to it.
>>>>>> 
>>>>>> Anyway I will definitely go play with the ipgsql API. If someone ever
>>>>>> gets around to writing an EDBC API (which I've been dying to get
>>>>>> around to but am just too busy for) then streaming delivery of rows is
>>>>>> a must! :)
>>>>>> 
>>>>>> Cheers,
>>>>>> 
>>>>>> Tim
>>>>>> 
>>>>>> On 23 February 2012 13:28, Anton Lebedevich <mabrek@REDACTED> wrote:
>>>>>>> Hello!
>>>>>>> 
>>>>>>> https://github.com/mabrek/epgsql
>>>>>>> branch named 'async'
>>>>>>> 
>>>>>>> It's a fork of Will Glozer's epgsql.
>>>>>>> 
>>>>>>> * Motivation
>>>>>>> 
>>>>>>> When you need to execute several queries it involves several network
>>>>>>> round-trips between your application and database.
>>>>>>> PostgreSQL frontend/backend protocol supports request pipelining.
>>>>>>> It means that you don't need to wait for previous command to finish
>>>>>>> before sending next command. This version of driver makes full use
>>>>>>> of the protocol feature allowing faster execution.
>>>>>>> 
>>>>>>> 
>>>>>>> * Difference highlights
>>>>>>> 
>>>>>>> + 3 API sets: pgsql, apgsql and ipgsql:
>>>>>>>   pgsql maintains backwards compatibility with original driver API,
>>>>>>>   apgsql delivers complete results as regular erlang messages,
>>>>>>>   ipgsql delivers results as messages incrementally (row by row)
>>>>>>> + internal queue of client requests, so you don't need to wait for
>>>>>>> response to send next request
>>>>>>> + single process to hold driver state and receive socket data
>>>>>>> + execute several prepared statements as a batch
>>>>>>> + bind timestamps in erlang:now() format
>>>>>>> see CHANGES for full list.
>>>>>>> 
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Anton Lebedevich.
>>>>>>> _______________________________________________
>>>>>>> erlang-questions mailing list
>>>>>>> erlang-questions@REDACTED
>>>>>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>>> 
>>>> _______________________________________________
>>>> erlang-questions mailing list
>>>> erlang-questions@REDACTED
>>>> http://erlang.org/mailman/listinfo/erlang-questions
> 




More information about the erlang-questions mailing list