exception exit: timeout in gen_server:call

Tue Dec 3 20:46:41 CET 2019

>
> If it transfers 2 gigabyte of data, then this single postgres query is
> going to take some time.
>

Of course, but this is not the case. Data is a very small packet.

If someone is doing updates which require a full table lock on the users
> table, this query is going to take some time.
>

No locks, read only: “select * from users where id = ‘123’”. Writes are on
user registration only, so irrelevant.

Other tricks:
>
> * If your initial intuitive drill down into the system bears no fruit,
> start caring about facts.
> * Measure the maximal latency over the equery call you've made for a 10-15
> second period. Plot it.
> * We are interested in microstutters in the pacing. If they are present,
> it is likely there is some problem which then suddenly tips the system
> over. If not, then it is more likely that it is something we don't know.
> * The database might be fast, but there is still latency to the first
> byte, and there is the transfer time to the last byte. If a query is 50ms,
> say, then you are only going to run 20 of those per connection.
> * Pipeline the queries. A query which waits for an answer affects every
> sibling query as well.
>
> Down the line:
>
> * Postgres can log slow queries. Turn that on.
> * Postgres can log whenever it holds a lock for more than a certain time
> window. Turn that on.
>
> Narrow down where the problem can occur by having systems provide facts to
> you. Don't go for "what is wrong?" Go for "What would I like to know?".
> This helps observability (In the Control Theory / Charity Majors sense).
>

I did most of those. There are no slow queries, the database is literally
sleeping. The issue is with the latency and the variance on the latency
responses.

You nailed it at first: it's a matter of flow control. Simply put, my HTTP
server is faster than the queries to the database (due to the
extra-latency), even though prepared statements are used. This never occurs
on other installations where databases are local, so I simply
underestimated this aspect.

As per my previous statement, I'm going to add a flow restriction by which
if there are no available database workers in the pool, I'll reject the
HTTP call (I guess with a 503 or similar).

Thank you for your help,
r.

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20191203/08e8e4e5/attachment.htm>