sob wrote:
I'm logging queries > 20 ms in Postgresql, and the queries that php logs as >200ms (for instance) don't even show up as >20ms in the postgresql logs. That goes in the direction of contention for host IO right?
Actually, no, that argues for no contention, unless your PHP script is also doing a lot of I/O. In other words, the database is able to do all of it's work (including I/O) for all the queries made while that logging threshold was in place, without exceeding the threshold, which means that out of those 200ms, less than 10% is the database and disk I/O.
I suppose if I were you I might very briefly drop the threshold a bit further - something that has to be exceeded - just to make sure the logging was in fact working.
What sort of processing is the PHP code doing - are there any potential bottlenecks or ways it could get delayed?
Quote:
I'm not sure what to make of such debug:
Code:
2011-05-09 16:30:00 EDT LOG: checkpoint complete: wrote 194 buffers (6.3%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=38.671 s, sync=0.002 s, total=38.675 s
I have 3 segments per checkpoint, of 16 MB each. So does this mine the 2*16MB write was spread over 38s?
Not quite. You get a checkpoint when either checkpoint_segments WAL segments have been written or the checkpoint timeout exceeded (whichever comes first). But how much data needs to be forced out at checkpoint time depends on how well the background writer has been keeping up with the actual changes.
See
http://developer.postgresql.org/pgdocs/ ... ation.html for some details (adjust as needed for your PostgreSQL release), and
http://www.westnet.com/~gsmith/content/ ... bgw-83.htm has some useful stuff from when the newer checkpoint logic was being released.
At the point of a checkpoint, any remaining data that hasn't made it to the actual table/index files is flushed, at which point the WAL records are no longer needed for recovery (a checkpoint record is written to the WAL at that point). It is also at that point that earlier WAL segment files are permitted to be re-used.
In terms of the checkpoint, the log indicates how many remaining buffers were dirty and needed to be written, and the overall time to do so. The latter aims to spread your checkpoint over a time equal to the configured completion target as a percentage of the overall checkpoint timeout. This helps spread the I/O overhead.
So in your case, at the checkpoint time you had 194 remaining buffers to flush. I think a buffer is 8K, so that's about 1.5MB of data. The I/O was spread over 38 seconds. And the final fsync to ensure it was on the disk took almost no time. That latter point in particular is a good indication that there is little contention, since otherwise you'd block there waiting for the disk to fully acknowledge the sync.
If you wanted to minimize the checkpoint spike even further you could increase the checkpoint frequency (either by dropping the timeout or segment limit) which would then most likely have less dirty data to flush each time. But this is a pretty tiny amount of data.
To be honest, if the database isn't logging any queries beyond 20ms I'm not sure your issue is at the database level at all. You might need to work your way up the stack, and perhaps add some intermediate logging in your PHP code to try to determine where the remaining 180+ms is coming from in the longer queries. To minimize intrusion of the monitoring you could just store a timestamp at various points in the code path and then have a single top-level handler dump all the timestamps/deltas as part of some final output/log.
This assumes that you've definitely caught one of those 1000ms+ queries that started the thread. If not, then I suppose those extreme cases might still be contention within the database layer.
-- David