To understand that, remember that an UPDATE in PostgreSQL does not overwrite the existing row, but adds a new row version to the table. The time is clearly spent in a sequential scan of sum, which must be growing to a size of 443 8kB-pages. > Seq Scan on sum (cost=0.00.38.25 rows=2260 width=14) (actual time=1.637.1.637 rows=1 loops=1)Įxplaining the cause of updates getting slower Row versions keep accumulating Then execution gets slower and slower, until the 100000th UPDATE reads: Query Text: UPDATE sum SET total = total + NEW.value However, if we start the experiment over, we get each of the 100000 UPDATE statements from the trigger logged, and the result is enlightening: Also, auto_explain.log_analyze = on will slow down all statements considerably, since PostgreSQL then measures statement execution in detail. For one, auto_explain.log_analyze = on will log all statements, which can be more than your disk can handle. Note that these are not settings I would want to use on a busy server. We need to restart PostgreSQL for the changed shared_preload_libraries setting to take effect. Shared_preload_libraries = 'auto_explain' For that, we add the following lines to nf: Why does that statement take so long? We can use the auto_explain extension to see details of the execution of the individual UPDATE statements. The problem clearly is the trigger, and the trigger function contains only a single UPDATE statement. How to prove that the problem is updates getting slower? If I drop the trigger on item, the same script runs in less than 70 milliseconds. On my machine, the above script takes 60 seconds, which is a terribly long time to load 100000 rows. Since this script mixes a COPY statement with data, we need to execute it with psql: We will bulk load the table with a script like this: Also, we are careful not to place any index on sum (which would be tempting to enforce that there is only a single row) so that we can get fast HOT updates. Then the row in the sum table is not locked any longer than absolutely necessary, and we get the best possible concurrency. We are using a deferred constraint trigger here so that the update takes place at the very end of the transaction. UPDATE sum SET total = total + NEW.value Because we frequently need to query the sum of the values in the table, we keep a summary table that is maintained by a trigger (see this article for an explanation):ĬREATE FUNCTION add_to_sum() RETURNS trigger Our example scenario is a table that receives bulk loads. An example that demonstrates updates getting slower in PostgreSQL Object definitions Now slow updates are a frequent problem, so I thought it would be interesting to explain what’s going on here in detail. They saw that it was a trigger on the table that took longer and longer, and in that trigger, the updates were getting slower all the time. Recently, a customer asked me why a bulk load into a PostgreSQL table kept slowing down as time went by. Using Postgres.app v2.5.4 and PostgreSQL 14.4.Performance postgresql slow update slowdown trigger I don’t see what is wrong with the above code. DROP TRIGGER IF EXISTS update_ts ON public.people ĮXECUTE FUNCTION public.update_modified_ts() Ī SQL statement like ‘UPDATE people SET lastname =‘Meier’ WHERE last name = Mueller’ ’ executes and replaces Mueller by Meier, but the trigger does not fire and ‘modified_ts’ is not set. The trigger on table ‘people’ is defined as: - Trigger: update_ts DROP FUNCTION IF EXISTS public.update_modified_ts() ĬREATE OR REPLACE FUNCTION public.update_modified_ts()ĪLTER FUNCTION public.update_modified_ts() The trigger function ‘update_modified_ts()’ is defined as: - FUNCTION: public.update_modified_ts() ALTER TABLE IF EXISTS public.people DROP COLUMN IF EXISTS modified_ts ĪDD COLUMN modified_ts timestamp without time zone This table has a column ‘modified_ts’ defined as: - Column: _ts In a db, timestamps shall be set server-side.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |