Hier habe ich eine Routine-Tabelle und ein Skript zur Tabellenerstellung, die ich in /tmp/try.sql eingefügt habe. Ich verwende Postgres unter OSX 10.10.2.
Hier ist die Tabelle:
cow_dev=# \d carc
Table "public.carc"
Column | Type | Modifiers
---------------+--------------+--------------------------------------------------- ---------
internal_id | integer | not null default nextval('carc_internal_id_seq'::regclass)
acct_nbr | character(6) |
birth_date | date |
anm_key | integer |
slghtr_dt | date |
sire_assoc_id | integer |
sire_reg | text |
dam_assoc_id | integer |
dam_reg | text |
sex | text |
carc_kphf_pct | real |
carc_wt | integer |
marbling | integer |
ribeye_area | real |
usda_qlty_grd | smallint |
act_fat_thick | real |
carcass_group | text |
maturity | integer |
Indexes:
"idx_a649568319866892fcdd3742289e8294" PRIMARY KEY, btree (internal_id)
"idx_d4659e9f750ff68c75e11e89a950e386" btree (anm_key)
"idx_d57025002b9ce54cf590b76e87c10cac" btree (acct_nbr)
Foreign-key constraints:
"carc_acct_nbr__acct_acct_nbr_fk" FOREIGN KEY (acct_nbr) REFERENCES acct(acct_nbr)
"carc_anm_key__anm_anm_key_fk" FOREIGN KEY (anm_key) REFERENCES anm(anm_key)
Hier ist das Skript:
create temp table carc_out as
with assoc_map as
(select
a.assoc_id,
c.code_3 || a.brd_cd_id mb_name
from
assoc a
join country_code c on c.code_2 = a.country_code_2)
select
c.internal_id,
n.mb_assoc_reg anm_nbr,
c.act_fat_thick,
c.carc_kphf_pct,
c.carcass_group,
c.carc_wt,
null::float carc_yld,
c.marbling,
c.maturity,
c.ribeye_area,
to_char(c.slghtr_dt, 'mmddyyyy') slghtr_dt,
c.usda_qlty_grd
from
carc c
left join z2_raa_numbers n using (anm_key)
left join assoc_map da on da.assoc_id = coalesce(c.dam_assoc_id, 1)
left join assoc_map sa on sa.assoc_id = coalesce(c.sire_assoc_id, 1);
drop table carc_out;
Hier sind die Ergebnisse der zweimaligen Ausführung, wobei das erste mit dem c.usda_qlty_grd nicht in der Abfrage enthalten ist und das zweite mit dem darin enthaltenen.
> time psql -U cow_peer -d cow_dev -c '\i /tmp/try.sql'
SELECT 25332
DROP TABLE
real 0m3.041s
user 0m0.004s
sys 0m0.004s
10:57:46 521 0 tom@angus-2 ~
> time psql -U cow_peer -d cow_dev -c '\i /tmp/try.sql'
SELECT 25332
DROP TABLE
real 3m1.958s
user 0m0.004s
sys 0m0.004s
Wie Sie sehen können, dauert es einmal 3 Sekunden und das andere Mal 3 Minuten. Dies ist wiederholbar und zumindest etwas unabhängig davon, welche zusätzliche Spalte Sie der Abfrage hinzufügen. Interessanterweise scheinen sich die Benutzer- und Systemzeiten nicht zu ändern. Die Ausführungspläne sind identisch. Ich bin ratlos.
Hier ist der Abfrageplan ohne Spalte:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=1220.51..122913.55 rows=25332 width=54) (actual time=48.925..3254.720 rows=25332 loops=1)
Output: c.internal_id, n.mb_assoc_reg, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, NULL::double precision, c.marbling, c.maturity, c.ribeye_area, to_char((c.slghtr_dt)::timestamp with time zone, 'mmddyyyy'::text)
Hash Cond: (COALESCE(c.sire_assoc_id, 1) = sa.assoc_id)
CTE assoc_map
-> Hash Join (cost=7.54..10.16 rows=52 width=11) (actual time=0.271..0.392 rows=52 loops=1)
Output: a.assoc_id, ((c_1.code_3)::text || (a.brd_cd_id)::text)
Hash Cond: (a.country_code_2 = c_1.code_2)
-> Seq Scan on public.assoc a (cost=0.00..1.52 rows=52 width=10) (actual time=0.003..0.012 rows=52 loops=1)
Output: a.assoc_id, a.assoc_code, a.priority, a.csu_priority, a.mb_priority, a.description, a.country_code_2, a.brd_cd_id
-> Hash (cost=4.46..4.46 rows=246 width=7) (actual time=0.250..0.250 rows=246 loops=1)
Output: c_1.code_3, c_1.code_2
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Seq Scan on public.country_code c_1 (cost=0.00..4.46 rows=246 width=7) (actual time=0.004..0.114 rows=246 loops=1)
Output: c_1.code_3, c_1.code_2
-> Hash Left Join (cost=1208.66..122614.19 rows=25332 width=58) (actual time=48.867..3220.637 rows=25332 loops=1)
Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.sire_assoc_id, n.mb_assoc_reg
Hash Cond: (COALESCE(c.dam_assoc_id, 1) = da.assoc_id)
-> Hash Right Join (cost=1206.97..122451.64 rows=25332 width=62) (actual time=48.380..3204.285 rows=25332 loops=1)
Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.dam_assoc_id, c.sire_assoc_id, n.mb_assoc_reg
Hash Cond: (n.anm_key = c.anm_key)
-> Seq Scan on public.z2_raa_numbers n (cost=0.00..61034.69 rows=3330869 width=19) (actual time=0.057..937.277 rows=3330869 loops=1)
Output: n.mb_assoc_reg, n.anm_key
-> Hash (cost=642.32..642.32 rows=25332 width=51) (actual time=46.444..46.444 rows=25332 loops=1)
Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.anm_key, c.dam_assoc_id, c.sire_assoc_id
Buckets: 2048 Batches: 128 (originally 4) Memory Usage: 1025kB
-> Seq Scan on public.carc c (cost=0.00..642.32 rows=25332 width=51) (actual time=0.007..18.372 rows=25332 loops=1)
Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.anm_key, c.dam_assoc_id, c.sire_assoc_id
-> Hash (cost=1.04..1.04 rows=52 width=4) (actual time=0.461..0.461 rows=52 loops=1)
Output: da.assoc_id
Buckets: 1024 Batches: 1 Memory Usage: 2kB
-> CTE Scan on assoc_map da (cost=0.00..1.04 rows=52 width=4) (actual time=0.275..0.436 rows=52 loops=1)
Output: da.assoc_id
-> Hash (cost=1.04..1.04 rows=52 width=4) (actual time=0.042..0.042 rows=52 loops=1)
Output: sa.assoc_id
Buckets: 1024 Batches: 1 Memory Usage: 2kB
-> CTE Scan on assoc_map sa (cost=0.00..1.04 rows=52 width=4) (actual time=0.001..0.018 rows=52 loops=1)
Output: sa.assoc_id
Total runtime: 3256.486 ms
(38 rows)
Hier ist der Abfrageplan mit der Spalte:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=1220.51..122913.55 rows=25332 width=56) (actual time=11588.296..179838.116 rows=25332 loops=1)
Output: c.internal_id, n.mb_assoc_reg, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, NULL::double precision, c.marbling, c.maturity, c.ribeye_area, to_char((c.slghtr_dt)::timestamp with time zone, 'mmddyyyy'::text), c.usda_qlty_grd
Hash Cond: (COALESCE(c.sire_assoc_id, 1) = sa.assoc_id)
CTE assoc_map
-> Hash Join (cost=7.54..10.16 rows=52 width=11) (actual time=0.211..0.314 rows=52 loops=1)
Output: a.assoc_id, ((c_1.code_3)::text || (a.brd_cd_id)::text)
Hash Cond: (a.country_code_2 = c_1.code_2)
-> Seq Scan on public.assoc a (cost=0.00..1.52 rows=52 width=10) (actual time=0.002..0.015 rows=52 loops=1)
Output: a.assoc_id, a.assoc_code, a.priority, a.csu_priority, a.mb_priority, a.description, a.country_code_2, a.brd_cd_id
-> Hash (cost=4.46..4.46 rows=246 width=7) (actual time=0.186..0.186 rows=246 loops=1)
Output: c_1.code_3, c_1.code_2
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Seq Scan on public.country_code c_1 (cost=0.00..4.46 rows=246 width=7) (actual time=0.003..0.065 rows=246 loops=1)
Output: c_1.code_3, c_1.code_2
-> Hash Left Join (cost=1208.66..122614.19 rows=25332 width=60) (actual time=11588.203..179739.516 rows=25332 loops=1)
Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.usda_qlty_grd, c.sire_assoc_id, n.mb_assoc_reg
Hash Cond: (COALESCE(c.dam_assoc_id, 1) = da.assoc_id)
-> Hash Right Join (cost=1206.97..122451.64 rows=25332 width=64) (actual time=11587.812..179701.171 rows=25332 loops=1)
Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.usda_qlty_grd, c.dam_assoc_id, c.sire_assoc_id, n.mb_assoc_reg
Hash Cond: (n.anm_key = c.anm_key)
-> Seq Scan on public.z2_raa_numbers n (cost=0.00..61034.69 rows=3330869 width=19) (actual time=0.170..1052.251 rows=3330869 loops=1)
Output: n.mb_assoc_reg, n.anm_key
-> Hash (cost=642.32..642.32 rows=25332 width=53) (actual time=96.305..96.305 rows=25332 loops=1)
Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.usda_qlty_grd, c.anm_key, c.dam_assoc_id, c.sire_assoc_id
Buckets: 2048 Batches: 65536 (originally 4) Memory Usage: 1028kB
-> Seq Scan on public.carc c (cost=0.00..642.32 rows=25332 width=53) (actual time=0.007..15.858 rows=25332 loops=1)
Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.usda_qlty_grd, c.anm_key, c.dam_assoc_id, c.sire_assoc_id
-> Hash (cost=1.04..1.04 rows=52 width=4) (actual time=0.371..0.371 rows=52 loops=1)
Output: da.assoc_id
Buckets: 1024 Batches: 1 Memory Usage: 2kB
-> CTE Scan on assoc_map da (cost=0.00..1.04 rows=52 width=4) (actual time=0.214..0.354 rows=52 loops=1)
Output: da.assoc_id
-> Hash (cost=1.04..1.04 rows=52 width=4) (actual time=0.045..0.045 rows=52 loops=1)
Output: sa.assoc_id
Buckets: 1024 Batches: 1 Memory Usage: 2kB
-> CTE Scan on assoc_map sa (cost=0.00..1.04 rows=52 width=4) (actual time=0.001..0.018 rows=52 loops=1)
Output: sa.assoc_id
Total runtime: 179844.270 ms
(38 rows)
quelle
hash right join
, was so viel länger dauert. Ich frage mich, ob die Anzahl der Spalten einen Schwellenwert überschreitet, der dies verursacht. Sie sollten dies wahrscheinlich in die Postgres-Performance-Mailingliste aufnehmen.work_mem
im zweiten Fall sehr einschränkend ist - keine Ahnung warum (wenn die Tupelbreitenschätzung, wenn der Planer korrekt ist).Antworten:
Kurze Antwort: Sie brauchen eine etwas größere
work_mem
. Versuchen Sie esset work_mem
in Ihrer Sitzung.Erläuterung:
Vergleichen Sie diese beiden Schritte:
und
Sie sehen, dass die langsamere Abfrage 65536 Stapel verwendet, die schnellere 128. Dies liegt daran, dass der Stapel mit dem zusätzlichen Feld nicht in work_mem passt. Er muss in kleinere Stapel aufgeteilt werden.
quelle
work_mem
ist trivial - es ist eine Sitzung, die konfiguriert werden kann.explain (analyze,buffers)
und Sie werden sehen, was ich sage.