阅读须知:
1、本案例是某金融机构的报表系统使用达梦数据库的报表统计SQL,不能拿出真实 SQL + 数据,笔者自己在ORACLE模拟的数据。
2、笔者对达梦数据库的hint不熟悉,该sql案例注重于优化思路和改写思路,如果使用hint干预速度将会更快。
上周六和DM哥们去优化金融项目SQL,遇到个比较有意思的案例,SQL截图如下:

A表和B表关联了8次,每次关联结果不一样,用来聚合运算,A、B表都是千万级别大表,使用了 parallel(4) 都需要跑50多秒。
模拟案例表和数据量:

两张表都是千万级大表,比较符合生产实际情况。
-- 原始SQL: SELECT /*+ gather_plan_statistics */ /*query111*/ s.STU_CLASSID, count(S1.STU_NAME) S1_name , count(S2.STU_NAME) S2_name , count(S3.STU_NAME) S3_name , count(S4.STU_NAME) S4_name , count(S5.STU_NAME) S5_name , count(S6.STU_NAME) S6_name , count(S7.STU_NAME) S7_name , count(S8.STU_NAME) S8_name from STUDENT s LEFT JOIN STUDENT2 S1 on s.STU_NAME = S1.STU_NAME and S1.STU_CLASSID = 1 LEFT JOIN STUDENT2 S2 on s.STU_NAME = S2.STU_NAME and S2.STU_CLASSID = 2 LEFT JOIN STUDENT2 S3 on s.STU_NAME = S3.STU_NAME and S3.STU_CLASSID = 3 LEFT JOIN STUDENT2 S4 on s.STU_NAME = S4.STU_NAME and S4.STU_CLASSID = 4 LEFT JOIN STUDENT2 S5 on s.STU_NAME = S5.STU_NAME and S5.STU_CLASSID = 5 LEFT JOIN STUDENT2 S6 on s.STU_NAME = S6.STU_NAME and S6.STU_CLASSID = 6 LEFT JOIN STUDENT2 S7 on s.STU_NAME = S7.STU_NAME and S7.STU_CLASSID = 7 LEFT JOIN STUDENT2 S8 on s.STU_NAME = S8.STU_NAME and S8.STU_CLASSID = 8 GROUP BY s.STU_CLASSID;
--执行计划: Plan hash value: 42163929 ------------------------------------------------------------------------------------------------------------------------------------------------------------- "| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |" ------------------------------------------------------------------------------------------------------------------------------------------------------------- "| 0 | SELECT STATEMENT | | 1 | | 300 |00:00:57.08 | 866K| 141K| | | |" "| 1 | HASH GROUP BY | | 1 | 300 | 300 |00:00:57.08 | 866K| 141K| 2697M| 38M| 1352K (0)|" "|* 2 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:53.69 | 866K| 141K| 5659K| 2517K| 6411K (0)|" "|* 3 | INDEX FAST FULL SCAN | IDX_STUDENT2_STU_NAME_CLASSID | 1 | 1768 | 100K|00:00:00.66 | 90599 | 0 | | | |" "|* 4 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:47.90 | 775K| 141K| 5659K| 2517K| 5427K (0)|" "|* 5 | INDEX FAST FULL SCAN | IDX_STUDENT2_STU_NAME_CLASSID | 1 | 1768 | 99373 |00:00:00.65 | 90599 | 0 | | | |" "|* 6 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:42.14 | 685K| 141K| 5902K| 2624K| 5432K (0)|" "|* 7 | INDEX FAST FULL SCAN | IDX_STUDENT2_STU_NAME_CLASSID | 1 | 1768 | 99511 |00:00:00.65 | 90599 | 0 | | | |" "|* 8 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:36.40 | 594K| 141K| 5659K| 2517K| 5392K (0)|" "|* 9 | INDEX FAST FULL SCAN | IDX_STUDENT2_STU_NAME_CLASSID | 1 | 1768 | 99782 |00:00:00.66 | 90599 | 0 | | | |" "|* 10 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:30.36 | 504K| 141K| 5659K| 2517K| 13M (0)|" "|* 11 | INDEX FAST FULL SCAN | IDX_STUDENT2_STU_NAME_CLASSID | 1 | 487K| 99809 |00:00:00.66 | 90599 | 0 | | | |" "|* 12 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:22.13 | 413K| 141K| 5659K| 2517K| 14M (0)|" "|* 13 | INDEX FAST FULL SCAN | IDX_STUDENT2_STU_NAME_CLASSID | 1 | 484K| 100K|00:00:00.69 | 90599 | 0 | | | |" "|* 14 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:13.87 | 322K| 141K| 5659K| 2517K| 5411K (0)|" "|* 15 | INDEX FAST FULL SCAN | IDX_STUDENT2_STU_NAME_CLASSID | 1 | 1768 | 100K|00:00:00.67 | 90599 | 0 | | | |" "|* 16 | HASH JOIN RIGHT OUTER| | 1 | 26M| 26M|00:00:07.88 | 232K| 141K| 5738K| 2624K| 5422K (0)|" "|* 17 | INDEX FAST FULL SCAN| IDX_STUDENT2_STU_NAME_CLASSID | 1 | 1768 | 99925 |00:00:00.67 | 90599 | 0 | | | |" "| 18 | TABLE ACCESS FULL | STUDENT | 1 | 26M| 26M|00:00:01.62 | 141K| 141K| | | |" ------------------------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- " 2 - access(""S"".""STU_NAME""=""S8"".""STU_NAME"")" " 3 - filter(""S8"".""STU_CLASSID""=8)" " 4 - access(""S"".""STU_NAME""=""S7"".""STU_NAME"")" " 5 - filter(""S7"".""STU_CLASSID""=7)" " 6 - access(""S"".""STU_NAME""=""S5"".""STU_NAME"")" " 7 - filter(""S5"".""STU_CLASSID""=5)" " 8 - access(""S"".""STU_NAME""=""S3"".""STU_NAME"")" " 9 - filter(""S3"".""STU_CLASSID""=3)" " 10 - access(""S"".""STU_NAME""=""S4"".""STU_NAME"")" " 11 - filter(""S4"".""STU_CLASSID""=4)" " 12 - access(""S"".""STU_NAME""=""S6"".""STU_NAME"")" " 13 - filter(""S6"".""STU_CLASSID""=6)" " 14 - access(""S"".""STU_NAME""=""S2"".""STU_NAME"")" " 15 - filter(""S2"".""STU_CLASSID""=2)" " 16 - access(""S"".""STU_NAME""=""S1"".""STU_NAME"")" " 17 - filter(""S1"".""STU_CLASSID""=1)" Note ----- - dynamic sampling used for this statement (level=2)
跑了57s,慢得一批,如果DM有并行HASH的hint ,本案例将会秒杀。
优化方式1,将 STUDENT2 转换成分区表,利用分区裁剪思路进行优化(需要提前和业务开发沟通,是否会影响其他SQL语句)
-- Create table create table STUDENT_PATITION ( stu_id NUMBER not null, stu_name VARCHAR2(200), stu_sex VARCHAR2(1), stu_age NUMBER, stu_date DATE, stu_classid NUMBER, stu_sal NUMBER ) partition by list (STU_CLASSID) ( partition PARTITION_STUCLASSID2_1 values (1), partition PARTITION_STUCLASSID_2 values (2), partition PARTITION_STUCLASSID_3 values (3), partition PARTITION_STUCLASSID_4 values (4), partition PARTITION_STUCLASSID_5 values (5), partition PARTITION_STUCLASSID_6 values (6), partition PARTITION_STUCLASSID_7 values (7), partition PARTITION_STUCLASSID_8 values (8),
.......
)
使用 STU_CLASSID 作为分区列。
create index IDX_STUDENT3_LOCAL_1_2 on STUDENT_PATITION (STU_CLASSID, STU_NAME) nologging local;
再创建个本地索引。
-- 使用分区表
SELECT /*+ gather_plan_statistics */ /*query222*/ s.STU_CLASSID, count(S1.STU_NAME) S1_name , count(S2.STU_NAME) S2_name , count(S3.STU_NAME) S3_name , count(S4.STU_NAME) S4_name , count(S5.STU_NAME) S5_name , count(S6.STU_NAME) S6_name , count(S7.STU_NAME) S7_name , count(S8.STU_NAME) S8_name from STUDENT s LEFT JOIN STUDENT_PATITION S1 on s.STU_NAME = S1.STU_NAME and S1.STU_CLASSID = 1 LEFT JOIN STUDENT_PATITION S2 on s.STU_NAME = S2.STU_NAME and S2.STU_CLASSID = 2 LEFT JOIN STUDENT_PATITION S3 on s.STU_NAME = S3.STU_NAME and S3.STU_CLASSID = 3 LEFT JOIN STUDENT_PATITION S4 on s.STU_NAME = S4.STU_NAME and S4.STU_CLASSID = 4 LEFT JOIN STUDENT_PATITION S5 on s.STU_NAME = S5.STU_NAME and S5.STU_CLASSID = 5 LEFT JOIN STUDENT_PATITION S6 on s.STU_NAME = S6.STU_NAME and S6.STU_CLASSID = 6 LEFT JOIN STUDENT_PATITION S7 on s.STU_NAME = S7.STU_NAME and S7.STU_CLASSID = 7 LEFT JOIN STUDENT_PATITION S8 on s.STU_NAME = S8.STU_NAME and S8.STU_CLASSID = 8 GROUP BY s.STU_CLASSID
-- 分区表执行计划 Plan hash value: 1458661267 ------------------------------------------------------------------------------------------------------------------------------------------------------- "| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |" ------------------------------------------------------------------------------------------------------------------------------------------------------- "| 0 | SELECT STATEMENT | | 1 | | 300 |00:00:48.33 | 144K| 141K| | | |" "| 1 | HASH GROUP BY | | 1 | 300 | 300 |00:00:48.33 | 144K| 141K| 2697M| 38M| 3855K (0)|" "|* 2 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:45.00 | 144K| 141K| 5659K| 2517K| 6820K (0)|" "| 3 | PARTITION LIST SINGLE | | 1 | 100K| 100K|00:00:00.01 | 302 | 0 | | | |" "| 4 | INDEX FAST FULL SCAN | IDX_STUDENT3_LOCAL_1_2 | 1 | 100K| 100K|00:00:00.01 | 302 | 0 | | | |" "|* 5 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:39.44 | 143K| 141K| 5659K| 2517K| 6830K (0)|" "| 6 | PARTITION LIST SINGLE | | 1 | 100K| 100K|00:00:00.01 | 301 | 0 | | | |" "| 7 | INDEX FAST FULL SCAN | IDX_STUDENT3_LOCAL_1_2 | 1 | 100K| 100K|00:00:00.01 | 301 | 0 | | | |" "|* 8 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:33.87 | 143K| 141K| 5659K| 2517K| 5806K (0)|" "| 9 | PARTITION LIST SINGLE | | 1 | 100K| 100K|00:00:00.01 | 301 | 0 | | | |" "| 10 | INDEX FAST FULL SCAN | IDX_STUDENT3_LOCAL_1_2 | 1 | 100K| 100K|00:00:00.01 | 301 | 0 | | | |" "|* 11 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:28.67 | 143K| 141K| 5659K| 2517K| 5809K (0)|" "| 12 | PARTITION LIST SINGLE | | 1 | 99925 | 99925 |00:00:00.01 | 300 | 0 | | | |" "| 13 | INDEX FAST FULL SCAN | IDX_STUDENT3_LOCAL_1_2 | 1 | 99925 | 99925 |00:00:00.01 | 300 | 0 | | | |" "|* 14 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:23.47 | 142K| 141K| 5659K| 2517K| 5811K (0)|" "| 15 | PARTITION LIST SINGLE | | 1 | 99809 | 99809 |00:00:00.01 | 300 | 0 | | | |" "| 16 | INDEX FAST FULL SCAN | IDX_STUDENT3_LOCAL_1_2 | 1 | 99809 | 99809 |00:00:00.01 | 300 | 0 | | | |" "|* 17 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:18.30 | 142K| 141K| 5659K| 2517K| 5808K (0)|" "| 18 | PARTITION LIST SINGLE | | 1 | 99782 | 99782 |00:00:00.01 | 300 | 0 | | | |" "| 19 | INDEX FAST FULL SCAN | IDX_STUDENT3_LOCAL_1_2 | 1 | 99782 | 99782 |00:00:00.01 | 300 | 0 | | | |" "|* 20 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:12.92 | 142K| 141K| 5902K| 2624K| 5815K (0)|" "| 21 | PARTITION LIST SINGLE | | 1 | 99511 | 99511 |00:00:00.01 | 299 | 0 | | | |" "| 22 | INDEX FAST FULL SCAN | IDX_STUDENT3_LOCAL_1_2 | 1 | 99511 | 99511 |00:00:00.01 | 299 | 0 | | | |" "|* 23 | HASH JOIN RIGHT OUTER | | 1 | 26M| 26M|00:00:07.13 | 141K| 141K| 5574K| 2624K| 5799K (0)|" "| 24 | PARTITION LIST SINGLE| | 1 | 99373 | 99373 |00:00:00.01 | 299 | 0 | | | |" "| 25 | INDEX FAST FULL SCAN| IDX_STUDENT3_LOCAL_1_2 | 1 | 99373 | 99373 |00:00:00.01 | 299 | 0 | | | |" "| 26 | TABLE ACCESS FULL | STUDENT | 1 | 26M| 26M|00:00:01.59 | 141K| 141K| | | |" ------------------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- " 2 - access(""S"".""STU_NAME""=""S8"".""STU_NAME"")" " 5 - access(""S"".""STU_NAME""=""S6"".""STU_NAME"")" " 8 - access(""S"".""STU_NAME""=""S2"".""STU_NAME"")" " 11 - access(""S"".""STU_NAME""=""S1"".""STU_NAME"")" " 14 - access(""S"".""STU_NAME""=""S4"".""STU_NAME"")" " 17 - access(""S"".""STU_NAME""=""S3"".""STU_NAME"")" " 20 - access(""S"".""STU_NAME""=""S5"".""STU_NAME"")" " 23 - access(""S"".""STU_NAME""=""S7"".""STU_NAME"")"
原来57s的SQL现在可以降到48s出结果,可以看到执行计划中有 PARTITION LIST SINGLE ,正是利用了分区裁剪的思路。
如果在加上并行的hint 走分区并行,拆分成每个分区并行运行,那速度将会更快,但是这里不用。
优化方式2,如果业务开发怕影响太大,不允许改表结构,只能对当前SQL进行等价改写。
-- 等价改写: SELECT /*+ gather_plan_statistics */ /*query333*/ s.STU_CLASSID, sum(case when S1.STU_CLASSID = 1 then 1 else 0 end ) S1_name , sum(case when S1.STU_CLASSID = 2 then 1 else 0 end ) S2_name , sum(case when S1.STU_CLASSID = 3 then 1 else 0 end ) S3_name , sum(case when S1.STU_CLASSID = 4 then 1 else 0 end ) S4_name , sum(case when S1.STU_CLASSID = 5 then 1 else 0 end ) S5_name , sum(case when S1.STU_CLASSID = 6 then 1 else 0 end ) S6_name , sum(case when S1.STU_CLASSID = 7 then 1 else 0 end ) S7_name , sum(case when S1.STU_CLASSID = 8 then 1 else 0 end ) S8_name from STUDENT s LEFT JOIN STUDENT2 S1 on s.STU_NAME = S1.STU_NAME GROUP BY s.STU_CLASSID;
-- 改写后执行计划 Plan hash value: 17088642 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- "| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|" ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- "| 0 | SELECT STATEMENT | | 1 | | 300 |00:00:30.58 | 232K| 272K| 130K| | | | |" "| 1 | HASH GROUP BY | | 1 | 300 | 300 |00:00:30.58 | 232K| 272K| 130K| 1800M| 32M| 1353K (0)| |" "|* 2 | HASH JOIN OUTER | | 1 | 30M| 29M|00:00:23.47 | 232K| 272K| 130K| 1211M| 34M| 29M (1)| 1056K|" "| 3 | TABLE ACCESS FULL | STUDENT | 1 | 26M| 26M|00:00:01.27 | 141K| 141K| 0 | | | | |" "| 4 | INDEX FAST FULL SCAN| IDX_STUDENT2_STU_NAME_CLASSID | 1 | 30M| 29M|00:00:01.29 | 90599 | 0 | 0 | | | | |" ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- " 2 - access(""S"".""STU_NAME""=""S1"".""STU_NAME"")" Note ----- - dynamic sampling used for this statement (level=2)
改写后执行时间直接降到30s就可以出结果,student2表只需要访问一次,对比第一条SQL语句降低了27s,极大提升。
加上相关并行hint 估计10s以内就能把结果跑出来甚至更快,本案例不加任何 hint,只是让大家了解优化思路。
看来还是得花时间好好学习下达梦数据库,毕竟现在信创国产化是一个重要的趋势,为信创行业添砖加瓦。