Baselines vs SYS

July 27, 2016 ❖ Tags: oracle, performance, baselines, bugs

I’ve been working a bit more with 12c lately and ran across something that was a bummer. We’ve been troubleshooting some changes with partition maintenance running longer than it had in 11.2.0.3 and found that 32K seconds were being spent on 2c3tb462qkzr8.

SQL> @ashtop event,sql_id "top_level_sql_id='04fqpz7dx017m'" sysdate-14/24 sysdate

    Total
  Seconds     AAS %This   EVENT                                    SQL_ID        FIRST_SEEN          LAST_SEEN           DIST_SQLEXEC_SEEN
--------- ------- ------- ---------------------------------------- ------------- ------------------- ------------------- -----------------
    32934      .7   61% |                                          2c3tb462qkzr8 2016-07-20 00:00:37 2016-07-20 06:21:35             22606
      882      .0    2% |                                          crds9mkr8v07x 2016-07-20 00:02:21 2016-07-20 06:21:32               882
      600      .0    1% | db file sequential read                  0kcgdb5dmj1km 2016-07-20 00:00:20 2016-07-20 06:21:28               597
      597      .0    1% | db file sequential read                  7sahkyk9d20sq 2016-07-20 00:00:27 2016-07-20 06:21:38               589
      564      .0    1% | row cache lock                           2j4xjxw846717 2016-07-20 00:01:36 2016-07-20 06:19:47               564
      534      .0    1% | row cache lock                           18uzvkt8ka4gx 2016-07-20 00:01:44 2016-07-20 06:20:04               534
      361      .0    1% |                                          g7mt7ptq286u7 2016-07-20 00:00:41 2016-07-20 06:15:49               346
...
20 rows selected.

SQL> @stext
Enter value for sql_id: 2c3tb462qkzr8

(SELECT TP.OBJ# C_KEYOBJ# , BO.SPARE2 C_OBJV#, 2 C_LVLCNT , TP.OBJ# C_LVL1OBJ# , NULL C_LVL2OBJ# , PO.TYPE# C_LVL1TYPE# , NULL C_LVL2TYPE# , PO.SUBNAME C_LVL1NAME , NULL C_LVL2NAME , 0 C_ASSOC# , TP.TS# C_TS# , TS.NAME C_TSNAME FROM SYS.OBJ$ PO, SYS.OBJ$ BO, SYS.TABPART$ TP, SYS.TAB$ T, SYS.TS$ TS WHERE TP.TS# = TS.TS# AND BO.OBJ# = T.OBJ# AND BO.OBJ# = TP.BO# AND PO.OBJ# = TP.OBJ# AND BO.OBJ# = :B1 UNION SELECT CP.OBJ# C_KEYOBJ# , BO.SPARE2 C_OBJV#, 2 C_LVLCNT , CO.OBJ# C_LVL1OBJ# , NULL C_LVL2OBJ# , CO.TYPE# C_LVL1TYPE# , NULL C_LVL2TYPE# , CO.SUBNAME C_LVL1NAME , NULL C_LVL2NAME , 0 C_ASSOC# , NULL C_TS# , NULL C_TSNAME FROM SYS.OBJ$ CO, SYS.OBJ$ BO, SYS.TABCOMPART$ CP, SYS.USER$ U, SYS.TAB$ T WHERE BO.OBJ# = T.OBJ# AND BO.OBJ# = CP.BO# AND CO.OBJ# = CP.OBJ# AND BO.OBJ# = :B1 UNION SELECT SP.OBJ# C_KEYOBJ# , BO.SPARE2 C_OBJV#, 3 C_LVLCNT , CO.OBJ# C_LVL1OBJ# , SO.OBJ# C_LVL2OBJ# , CO.TYPE# C_LVL1TYPE# , SO.TYPE# C_LVL2TYPE# , CO.SUBNAME C_LVL1NAME , SO.SUBNAME C_LVL2NAME , 0 C_ASSO
C

Looks pretty Oracle-ish. In AWR I could see that previously there was a better execution plan so we were directed to create a SQL Baseline for it. Seemed like a good idea since I’d done similar things in 11g when I was in a pinch. We created the baseline only to see that it was still not using the better plan. Ran a 10053 trace and after grep’ing for SPM low and behold:

[andy.klock.MW7RHEPV318NVP] ➤ grep -r "SPM:" /cygdrive/c/partition_maint_12c/*.*
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053.trc:SPM: statement found in SMB
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053.trc:SPM: finding a match for the generated plan, planId = 1189830030
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053.trc:SPM: fixed planId's of plan baseline are: 1756995255
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053.trc:SPM: using qksan to reproduce, cost and select accepted plan, sig = 11918263451633923973
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053.trc:SPM: plan reproducibility round 1 (plan outline + session OFE)
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053.trc:SPM: using qksan to reproduce accepted plan, planId = 1756995255
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053.trc:SPM: planId in plan baseline = 1756995255, planId of reproduced plan = 1756995255
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053.trc:SPM: best cost so far = 12019.439501, current accepted plan cost = 12019.439501
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053.trc:SPM: re-parse to use selected accepted plan, planId = 1756995255
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053.trc:SPM: statement found in SMB
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053.trc:SPM: re-parsing to generate selected accepted plan,  planId = 1756995255
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053.trc:SPM: finding a match for the generated plan, planId = 1756995255
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053.trc:SPM: re-parsed to use selected accepted plan, planId = 1756995255
/cygdrive/c/partition_maint_12c/2c3tb462qkzr8_10053_i1_c0_extract.trc:SPM: disallowed: SQL with a bootstrap object run by SYS

SPM: disallowed: SQL with a bootstrap object run by SYS

That was new to me and was informed that this was introduced via Bug 14029891. Something to consider if you need to change the execution plan for internal SQL. Fortunately for us in this case we were able to disable the feature that was calling this SQL in the first place.

As Cary Millsap says, the fastest way to do something is to not do it at all. Cary also has something to say about mouse balls but that doesn’t necessarily apply here.