The Cost of Adaptive Execution Plans in Oracle (a Small Study) Part 2

November 21, 2013

In my previous post, I tried to find out the performance impact of adaptive execution plans (AEP). During the test, I used OPTIMIZER_ADAPTIVE_REPORTING_ONLY parameter to switch on and off adaptive execution plans(AEP). It should be noted that the parameter does not control the generation of AEP, but rather the executtion of the “adaptive” part. I came to a rather surprising result that turning off adaptive execution plans actually made the query run slower. This behavior seems to be a side -effect of frequent shared pool flushing. The original test, i.e. the one that showed tuning on adaptive execution plans speeds up a query, is reproducible, however! I ran it at least three times and I got similar results.

To confirm the peculiar results, I ran a new test that avoids SQL reuse by generating unique SQL statements. Here is the new table/view setup script. Here is the new test procedure. After running it, we can see the expected outcome – adaptive execution plans do cost something.

select avg(r), avg(a) , STATS_T_TEST_PAIRED (R,A) FROM track_v

avg(r) = 2.05312594     avg(a) = 2.07863544     STATS_T_TEST_PAIRED (R,A) /p/ = 0

The difference is quite small – approximately 2% of the overall time or a fraction of a millisecond.
The following query allows us drill down and see where the difference comes from :
select c.name , avg (a.val) a , avg (r.val) r , STATS_T_TEST_PAIRED (r.val,A.val)
from stats_a a , stats_r r , v$statname c
where a.id = r.id
and a.stat = r.stat
and a.stat = c.statistic#
group by c.name

The only statistics that are statistically different are:
CCursor + sql area evicted            – the SQL with AEP on consumed less that the SQL with AEP off
CPU used by this session               – the SQL with AEP on consumed more (marginally) that the SQL with AEP off
parse time elapsed                          – the SQL with AEP on consumed more that the SQL with AEP off
sql area evicted                                – the SQL with AEP on consumed less that the SQL with AEP off


The Cost of Adaptive Execution Plans in Oracle (a Small Study) Part 1

September 26, 2013

Adaptive execution plans is a new Oracle 12c feature. I consider it one of the most important ones.

Oracle 12c would pick the join method based on the actual number of rows retrieved during the first execution of the query.

There are a few great papers about this topic:
Here is the Oracle’s white paper and here is Kerry Osborbe’s Hostsos 2013 presentation. There is even a video.

Every time a new feature that fundamentally changes the way Oracle works is introduced, we should ask ourselves what is the cost, in terms of CPU or other resources, used by that feature.
For me, the cost is an issue only when the feature does not change the join type that was initially chosen.
If the feature causes a change of join type, then the benefits of it would be huge, so it is not worth worrying about the cost.

I’ll measure the elapsed times  / looking at internals is hard… :) / of a query with and without adaptive execution plans and then run a paired t-test to see if there is a statistically significant difference. I’ll use OPTIMIZER_ADAPTIVE_REPORTING_ONLY to turn on and off the feature and I’ll do my best to get rid of any bias in this test.

First, let’s see what happened when we have a query that uses Hash Join. The table/view set up scripts is here. The test procedure is here. I get the end-result using this simple query:
select avg(r),             avg(a) ,              STATS_T_TEST_PAIRED (R,A)   FROM track_v

For this case, we get avg(r) = 450.6991608  ,   avg(a) =      451.5071414  ,   STATS_T_TEST_PAIRED (R,A) /p/ =   0.48716074654373898

This means that for 10K runs, we are nowhere near statistical significance (p< 0.05). Things were not different for 100K runs:

avg(r) = 336.78178535 , avg(a) = 336.67725615 , STATS_T_TEST_PAIRED (R,A) /p/ = 0.82934281196842896

Therefore, the adaptive execution plan feature does not add any meaningful cost when Hash Joins are used.

Let’s see the behavior for a query the uses nested loops. The table/view set up scripts are here. The test procedure is here. I ran 100K iterations, and this is what I got:
select avg(r), avg(a) , STATS_T_TEST_PAIRED (R,A) FROM track_v

For this case, we get avg(r) = 15.31589686 ,   avg(a) =      15.11440871 ,   STATS_T_TEST_PAIRED (R,A) /p/ =   0.015795071536522352

Very interesting result - using adaptive execution plans is slightly faster even when no runtime change of join type is happening. The result seems to have statistical significance…

I then run 500K iterations and I got the even stronger result:
avg(r) = 20.4530 , avg(a) = 19.982423, STATS_T_TEST_PAIRED (R,A) /p/ = 0.00000320884674226468

The averages went up probably because there was some load on the box, but the statistical significance of the difference is out of the question.

So, it appears that the cost of adaptive execution plans is actually negative when it comes to nested loops (NL).

Updated 11/21/2013:
This result seems to be a side-effect of frequent shared pool flushing. It is still a reproducible result though. Check my new post that uses unique SQLs rather than flushing the shared pool to guarantee that no SQL plans are reused.

Overall, adaptive execution plans could bring huge benefits without any cost. I cannot think of a scenario where we should turn them off…


New York Oracle User Group (NYOUG) Fall 2013 Conference

September 12, 2013

Thanks to all who attended my session at NYOUG. As usual, it was a great experience!

Download the presentation here and the white paper here.

You can also download the spec here and the body here of JUST_STATS package.
You can download the README file here andf some examples of gathering stats in trigger here.


NYOUG Fall General Meeting (2013)

August 30, 2013

I am excited to present at the NYOUG Fall General Meeting (2013). I’ll be speaking about volatile tables in Oracle. They are not common, yet they can wreak havoc on the database performance when present.

The presentation has some improvements over the one I gave at Hotsos a couple years back.  It covers relevant Oracle 12c features as well.
I’ll be happy to see you there!


Oracle 12c Adaptive Execution Plans – Do Collected Stats Still Matter?

July 31, 2013

Adaptive execution plans is one of the most exciting new features in Oracle 12c. Now, in Oracle 12c, the optimizer can pick the join method based on the actual amount of data processed, rather than the estimates it had at compile time.  Does that mean that the estimated cardinalities/set sizes and the table/index statistics they are based upon no longer matter?

Not so! Here is why:

Even though Oracle uses actual information from the STATISTICS COLLECTOR step, it also uses lots of estimated information to decide the join strategy. I am not aware of the specific formula used to select the join type, but I think it is reasonable to assume that Oracle uses the estimated cost of a Nested Loop and the estimated cost of a Hash Join for that formula. Those estimated costs are based on collected data dictionary statistics (DBA_TABLES, DBA_INDEXES,..).

Here is an example:

Table TEST1 has 20 million records. The data dictionary statistics (DBA_TABLES, etc ) for TEST1 represent the data in the table.

Table TEST2 has 2 million records. The data dictionary statistics (DBA_TABLES, etc ) for TEST2 represent the data in the table.

This sample query

SELECT sum(length(TEST1.STR))

FROM TEST1 ,

     TEST2

WHERE

     TEST1.ID = TEST2.ID

generates an adaptive plan, whoch picks Hash Join, as expected.

Let’s see at how much data we need to have in TEST2 in order to switch to from Nested Loops to Hash Join.

To do that, we need to truncate TEST2 without re-gathering stats. Also, we need to create a loop that inserts data in TEST2 (without stats gathering) and saves the execution plan.

For my case, using this setup, we can see that the switch from NL to HJ happens when TEST2 has approx. 11500 cords.

The switch point depends on the TEST1 stats, not the actual data in TEST1.

We can double the data in TEST1 and we’ll get the same result as long as the stats are not re-gathered.

Conversly, we can delete 70% of the data in TEST1 and we’ll get the same result as long as the stats are not re-gathered.

In short, the cardinality estimates and the stats they are based upon are still very important.

If the actual cardinality of the second set in a join is higher than the estimated cardinality , then for some values Oracle would be using HJ, even though NL would produce better results.

Conversely, if the actual cardinality of the second set in a join is lower than the estimated cardinality, then for some values Oracle would be using NL, even though HJ would produce better results.

P.S.

I was not able to get the Adaptive Execution Plan to work properly from a PL/SQL procedure. I had to write a shell script…


Some more useful undocumented OEM 12c repository tables/views

June 28, 2013

As mentioned in a previous post, MGMT_ECM_HW contains a wealth of information about hosts in OEM.

The problem is that CLOCK_FREQ_IN_MHZ column in MGMT_ECM_HW is not properly populated. All hosts had the value of 100 there – an obvious mistake.

That prompted me to do a search through the OEM repository objects. This is what I found:

MGMT_ECM_HW_CPU is similar to MGMT_ECM_HW, but FREQ_IN_MHZ column is properly populated. No need to parse descriptions to get the CPU’s speed

MGMT_ECM_HW_NIC – lots of information about network cards

MGMT_ECM_OS – OS related information

MGMT_ECM_HW_IOCARD – information about IO peripheral devices

 


Measuring the Benefits of Database Virtualization/Thin Provisioning Solutions

May 14, 2013

Overview

Database virtualization and thin provisioning are new and powerful ways to share/reuse storage among non-production database copies.

Solutions for reducing the disk footprint of non-production Oracle databases range from complete database virtualization (Delphix, etc), to solutions build around storage vendor’s features (EMC,NetApps, etc), to purely software solutions (cloneDB, etc).

Sharing read-only tablespaces – a “poor man’s” database virtualization – is less known approach for saving disk space and improving refresh times in non-production environments. Here is a link to the presentation I did at NYOUG (2012) about this topic.

What are we measuring and why?

In general, most solutions are quite effective, both in terms of storage and refresh time, in delivering a copy of the production database.

Once the copy is delivered, it is open for users and batch jobs, so it starts changing. Those changes belong to the new database and therefore cannot be shared with the other non-production databases. That means that they consume “real” disk space. The more DB changes since refresh, the lower the benefits of the DB virtualization/thin provisioning solution.

Measuring the amount of change after a refresh is important in understanding how much disk space would be needed for the non-production Oracle environments after DB virtualization/thin provisioning is in place. The number is essential in computing ROI of the project.

How are we measuring it?

One way to measure the change is to run an incremental RMAN backup and see how big the resulting backup file(s) is. In many cases, however, that is not doable.

The method described here works only on 11gR2. It only requires the SCN of the DB at the time of refresh.

The information can be find in V$DATABASE (RESETLOG_CHANGE#) for databases created by RMAN DUPLICATE.

If the data was transferred with DataPump, or other similar method, a suitable SCN can be found in V$ARCHIVED_LOG (FIRST_CHANGE#, NEXT_CHANGE#).

The new DBVerify feature allows us to utilize HIGH_SCN parameter to find out how many blocks where changes since HIGH_SCN.

Let’s see how many blocks were modified after SCN 69732272706.

It is really easy:

dbv userid=sys/?????????? file=/tmp/test.dbf HIGH_SCN=69732272706

Page 64635 SCN 1012797077 (16.1012797077) exceeds highest scn to check 1012795970 (16.1012795970)

Page 66065 SCN 1012796687 (16.1012796687) exceeds highest scn to check 1012795970 (16.1012795970)

Page 66187 SCN 1012796687 (16.1012796687) exceeds highest scn to check 1012795970 (16.1012795970)

Page 87759 SCN 1012796692 (16.1012796692) exceeds highest scn to check 1012795970 (16.1012795970)

DBVERIFY - Verification complete

Total Pages Examined : 93440

Total Pages Processed (Data) : 62512

Total Pages Failing (Data) : 0

Total Pages Processed (Index): 9399

Total Pages Failing (Index): 0

Total Pages Processed (Other): 6119

Total Pages Processed (Seg) : 1705

Total Pages Failing (Seg) : 0

Total Pages Empty : 13705

Total Pages Marked Corrupt : 0

Total Pages Influx : 0

Total Pages Encrypted : 0

Total Pages Exceeding Highest Block SCN Specified: 39

Highest block SCN : 1012797131 (16.1012797131)

In this case, only 39 blocks (out of 100K) were modified after SCN 69732272706.

Please note that even though DBVerify (dbv) can work with open DB files, it will not account for changed blocks that were not yet written to the file.

Also, as per MOS document 985284.1 – ” Why HIGH_SCN Option Doesn’t Work While Running DBV Against ASM Files?” – the HIGH_SCN flag works only for files that are not stored in ASM. Hope they fix that problem soon.


Follow

Get every new post delivered to your Inbox.