What to Do about Tools Hoarding Parallel Slaves in Oracle

November 26, 2014

Most GUI tools use cursors with relatively small fetch size (around 50) to retrieve data from Oracle. They open the cursor, fetch some data, show it and then wait on user input. All resources related to the connection and the open session are held while the tool waits on the user. While those resources are usually trivial for serial SELECT statements, they can be significant for parallel SELECT statements.

Each parallel statement gets assigned a number of parallel slaves and those slaves are not freed until their respective cursor is closed, regardless of the amount of work the slaves do. Since there are limited number of parallel slaves available in an instance (PARALLEL_MAX_SERVERS init.ora parameter) the hoarding of parallel slaves can prevent future statements to be executed in parallel, severely impacting the performance of those statements.

This following blog post describes the situation quite well:
http://structureddata.org/2012/02/29/pitfalls-of-using-parallel-execution-with-sql-developer/

Since this behavior is not a bug, i.e. there is never going to be a fix, we need to find a way to manage it.

One solution is to disable parallelism for all sessions coming from GUI tools that use cursors and therefore could cause this problem. This is a radical step that would deprive the users from running anything in parallel.

The second option is to educate the users of the problems associated with open cursors and ask them to close all cursor as soon as possible. This approach is ideal when executed diligently, but in reality not all users are compliant.

The approach I would like to propose is to allow parallelism for all, but monitor those who do not close their open cursors. Here is the query that I use to monitor:

select count(*) from 
  (
      select * 
      from gv$px_session px_QC   
      where px_QC.qcinst_id IS NULL
      minus
      select * from gv$px_session px_QC   
      where px_QC.qcinst_id IS NULL
      and  exists
            (select * 
             from gv$px_session  px_Slaves , gv$session sess
             where px_QC.qcsid = px_Slaves.qcsid
             and px_Slaves.sid = sess.sid
             and (sess.wait_class = 'Idle'
                 or ( sess.seconds_in_wait < 600
                      and sess.wait_class = 'Idle'
                     )
                 )
             )
   )

This query returns zero if the parallel slaves are actively used and greater than zero if there is a set of parallel slaves that have been idle for 600 second. This query can be used to get the offending session and “talk” with the end user. It could be integrated with OEM using Metric Extensions or it could be part of a monitoring script that kills the offending sessions. The possibilities are endless.