Abstract:
Recently we encountered a single session cyclic "library cache pin" during upgrade to 11.2.0.3.0 from a lower version, in which the application generated types (in SYS schema) are cleaned out by:
drop type sys.SYS_PLSQL_X_Y_Z force;
Addendum (2015.11.02): Above drop is done by SMON (see Section CLEANUP_NON_EXIST_OBJ Task of follow-up Blog: Oracle 12c single session "library cache lock (cycle)" deadlock).
In order to understand this special deadlock case, run the attached the test code (at the end of this Blog).
which consists of three code parts (SYS_PLSQL_3238_382_1 is a package generated sys type):
create package spec and package body
drop type sys.SYS_PLSQL_3238_382_1 force
alter package suk_lc_pin# compile body
and session is waiting for "library cache pin" for a duration of 15 minutes, which can be observed by:
select * from v$session where event like 'library cache pin';
select * from v$wait_chains;
After 15 minutes, session throws:
ORA-04021: timeout occurred while waiting to lock object
15 minutes timeout is the default value of "_kgl_time_to_wait_for_locks" (time to wait for locks and pins before timing out).
We can break the blocked session and remove all invalids by:
alter package lc_pin# compile;
The attached testcase is based on dba_tables, but it is reproducible with dba_segments, dba_objects, dba_indexes.
Affected Oracle Version:
11.2.0.1.0, 11.2.0.2.0, 11.2.0.3.0, but not 10.2.0.4.0.
Reasoning:
The problem seems caused by the "with" factoring clause (see attached testcase at the end of this Blog).
When Oracle parses "with" factoring clause, it acquires a "library cache pin" in the Share Mode (S) on the dependent objects, in this case, it is "t_vc", then it proceeds to main clause, in which it realizes that the dependent object: "t_dba_row_tab" is invalid. In order to resolve this invalid, Oracle attempts to recompile package spec, which requests Exclusive Mode (X) on the related objects.
Since the already held mode (S) on "t_vc" is not consistent with requesting mode (X), Oracle session spins on the wait event "library cache pin". By default, Oracle throws an error:
ORA-04021: timeout occurred while waiting to lock object
after 15 minutes.
In theory, this single session cyclic lock should be detected as a deadlock, however, Oracle "library cache pin" is not deadlock sensitive. That is probably the reason why the waiting event is controlled by a 15 minute timeout in Oracle implementation.
A further query:
select (select kglnaobj||'('||kglobtyd||')'
from x$kglob v
where kglhdadr = object_handle and rownum=1) kglobj_name
,v.*
from v$libcache_locks v
where v.holding_user_session =
(select saddr from v$session
where event ='library cache pin' and rownum = 1)
and object_handle in (select object_handle from v$libcache_locks where mode_requested !=0)
order by kglobj_name, holding_user_session, type, mode_held, mode_requested;
shows there are two rows on SYS_PLSQL_263602_21_1(TYPE) with TYPE: PIN, in which the HOLDING_USER_SESSION and HOLDING_SESSION are different in the row with
MODE_REQUESTED = 3 (Exclusive mode)
KGLOBJ_NAME |
TYPE |
ADDR |
HOLDING_USER_SESSION |
HOLDING_SESSION |
OBJECT_HANDLE |
LOCK_HELD |
REFCOUNT |
MODE_HELD |
MODE_REQUESTED |
SAVEPOINT_NUMBER |
SYS_PLSQL_263602_21_1(TYPE) |
PIN |
07000000A5C224E0 |
07000000C09CE610 |
07000000C1A399B0 |
07000000A8AEFD20 |
0 |
0 |
0 |
3 |
7504 |
SYS_PLSQL_263602_21_1(TYPE) |
PIN |
07000000A7243838 |
07000000C09CE610 |
07000000C09CE610 |
07000000A8AEFD20 |
07000000A7243938 |
1 |
2 |
0 |
6235 |
From the query result, we can see that HOLDING_USER_SESSION already held a PIN mode of 2(Share mode), but at the same time designates a different recursive session to request a PIN mode of 3(Exclusive mode). The column SAVEPOINT_NUMBER seems recording the sequence of PIN get and request.
This probably explains why Oracle can't detect such deadlock case.
In Oracle, HOLDING_USER_SESSION is the session we see in v$session, whereas HOLDING_SESSION is the recursive session when both are not the same. Normally recursive session is spawned out when HOLDING_USER_SESSION requires "SYS" user privilege to perform certain tasks.
By the way, recursive session is not exported in v$session because of filter predicate:
bitand("s"."ksuseflg",1)<>0
on x$ksuse, where bitand (s.ksuseflg, 19) = 1 for 'USER' session; = 17 for 'BACKGROUND', = 2 for 'RECURSIVE'.
(Tanel has a Blog on recursive session in
Recursive sessions…)
A quick workaround is to recompile the package spec by:
alter package lc_pin# compile;
The new recompiled sources can be listed by:
select * from dba_objects order by last_ddl_time desc;
from which one can see the t_vc is recompiled even it was valid before (this confirms the Exclusive Mode (X) request on t_vc).
Further statistics shows that Oracle is continuously enriching its library cache with the newer versions, for example, v$db_object_cache contains 13 columns in 10.2.0.4.0, 21 in 11.2.0.1.0 and 11.2.0.2.0, and now 23 in 11.2.0.3.0.
Once all are valid, you can run the query:
select * from table(lc_pin#.soo);
Fundamentals:
I would like to share the basics I used to approach this problem.
The Oracle generated representations for package defined types are denoted as:
SYS_PLSQL_263602_9_1 for t_dba_row_tab
SYS_PLSQL_3238_382_1 for sys.dba_tables%rowtype
SYS_PLSQL_263602_31_1 for t_vc_tab
SYS_PLSQL_263602_21_1 for t_vc
where 263602 is the object_id of lc_pin# (package spec, not package body) in dba_objects, 3238 is that of DBA_TABLES.
Additionally a type
SYS_PLSQL_263602_DUMMY_1 as table of number;
is created, probably to index the two PL/SQL nested tables: t_dba_row_tab and t_vc_tab, even though both are not declared as associative array (formerly called PL/SQL table or index-by table). We can also guess that PL/SQL nested table is internally implemented as conventional index-by table.
The generated sources can be listed by:
select * from dba_source where name in (
'SYS_PLSQL_263602_9_1'
,'SYS_PLSQL_3238_382_1'
,'SYS_PLSQL_263602_31_1'
,'SYS_PLSQL_263602_21_1'
,'SYS_PLSQL_263602_DUMMY_1');
One can also notice that after:
drop type sys.SYS_PLSQL_3238_382_1 force;
SYS_PLSQL_3238_382_1 is no more registered in dba_objects, but still retained in sys.obj$. In sys.obj$, however, it is altered from type# 13 (TYPE) to type# 10 object (also named NON-EXISTENT object in Oracle).
v$libcache_locks contains rich details on pin and lock (refcount, mode_held, mode_requested).
After the first publication of this Blog, Tanel's powerful kglpn.sql (
TPT_public.zip) showed me the following output:
PIN_MODE REQ_MODE PINNED_BLOCKS OBJECT_NAME
-------- -------- -------------- -----------------------
Share None 0 6 SYS.SYS_PLSQL_263602_21_1
None Excl SYS.SYS_PLSQL_263602_21_1
This confirms again the self-deadlock caused by:
SYS.SYS_PLSQL_263602_21_1
since its PIN_MODE = Share, and REQ_MODE = Excl.
AIX Trace
On AIX, running trace command with session's PID: 28246034 for 20 seconds by:
trace -a -A 28246034 -o trc_raw; sleep 20; trcstop; trcrpt -O "exec=on,pid=on" trc_raw
we get the output (some details are removed):
ID PID ELAPSED_SEC DELTA_MSEC SYSCALL KERNEL INTERRUPT
001 16318614 0.000000000 0.000000 TRACE ON channel 0
200 28246034 1.243906765 1243.906765 resume oracletestdb iar=DF09C cpuid=FFFFFFFF
104 28246034 1.243909585 0.002820 return from system call
101 28246034 1.243953916 0.044331 _poll LR = 9000000012D75D4
252 28246034 1.243956337 0.002421 SOCK soo_select fp=xx so= corl=0 reqevents= rtneventsp=
252 28246034 1.243956876 0.000539 SOCK return from soo_select fp= so= error=0
104 28246034 1.243957916 0.001040 return from _poll [4 usec]
101 28246034 1.243965021 0.007105 _thread_wait LR = 90000000129C2AC
200 28246034 4.243980259 3000.015238 resume oracletestdb iar=DF09C cpuid=FFFFFFFF
104 28246034 4.243982353 0.002094 return from _thread_wait [3.000017 sec]
101 28246034 4.244031337 0.048984 _thread_wait LR = 90000000129C2AC
200 28246034 7.244053839 3000.022502 resume oracletestdb iar=DF09C cpuid=FFFFFFFF
104 28246034 7.244055822 0.001983 return from _thread_wait [3.000024 sec]
101 28246034 7.244095943 0.040121 _thread_wait LR = 90000000129C2AC
200 28246034 10.244111460 3000.015517 resume oracletestdb iar=DF09C cpuid=FFFFFFFF
it shows that session resumes after each 3 seconds of thread_wait (v$wait_chains is also refreshed each 3 seconds).
------------------------- TestCase -------------------------
-- This test is with dba_tables.
-- It is also reproducible with dba_segments, dba_objects, dba_indexes.
drop package lc_pin#;
-- Create Package and Package Body
create or replace package lc_pin#
as
type t_dba_row_tab is table of sys.dba_tables%rowtype;
type t_vc is record (name varchar2(30));
type t_vc_tab is table of t_vc;
function foo return t_vc_tab pipelined;
function koo return t_dba_row_tab pipelined;
function soo return t_dba_row_tab pipelined;
end lc_pin#;
/
create or replace package body lc_pin#
as
function foo return t_vc_tab pipelined
is
l_result t_vc;
begin
l_result.name := 'lc_test';
pipe row(l_result);
return;
end foo;
function koo return t_dba_row_tab pipelined
is
begin
for c in (select * from dba_tables where rownum = 1) loop
pipe row(c);
end loop;
end koo;
function soo return t_dba_row_tab pipelined
is
begin
for c in (
with sq as (select * from table(foo))
select nt.*
from sq
,(select * from table(koo)) nt
-- following re-write works
-- select nt.* from (select * from table(foo)) sq, (select * from table(koo)) nt
) loop
pipe row(c);
end loop;
end soo;
end lc_pin#;
/
-- Generate "drop type sys.SYS_PLSQL_3238_382_1 force;" and execute
declare
l_stmt varchar2(100);
begin
select 'drop type sys.' || object_name || ' force' drop_stmt
into l_stmt
from dba_objects
where object_name like
(select 'SYS_PLSQL_' || object_id || '%_1'
from dba_objects
where owner = 'SYS' and object_name = 'DBA_TABLES' and object_type = 'VIEW')
and object_name not like '%DUMMY%'
;
dbms_output.put_line('Run l_stmt: ' || l_stmt);
execute immediate l_stmt;
end;
/
-- Compile Package Body, session is in wait event "library cache pin" for 15 minutes,
-- then throws "ORA-04021: timeout occurred while waiting to lock object"
alter package lc_pin# compile body;