Monday, April 10, 2023

Oracle Multi-Consumer AQ dequeue with dbms_aq.listen Wait Events (I)

Wait Events (I)           Performance Model (II)           Performance Analysis and Workaround (III)     


We will compose three Blogs to discuss Performance of Oracle Multi-Consumer AQ dequeue with dbms_aq.listen.

The first Blog performs test to show Wait Events: "library cache: mutex X" on queue object and "latch free" on Latch "channel operations parent latch".

The second Blog (II) build Performance Model and reveals implementation problems.

The third Blog (III) traces dequeue operation with different number of parallel concurrent sessions, and conclude with the performance Analysis.

Note: tested on Oracle 19.17 in Linux and 19.7 on Solaris


1. Test Setup


We create one multiple_consumer queue, then compare two dequeue methods: listen_deq and subscr_deq.
listen_deq is using dbms_aq.listen, subscr_deq is with subscriber.

At first, we create 100 subscribers (see appended Test Code):

   exec create_subscribers('msg_queue_multiple', 100);


2. Test Run


We run two tests as follows:

   exec test_run(80, 1, 10*60, 1);   -- listen dequeue 
   
   exec test_run(80, 2, 10*60, 1);   -- subscriber dequeue
It first starts 80 enqueue Jobs and 80 listen_deq Jobs for 10 minutes, gets AWR.
Then starts 80 enqueue Jobs and 80 subscr_deq Jobs for 10 minutes, gets AWR.


2.1 listen_deq


AWR shows heavy "library cache: mutex X" waits on queue object: msg_queue_multiple:

Top 10 Foreground Events by Total Wait Time

  Event                   Waits     Total Wait Time (sec) Avg Wait  % DB time Wait Class
  ----------------------- --------- --------------------- --------- --------- ----------
  library cache: mutex X  4,075,073 9538.6                2.34ms    14.5      Concurrency
  latch free                 68,333 1195.5                17.49ms   1.8       Other

Top Event P1/P2/P3 Values

  Event                   % Event P1,           P2,   P3 Values           % Activity  Parameter 1 Parameter 2 Parameter 3
  ----------------------- ------- ------------- ----- ------------------- ----------- ----------- ----------- -----------
  library cache: mutex X  14.28   "198468696",  "0",  "23028197302599684" 0.47        idn         value       where
  latch free               2.18   "3012157976", "98", "3012157904"        2.16        address     number      why


Note: "library cache: mutex X" on idn=198468696, V$DB_OBJECT_CACHE.name='MSG_QUEUE_MULTIPLE', V$DB_OBJECT_CACHE.hash_value=198468696.

      "latch free" on Latch "channel operations parent latch" (V$LATCH.latch#=98), child: 144 (V$LATCH_CHILDREN.child#=144 for latch#=98).


2.2 subscr_deq


AWR shows less "library cache: mutex X" waits on queue object: msg_queue_multiple,
and no "latch free" on Latch "channel operations parent latch" is listed in Top 10 Events.

Top 10 Foreground Events by Total Wait Time

  Event                   Waits     Total Wait Time (sec) Avg Wait  % DB time Wait Class
  ----------------------- --------- --------------------- --------- --------- ----------
  library cache: mutex X  1,070,942 3235                  3.02ms    3.5       Concurrency

Top Event P1/P2/P3 Values

  Event                   % Event P1,           P2,   P3 Values           % Activity  Parameter 1 Parameter 2 Parameter 3
  ----------------------- ------- ------------- ----- ------------------- ----------- ----------- ----------- -----------
  library cache: mutex X  3.40    "198468696",  "0",  "23028197302599684" 0.26        idn         value       where


3. Test Code



3.1 Queue Table, Queue and Subscribers



--  Oracle Advanced Queuing by Example (https://docs.oracle.com/cd/B10500_01/appdev.920/a96587/apexampl.htm)

execute dbms_aqadm.stop_queue (queue_name         => 'msg_queue_multiple');

execute dbms_aqadm.drop_queue (queue_name         => 'msg_queue_multiple');

execute dbms_aqadm.drop_queue_table (queue_table  => 'MultiConsumerMsgs_qtab', force=> TRUE);

drop type Message_typ force;

/* Creating a message type: */
create or replace noneditionable type message_typ as object (
	subject     varchar2(30),
	text        varchar2(256),
	return_msg  clob);
/   

----- 2. Creating a Multiconsumer Queue Table and Queue

begin
  dbms_aqadm.create_queue_table (
		queue_table        => 'MultiConsumerMsgs_qtab',
		multiple_consumers => TRUE, 
		compatible        => '10.0.0',
		sort_list         =>  'PRIORITY,ENQ_TIME',
		queue_payload_type => 'Message_typ');
end;
/

begin
  dbms_aqadm.create_queue (
		queue_name         => 'msg_queue_multiple',
		queue_table        => 'MultiConsumerMsgs_qtab',
	  max_retries        =>   5,
    retry_delay        =>   2,
    retention_time     =>   86400);
end;
/

-- exec dbms_aqadm.alter_queue('MSG_QUEUE_MULTIPLE', 10, 2, 86400);

begin
  dbms_aqadm.start_queue (
		queue_name         => 'msg_queue_multiple');
end;
/


create or replace procedure purge_queue_table as
	po dbms_aqadm.aq$_purge_options_t;
begin
	dbms_aqadm.purge_queue_table('MultiConsumerMsgs_qtab', null, po);
end;
/

create or replace procedure create_subscribers(p_queue_name varchar2, p_cnt number) as
   l_subscriber sys.aq$_agent;
begin
   for i in 1..p_cnt loop
     begin
		   l_subscriber := sys.aq$_agent('KSUB_'||i, null, null);
		   dbms_aqadm.add_subscriber(queue_name => p_queue_name, subscriber => l_subscriber);
	   exception when others then
	      dbms_output.put_line ('Error '||sqlerrm);
	      commit;
	   end;
   end loop;
end;
/

--  Test
--exec create_subscribers('msg_queue_multiple', 100);

create or replace procedure remove_subscribers(p_queue_name varchar2, p_cnt number) as
   l_subscriber SYS.aq$_agent;
begin
	  for idx in (select consumer_name from dba_queue_subscribers a where a.queue_name = p_queue_name
	                                                                --and consumer_name like 'KSUB%'
	  ) loop
	    l_subscriber := sys.aq$_agent(idx.consumer_name, null, null);
	    dbms_aqadm.remove_subscriber(p_queue_name, l_subscriber);
	  end loop;
end;
/

-- exec remove_subscribers('MSG_QUEUE_MULTIPLE', 100);


3.2 Job clearup Helper and cpu_burner simulator



create or replace procedure clearup_test as
begin
  for c in (select * from dba_scheduler_jobs where job_name like '%TEST_JOB%') loop
    begin
      --set DBA_SCHEDULER_JOBS.enabled=FALSE
	    dbms_scheduler.disable (c.job_name, force => true, commit_semantics =>'ABSORB_ERRORS');
	    --set DBA_SCHEDULER_JOBS.enabled=TRUE, so that it can be scheduled to run (state='RUNNING')
	    --  dbms_scheduler.enable (c.job_name, commit_semantics =>'ABSORB_ERRORS');
	  exception when others then null;
	  end;
	end loop;
	
  for c in (select * from dba_scheduler_running_jobs where job_name like '%TEST_JOB%') loop
    begin
      --If force=FALSE, gracefully stop the job, slave process can update the status of the job in the job queue.
      --If force= TRUE, the Scheduler immediately terminates the job slave.
      --For repeating job with attribute "start_date => systimestamp" and enabled=TRUE, 
      --re-start immediate (state changed from 'SCHEDULED' to 'RUNNING'), DBA_SCHEDULER_JOBS.run_count increases 1.
	    dbms_scheduler.stop_job (c.job_name, force => true, commit_semantics =>'ABSORB_ERRORS');
	  exception when others then null;
	  end;
	end loop;
	
  for c in (select * from dba_scheduler_jobs where job_name like '%TEST_JOB%') loop
    begin
      --If force=TRUE, the Scheduler first attempts to stop the running job instances 
      --(by issuing the STOP_JOB call with the force flag set to false), and then drops the jobs.
	    dbms_scheduler.drop_job (c.job_name, force => true, commit_semantics =>'ABSORB_ERRORS');
	    -- if there is still not-killable jobs, try following
	    --dbms_scheduler.drop_job (c.job_name, force => true);
	  exception when others then null;
	  end;
	end loop;
end;
/

--exec clearup_test;

-- Simulate CPU Load
create or replace procedure cpu_burner (n number) as
  x   number := 0;
begin
  for i in 1 .. n loop
    x := mod (n, 999999) + sqrt (i);
  end loop;
end;
/


3.3 Enqueue, Dequeue (listen_deq and subscr_deq)



create or replace procedure enq(p_sub number) as
   l_enqueue_options     dbms_aq.enqueue_options_t;
   l_message_properties  dbms_aq.message_properties_t;
   l_recipients          dbms_aq.aq$_recipient_list_t;
   l_message_handle      raw(16);
   l_message             message_typ;
   l_return_msg          clob; 
   l_rep_string          varchar2(32000);
   c_queue_name          varchar2(30) := 'K.MSG_QUEUE_MULTIPLE';
   l_name                varchar2(100);
   l_agt_name            varchar2(100);
begin
   --visibility:
     --   dbms_aq.immediate: no commit is required, message is enqueued and committed for dequeue immediately in an autonomous transaction.
     --   dbms_aq.on_commit (default): commit is required, message is visibile for dequeue after COMMIT. enqueue session has an open transaction before commit.
     
   l_enqueue_options.visibility    := dbms_aq.immediate;
   l_enqueue_options.delivery_mode := dbms_aq.persistent;
   
   l_name := 'Message for KSUB_'||p_sub;
   
   ----- when dbms_lob used, "ibrary cache: mutex X" on dbms_lob
   -- l_rep_string := rpad('ABC', 4000, 'x');
   -- dbms_lob.createtemporary(l_return_msg, TRUE, dbms_lob.session);    
   -- l_return_msg := l_name;   
   -- write 300 KB CLOB, It will cause heavy "enq: HW - contention" wait event.
   --  for i in 1..10 loop
   --    dbms_lob.writeappend(l_return_msg, length(l_rep_string), l_rep_string);
   --  end loop;
   
   l_return_msg :=  rpad('ABC', 4000, 'x');
   l_message := message_typ(l_name, to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS')||
                        '. This message is queued for recipient: Subscriber_'||p_sub, l_return_msg);
   l_agt_name := 'KSUB_'||p_sub;    -- subscriber, M_IDEN
	 l_recipients(1) := SYS.aq$_agent(l_agt_name,  NULL, NULL);     --(name, address, protocol)
   l_message_properties.recipient_list := l_recipients;
   l_message_properties.correlation := l_agt_name;

   dbms_aq.enqueue(
     queue_name         => c_queue_name,
     enqueue_options    => l_enqueue_options,
     message_properties => l_message_properties,
     payload            => l_message,
     msgid              => l_message_handle);
   
   --debug('enq '||l_name);
   dbms_output.put_line (to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS')||' --- enqueue '||l_name||
                         ', corr_id=' || l_message_properties.correlation||', text=' || l_message.text);
   commit;    -- enqueue with dbms_aq.immediate, no commit required
end;
/

-- exec enq(1);

-- With using sys.aq$_agent, heavy "latch frree" on "channel operations parent latch" (latch#=98) 
--    on a particular child: v$latch_children.CHILD#=144
--    Event: wait for unread message on broadcast channel
create or replace procedure listen_deq(p_sub number, p_wait number := 10) as
   l_dequeue_options     dbms_aq.dequeue_options_t;
   l_message_properties  dbms_aq.message_properties_t;
   l_message_handle      raw(16);
   l_message             message_typ;
   l_listen_list         DBMS_AQ.aq$_agent_list_t;
   l_agent               sys.aq$_agent;
   c_queue_name          varchar2(30) := 'K.MSG_QUEUE_MULTIPLE';
   l_name                varchar2(100);
   l_agt_name            varchar2(100);
   l_start_time          number; 
   l_elapsed_time        pls_integer := 0;
   l_timeout             pls_integer := p_wait *100;   --centisecond
   l_ret                 varchar2(100);
   no_messages           exception;
   pragma exception_init (no_messages, -25228);
begin
    l_agt_name := 'KSUB_'||p_sub;
    l_listen_list(1):= sys.aq$_agent(l_agt_name, c_queue_name, null);
    l_start_time:= dbms_utility.get_time();    -- this has overflow !
    
    while l_ret is null and (dbms_utility.get_time() - l_start_time) < l_timeout loop
	    begin
	      -- dbms_aq.listen can have false wakeup if two or more sessions are waiting with dbms_aq.listen, 
	      -- but only one enqueued message is arrived.
          
          -- when dequeue_options.visibility = dbms_aq.on_commit (default, it has an open transaction, requires commit)
          -- listen call will return a false positive when another session has an uncommitted dequeue on a multi-consumer queue.
          -- Listen call is supposed to return a true if it finds one message for a subscriber.
          -- When there is an uncommitted dequeue, message is still there in queue table. So listen will return true.
          -- Also it might be the case that first session rolls back just before commit.
 
		    dbms_aq.listen(
		        agent_list => l_listen_list
		       ,wait       => p_wait
		       ,agent      => l_agent
		      );    
		      
		    l_dequeue_options.consumer_name := l_agent.name;
		    l_dequeue_options.visibility    := dbms_aq.immediate;    -- for dequeue, when on_commit, there is an open transaction, requires commit
		    l_dequeue_options.delivery_mode := dbms_aq.persistent;
		    l_dequeue_options.navigation    := dbms_aq.first_message;
		    l_dequeue_options.wait          := DBMS_AQ.NO_WAIT;
		    
			  dbms_aq.dequeue(
			    queue_name         => c_queue_name,
			    dequeue_options    => l_dequeue_options,
			    message_properties => l_message_properties,
			    payload            => l_message,
			    msgid              => l_message_handle);
			  commit;
			  
			  dbms_output.put_line (to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS')||' >>> dequeue ' || l_message.subject||
			                        ', corr_id=' || l_message_properties.correlation||
			                        ', text=' || l_message.text|| ' , return_msg Length=' || length(l_message.return_msg) ||
			                        ' -- dequeue_options.consumer_name for '||l_agt_name);
			  --dequeue_options.navigation := DBMS_AQ.NEXT_MESSAGE;
			  --debug('deq '||l_message.subject);
			  l_ret := l_message.text;
			  exit;  --DEQUEUE SUC, exit, 
			  
			  exception 
			    when NO_MESSAGES then
			      dbms_output.put_line (to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS')||' -- Error for '||l_agt_name||', '||SQLERRM);
			      commit;
			    when OTHERS then
			      dbms_output.put_line (to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS')||' -- Error for '||l_agt_name||', '||SQLERRM);
			      l_ret := SQLERRM;
			      commit;
			end;
	end loop;
end;
/

-- exec listen_deq(1, 100);

-- Without using sys.aq$_agent, avoid "latch free" on "channel operations parent latch" (latch#=98)
--     Event: Streams AQ: waiting for messages in the queue
--     Same Event in non-multi-consumer(i.e, single) queue
create or replace procedure subscr_deq(p_sub number, p_wait number := 10) as
   l_dequeue_options     dbms_aq.dequeue_options_t;
   l_message_properties  dbms_aq.message_properties_t;
   l_message_handle      raw(16);
   l_message             message_typ;
   c_queue_name          varchar2(30) := 'K.MSG_QUEUE_MULTIPLE';
   l_msg_cnt             number := 0;
   l_try_limit           number := 1;      --- only dequeue one message
   l_try_cnt             number := 0;
   l_agt_name            varchar2(100);
   no_messages           exception;
   pragma exception_init (no_messages, -25228);
begin
  l_agt_name :=  'KSUB_'||p_sub;
  l_dequeue_options.visibility    := dbms_aq.immediate;
  l_dequeue_options.delivery_mode := dbms_aq.persistent;
  l_dequeue_options.wait          := p_wait;     --DBMS_AQ.NO_WAIT;   -- wait 10 seconds
  l_dequeue_options.consumer_name := l_agt_name;
	l_dequeue_options.navigation    := dbms_aq.first_message;
	--dequeue_options.navigation    := dbms_aq.next_message;   
	
	loop
	  l_try_cnt := l_try_cnt + 1;
	  begin
	    dbms_aq.dequeue(
	      queue_name         => c_queue_name,
	      dequeue_options    => l_dequeue_options,
	      message_properties => l_message_properties,
	      payload            => l_message,
	      msgid              => l_message_handle);
	    COMMIT;
			dbms_output.put_line (to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS')||' >>> dequeue ' || l_message.subject||
			                        ', corr_id=' || l_message_properties.correlation||
			                        ', text=' || l_message.text|| ' , return_msg Length=' || length(l_message.return_msg)||
			                        ' -- dequeue_options.consumer_name for '||l_agt_name);
			
	    --debug('deq '||l_message.subject);
	    l_msg_cnt := l_msg_cnt + 1;
	    
	    exception when no_messages then
	      dbms_output.put_line (to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS')||' -- Error for '||l_agt_name||', '||SQLERRM);
	      commit;
    end;

	  if l_msg_cnt > 0 or l_try_cnt >= l_try_limit then 
	    exit;
	  end if; 
	end loop;  
end;
/

-- exec subscr_deq(1, 10);


--  create or replace type sys.aq$_agent as object
--  (name     varchar2 (30),     -- M_IDEN, name of a message producer or consumer
--   address  varchar2 (1024),   -- address where message must be sent
--   protocol number             -- protocol for communication, must be 0
--  );
--  
--  alter type aq$_agent modify attribute (name varchar2 (512)) cascade
--  /
-- 
--  DBMS_AQ.LISTEN (
--     agent_list            IN    AQ$_AGENT_LIST_T,
--     wait                  IN    BINARY_INTEGER DEFAULT DBMS_AQ.FOREVER,
--     agent                 OUT   SYS.AQ$_AGENT);


3.4 Test Jobs



-- enqueue loop
-- adjust cpu_burner
create or replace procedure enq_loop(p_sub number, p_dur_seconds number) as
  l_timeout    pls_integer := p_dur_seconds *100;    -- centisecond
  l_start_time number := dbms_utility.get_time();    -- this has overflow !
  l_cnt        number := 0;
begin
  dbms_application_info.set_action('Enq Job '||p_sub);
  while (dbms_utility.get_time() - l_start_time) < l_timeout
  loop 
    cpu_burner(1*1e4);
    enq(p_sub);
    l_cnt := l_cnt + 1;
  end loop;
  dbms_output.put_line('Enq Messages = '||l_cnt||', Dureation (seconds) = '||((dbms_utility.get_time() - l_start_time)/100));
end;
/

-- exec enq_loop(1, 10);

-- adjust cpu_burner
create or replace procedure start_enq_jobs(p_cnt number, p_dur_seconds number) as
  l_job_name varchar2(50);
begin
  for i in 1..p_cnt loop
    l_job_name := 'TEST_JOB_ENQ_'||i;
    dbms_scheduler.create_job (
      job_name        => l_job_name,
      job_type        => 'PLSQL_BLOCK',
      job_action      => 
        'begin 
           enq_loop('||i||', '||p_dur_seconds||');
        end;',    
      start_date      => systimestamp,
      --repeat_interval => 'systimestamp',
      auto_drop       => true,
      enabled         => true);
  end loop;
end;
/

-- dequeue loop
-- adjust cpu_burner
create or replace procedure sel_deq_loop(p_sub number, p_sel number := 1, p_dur_seconds number, p_wait number := 10) as
  l_timeout    pls_integer := p_dur_seconds *100;    --centisecond
  l_start_time number := dbms_utility.get_time();    -- this has overflow !
  l_cnt        number := 0;
begin
  dbms_application_info.set_action('Deq Job '||p_sub);
  while (dbms_utility.get_time() - l_start_time) < l_timeout
  loop
    cpu_burner(1*1e2);
    if p_sel = 1 then
      listen_deq(p_sub, p_wait);
    else
      subscr_deq(p_sub, p_wait);
    end if; 
    l_cnt := l_cnt + 1; 
  end loop;
  
  if p_sel = 1 then
    dbms_output.put_line('Listener Deq Execs (SUC and Failed) = '||l_cnt||', Duration (seconds) = '||((dbms_utility.get_time() - l_start_time)/100));
  else
    dbms_output.put_line('Subscriber Deq Execs (SUC and Failed) = '||l_cnt||', Duration (seconds) = '||((dbms_utility.get_time() - l_start_time)/100));
  end if;
end;
/

-- exec sel_deq_loop(1, 1, 10);   -- listen dequeue 
-- exec sel_deq_loop(1, 2, 10);   -- subscriber dequeue

create or replace procedure start_deq_jobs(p_cnt number, p_sel number := 1, p_dur_seconds number) as
  l_job_name varchar2(50);
begin
  for i in 1..p_cnt loop
    l_job_name := 'TEST_JOB_DEQ_'||i;
    dbms_scheduler.create_job (
      job_name        => l_job_name,
      job_type        => 'PLSQL_BLOCK',
      job_action      => 
        'begin 
           sel_deq_loop('||i||','||p_sel||', '||p_dur_seconds||');
        end;',    
      start_date      => systimestamp,
      --repeat_interval => 'systimestamp',
      auto_drop       => true,
      enabled         => true);
  end loop;
end;
/


3.5 Test Run: listen vs. subscriber



create or replace procedure test_run (p_jobs number, p_sel number, p_dur_seconds number, is_awr number := 0) as
begin
  clearup_test;
  purge_queue_table;
  
  if is_awr > 0 then SYS.DBMS_WORKLOAD_REPOSITORY.create_snapshot('ALL'); end if;
  start_enq_jobs(p_jobs, p_dur_seconds);
  start_deq_jobs(p_jobs, p_sel, p_dur_seconds);     -- listen dequeue 
  
  -- after 10 minutes, stop, get AWR
  dbms_session.sleep(p_dur_seconds);
  
  if is_awr > 0 then SYS.DBMS_WORKLOAD_REPOSITORY.create_snapshot('ALL'); end if;
  clearup_test;
end;
/


-- Start test run for 10 minutes
--   exec test_run(80, 1, 10*60, 1);   -- listen dequeue 
--   exec test_run(80, 2, 10*60, 1);   -- subscriber dequeue

Oracle Multi-Consumer AQ dequeue with dbms_aq.listen Performance Model (II)

Wait Events (I)           Performance Model (II)           Performance Analysis and Workaround (III)     


In the first Blog (I), we made test to show Wait Events: "library cache: mutex X" on queue object and "latch free" on Latch "channel operations parent latch".

In this second Blog (II), we will make dequeue test with GDB trace. Based on test outcome, we will build Latch and Mutex model, examine it against real DB stats and subsequently reveals implementation problems.

Note: tested on Oracle 19.17 in Linux


1. Library Cache Lock/Pin, Mutex Get, Latch Get


First we compose a GDB Script to trace dequeue stats (see Section 5 GDB Script).
Then we make tests with different scenarios.


1.1 Case-1: enqueue - dequeue


Open two Sqlplus sessions (SID-1 and SID-2). In SID-2, enqueue one message, execute gdb script on UNIX process of SID-1 (UNIX process pid: 27039), then on SID-1 make dequeue.

Here all output:

(SID-2) > exec enq(1);
  11:16:29 --- enqueue Message for KSUB_1, corr_id=KSUB_1, text=11:16:29.

$ > gdb -x gdb_cmd_lm_script1 -p 27039
	
(SID-1) > exec listen_deq(1, 60);
  11:17:10 >>> dequeue Message for KSUB_1, corr_id=KSUB_1, text=11:16:29.

$ > gdb -x gdb_cmd_lm_script1 -p 27039
  11:17:10
  ===== BreakPoint: 55-Counter: [ LOCK: 2, PIN: 3, LATCH: 3, MUTEX: 18 ] =====
GDB output shows that for one single message dequeue, it needs:

   2   Library Cache Lock (kgllkal)
   3   Library Cache Pin  (kgllkdl)
   3   Latch Gets         (ksl_get_shared_latch)
   18  Mutex Gets         (kglGetMutex)
   55  Breakooints        (55-Counter)
For all the output details, see Section 6. GDB Output of Case-1.


1.2 Case-2: dequeue - enqueue


We first start dequeue in SID-1 and gdb its process, then enqueue one message in SID-2. Since dequeue starts first, it is waiting for message in dbms_aq.listen (with timeout 60 seconds).

Here the test output:

$ > gdb -x gdb_cmd_lm_script1 -p 27039
  
(SID-1) > exec listen_deq(1, 60);
  
$ > gdb -x gdb_cmd_lm_script1 -p 27039
  11:26:18
  ===== BreakPoint: 38-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 12 ] =====

(SID-2) > exec enq(1);
	11:26:42 --- enqueue Message for KSUB_1, corr_id=KSUB_1, text=11:26:42.
	
(SID-1) > exec listen_deq(1, 60);
  11:26:42 >>> dequeue Message for KSUB_1, corr_id=KSUB_1, text=11:26:42.
  
$ > gdb -x gdb_cmd_lm_script1 -p 27039
  11:26:42
  ===== BreakPoint: 103-Counter: [ LOCK: 4, PIN: 5, LATCH: 9, MUTEX: 30 ] =====
gdb output shows stats at wait start:

   2   Library Cache Lock (kgllkal)
   2   Library Cache Pin  (kgllkdl)
   1   Latch Gets         (ksl_get_shared_latch)
   12  Mutex Gets         (kglGetMutex)
   38  Breakooints        (38-Counter)
gdb output shows stats after message dequeued:

   4   Library Cache Lock (kgllkal)
   5   Library Cache Pin  (kgllkdl)
   9   Latch Gets         (ksl_get_shared_latch)
   30  Mutex Gets         (kglGetMutex)
   103 Breakooints        (103-Counter)


1.3 Case-3: dequeue - enqueue other corr_id - enqueue


Similar to Case-2, but we make two enqueues, the first is enqueued to other corr_id (corr_id=KSUB_31), the second is enqueued for the waiting dequeue (corr_id=KSUB_1).

We can see that the first enqueue for corr_id=KSUB_31 also wakes up dequeue, and hence Latch and Mutex stats are increased:

$ > gdb -x gdb_cmd_lm_script1 -p 27039

(SID-1) > exec listen_deq(1, 60);

$ > gdb -x gdb_cmd_lm_script1 -p 27039
  11:36:19
  ===== BreakPoint: 38-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 12 ] =====

(SID-2) > exec enq(31);
  11:36:38 --- enqueue Message for KSUB_31, corr_id=KSUB_31, text=11:36:38.

$ > gdb -x gdb_cmd_lm_script1 -p 27039
  11:36:38
  ===== BreakPoint: 86-Counter: [ LOCK: 4, PIN: 4, LATCH: 7, MUTEX: 24 ] =====

(SID-2) > exec enq(1);
  11:37:12 --- enqueue Message for KSUB_1, corr_id=KSUB_1, text=11:37:12.

(SID-1) > exec listen_deq(1, 60);
  11:37:12 >>> dequeue Message for KSUB_1, corr_id=KSUB_1, text=11:37:12.

$ > gdb -x gdb_cmd_lm_script1 -p 27039
  11:37:12
  ===== BreakPoint: 151-Counter: [ LOCK: 6, PIN: 7, LATCH: 15, MUTEX: 42 ] =====
gdb output shows stats at wait start:

   2   Library Cache Lock (kgllkal)
   2   Library Cache Pin  (kgllkdl)
   1   Latch Gets         (ksl_get_shared_latch)
   12  Mutex Gets         (kglGetMutex)
   38  Breakooints        (38-Counter)
gdb output shows the first wake-up (false wake-up) stats:

   4   Library Cache Lock (kgllkal)
   4   Library Cache Pin  (kgllkdl)
   7   Latch Gets         (ksl_get_shared_latch)
   24  Mutex Gets         (kglGetMutex)
   86  Breakooints        (86-Counter)
the second wake-up (true wake-up) stats:

   6   Library Cache Lock (kgllkal)
   7   Library Cache Pin  (kgllkdl)
   15  Latch Gets         (ksl_get_shared_latch)
   42  Mutex Gets         (kglGetMutex)
   151 Breakooints        (151-Counter)


1.4 Case-4: dequeue with timeout


We start a dequeue, but no enqueue, and then wait dequeue timeout:

$ > gdb -x gdb_cmd_lm_script1 -p 27039

(SID-1) > exec listen_deq(1, 60);

$ > gdb -x gdb_cmd_lm_script1 -p 27039
  11:42:30
  ===== BreakPoint: 38-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 12 ] =====

(SID-1) > exec listen_deq(1, 60);
  11:43:30 -- Error for KSUB_1, ORA-25254: time-out in LISTEN while waiting for a message

$ > gdb -x gdb_cmd_lm_script1 -p 27039
  11:43:30
  ===== BreakPoint: 42-Counter: [ LOCK: 2, PIN: 2, LATCH: 3, MUTEX: 12 ] =====
gdb output shows the stats at wait start:

   2   Library Cache Lock (kgllkal)
   2   Library Cache Pin  (kgllkdl)
   1   Latch Gets         (ksl_get_shared_latch)
   12  Mutex Gets         (kglGetMutex)
   38  Breakooints        (38-Counter)
gdb output shows the timeout stats:

   2   Library Cache Lock (kgllkal)
   2   Library Cache Pin  (kgllkdl)
   3   Latch Gets         (ksl_get_shared_latch)
   12  Mutex Gets         (kglGetMutex)
   42  Breakooints        (42-Counter)


2. Performance Model and Analysis


According to above test outcome, we can estimate number of Latch and Mutex Gets, and build a performance analysis model for concurrent parallel sessions.

The numbers used in model are approximative and are based on the observations of real DB stats.

We examine the model against real DB collected stats and show the deviations.

In next Blog (III), we will make tests with different number of parallel sessions, and collect exact DB Latch and Mutex stats.


2.1. Mutex and Latch Formulas



Assume N: number of Sessions
       M: number of dequeued messages
       no timeout in dequeue (no repeated get)
       each message is only dequeued once and by one specified target session
        
       MGets: Number of Mutex Gets on Queue Object
       LGets: Number of Latch Gets on LATCH#=98 (channel operations parent latch)

For one dequeue (M=1) and no concurrent session (N=1):
       MGets = 12
       LGets =  6
       
For one dequeue (M=1) and N concurrent sessions:
       MGets = 12 * N
       LGets =  6 * N

For M dequeues and N concurrent sessions:
       MGets = 12 * N * M
       LGets =  6 * N * M


2.2. Concurrency Contentions


When one Message is enqueued, all sessions waiting with dbms_aq.listen are waked up simultaneously. They are concurrently contending for the DB-wide single Mutex (on Queue Object), and single Child Latch (CHILD#=144 of LATCH#=98, LATCH#=98 has 173 Children, but only 144 is used for dbms_aq.listen on any Queues).

Hence, dbms_aq.listen artificially "synchronize" the DB-wide single resource contentions.


2.3. Examination of Latch Formula on Real DB



2.3.1 Pick following data from AWR:



		  dequeue SQL Query Executions = 96,508	
		  Enq Msgs                     = 94,551
		  Deq Msgs                     = 94,551
		          
      Latch Name                       Get Requests   Misses     Sleeps   Spin Gets
      -------------------------------  -------------  ---------	 -------  ---------
      channel operations parent latch  43,999,351     3,234,815  130,568  3,107,867


2.3.2 Compute and Compare:



      Real      LGets (rLGets) = 43,999,351	
		  Estimated LGets (eLGets) = 6 * N * M
		                           = 6 * 64 * 96,508	
		                           = 37,059,072
Due to Latch Misses and Sleeps, Estimated LGets can be adjusted as:

		  Adjusted Estimated LGets = eLGets     + Misses
		                           = 37,059,072 + 3,234,815
		                           = 40,293,887
		                           
		  Estimation Error = 43,999,351 - 40,293,887 
		                   = 3,705,464 (8.42%)
Note that Latch has recurrent_sleeps:
		
		  recurrent_sleeps = sleeps  + spin_gets ñ misses
		                   = 130,568 + 3,107,867 - 3,234,815
		                   = 3,620
Because of recurrent_sleeps, Estimated LGets can be less than Real LGets.


3. dbms_aq.listen dequeue - child latch 144


For dequeue with dbms_aq.listen, following test shows that only child# 144 of Latch "channel operations parent latch" (latch#=98) is used. Each RUN needs 3 latch GETs.

For dequeue with subscriber, no such latch is needed.

-----===== enqueue 20 messages for KSUB_1 
begin
  for i in 1..20 loop
    enq(1);
  end loop;
end;
/

-----===== dequeue 10 messages by dbms_aq.listen =====
column latch_gets new_value latch_gets_1
column latch_children_gets new_value latch_children_gets_1

select 
  (select gets from v$latch where latch#=98) latch_gets,
  (select gets from v$latch_children where latch#=98 and child# = 144) latch_children_gets
from dual;

begin
  for i in 1..10 loop
    begin
      listen_deq(1, 10); 
    EXCEPTION WHEN others THEN
	    null;  --DBMS_OUTPUT.PUT_LINE ('---------------- No more messages for KSUB_'||1);
	  end;
  end loop;
end;
/

select 
  (select gets - &latch_gets_1 from v$latch where latch#=98) latch_gets_delta,
  (select gets - &latch_children_gets_1 from v$latch_children where latch#=98 and child# = 144) latch_children_gets_delta
from dual;

LATCH_GETS_DELTA LATCH_CHILDREN_GETS_DELTA
---------------- -------------------------
              31                        30

-----===== dequeue 10 messages by subscriber =====

select 
  (select gets from v$latch where latch#=98) latch_gets,
  (select gets from v$latch_children where latch#=98 and child# = 144) latch_children_gets
from dual;


begin
  for i in 1..10 loop
    begin
      sub_deq(1, 10); 
    EXCEPTION WHEN others THEN
	    null;  --DBMS_OUTPUT.PUT_LINE ('---------------- No more messages for KSUB_'||1);
	  end;
  end loop;
end;
/

select 
  (select gets - &latch_gets_1 from v$latch where latch#=98) latch_gets_delta,
  (select gets - &latch_children_gets_1 from v$latch_children where latch#=98 and child# = 144) latch_children_gets_delta
from dual;

LATCH_GETS_DELTA LATCH_CHILDREN_GETS_DELTA
---------------- -------------------------
               1                         0


4. dequeue waiting semtimedop


If we use strace to display dequeue waiting with dbms_aq.listen, we can see that the internal timeout of 1 second ("{1, 0}"):

SQL > exec listen_deq(32, 120);

$ > strace -ttp 19009
  strace: Process 19009 attached
  15:29:36.161830 semtimedop(9928704, [{63, -1, 0}], 1, {1, 0}) = -1 EAGAIN (Resource temporarily unavailable)
  15:29:37.162159 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={2, 747478}, ru_stime={0, 322189}, ...}) = 0
  15:29:37.162333 semtimedop(9928704, [{63, -1, 0}], 1, {1, 0}) = -1 EAGAIN (Resource temporarily unavailable)
  15:29:38.162152 semtimedop(9928704, [{63, -1, 0}], 1, {1, 0}) = -1 EAGAIN (Resource temporarily unavailable)
If we use strace to display dequeue waiting with subscriber, we can see that the internal timeout of 1 second ("{3, 0}"):

SQL > exec sub_deq(32, 120);

$ > strace -ttp 19009

  strace: Process 19009 attached
  15:38:37.236174 semtimedop(9928704, [{63, -1, 0}], 1, {4, 641000}) = -1 EAGAIN (Resource temporarily unavailable)
  15:38:41.237194 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={2, 764867}, ru_stime={0, 323688}, ...}) = 0
  15:38:41.237309 semtimedop(9928704, [{63, -1, 0}], 1, {3, 0}) = -1 EAGAIN (Resource temporarily unavailable)
  15:38:44.237145 semtimedop(9928704, [{63, -1, 0}], 1, {3, 0}) = -1 EAGAIN (Resource temporarily unavailable)
For all 4 test cases in Section 1, we can also use strace to momitor semtimedop. In particular for Case-3 (dequeue - enqueue other corr_id - enqueue), we can see the inter-process communication between different sessions. More info about semaphores can be showed by:

$ > ipcs -s
  ------ Semaphore Arrays --------
  key        semid      owner      perms      nsems
  0xe4774fb0 9928704    oracle     600        250
  0xe4774fb1 9961473    oracle     600        250
  0xe4774fb2 9994242    oracle     600        250

$ > ipcs -s -i 9928704
$ > ipcs -l


5. GDB Script


To run appended GDB script, we first need to find following three address.


5.1 Get Queue Object Handlr Addr: KGLLKHDL_ADDR



select addr, hash_value idn,  v.* from v$db_object_cache v where name = 'MSG_QUEUE_MULTIPLE'; 
-- 98417B58  198468696

select kgllkhdl, kglhdpar, v.* from sys.x_kgllk v where kglnaobj = 'MSG_QUEUE_MULTIPLE'; 
-- 98417B58  98417B58

select object_handle, mode_held, mode_requested, v.* from v$libcache_locks v where object_handle like '%98417B58';
-- 98417B58

Lock/pin mode held in v$libcache_locks:
    0 - No lock/pin held
    1 - Null mode
    2 - Share mode
    3 - Exclusive mode

Note: we use Blog:      
  Tracing Library Cache Locks 3: Object Names (Nenad Noveljic, March 1, 2021)
    (https://nenadnoveljic.com/blog/library-cache-lock-object-name/)
to printout library object name and schema: MSG_QUEUE_MULTIPLEK 


5.2 Get Queue Object Mutex Addr: KGLHDMTX_ADDR



select kglhdmtx, kglhdadr, v.* from sys.x_kglob v where kglnaobj = 'MSG_QUEUE_MULTIPLE'; 
--KGLHDMTX=98417CA8, KGLHDADR=98417B58

-- Note: each Mutex has a number of different Locations


5.3 Get Latch 'channel operations parent latch' (Latch#=144) Top Child# Addr: LATCHCHD_ADDR


We can get Top Child# Addr by query:

select * from (
  select addr addr_144, child# child#_144, v.* from v$latch_children v 
   where latch#=98 or name = 'channel operations parent latch' order by gets desc
) where rownum = 1;  
-- B4C33CF8  144
It can also be found by following test and query:

Open two Sqlplus sessions.

-- In Session_1, run 

SQL (SID-1) > exec listen_deq(1, 120);

-- Note: For dbms_aq.listen wait, the session is waiting on event: 'wait for unread message on broadcast channel'.
--       For dbms_aq.dequeue wait, the session is waiting on event: 'Streams AQ: waiting for messages in the queue'.

-- In Session_2, run query below to display CHILD#=144:

SQL (SID-1) > with sq as (select program, event, sid, p1text, p1 from v$session where event = 'wait for unread message on broadcast channel')
              select l.addr, child#, s.program, s.sid, s.p1 p1_num, (s.p1+72) p1_num_72, to_number(l.addr, 'XXXXXXXXXXXXXXXX') child_addr_dec, l.* 
               from v$latch_children l, sq s
               where (latch#=98 or name = 'channel operations parent latch')
                 and to_number(l.addr, 'XXXXXXXXXXXXXXXX') = s.p1 + 72;
                               
-- B4C33CF8 144

-- Note: Above query shows that Oracle background SVCB is also waiting on Event "unread message on broadcast channel"
--       Probably it also dbms_aq.listen (Latch#=98,  CHILD#=36) 
--       (SVCB is supposed only for Oracle RAC, but we observed it also in non-RAC DB (Oracle Docu error)) 


-- For ksl_get_shared_latch, see Blog: 
--   Oracle 12 and latches (https://fritshoogland.wordpress.com/2015/07/17/oracle-12-and-latches/):

The kslgetl and ksl_get_shared_latch functions take the following arguments:
    1-latch address
    2-immediate get (0 means yes, 1 means no)
    3-where (X$KSLLW.INDX)
    4-why (X$KSLWSC.INDX)
    5-mode (8=shared,16=exclusive; only for ksl_get_shared_latch function)


5.4 Stats Counter


For each Get operation (kgllkal, kglpin, kglGetMutex, ksl_get_shared_latch), we increase its respective Counter by 1. For each Release operation (kgllkdl, kglpndl, kglReleaseMutex, kslfre), the respective current Counter value is used.

After each operation, we printout all 4 Counter values.


5.5 GDB Script Code


With three above addresses, we can compose following GDB script:

------------------------- gdb -x gdb_cmd_lm_script1 -p 3214 -------------------------

set pagination off
set logging file mutex-latch_test1.log
set logging overwrite on
set logging on
set $KGLLKHDL_ADDR=0X98417B58
set $KGLHDMTX_ADDR=0X98417CA8
set $LATCHCHD_ADDR=0XB4C33CF8
set $s = 0
set $k = 0, $kr = 0
set $p = 0, $pr = 0
set $l = 0, $lr = 0
set $m = 0, $mr = 0

define PrtCont
  bt $arg0
  shell date
  printf "===== BreakPoint: %i-Counter: [ LOCK: %i, PIN: %i, LATCH: %i, MUTEX: %i ] =====\n", ++$s, $k, $p, $l, $m
  continue
end

# kgllkal kglLock kglget
break *kgllkal if $rdx==$KGLLKHDL_ADDR
command
printf "===== Lock Get Seq: %i -- kgllkal (kgllkhdl Addr(rdx)=>%X, Mode(rcx)=>%X), kglnaobj_and_schema: %s --\n", ++$k, $rdx, $rcx, ($rdx+0x1c0)
set $kr = 0
PrtCont 10
end

# kgllkdl kglUnLock
break *kgllkdl if $r8==$KGLLKHDL_ADDR
command
printf "===== Lock Rel Seq: %i-%i -- kgllkdl (kgllkhdl Addr(r8)=>%X) ---\n", $k, ++$kr, $r8
PrtCont 3
end

break *kglpin if $rcx==$KGLLKHDL_ADDR
command
printf "===== Pin Get Seq: %i -- kglpin (kgllkhdl Addr(rcx)=>%X, Mode(r8)=>%X, mutex Addr(r11)=>%X)--\n", ++$p, $rcx, $r8, $r11
set $pr = 0
PrtCont 3
end

#kglUnPin
break *kglpndl if $r8==$KGLLKHDL_ADDR
command
printf "===== Pin Rel Seq: %i-%i -- kglpndl(kgllkhdl Addr(r8)=>%X) --\n", $p, ++$pr, $r8
PrtCont 3
end

break *kglGetMutex if $rsi==$KGLHDMTX_ADDR
command
printf "===== Mutext Get Seq: %i -- kglGetMutex(kglhdmtx Addr(rsi)=>%X, Location(r8)=>%i) --\n", ++$m, $rsi, $r8
set $mr = 0
PrtCont 3
end

break *kglReleaseMutex if $rsi==$KGLHDMTX_ADDR
command
printf "===== Mutext Rel Seq: %i-%i -- kglReleaseMutex(kglhdmtx Addr(rsi)=>%X, Location(r8)=>%i) --\n", $m, ++$mr, $rsi, $r8
PrtCont 3
end

break *ksl_get_shared_latch if $rdi==$LATCHCHD_ADDR
command
printf "===== Latch Get Seq: %i -- ksl_get_shared_latch Addr(Latch-98 Child-144 Addr(rdi)=>%X, Mode(r8)=>%i) --\n", ++$l, $rdi, $r8
set $lr = 0
PrtCont 3
end

# kslgetl not used

break *kslfre if $rdi==$LATCHCHD_ADDR
command
printf "===== Latch Fre Seq: %i-%i -- kslfre (Latch-98 Child-144 Addr(rdi)=>%X) --\n", $l, ++$lr, $rdi
PrtCont 3
end


6. GDB Output of Case-1



Note: "kwqilisten" is Oracle subroutine for dbms_aq.LISTEN.


===== Mutext Get Seq: 1 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>1) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130ac47e in kglLock ()
#2  0x00000000130a70c5 in kglget ()
===== BreakPoint: 1-Counter: [ LOCK: 0, PIN: 0, LATCH: 0, MUTEX: 1 ] =====

===== Lock Get Seq: 1 -- kgllkal (kgllkhdl Addr(rdx)=>98417B58, Mode(rcx)=>2), kglnaobj_and_schema: MSG_QUEUE_MULTIPLEK --
#0  0x00000000130b0b40 in kgllkal ()
#1  0x00000000130ac291 in kglLock ()
#2  0x00000000130a70c5 in kglget ()
#3  0x0000000004c04809 in kglgob ()
#4  0x0000000001b1c3ed in kwqicgob ()
#5  0x0000000001b1b7d8 in kwqicrqo0 ()
#6  0x000000000f70f1df in kwqidlqo ()
#7  0x00000000080594f7 in kwqiccllst ()
#8  0x000000000807351a in kwqilintl ()
#9  0x0000000008073c0a in kwqilisten ()
===== BreakPoint: 2-Counter: [ LOCK: 1, PIN: 0, LATCH: 0, MUTEX: 1 ] =====

===== Mutext Rel Seq: 1-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130b21e7 in kglLoadOnLock ()
#2  0x00000000130b0e90 in kgllkal ()
===== BreakPoint: 3-Counter: [ LOCK: 1, PIN: 0, LATCH: 0, MUTEX: 1 ] =====

===== Mutext Rel Seq: 1-2 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130b0ea0 in kgllkal ()
#2  0x00000000130ac291 in kglLock ()
===== BreakPoint: 4-Counter: [ LOCK: 1, PIN: 0, LATCH: 0, MUTEX: 1 ] =====

===== Mutext Rel Seq: 1-3 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130ac35c in kglLock ()
#2  0x00000000130a70c5 in kglget ()
===== BreakPoint: 5-Counter: [ LOCK: 1, PIN: 0, LATCH: 0, MUTEX: 1 ] =====

===== Pin Get Seq: 1 -- kglpin (kgllkhdl Addr(rcx)=>98417B58, Mode(r8)=>2, mutex Addr(r11)=>98417CA8)--
#0  0x00000000130a7e20 in kglpin ()
#1  0x0000000004c048c8 in kglgob ()
#2  0x0000000001b1c3ed in kwqicgob ()
===== BreakPoint: 6-Counter: [ LOCK: 1, PIN: 1, LATCH: 0, MUTEX: 1 ] =====

===== Mutext Get Seq: 2 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130a80dd in kglpin ()
#2  0x0000000004c048c8 in kglgob ()
===== BreakPoint: 7-Counter: [ LOCK: 1, PIN: 1, LATCH: 0, MUTEX: 2 ] =====

===== Mutext Rel Seq: 2-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>0) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130a817e in kglpin ()
#2  0x0000000004c048c8 in kglgob ()
===== BreakPoint: 8-Counter: [ LOCK: 1, PIN: 1, LATCH: 0, MUTEX: 2 ] =====

===== Mutext Get Seq: 3 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>90) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130b4c4f in kglpnal ()
#2  0x00000000130a82b9 in kglpin ()
===== BreakPoint: 9-Counter: [ LOCK: 1, PIN: 1, LATCH: 0, MUTEX: 3 ] =====

===== Mutext Get Seq: 4 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>71) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130afef5 in kglobpn ()
#2  0x00000000130aab2a in kglpim ()
===== BreakPoint: 10-Counter: [ LOCK: 1, PIN: 1, LATCH: 0, MUTEX: 4 ] =====

===== Mutext Rel Seq: 4-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>1) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130b0200 in kglobpn ()
#2  0x00000000130aab2a in kglpim ()
===== BreakPoint: 11-Counter: [ LOCK: 1, PIN: 1, LATCH: 0, MUTEX: 4 ] =====

===== Mutext Rel Seq: 4-2 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>1) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130a858b in kglpin ()
#2  0x0000000004c048c8 in kglgob ()
===== BreakPoint: 12-Counter: [ LOCK: 1, PIN: 1, LATCH: 0, MUTEX: 4 ] =====

===== Pin Rel Seq: 1-1 -- kglpndl(kgllkhdl Addr(r8)=>98417B58) --
#0  0x00000000130b6100 in kglpndl ()
#1  0x00000000130a7756 in kglUnPin ()
#2  0x000000000f70f6bb in kwqidfqo ()
===== BreakPoint: 13-Counter: [ LOCK: 1, PIN: 1, LATCH: 0, MUTEX: 4 ] =====

===== Mutext Get Seq: 5 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>95) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130b63c7 in kglpndl ()
#2  0x00000000130a7756 in kglUnPin ()
===== BreakPoint: 14-Counter: [ LOCK: 1, PIN: 1, LATCH: 0, MUTEX: 5 ] =====

===== Mutext Rel Seq: 5-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>-2022442648) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130b66c9 in kglpndl ()
#2  0x00000000130a7756 in kglUnPin ()
===== BreakPoint: 15-Counter: [ LOCK: 1, PIN: 1, LATCH: 0, MUTEX: 5 ] =====

===== Lock Rel Seq: 1-1 -- kgllkdl (kgllkhdl Addr(r8)=>98417B58) ---
#0  0x00000000130b34d0 in kgllkdl ()
#1  0x00000000130ae53c in kglUnLock ()
#2  0x000000000f70f6c6 in kwqidfqo ()
===== BreakPoint: 16-Counter: [ LOCK: 1, PIN: 1, LATCH: 0, MUTEX: 5 ] =====

===== Mutext Get Seq: 6 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>85) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130b36be in kgllkdl ()
#2  0x00000000130ae53c in kglUnLock ()
===== BreakPoint: 17-Counter: [ LOCK: 1, PIN: 1, LATCH: 0, MUTEX: 6 ] =====

===== Mutext Rel Seq: 6-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130b39f1 in kgllkdl ()
#2  0x00000000130ae53c in kglUnLock ()
===== BreakPoint: 18-Counter: [ LOCK: 1, PIN: 1, LATCH: 0, MUTEX: 6 ] =====

===== Latch Get Seq: 1 -- ksl_get_shared_latch Addr(Latch-98 Child-144 Addr(rdi)=>B4C33CF8, Mode(r8)=>16) --
#0  0x00000000128a3a90 in ksl_get_shared_latch ()
#1  0x000000000128c4e3 in ksrsubscribe ()
#2  0x000000000f71e4fb in kwqidlisten ()
===== BreakPoint: 19-Counter: [ LOCK: 1, PIN: 1, LATCH: 1, MUTEX: 6 ] =====

===== Latch Fre Seq: 1-1 -- kslfre (Latch-98 Child-144 Addr(rdi)=>B4C33CF8) --
#0  0x00000000128aa290 in kslfre ()
#1  0x000000000128c5ae in ksrsubscribe ()
#2  0x000000000f71e4fb in kwqidlisten ()
===== BreakPoint: 20-Counter: [ LOCK: 1, PIN: 1, LATCH: 1, MUTEX: 6 ] =====

===== Mutext Get Seq: 7 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>1) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130ac47e in kglLock ()
#2  0x00000000130a70c5 in kglget ()
===== BreakPoint: 21-Counter: [ LOCK: 1, PIN: 1, LATCH: 1, MUTEX: 7 ] =====

===== Lock Get Seq: 2 -- kgllkal (kgllkhdl Addr(rdx)=>98417B58, Mode(rcx)=>2), kglnaobj_and_schema: MSG_QUEUE_MULTIPLEK --
#0  0x00000000130b0b40 in kgllkal ()
#1  0x00000000130ac291 in kglLock ()
#2  0x00000000130a70c5 in kglget ()
#3  0x0000000004c04809 in kglgob ()
#4  0x0000000001b1c3ed in kwqicgob ()
#5  0x0000000001b1b7d8 in kwqicrqo0 ()
#6  0x000000000f70f1df in kwqidlqo ()
#7  0x000000000f70f938 in kwqidlqowl ()
#8  0x0000000003f5b335 in kwqididqx ()
#9  0x000000000f705bba in kwqididq ()
===== BreakPoint: 22-Counter: [ LOCK: 2, PIN: 1, LATCH: 1, MUTEX: 7 ] =====

===== Mutext Rel Seq: 7-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130b21e7 in kglLoadOnLock ()
#2  0x00000000130b0e90 in kgllkal ()
===== BreakPoint: 23-Counter: [ LOCK: 2, PIN: 1, LATCH: 1, MUTEX: 7 ] =====

===== Mutext Rel Seq: 7-2 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130b0ea0 in kgllkal ()
#2  0x00000000130ac291 in kglLock ()
===== BreakPoint: 24-Counter: [ LOCK: 2, PIN: 1, LATCH: 1, MUTEX: 7 ] =====

===== Mutext Rel Seq: 7-3 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130ac35c in kglLock ()
#2  0x00000000130a70c5 in kglget ()
===== BreakPoint: 25-Counter: [ LOCK: 2, PIN: 1, LATCH: 1, MUTEX: 7 ] =====

===== Pin Get Seq: 2 -- kglpin (kgllkhdl Addr(rcx)=>98417B58, Mode(r8)=>2, mutex Addr(r11)=>98417CA8)--
#0  0x00000000130a7e20 in kglpin ()
#1  0x0000000004c048c8 in kglgob ()
#2  0x0000000001b1c3ed in kwqicgob ()
===== BreakPoint: 26-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 7 ] =====

===== Mutext Get Seq: 8 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130a80dd in kglpin ()
#2  0x0000000004c048c8 in kglgob ()
===== BreakPoint: 27-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 8 ] =====

===== Mutext Rel Seq: 8-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>0) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130a817e in kglpin ()
#2  0x0000000004c048c8 in kglgob ()
===== BreakPoint: 28-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 8 ] =====

===== Mutext Get Seq: 9 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>90) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130b4c4f in kglpnal ()
#2  0x00000000130a82b9 in kglpin ()
===== BreakPoint: 29-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 9 ] =====

===== Mutext Get Seq: 10 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>71) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130afef5 in kglobpn ()
#2  0x00000000130aab2a in kglpim ()
===== BreakPoint: 30-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 10 ] =====

===== Mutext Rel Seq: 10-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>1) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130b0200 in kglobpn ()
#2  0x00000000130aab2a in kglpim ()
===== BreakPoint: 31-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 10 ] =====

===== Mutext Rel Seq: 10-2 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>1) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130a858b in kglpin ()
#2  0x0000000004c048c8 in kglgob ()
===== BreakPoint: 32-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 10 ] =====

===== Pin Rel Seq: 2-1 -- kglpndl(kgllkhdl Addr(r8)=>98417B58) --
#0  0x00000000130b6100 in kglpndl ()
#1  0x00000000130a7756 in kglUnPin ()
#2  0x000000000f70f6bb in kwqidfqo ()
===== BreakPoint: 33-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 10 ] =====

===== Mutext Get Seq: 11 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>95) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130b63c7 in kglpndl ()
#2  0x00000000130a7756 in kglUnPin ()
===== BreakPoint: 34-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 11 ] =====

===== Mutext Rel Seq: 11-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>-2022442648) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130b66c9 in kglpndl ()
#2  0x00000000130a7756 in kglUnPin ()
===== BreakPoint: 35-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 11 ] =====

===== Lock Rel Seq: 2-1 -- kgllkdl (kgllkhdl Addr(r8)=>98417B58) ---
#0  0x00000000130b34d0 in kgllkdl ()
#1  0x00000000130ae53c in kglUnLock ()
#2  0x000000000f70f6c6 in kwqidfqo ()
===== BreakPoint: 36-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 11 ] =====

===== Mutext Get Seq: 12 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>85) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130b36be in kgllkdl ()
#2  0x00000000130ae53c in kglUnLock ()
===== BreakPoint: 37-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 12 ] =====

===== Mutext Rel Seq: 12-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130b39f1 in kgllkdl ()
#2  0x00000000130ae53c in kglUnLock ()
===== BreakPoint: 38-Counter: [ LOCK: 2, PIN: 2, LATCH: 1, MUTEX: 12 ] =====

===== Latch Get Seq: 2 -- ksl_get_shared_latch Addr(Latch-98 Child-144 Addr(rdi)=>B4C33CF8, Mode(r8)=>16) --
#0  0x00000000128a3a90 in ksl_get_shared_latch ()
#1  0x000000000128dc58 in ksrchreset ()
#2  0x000000000f71eae7 in kwqidlisten ()
===== BreakPoint: 39-Counter: [ LOCK: 2, PIN: 2, LATCH: 2, MUTEX: 12 ] =====

===== Latch Fre Seq: 2-1 -- kslfre (Latch-98 Child-144 Addr(rdi)=>B4C33CF8) --
#0  0x00000000128aa290 in kslfre ()
#1  0x000000000128dd92 in ksrchreset ()
#2  0x000000000f71eae7 in kwqidlisten ()
===== BreakPoint: 40-Counter: [ LOCK: 2, PIN: 2, LATCH: 2, MUTEX: 12 ] =====

===== Latch Get Seq: 3 -- ksl_get_shared_latch Addr(Latch-98 Child-144 Addr(rdi)=>B4C33CF8, Mode(r8)=>16) --
#0  0x00000000128a3a90 in ksl_get_shared_latch ()
#1  0x000000000128ddc5 in ksrchreset ()
#2  0x000000000f71eae7 in kwqidlisten ()
===== BreakPoint: 41-Counter: [ LOCK: 2, PIN: 2, LATCH: 3, MUTEX: 12 ] =====

===== Latch Fre Seq: 3-1 -- kslfre (Latch-98 Child-144 Addr(rdi)=>B4C33CF8) --
#0  0x00000000128aa290 in kslfre ()
#1  0x000000000128de19 in ksrchreset ()
#2  0x000000000f71eae7 in kwqidlisten ()
===== BreakPoint: 42-Counter: [ LOCK: 2, PIN: 2, LATCH: 3, MUTEX: 12 ] =====

===== Mutext Get Seq: 13 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>1) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130ac47e in kglLock ()
#2  0x00000000130a70c5 in kglget ()
===== BreakPoint: 43-Counter: [ LOCK: 2, PIN: 2, LATCH: 3, MUTEX: 13 ] =====

===== Mutext Rel Seq: 13-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>626987360) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130ac35c in kglLock ()
#2  0x00000000130a70c5 in kglget ()
===== BreakPoint: 44-Counter: [ LOCK: 2, PIN: 2, LATCH: 3, MUTEX: 13 ] =====

===== Pin Get Seq: 3 -- kglpin (kgllkhdl Addr(rcx)=>98417B58, Mode(r8)=>2, mutex Addr(r11)=>98417B58)--
#0  0x00000000130a7e20 in kglpin ()
#1  0x00000000130a7af7 in kglpnp ()
#2  0x0000000004c062fb in kglgpr ()
===== BreakPoint: 45-Counter: [ LOCK: 2, PIN: 3, LATCH: 3, MUTEX: 13 ] =====

===== Mutext Get Seq: 14 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130a80dd in kglpin ()
#2  0x00000000130a7af7 in kglpnp ()
===== BreakPoint: 46-Counter: [ LOCK: 2, PIN: 3, LATCH: 3, MUTEX: 14 ] =====

===== Mutext Rel Seq: 14-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>0) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130a817e in kglpin ()
#2  0x00000000130a7af7 in kglpnp ()
===== BreakPoint: 47-Counter: [ LOCK: 2, PIN: 3, LATCH: 3, MUTEX: 14 ] =====

===== Mutext Get Seq: 15 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>90) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130b4c4f in kglpnal ()
#2  0x00000000130a82b9 in kglpin ()
===== BreakPoint: 48-Counter: [ LOCK: 2, PIN: 3, LATCH: 3, MUTEX: 15 ] =====

===== Mutext Get Seq: 16 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>71) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130afef5 in kglobpn ()
#2  0x00000000130aab2a in kglpim ()
===== BreakPoint: 49-Counter: [ LOCK: 2, PIN: 3, LATCH: 3, MUTEX: 16 ] =====

===== Mutext Get Seq: 17 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>72) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130b0033 in kglobpn ()
#2  0x00000000130aab2a in kglpim ()
===== BreakPoint: 50-Counter: [ LOCK: 2, PIN: 3, LATCH: 3, MUTEX: 17 ] =====

===== Mutext Rel Seq: 17-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>17) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130b0200 in kglobpn ()
#2  0x00000000130aab2a in kglpim ()
===== BreakPoint: 51-Counter: [ LOCK: 2, PIN: 3, LATCH: 3, MUTEX: 17 ] =====

===== Mutext Rel Seq: 17-2 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>17) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130a858b in kglpin ()
#2  0x00000000130a7af7 in kglpnp ()
===== BreakPoint: 52-Counter: [ LOCK: 2, PIN: 3, LATCH: 3, MUTEX: 17 ] =====

===== Pin Rel Seq: 3-1 -- kglpndl(kgllkhdl Addr(r8)=>98417B58) --
#0  0x00000000130b6100 in kglpndl ()
#1  0x00000000130a7756 in kglUnPin ()
#2  0x0000000008066b4d in kwqidxdeq0 ()
===== BreakPoint: 53-Counter: [ LOCK: 2, PIN: 3, LATCH: 3, MUTEX: 17 ] =====

===== Mutext Get Seq: 18 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>95) --
#0  0x00000000130b9c20 in kglGetMutex ()
#1  0x00000000130b63c7 in kglpndl ()
#2  0x00000000130a7756 in kglUnPin ()
===== BreakPoint: 54-Counter: [ LOCK: 2, PIN: 3, LATCH: 3, MUTEX: 18 ] =====

===== Mutext Rel Seq: 18-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>-2022442648) --
#0  0x00000000130b9430 in kglReleaseMutex ()
#1  0x00000000130b66c9 in kglpndl ()
#2  0x00000000130a7756 in kglUnPin ()
===== BreakPoint: 55-Counter: [ LOCK: 2, PIN: 3, LATCH: 3, MUTEX: 18 ] =====

Sunday, April 9, 2023

Oracle Multi-Consumer AQ dequeue with dbms_aq.listen Performance Analysis and Workaround (III)

Wait Events (I)           Performance Model (II)           Performance Analysis and Workaround (III)     


In the first Blog (I), we made test to show Wait Events: "library cache: mutex X" on queue object and "latch free" on Latch "channel operations parent latch".

In the second Blog (II), we first performed dequeue trace by GDB. Based on test output, we built Latch and Mutex model, examined it against real DB stats and subsequently revealed implementation problems.

In this third Blog (III), we will trace dequeue operation with different number of parallel sessions in Dtrace, present the performance stats, and evaluate Blog (II) model against real DB stats.

We conclude the Blogs with workaround (fix) for identified "library cache: mutex X" and "latch free".

Note: tested on Oracle 19.7 in Solaris.


1. Single Message Dequeue Latch and Mutex


Same as Blog (II), we first get child latch and mutex addr as follows:

select addr from v$latch_children where latch#=98 and child#=144;
    -- D371C240

select kglhdmtx, kglhdadr, v.* from  sys.x_KGLOB v where kglnaobj = 'MSG_QUEUE_MULTIPLE';  
    -- 99CF9A20	 99CF98D0
Then we compose below Dtrace script:

sudo dtrace -n \
'
 BEGIN{
   LATCHCHD_ADDR = 0XD371C240;
   KGLHDMTX_ADDR = 0X99CF9A20;
   l_get_seq = 0; l_rel_seq = 0;
   m_get_seq = 0; m_rel_seq = 0}
   
 pid$target:oracle:ksl_get_shared_latch:entry /arg0 == LATCHCHD_ADDR/  
  {printf("Latch Get Seq: %d --- %s (Addr=>0x%X, Mode=>%d)", ++l_get_seq, probefunc, arg0, arg4)}
  
 pid$target:oracle:kslfre:entry /arg0 == LATCHCHD_ADDR/  
  {printf("Latch Fre Seq: %d-%d --- %s (Addr=>0x%X)", l_get_seq, ++l_rel_seq, probefunc, arg0)}

 pid$target:oracle:kglGetMutex:entry /arg1 == KGLHDMTX_ADDR/ 
  {printf("Mutex Get Seq: %d --- %s (KGLHDMTX_ADDR=>0x%X, Location=>%d, KGLLKHDL_ADDR=>0x%X)", ++m_get_seq, probefunc, arg1, arg4, arg5);}
 
 pid$target:oracle:kglReleaseMutex:entry /arg1 == KGLHDMTX_ADDR/ 
  {printf("Mutex Rel Seq: %d-%d --- %s (KGLHDMTX_ADDR=>0x%X)", m_get_seq, ++m_rel_seq, probefunc, arg1);}

 END{printf("Total Counters: Latch GETs: %d, Latch FREs: %d, Mutex GETs: %d, Mutex RELs: %d", l_get_seq, l_rel_seq, m_get_seq, m_rel_seq)}
' -p 7476
Open one Sqlplus session (UNIX process Pid: 7476), enqueue one message:

SQL > exec enq(1);
Then start above Dtrace script on process Pid: 7476.
On Sqlplus session, run:

SQL > exec listen_deq(1);
Here Dtrace output:

  Mutex Get Seq: 1 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>1, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 1-1 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Rel Seq: 1-2 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Rel Seq: 1-3 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Get Seq: 2 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>4, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 2-4 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Get Seq: 3 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>90, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Get Seq: 4 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>71, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 4-5 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Rel Seq: 4-6 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Get Seq: 5 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>95, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 5-7 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Get Seq: 6 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>85, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 6-8 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Latch Get Seq: 1 --- ksl_get_shared_latch (Addr=>0xD371C240, Mode=>16)
  Latch Fre Seq: 1-1 --- kslfre (Addr=>0xD371C240)
  Mutex Get Seq: 7 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>1, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 7-9 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Rel Seq: 7-10 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Rel Seq: 7-11 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Get Seq: 8 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>4, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 8-12 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Get Seq: 9 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>90, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Get Seq: 10 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>71, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 10-13 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Rel Seq: 10-14 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Get Seq: 11 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>95, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 11-15 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Get Seq: 12 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>85, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 12-16 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Latch Get Seq: 2 --- ksl_get_shared_latch (Addr=>0xD371C240, Mode=>16)
  Latch Fre Seq: 2-2 --- kslfre (Addr=>0xD371C240)
  Latch Get Seq: 3 --- ksl_get_shared_latch (Addr=>0xD371C240, Mode=>16)
  Latch Fre Seq: 3-3 --- kslfre (Addr=>0xD371C240)
  Mutex Get Seq: 13 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>1, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 13-17 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Get Seq: 14 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>4, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 14-18 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Get Seq: 15 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>90, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Get Seq: 16 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>71, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Get Seq: 17 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>72, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 17-19 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Rel Seq: 17-20 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Get Seq: 18 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>95, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 18-21 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  
  Total Counters: Latch GETs: 3, Latch FREs: 3, Mutex GETs: 18, Mutex RELs: 21
Above output shows the number of Latch and Mutex GET/FREE (RELEASE) for one message dequeue when it is running alone (without any interference):

  Latch GETs: 3
  Latch FREs: 3
  Mutex GETs: 18
  Mutex RELs: 21
We can see that for one single message dequeue, we need 3 Latch GETs and also 3 subsequent Latch FREEs (GET and FREE are matched pair).

But for Mutex, there are 18 GETS, but 21 RELs. These are due to consecutive Mutex GETs in different Locations, for example:

  Mutex Get Seq: 3 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>90, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Get Seq: 4 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>71, KGLLKHDL_ADDR=>0x99CF98D0)
and consecutive Mutex RELs in different Locations, for example:

  Mutex Get Seq: 1 --- kglGetMutex (KGLHDMTX_ADDR=>0x99CF9A20, Location=>1, KGLLKHDL_ADDR=>0x99CF98D0)
  Mutex Rel Seq: 1-1 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Rel Seq: 1-2 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
  Mutex Rel Seq: 1-3 --- kglReleaseMutex (KGLHDMTX_ADDR=>0x99CF9A20)
Above Latch and Mutex Gets are the same as GDB test outcome in previous Blog (II) for one single message dequeue:

   3   Latch Gets         (ksl_get_shared_latch)
   18  Mutex Gets         (kglGetMutex)
Blog (II) gdb trace shows the same mutex behaviour (consecutive Mutex GETs/RELs):

===== Mutext Get Seq: 3 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>90) --
===== Mutext Get Seq: 4 -- kglGetMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>71) --

===== Lock Get Seq: 1 -- kgllkal (kgllkhdl Addr(rdx)=>98417B58, Mode(rcx)=>2), kglnaobj_and_schema: MSG_QUEUE_MULTIPLEK --
===== Mutext Rel Seq: 1-1 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --
===== Mutext Rel Seq: 1-2 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --
===== Mutext Rel Seq: 1-3 -- kglReleaseMutex(kglhdmtx Addr(rsi)=>98417CA8, Location(r8)=>4) --


2. Parallel Session Test


Above test shows the minimum number of Latch and Mutex GET/FREE (RELEASE) for one single message dequeue. We will continue our dequeue test in parallel sessions and monitor Latch and Mutex behaviour.

At first, we compose a Dtrace to collect test stats (see Section 5. Dtrace Script).

Then start Dtrace on Sqlplus session (process Pid: 7476), and run following test:

----------------- Parallel Sessions = 4 -----------------
begin
  clearup_test;
  purge_queue_table;
  start_enq_jobs(4, 60);
  start_deq_jobs(3, 1, 60);
  sel_deq_loop(4, 1, 60);
end;
/

-- Output
--   Listener Deq Execs (SUC and Failed) = 9420, Duration (seconds) = 60
It starts 4 enqueue Job sessions, of which first 3 dequeue Jobs for first 3 enqueue sessions for 60 seconds, and this Sqlplus dequeue session for 4th enqueue session for 60 seconds.

After test run, we collect dequeue stats by query:

-- For each RUN, only select last corr_id, which is the traced session

select corr_id, count(*), min(enq_time), max(enq_time), min(deq_time), max(deq_time),
       (max(enq_time)- min(enq_time))*86400 enq_delta, (max(deq_time) - min(deq_time))*86400 deq_delta
from aq$MultiConsumerMsgs_qtab 
where corr_id = (select 'KSUB_'||count(distinct corr_id) from aq$MultiConsumerMsgs_qtab) 
  and deq_time is not null and msg_state = 'PROCESSED' 
group by corr_id order by 1;

  CORR_ID    COUNT(*)   MIN(ENQ_TIM  MAX(ENQ_TIM  MIN(DEQ_TIM  MAX(DEQ_TIM  ENQ_DELTA   DEQ_DELTA
  ---------- ---------- -----------  -----------  -----------  -----------  ----------  ----------
  KSUB_4     9420       10:56:41     10:57:41     10:56:41     10:57:41     60          60
and extract Latch and Mutex stats from Dtrace output:

  L_CNTGET   ksl_get_shared_latch =       155,862
  L_AVG                average_ns =         2,834
  ALL_MUTEX_GETs      kglGetMutex =       415,825
  ALL_MUTEX_FREs  kglReleaseMutex =       533,074
  M_AVG                average_ns =         4,306
Above output shows that during 60 seconds, one session (KSUB_4) dequeued 9420 messages.

  There are 155,862 Latch GETs, which is much higher than (9420 * 3 = 28,260) if they were dequeued without parallel sessions.
  There are 415,825 Mutex GETs, which is much higher than (9420 * 18 = 169,560) if they were dequeued without parallel sessions.
  Latch Average GET takes 2,834 ns, which is varied between 1,024 and 524,288 ns (see L_QUANT in Section 5.1 Dtrace Output).
  Mutex Average GET takes 4,306 ns, which is varied between 2,048 and 8,388,608 ns (see M_QUANT in Section 5.1 Dtrace Output).
We run one more test with 64 parallel Sessions and compare the performance:

----------------- Parallel Sessions = 64-----------------
begin
  clearup_test;
  purge_queue_table;
  start_enq_jobs(64, 60);
  start_deq_jobs(63, 1, 60);
  sel_deq_loop(64, 1, 60);
end;
/

  CORR_ID    COUNT(*)   MIN(ENQ_TIM  MAX(ENQ_TIM  MIN(DEQ_TIM  MAX(DEQ_TIM  ENQ_DELTA   DEQ_DELTA
  ---------- ---------- -----------  -----------  -----------  -----------  ----------  ----------
  KSUB_64    667        11:18:41     11:19:41     11:18:48     11:19:41     60          53

  L_CNTGET  ksl_get_shared_latch =        93,584
  L_AVG               average_ns =        29,496
  ALL_MUTEX_GETs     kglGetMutex =       258,710
  ALL_MUTEX_FREs kglReleaseMutex =       343,015
  M_AVG               average_ns =        40,750  
Above output shows that during 60 seconds, one session (KSUB_64) dequeued 667 messages.

  There are 93,584 Latch GETs, which is much higher than (667 * 3 = 2,001) if they were dequeued without parallel sessions.
  There are 258,710 Mutex GETs, which is much higher than (667 * 18 = 12,006) if they were dequeued without parallel sessions.
  Latch Average GET takes 29,496 ns, which is varied between 1,024 and 268,435,456 ns (see L_QUANT in Section 5.2 Dtrace Output).
  Mutex Average GET takes 40,750 ns, which is varied between 2,048 and 536,870,912 ns (see M_QUANT in Section 5.2 Dtrace Output).
For full Dtrace output, see Section 5.1 Dtrace Output for 4-Parallel Sessions. and Section 5.2 for 64-Parallel Sessions.


3. Test Stats for different number of Parallel Sessions


We run the same test for 1, 2, 4. 8, 16, 32, 64 Parallel Sessions. Here the performance stats.

Dequeue Stats

Dequeue Stats

RUN PX  CORR_ID  COUNT(*)  MIN(ENQ_TIM)  MAX(ENQ_TIM)  MIN(DEQ_TIM)  MAX(DEQ_TIM)  ENQ_DELTA  DEQ_DELTA
--- --  -------  --------  ------------  ------------  ------------  ------------  ---------  ---------
1   1   KSUB_1   11,884    10:49:17      10:50:17      10:49:17      10:50:16       60        59
2   2   KSUB_2   11,640    10:53:24      10:54:24      10:53:24      10:54:24       60        60
3   4   KSUB_4    9,420    10:56:41      10:57:41      10:56:41      10:57:41       60        60
4   8   KSUB_8    6,519    10:59:02      11:00:02      10:59:02      11:00:01       60        59
5   16  KSUB_16   3,181    11:02:08      11:03:08      11:02:08      11:03:07       60        59
6   32  KSUB_32   1,441    11:04:38      11:05:38      11:04:39      11:05:37       60        58
7   64  KSUB_64     667    11:18:41      11:19:41      11:18:48      11:19:41       60        53

Note: PX column: number of parallel sessions
Latch Stats (ns_Per_Latch_Get is Dtrace L_AVG (nanosecond))

RUN PX  MSGs    Total_MSGs  Latch_Gets  Total_Latch_Gets  ns_Per_Latch_Get  Latch_Gets_per_deq  Latch_ns_per_Deq
--- --  ------  ----------  ----------  ----------------  ----------------  ------------------  ----------------
1   1   11,884  11,884      108,318     108,318           2,497             9                   22,759   
2   2   11,640  23,280      148,303     296,606           2,562             13                  32,642   
3   4    9,420  37,680      155,862     623,448           2,834             17                  46,891   
4   8    6,519  52,152      243,571     1,948,568         3,969             37                  148,295  
5   16   3,181  50,896      187,259     2,996,144         6,699             59                  394,357  
6   32   1,441  46,112      138,192     4,422,144         10,954            96                  1,050,489
7   64     667  42,688      93,584      5,989,376         29,496            140                 4,138,461

(Latch_ns_per_Deq = ns_Per_Latch_Get * Latch_Gets_per_deq)
Muext Stats (ns_Per_Mutex_Get is Dtrace M_AVG (nanosecond))

RUN PX  MSGs    Total_MSGs  Mutex_Gets  Total_Mutex_Gets  ns_Per_Mutex_Get  Mutex_Gets_per_deq  Mutex_ns_per_Deq     
--- --  ------  ----------  ----------  ----------------  ----------------  ------------------  ----------------
1   1   11,884  11,884      361,105     361,105           3,438             30                  104,466   
2   2   11,640  23,280      429,493     858,986           3,629             37                  133,903   
3   4    9,420  37,680      415,825     1,663,300         4,306             44                  190,079   
4   8    6,519  52,152      630,757     5,046,056         5,337             97                  516,391   
5   16   3,181  50,896      497,035     7,952,560         9,033             156                 1,411,417 
6   32   1,441  46,112      380,492     12,175,744        17,947            264                 4,738,855 
7   64     667  42,688      258,710     16,557,440        40,750            388                 15,805,746

(Mutex_ns_per_Deq = ns_Per_Mutex_Get * Mutex_Gets_per_deq)
We can see that total dequeue throughput (Total_MSGs) is linearly increased with number of Parallel Sessions till 8 Parallel Sessions with maximum reached (Total_MSGs = 52,152).

Total number of Latch GETS (Total_Latch_Gets) and Mutex GETs (Total_Mutex_Gets) per dequeue are almost linearly increased with number of Parallel Sessions.

Latch and Mutex GET time per dequeue (Latch_ns_per_Deq and Mutex_ns_per_Deq, in nanosecond) are increased much faster when number of Parallel Sessions is more than 8.

For example, compare Mutex_ns_per_Deq for Parallel 32 vs. 16, number of Parallel Sessions increased 2 times, but duration increased more than 3 times (4,738,855/1,411,417 = 3.36).

Comparing above DB stats with Blog (II) model, we can see the model are approximative, but closed to real DB stats.

If we draw a graphic chart with test stats, we can see the similar trend as model formulas.


3.1 Test Outputs


Here all the parallel tests and stats details.

----------------- 1 Session -----------------
begin
  clearup_test;
  purge_queue_table;
  start_enq_jobs(1, 60);
  sel_deq_loop(1, 1, 60);
end;
/

L_CNTGET   ksl_get_shared_latch =       108318
L_AVG                average_ns =         2497
ALL_MUTEX_GETs      kglGetMutex =       361105
ALL_MUTEX_FREs  kglReleaseMutex =       445357
M_AVG                average_ns =         3438

KSUB_1	11884	10:49:17	10:50:17	10:49:17	10:50:16	60	59

----------------- 2 Parallel Sessions -----------------
begin
  clearup_test;
  purge_queue_table;
  start_enq_jobs(2, 60);
  start_deq_jobs(1, 1, 60);
  sel_deq_loop(2, 1, 60);
end;
/

L_CNTGET  ksl_get_shared_latch =       148303
L_AVG               average_ns =         2562
ALL_MUTEX_GETs     kglGetMutex =       429493
ALL_MUTEX_FREs kglReleaseMutex =       539133
M_AVG               average_ns =         3629

KSUB_2	11640	10:53:24	10:54:24	10:53:24	10:54:24	60	60

----------------- 4 Parallel Sessions -----------------
begin
  clearup_test;
  purge_queue_table;
  start_enq_jobs(4, 60);
  start_deq_jobs(3, 1, 60);
  sel_deq_loop(4, 1, 60);
end;
/

L_CNTGET  ksl_get_shared_latch =       155862
L_AVG               average_ns =         2834
ALL_MUTEX_GETs     kglGetMutex =       415825
ALL_MUTEX_FREs kglReleaseMutex =       533074
M_AVG               average_ns =         4306

KSUB_4	9420	10:56:41	10:57:41	10:56:41	10:57:41	60	60

----------------- Parallel Sessions = 8 -----------------
begin
  clearup_test;
  purge_queue_table;
  start_enq_jobs(8, 60);
  start_deq_jobs(7, 1, 60);
  sel_deq_loop(8, 1, 60);
end;
/

L_CNTGET  ksl_get_shared_latch =       243571
L_AVG               average_ns =         3969
ALL_MUTEX_GETs     kglGetMutex =       630757
ALL_MUTEX_FREs kglReleaseMutex =       822641
M_AVG               average_ns =         5337

KSUB_8	6519	10:59:02	11:00:02	10:59:02	11:00:01	60	59

----------------- 16 Parallel Sessions -----------------
begin
  clearup_test;
  purge_queue_table;
  start_enq_jobs(16, 60);
  start_deq_jobs(15, 1, 60);
  sel_deq_loop(16, 1, 60);
end;
/

L_CNTGET  ksl_get_shared_latch =       187259
L_AVG               average_ns =         6699
ALL_MUTEX_GETs     kglGetMutex =       497035
ALL_MUTEX_FREs kglReleaseMutex =       653138
M_AVG               average_ns =         9033

KSUB_16	3181	11:02:08	11:03:08	11:02:08	11:03:07	60	59

----------------- 32 Parallel Sessions -----------------
begin
  clearup_test;
  purge_queue_table;
  start_enq_jobs(32, 60);
  start_deq_jobs(31, 1, 60);
  sel_deq_loop(32, 1, 60);
end;
/

L_CNTGET  ksl_get_shared_latch =       138192
L_AVG               average_ns =        10954
ALL_MUTEX_GETs     kglGetMutex =       380492
ALL_MUTEX_FREs kglReleaseMutex =       503141
M_AVG               average_ns =        17947

KSUB_32	1426	11:04:38	11:05:38	11:04:39	11:05:37	60	58

----------------- Parallel Sessions = 64-----------------
begin
  clearup_test;
  purge_queue_table;
  start_enq_jobs(64, 60);
  start_deq_jobs(63, 1, 60);
  sel_deq_loop(64, 1, 60);
end;
/

L_CNTGET  ksl_get_shared_latch =        93584
L_AVG               average_ns =        29496
ALL_MUTEX_GETs     kglGetMutex =       258710
ALL_MUTEX_FREs kglReleaseMutex =       343015
M_AVG               average_ns =        40750

KSUB_63	667	11:18:41	11:19:41	11:18:48	11:19:41	60	53


4. Workaround


To mitigate "library cache: mutex X" on queue object, we can mark hot object as follows:

select object_name, namespace from dba_objects where object_name = 'MSG_QUEUE_MULTIPLE';
  -- MSG_QUEUE_MULTIPLE	10

alter system set "_kgl_hot_object_copies"= 254 scope=spfile;

alter system set "_kgl_debug"= "name='MSG_QUEUE_MULTIPLE' schema='K' namespace=10 debug=33554432" scope=spfile;

  -- See Blog: "library cache: mutex X" and Application Context 
  --           (http://ksun-oracle.blogspot.com/2016/11/library-cache-mutex-x-and-application.html)
To workaround "latch free" on Latch "channel operations parent latch", we use subscriber instead of dbms_aq.listen. See Blog (I) - procedure subscr_deq

In Blog (I), we created 100 subscribers by "create_subscribers('msg_queue_multiple', 100)". If a subscriber does not exist, Oracle temporarily creates one and use it for enqueue and dequeue, for example:

begin
  for i in 1..1035 loop
    enq(9990000+i);
    listen_deq(9990000+i, 10); 
  end loop;
end;
/

-- only 1024 rows are listed, not all of 1035 rows are exposed v$persistent_subscribers.
select count(*) from v$persistent_subscribers where queue_name = 'MSG_QUEUE_MULTIPLE' and subscriber_name like 'KSUB_999%';
  -- 1024

select count(*) from v$persistent_subscribers where queue_name = 'MSG_QUEUE_MULTIPLE' and subscriber_name NOT like 'KSUB_999%';
  -- 100
  
select count(*) from dba_queue_subscribers where queue_name = 'MSG_QUEUE_MULTIPLE' and consumer_name like 'KSUB_999%';
  -- 0

select count(*) from dba_queue_subscribers where queue_name = 'MSG_QUEUE_MULTIPLE' and consumer_name NOT like 'KSUB_999%';
  -- 100
  
-- all messages are stored in queue table
select * from multiconsumermsgs_qtab where corrid like 'KSUB_999%' order by corrid;
Then we can see 1024 new subscribers with name like 'KSUB_999%' (maximum 1024 kept even 1035 created).

The temporarily created subscribers are visible in v$persistent_subscribers (not all of them are exposed v$persistent_subscribers), but not stored in dba_queue_subscribers.

dba_queue_subscribers contains list of subscribers on all queues. They are created by dbms_aqadm add_subscriber and can be removed by dbms_aqadm.remove_subscriber. They are persistent after DB restart.

The maximum number of subscribers per queue is 1024, and error is:

  maximum 1024:  ORA-24067: exceeded maximum number of subscribers for queue MSG_QUEUE_MULTIPLE

     See Blog: One Test of ORA-00600: [KGL-heap-size-exceeded] With AQ Subscriber 
        (http://ksun-oracle.blogspot.com/2023/03/one-test-of-ora-00600-kgl-heap-size.html)
According to Oracle Docu, v$persistent_subscribers displays information about all active subscribers of the persistent queues in the database
(above test shows that is not the case, only 1024 0f 1035 active subscribers are exposed in v$persistent_subscribers).
There is one row per instance per queue per subscriber. The rows are deleted when the database restarts.

So it is better to use pre-defined and limited number of subscribers to improved performance.


5. Dtrace Output for Parallel Sessions



5.1 4-Parallel Sessions



----------------- 4 Parallel Sessions -----------------

BEGIN at 10:56:31
END   at 10:57:49

ELAPSED_Seconds = 77
------------------ Latch Child Stats ------------------
L_GET_CNT  = 155862
L_TOTAL_NS = 441736077
L_CNTGET ksl_get_shared_latch =       155862
L_CNTFRE               kslfre =       155862
L_AVG              average_ns =         2834
L_QUANT          quantize_ns  =
           value  ------------- Distribution ------------- count
             512 |                                         0
            1024 |@@@@@@@@@@@@@@@@@@@@@@@@                 94257
            2048 |@@@@@@@@@@@                              42290
            4096 |@@@@                                     17383
            8192 |                                         1127
           16384 |                                         287
           32768 |                                         158
           65536 |                                         337
          131072 |                                         17
          262144 |                                         2
          524288 |                                         4
         1048576 |                                         0

------------------ Mutex Stats ------------------
ALL_MUTEX_GETs      kglGetMutex =       415825
ALL_MUTEX_FREs      kglReleaseMutex =       533074

M_GET_CNT  = 339405
M_TOTAL_NS = 1461574177
M_CNTGET          kglGetMutex,   21 (Mode) =            1
M_CNTGET          kglGetMutex,  109 (Mode) =            1
M_CNTGET          kglGetMutex,  112 (Mode) =            1
M_CNTGET          kglGetMutex,    6 (Mode) =            2
M_CNTGET          kglGetMutex,   28 (Mode) =            2
M_CNTGET          kglGetMutex,   75 (Mode) =            2
M_CNTGET          kglGetMutex,  106 (Mode) =            2
M_CNTGET          kglGetMutex,  157 (Mode) =            2
M_CNTGET          kglGetMutex,   85 (Mode) =        62185
M_CNTGET          kglGetMutex,    1 (Mode) =        69301
M_CNTGET          kglGetMutex,    4 (Mode) =        69302
M_CNTGET          kglGetMutex,   90 (Mode) =        69302
M_CNTGET          kglGetMutex,   95 (Mode) =        69302
M_CNTFRE      kglReleaseMutex =       339405
M_AVG              average_ns =         4306
M_QUANT           quantize_ns =
           value  ------------- Distribution ------------- count
            1024 |                                         0
            2048 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@             235503
            4096 |@@@@@@@@@@@@                             98902
            8192 |                                         3262
           16384 |                                         947
           32768 |                                         568
           65536 |                                         119
          131072 |                                         44
          262144 |                                         14
          524288 |                                         12
         1048576 |                                         9
         2097152 |                                         6
         4194304 |                                         5
         8388608 |                                         14
        16777216 |                                         0


5.2 64-Parallel Sessions



---------- Parallel Sessions = 64 ----------

BEGIN at 11:18:35
END   at 11:19:53

ELAPSED_Seconds = 77
------------------ Latch Child Stats ------------------
L_GET_CNT  = 93584
L_TOTAL_NS = -1534546885
L_CNTGET ksl_get_shared_latch =        93584
L_CNTFRE               kslfre =        93584
L_AVG              average_ns =        29496
L_QUANT          quantize_ns  =
           value  ------------- Distribution ------------- count
             512 |                                         0
            1024 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@             64970
            2048 |@@@@@@@@@                                21078
            4096 |@@                                       5382
            8192 |                                         459
           16384 |                                         680
           32768 |                                         483
           65536 |                                         302
          131072 |                                         114
          262144 |                                         26
          524288 |                                         26
         1048576 |                                         11
         2097152 |                                         9
         4194304 |                                         17
         8388608 |                                         10
        16777216 |                                         4
        33554432 |                                         4
        67108864 |                                         4
       134217728 |                                         3
       268435456 |                                         2
       536870912 |                                         0

------------------ Mutex Stats ------------------
ALL_MUTEX_GETs      kglGetMutex =       258710
ALL_MUTEX_FREs      kglReleaseMutex =       343015

M_GET_CNT  = 214951
M_TOTAL_NS = 169358324
M_CNTGET          kglGetMutex,   21 (Mode) =            1
M_CNTGET          kglGetMutex,   75 (Mode) =            1
M_CNTGET          kglGetMutex,  157 (Mode) =            1
M_CNTGET          kglGetMutex,    6 (Mode) =            2
M_CNTGET          kglGetMutex,   28 (Mode) =            2
M_CNTGET          kglGetMutex,  106 (Mode) =            2
M_CNTGET          kglGetMutex,   85 (Mode) =        42475
M_CNTGET          kglGetMutex,    1 (Mode) =        43116
M_CNTGET          kglGetMutex,    4 (Mode) =        43117
M_CNTGET          kglGetMutex,   90 (Mode) =        43117
M_CNTGET          kglGetMutex,   95 (Mode) =        43117
M_CNTFRE      kglReleaseMutex =       214951
M_AVG              average_ns =        40750
M_QUANT           quantize_ns =
           value  ------------- Distribution ------------- count
            1024 |                                         0
            2048 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@          166007
            4096 |@@@@@@@@                                 40643
            8192 |@                                        2751
           16384 |                                         1731
           32768 |                                         1890
           65536 |                                         891
          131072 |                                         405
          262144 |                                         211
          524288 |                                         158
         1048576 |                                         73
         2097152 |                                         31
         4194304 |                                         59
         8388608 |                                         47
        16777216 |                                         17
        33554432 |                                         18
        67108864 |                                         7
       134217728 |                                         5
       268435456 |                                         5
       536870912 |                                         2
      1073741824 |                                         0


6. Dtrace Script: Latch and Mutex Stats



--------- Latch and Mutex (not consider Mutex consecutive Gets and Release) ---------

sudo dtrace -n \
'
BEGIN{
   START_TS = timestamp;
   LATCHCHD_ADDR = 0XD371C240;
   L_GET_CNT  = 0; L_TOTAL_NS = 0;
   KGLHDMTX_ADDR = 0X99CF9A20;
   M_GET_CNT  = 0; M_TOTAL_NS = 0;
   m_get = 0; m_rel = 1;       /* exclude Mutex consecutive Gets and Release */
   printf("\nBEGIN at %Y ", walltimestamp);}

 pid$target:oracle:ksl_get_shared_latch:entry /arg0 == LATCHCHD_ADDR/  
  {self->L_ts = timestamp; L_GET_CNT = L_GET_CNT + 1;
   @L_CNTGET[probefunc] = count()}
  
 pid$target:oracle:kslfre:entry /arg0 == LATCHCHD_ADDR/  
  {L_TOTAL_NS = L_TOTAL_NS + (timestamp - self->L_ts);
   @L_AVG["average_ns"]    = avg(timestamp - self->L_ts);
   @L_QUANT["quantize_ns"] = quantize(timestamp - self->L_ts);
   @L_CNTFRE[probefunc]    = count()} 

 pid$target:oracle:kglGetMutex:entry /arg1 == KGLHDMTX_ADDR/
  {@M_ALL_CNTGET[probefunc] = count();} 

 pid$target:oracle:kglReleaseMutex:entry /arg1 == KGLHDMTX_ADDR/    
  {@M_ALL_CNTFRE[probefunc] = count();}
  
 pid$target:oracle:kglGetMutex:entry /arg1 == KGLHDMTX_ADDR && m_get == 0 && m_rel == 1/
  {self->M_ts = timestamp; M_GET_CNT = M_GET_CNT + 1;
   @M_CNTGET[probefunc, arg4]        = count();
   m_get = 1; m_rel = 0}
  
 pid$target:oracle:kglReleaseMutex:entry /arg1 == KGLHDMTX_ADDR && m_get == 1 && m_rel == 0/    
  {M_TOTAL_NS = M_TOTAL_NS + (timestamp - self->M_ts); 
   @M_CNTFRE[probefunc]    = count();
   @M_AVG["average_ns"]    = avg(timestamp - self->M_ts);
   @M_QUANT["quantize_ns"] = quantize(timestamp - self->M_ts);
   m_get = 0; m_rel = 1} 
   
END{printf("\nEND at %Y\n", walltimestamp);
    printf("\nELAPSED_Seconds = %i",     (timestamp - START_TS)/1000/1000/1000);
    printf("\n------------------ Latch Child Stats ------------------");
    printf("\nL_GET_CNT  = %i",         L_GET_CNT);
    printf("\nL_TOTAL_NS = %i",         L_TOTAL_NS);
    /* printf("\nL_AVG_GET_NS = %i",       L_TOTAL_NS/L_GET_CNT);  same as L_AVG */
    printa("\nL_CNTGET %20s = %12@d",   @L_CNTGET);
    printa("\nL_CNTFRE %20s = %12@d",   @L_CNTFRE);
    printa("\nL_AVG    %20s = %12@d",       @L_AVG);
    printa("\nL_QUANT %20s  = %12@d",   @L_QUANT);
    
    printf("\n------------------ Mutex Stats ------------------");
    printa("\nALL_MUTEX_GETs %20s = %12@d",       @M_ALL_CNTGET);
    printa("\nALL_MUTEX_FREs %20s = %12@d\n",     @M_ALL_CNTFRE);
    printf("\nM_GET_CNT  = %i",                   M_GET_CNT);
    printf("\nM_TOTAL_NS = %i",                   M_TOTAL_NS);
    /* printf("\nM_AVG_GET_NS = %i",                 M_TOTAL_NS/M_GET_CNT);  same as M_AVG */
    printa("\nM_CNTGET %20s, %4d (Mode) = %12@d", @M_CNTGET);
    printa("\nM_CNTFRE %20s = %12@d",             @M_CNTFRE);
    printa("\nM_AVG    %20s = %12@d",                @M_AVG);
    printa("\nM_QUANT  %20s = %12@d",             @M_QUANT);}
' -p 7476


-- // -p $1
-- ksh ./aq_latch_mutex_dtrace 7476