Tuesday, November 5, 2024

Oracle Self-tune checkpointing lag of log switch

Often DB alert.log shows a lag of 5 minutes from log switch to Completed checkpoint as follows:
(from 11:03:36 to 11:08:39)

2024-11-04T11:03:36.681460+01:00
Beginning log switch checkpoint up to RBA [0x8b.2.10], SCN: 13022649669494
2024-11-04T11:03:36.681565+01:00
Thread 1 advanced to log sequence 139 (LGWR switch),  current SCN: 13022649669494
  Current log# 3 seq# 139 mem# 0: /oratestdb/oradata/testdb/redo03.dbf
2024-11-04T11:08:39.613734+01:00
Completed checkpoint up to RBA [0x8b.2.10], SCN: 13022649669494
is this a LGWR performance problem ?

In this Blog, we will demonstrate those LGWR and CKPT asynchronous write and 3 hidden parameters with Linux bpftrace by following the advanced guidance Let's trace it! A peek into the inner workings of Oracle with bpftrace von Christoph Lutz.

  _selftune_checkpointing_lag		Self-tune checkpointing lag the tail of the redo log   300
  _disable_selftune_checkpointing	Disable self-tune checkpointing                        FALSE
  _selftune_checkpoint_write_pct	Percentage of total physical i/os for self-tune ckpt   3    
Note: Tested in Oracle 19.24.

In Linux window, start bpftrace script below:

BPFTRACE_STRLEN=200 bpftrace -e '
 tracepoint:syscalls:sys_enter_openat /(strncmp("ora_lgwr", comm, 7) == 0 || strncmp("ora_ckpt", comm, 7) == 0) && strncmp("/oradir", str(args->filename), 7) == 0/ 
   {time("====== %H:%M:%S ------ ");
    printf("comm=%s, pid=%d, filename ------\n%8s%s\n", comm, pid, "", str(args->filename));}
 tracepoint:syscalls:sys_enter_write /(strncmp("ora_lgwr", comm, 7) == 0 || strncmp("ora_ckpt", comm, 7) == 0) && strncmp("<",str(args->buf), 1) == 1/ 
   {time("====== %H:%M:%S ------ ");
    printf("comm=%s, pid=%d, fd=%d, BUF ------\n%s\n", comm, pid, args->fd, str(args->buf));}'
In Sqlplus window, run following SQL:

--create table test_tab as select 1 x from dual; 

update tt1 set x = x + 1;
commit;
alter system switch logfile;
Back to Linux Window, we can see following output during around 5 minutes:

Attaching 2 probes...

====== 11:03:36 ------ comm=ora_lgwr_testdb, pid=1926182, filename ------
        /oradir/app/oracle/admin/testdb/diag/rdbms/testdb/testdb/alert/log.xml
====== 11:03:36 ------ comm=ora_lgwr_testdb, pid=1926182, filename ------
        /oradir/app/oracle/admin/testdb/diag/rdbms/testdb/testdb/trace/alert_testdb.log
====== 11:03:36 ------ comm=ora_lgwr_testdb, pid=1926182, fd=8, BUF ------
2024-11-04T11:03:36.681460+01:00
Beginning log switch checkpoint up to RBA [0x8b.2.10], SCN: 13022649669494

====== 11:03:36 ------ comm=ora_lgwr_testdb, pid=1926182, filename ------
        /oradir/app/oracle/admin/testdb/diag/rdbms/testdb/testdb/alert/log.xml
====== 11:03:36 ------ comm=ora_lgwr_testdb, pid=1926182, filename ------
        /oradir/app/oracle/admin/testdb/diag/rdbms/testdb/testdb/trace/alert_testdb.log
====== 11:03:36 ------ comm=ora_lgwr_testdb, pid=1926182, fd=8, BUF ------
2024-11-04T11:03:36.681565+01:00
Thread 1 advanced to log sequence 139 (LGWR switch),  current SCN: 13022649669494

====== 11:03:36 ------ comm=ora_lgwr_testdb, pid=1926182, filename ------
        /oradir/app/oracle/admin/testdb/diag/rdbms/testdb/testdb/alert/log.xml
====== 11:03:36 ------ comm=ora_lgwr_testdb, pid=1926182, filename ------
        /oradir/app/oracle/admin/testdb/diag/rdbms/testdb/testdb/trace/alert_testdb.log
====== 11:03:36 ------ comm=ora_lgwr_testdb, pid=1926182, fd=8, BUF ------
  Current log# 3 seq# 139 mem# 0: /oratestdb/oradata/testdb/redo03.dbf

====== 11:08:39 ------ comm=ora_ckpt_testdb, pid=1926186, filename ------
        /oradir/app/oracle/admin/testdb/diag/rdbms/testdb/testdb/alert/log.xml
====== 11:08:39 ------ comm=ora_ckpt_testdb, pid=1926186, filename ------
        /oradir/app/oracle/admin/testdb/diag/rdbms/testdb/testdb/trace/alert_testdb.log
====== 11:08:39 ------ comm=ora_ckpt_testdb, pid=1926186, fd=10, BUF ------
2024-11-04T11:08:39.613734+01:00
Completed checkpoint up to RBA [0x8b.2.10], SCN: 13022649669494
Both LGWR and CKPT write to DIAG "log.xml" and alert.log (xml content in "log.xml" are filtered out).
LGWR records redo log activities at beginning (11:03:36), and only after about 5 minutes,
CKPT sends "Completed checkpoint" (11:08:39) with same "RBA [0x8b.2.10]".

If we look redo logs immediately after switch logfile, the current one is marked with log switch timestamp (11:03):

oracle@testdb:testdb > ls -ltr /oratestdb/oradata/testdb/redo*
-rw-r----- 1 oracle dba 1073742336 Nov  4 10:19 /oratestdb/oradata/testdb/redo04.dbf
-rw-r----- 1 oracle dba 1073742336 Nov  4 10:21 /oratestdb/oradata/testdb/redo01.dbf
-rw-r----- 1 oracle dba 1073742336 Nov  4 11:03 /oratestdb/oradata/testdb/redo03.dbf
-rw-r----- 1 oracle dba 1073742336 Nov  4 11:03 /oratestdb/oradata/testdb/redo02.dbf
As tested, the 5 minutes is an upper limit and controlled by the default 300 seconds of "_selftune_checkpointing_lag".

If we change it as (require DB restart):

alter system set "_selftune_checkpointing_lag"=60 scope=spfile;
The "Completed checkpoint" is written after 1 minutes.

If we disable the Self-tune by (require DB restart):

alter system set "_disable_selftune_checkpointing"=true scope=spfile;
CKPT will no more write "Completed checkpoint" in alert.log.

Of course, if we want to watch "Completed checkpoint", we can force a checkpoint by:

alter system checkpoint;