For both SQL and Plsql, we show one slow and one fast version.
Then we run tests, compare and investigate their performance difference.
In next Blog: Reading CLOB Line by Line with SQL and Plsql: ChatGPT and Gemini Implementations
we will give a look of SQL and Plsql implementations provided by ChatGPT and Gemini.
Note: Tested on Oracle 19c.
1. Test Setup
1.1 Create Table and Filling Procedure
drop table test_tab_lob cascade constraints;
create table test_tab_lob(id number, mylob clob, note varchar2(100));
create or replace procedure create_test_data(p_id number, p_cnt number, p_last_line_no_LF number := 0) as
LF constant varchar2(1) := chr(10);
l_mylob clob;
begin
insert into test_tab_lob values (p_id, empty_clob(), 'CLOB Line Count = '||p_cnt||', Line_Last_Line_No_LF = '||p_last_line_no_LF)
returning mylob into l_mylob;
for i in 1..p_cnt loop
dbms_lob.append(dest_lob => l_mylob, src_lob => rpad('Line_'||i||'-', least(10+i-1, 1000), 'X')||LF);
end loop;
if p_last_line_no_LF = 1 then
dbms_lob.append(dest_lob => l_mylob, src_lob => rpad('Line_Last_Line_No_LF-', 20, 'Y')); -- add last line without LF
end if;
dbms_output.put_line(length(l_mylob));
commit;
end;
/
1.2. Create Test Data
---- Smoking Test with LF at the end of Last Line
exec create_test_data(1, 5);
---- Smoking Test without LF at the end of Last Line
exec create_test_data(2, 5, 1);
---- CLOB with 1000 Lines
exec create_test_data(3, 1000);
---- CLOB with 10*1000 Lines
exec create_test_data(4, 10*1000);
---- CLOB with 100*1000 Lines
exec create_test_data(5, 100*1000);
1.3. Check Table CLOB Size
select id, length(mylob), note, round(dbms_lob.getlength(mylob)/1024/1024, 2) LOB_MB from test_tab_lob t;
ID LENGTH(MYLOB) NOTE LOB_MB
---- ------------- ------------------------------------------------------------ ----------
1 65 CLOB Line Count = 5, Line_Last_Line_No_LF = 0 0
2 85 CLOB Line Count = 5, Line_Last_Line_No_LF = 1 0
3 510455 CLOB Line Count = 1000, Line_Last_Line_No_LF = 0 .49
4 9519455 CLOB Line Count = 10000, Line_Last_Line_No_LF = 0 9.08
5 99609455 CLOB Line Count = 100000, Line_Last_Line_No_LF = 0 94.99
with sq as (select /*+ materialize */ segment_name, index_name from dba_lobs where owner='K' and table_name='TEST_TAB_LOB')
select t.segment_name, tablespace_name, header_file, relative_fno, header_block, bytes, blocks, extents, round(bytes/1024/1024) mb from dba_segments t, sq
where t.segment_name = sq.segment_name or t.segment_name = sq.index_name;
SEGMENT_NAME TABLESPACE_NAME HEADER_FILE RELATIVE_FNO HEADER_BLOCK BYTES BLOCKS EXTENTS MB
--------------------------------- -------------------- ----------- ------------ ------------ ---------- ---------- ---------- ----------
SYS_IL0005911971C00002$$ U1 22 1024 2830970 10485760 1280 25 10
SYS_LOB0005911971C00002$$ U1 22 1024 2830962 1143996416 139648 201 1091
(see Blog: "One Oracle CLOB Space Usage Test":
"https://ksun-oracle.blogspot.com/2025/02/one-oracle-clob-space-usage-test.html")
2. SQL Recursive Query
2.1 SQL Slow Recursive
In Post: "Reading clob line by line with pl\sql"
(https://stackoverflow.com/questions/11647041/reading-clob-line-by-line-with-pl-sql),
there is one ANSI Standard Recursive Query to read clob line by line:
create or replace procedure SQL_SLOW (p_id number) as
--v_tmp clob := 'aaaa'||chr(10)||'bbb'||chr(10)||'ccccc';
v_tmp clob;
l_cnt number := 0;
l_len number := 0;
begin
select mylob into v_tmp from test_tab_lob where id = p_id;
for rec in (with clob_table(c) as (select v_tmp c from dual)
select regexp_substr(c, '.+', 1, level) text,level line
from clob_table
connect by level <= regexp_count(c, '.+'))
loop
--dbms_output.put_line(rec.text);
l_cnt := l_cnt+1;
l_len := l_len+length(rec.text);
end loop;
dbms_output.put_line('CNT = '||l_cnt ||', Length = '||l_len);
end;
/
-- exec SQL_SLOW(1);
This code has Quadratic Performance (O(n^2)). Each line is read from first CLOB line position, hence n*n/2 scans.Session shows intensive Oracle Wait Event "direct path read".
With perf tool, we observe following top Oracle subroutine calls:
$ > perf top -p 2450579 -d 2
PerfTop: 744 irqs/sec kernel: 0.0% exact: 0.0% lost: 0/0 drop: 0/0 [4000Hz cpu-clock:uhpppH], (target_pid: 2450579)
-------------------------------------------------------------------------------------------------------------------------------------
10.72% oracle [.] lxregmatch
3.88% oracle [.] sxorchk
3.17% oracle [.] lxregmatpush
3.04% oracle [.] lxoCntByte
2.71% oracle [.] lxpoCmpStr
2.2 SQL Fast Recursive
create or replace procedure SQL_FAST (p_id number) as
l_cnt number := 0;
l_len number := 0;
begin
for c in (
with
--lob as (select 1 id, length(mylob)+1 lob_len, mylob||chr(10) mylob from
-- (select to_clob('abc'||chr(10)||'x1'||chr(10)||chr(10)||'x2'||chr(10)||'x3') mylob from dual))
lob as (select id, length(mylob)+1 lob_len, mylob||chr(10) mylob from test_tab_lob where id = p_id)
,recur_tab(id, lob_len, mylob, line, total_len, lf_pos, lf_pos_prior, lvl) as (
select id, lob_len, mylob, dbms_lob.substr(mylob, instr(mylob, chr(10), 1, 1)-1, 1) line
,0 total_len, instr(mylob, chr(10), 1, 1) lf_pos, 0 lf_pos_prior, 0 lvl
from lob
union all
select id, lob_len, mylob, dbms_lob.substr(mylob, lf_pos - 1 - lf_pos_prior, lf_pos_prior+1) line
,total_len + length(line) + 1 total_len
,instr(mylob, chr(10), lf_pos+1, 1) lf_pos, lf_pos lf_pos_prior, lvl+1 lvl
from recur_tab where total_len <= length(mylob))
select t.*, length(line) line_len, dbms_lob.substr(line, length(line), 1) line_text
from recur_tab t where line is not null and lf_pos_prior > 0)
loop
l_cnt := l_cnt+1;
l_len := l_len+length(c.line);
end loop;
dbms_output.put_line('CNT = '||l_cnt ||', Length = '||l_len);
end;
/
-- exec SQL_FAST(1);
The above code has Linear Prformance (O(n)). Each line is read from next CLOB line position, hence n scans.Post: "Oracle: Read from CLOB column line by line and insert to the table"
(https://dba.stackexchange.com/questions/10893/oracle-read-from-clob-column-line-by-line-and-insert-to-the-table)
also has some similar code, which wrote:
"This is not a full answer because it only works if your clob is less than 4000 chars."
3. Plsql Implementations
3.1 Plsql Fast Version
In Post: "Reading clob line by line with pl\sql"
(https://stackoverflow.com/questions/11647041/reading-clob-line-by-line-with-pl-sql),
there is one Plsql code: "procedure parse_clob" as follows:
create or replace procedure PLSQL_FAST (p_id number) is
p_clob clob;
l_offset pls_integer:=1;
l_line varchar2(32767);
l_total_length pls_integer := length(p_clob);
l_line_length pls_integer;
l_cnt number := 0;
l_len number := 0;
begin
select mylob into p_clob from test_tab_lob where id = p_id;
l_total_length := length(p_clob);
dbms_output.put_line('l_total_length = '|| l_total_length);
while l_offset <= l_total_length loop
l_line_length := instr(p_clob, chr(10), l_offset) - l_offset;
if l_line_length < 0 then
l_line_length := l_total_length + 1 - l_offset;
end if;
l_line:=substr(p_clob, l_offset, l_line_length);
--dbms_output.put_line(l_line); --do line processing
l_offset:=l_offset + l_line_length + 1;
l_cnt := l_cnt + 1;
l_len := l_len + l_line_length;
end loop;
dbms_output.put_line('CNT = '||l_cnt ||', Length = '||l_len);
end;
/
-- exec PLSQL_FAST(1);
Each line is read from next CLOB line offset.
3.2. Plsql Slow Version
create or replace procedure PLSQL_SLOW(p_id number) is
p_clob clob;
l_offset pls_integer := 1;
l_line varchar2(32767);
l_total_length pls_integer := length(p_clob);
l_line_length pls_integer;
l_lf_pos pls_integer;
l_cnt number := 0;
l_len number := 0;
begin
select mylob into p_clob from test_tab_lob where id = p_id;
l_total_length := length(p_clob);
dbms_output.put_line('l_total_length = '|| l_total_length);
while l_offset <= l_total_length loop
l_lf_pos := instr(p_clob, chr(10));
l_line_length := l_lf_pos - 1;
--dbms_output.put_line('l_offset = '||l_offset);
--dbms_output.put_line('l_line_length = '||l_line_length);
-- In the case of CLOB last line without CHR(10), l_line_length < 0.
-- instr: If substring not found, INSTR will return 0, hence l_line_length = - l_offset.
if l_line_length < 0 then
l_line_length := l_total_length + 1 - l_offset;
end if;
l_line := substr(p_clob, 1, l_line_length);
--dbms_output.put_line('New l_line_length = '||l_line_length);
--dbms_output.put_line(l_line); --do line processing
l_offset := l_offset + l_line_length + 1;
p_clob := substr(p_clob, l_lf_pos + 1);
l_cnt := l_cnt + 1;
l_len := l_len + l_line_length;
end loop;
dbms_output.put_line('CNT = '||l_cnt ||', Length = '||l_len);
end;
/
-- exec PLSQL_SLOW(1);
CLOB loop calling of substr triggers many dynamic LOB Creations/Destructions in session PL/SQL (Oracle subroutine "__intel_avx_rep_memcpy": UNIX memcpy).
perf tool shows top Oracle calls:
$ > perf top -p 2450579 -d 2
PerfTop: 3878 irqs/sec kernel: 0.0% exact: 0.0% lost: 0/0 drop: 0/37 [4000Hz cpu-clock:uhpppH], (target_pid: 2450579)
-------------------------------------------------------------------------------------------------------------------------------------
26.05% oracle [.] __intel_avx_rep_memcpy
6.36% oracle [.] kcbgcur
4.76% oracle [.] kcbgtcr
4.05% oracle [.] kcbchg1_main
3.55% oracle [.] kcbrls
Post "Testing Oracle's Use of Optane Persistent Memory, Part 1 - Low Latency Commits"(https://tanelpoder.com/posts/testing-oracles-use-of-optane-persistent-memory/)
wrote: "This is the ìmemcpyî (actually __intel_avx_rep_memcpy)"
4. Test Outcome
4.1. SQL Slow
SQL > exec SQL_SLOW(3);
CNT = 1,000, Length = 509,455
Elapsed: 00:04:45.59
SQL > exec SQL_SLOW(4);
CNT = 10,000, Length = 9,509,455
Elapsed: 09:49:48.05
SQL > exec SQL_SLOW(5);
-- Not finished after 10 hours
4.2. SQL Fast
SQL > exec SQL_FAST(3);
CNT = 1,000, Length = 509,455
Elapsed: 00:00:00.67
SQL > exec SQL_FAST(4);
CNT = 10,000, Length = 9,509,455
Elapsed: 00:00:02.78
SQL > exec SQL_FAST(5);
CNT = 100,000, Length = 99,509,455
Elapsed: 00:00:39.74
4.3. Plsql Fast
SQL > exec PLSQL_FAST(3);
CNT = 1,000, Length = 509,455
Elapsed: 00:00:01.20
SQL > exec PLSQL_FAST(4);
CNT = 10,000, Length = 9,509,455
Elapsed: 00:00:04.93
SQL > exec PLSQL_FAST(5);
CNT = 100,000, Length = 99,509,455
Elapsed: 00:00:45.29
4.4. Plsql Slow
SQL > exec PLSQL_SLOW(3);
CNT = 1,000, Length = 509,455
Elapsed: 00:00:01.36
SQL > exec PLSQL_SLOW(4);
CNT = 10,000, Length = 9,509,455
Elapsed: 00:03:22.48
SQL > exec PLSQL_SLOW(5);
CNT = 100,000, Length = 99,509,455
Elapsed: 06:16:49.58
4.5. Performance Summary
CLOB_Lines = 1,000
Code Elapsed
----------- -----------
SQL_SLOW 00:04:45.59
SQL_FAST 00:00:00.67
PLSQL_SLOW 00:00:01.36
PLSQL_FAST 00:00:01.20
CLOB_Lines = 10,000
Code Elapsed
----------- -----------
SQL_SLOW 09:49:48.05 *** 9 hours
SQL_FAST 00:00:02.78
PLSQL_SLOW 00:03:22.48
PLSQL_FAST 00:00:04.93
CLOB_Lines = 100,000
Code Elapsed
----------- -----------
SQL_SLOW > 10 hours *** > 10 hours
SQL_FAST 00:00:39.74
PLSQL_SLOW 06:16:49.58 *** 6 hours
PLSQL_FAST 00:00:45.29