Tuesday, December 16, 2008

When the optimizer goes for a FULL SCAN

It is often asked why a SQL is going for a Full Table Scan. And also its widely considered that
Full Table Scan means BAD performance.

So i would try to explain the possible caused when a CBO decides to go for a full table
scan. And also the concept of FULL TABLE SCAN IS NOT EVIL.

To do that i have created a small test setup.

I have create a table FULL_TABLE_SCAN_TEST from all_objects.


SQL> create table full_table_scan_test
2 as
3 select *
4 from all_objects
5 /

SQL> select count(1) from full_table_scan_test
2 /

COUNT(1)
----------
930360



I have indexed the column OBJECT_TYPE.

SQL> create index full_table_scan_test_idx on full_table_scan_test(object_type)
2 /

Index created.

SQL> exec dbms_stats.gather_table_stats(user,'FULL_TABLE_SCAN_TEST',cascade=>TRUE)

PL/SQL procedure successfully completed.



CASE 1: The optimizer normally goes for a FULL TABLE SCAN when it has to retreive a
large portion of data from the table. This is because when accessing a large
portion of data, a scan through the index can be a overhead.

Below is a little demonstration of the same.


Now below query shows the count of records for object_type FUNCTION.

SQL> select count(1)
2 from full_table_scan_test
3 where object_type in ('FUNCTION')
4 /

COUNT(1)
----------
2408



This is a very small portion when compared to the total count which is 930360

So a condition like this must result in a INDEX RANGE SCAN. Lets try it.

SQL> explain plan
2 for
3 select *
4 from full_table_scan_test
5 where object_type in ('FUNCTION')
6 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------
Plan hash value: 607499257

--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1833 | 161K| 65 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| FULL_TABLE_SCAN_TEST | 1833 | 161K| 65 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | FULL_TABLE_SCAN_TEST_IDX | 1833 | | 7 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------
2 - access("OBJECT_TYPE"='FUNCTION')

14 rows selected.



As expected we got a INDEX RANGE SCAN.

But now a filter criteria on the same column which will return a large protion of record will go for
a FULL SCAN. This can easily be illustrated.

A filter for object_type TABLE, INDEX returns a vast protion for data as shown below.

SQL> select count(1)
2 from full_table_scan_test
3 where object_type IN ('TABLE','INDEX')
4 /

COUNT(1)
----------
428688



So when a filtering is done for this condition you can see the optimizer going for a FULL TABLE SCAN.

SQL> explain plan
2 for
3 select *
4 from full_table_scan_test
5 where object_type in ('TABLE','INDEX')
6 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------------------------------
Plan hash value: 798910834

------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 437K| 37M| 2796 (3)| 00:00:34 |
|* 1 | TABLE ACCESS FULL| FULL_TABLE_SCAN_TEST | 437K| 37M| 2796 (3)| 00:00:34 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------
1 - filter("OBJECT_TYPE"='INDEX' OR "OBJECT_TYPE"='TABLE')

13 rows selected.



Most of the time people assume that this is bad and try to HINT the optimizer to use the index.

This could turn to be expensive. Lets hint the above query and see what happens.

SQL> delete from plan_table
2 /

2 rows deleted.

SQL> explain plan
2 for
3 select /*+ INDEX(full_table_scan_test full_table_scan_test_idx) */ *
4 from full_table_scan_test
5 where object_type in ('TABLE','INDEX')
6 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT

---------------------------------------------------------------------------------------------------------
Plan hash value: 279860542
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 437K| 37M| 14723 (1)| 00:02:57 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| FULL_TABLE_SCAN_TEST | 437K| 37M| 14723 (1)| 00:02:57 |
|* 3 | INDEX RANGE SCAN | FULL_TABLE_SCAN_TEST_IDX | 437K| | 1053 (2)| 00:00:13 |
---------------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):

3 - access("OBJECT_TYPE"='INDEX' OR "OBJECT_TYPE"='TABLE')

15 rows selected.


Here you can see there is a drastic increase in the CPU Cost.

Conclution: FULL TABLE SCAN is one of the optimal way when trying to access a large portion of data.

CASE 2: Use of NOT leads to a FULL TABLE SCAN.

Even though the optimizer is a very smart program it has its owne assumptions and limitations.
One such thing is, when a NOT operatior is used the optimizer assumes that its going to
fetch a large portion of data and directly goes for a FULL TABLE SCAN.

So the use of NOT operator leads to a full table scan even though the data retrived is
less. See the demonstration below.

The follwing condition shown in the below query results in a very small portion of the table

SQL> select count(*)
2 from full_table_scan_test
3 where object_type not in ('TABLE','INDEX','VIEW','SYNONYM',
4 'TYPE','JAVA CLASS','PACKAGE','PACKAGE BODY')
5 /

COUNT(*)
----------
21184



But you can see that the optimizer decides to go for a FULL TABLE SCAN. Check this out.

SQL> explain plan
2 for
3 select *
4 from full_table_scan_test
5 where object_type not in ('TABLE','INDEX','VIEW','SYNONYM',
6 'TYPE','JAVA CLASS','PACKAGE','PACKAGE BODY')
7 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
Plan hash value: 798910834
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 313K| 26M| 2813 (4)| 00:00:34 |
|* 1 | TABLE ACCESS FULL| FULL_TABLE_SCAN_TEST | 313K| 26M| 2813 (4)| 00:00:34 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
1 - filter("OBJECT_TYPE"<>'INDEX' AND "OBJECT_TYPE"<>'TABLE' AND
"OBJECT_TYPE"<>'SYNONYM' AND "OBJECT_TYPE"<>'VIEW' AND "OBJECT_TYPE"<>'JAVA
CLASS' AND "OBJECT_TYPE"<>'TYPE' AND "OBJECT_TYPE"<>'PACKAGE BODY' AND
"OBJECT_TYPE"<>'PACKAGE')

16 rows selected.


Even hinting will lead only to a INDEX FULL SCAN. A RANGE SCAN is never an option with NOT operator.

Lets hint and see.

SQL> explain plan
2 for
3 select /*+ INDEX(full_table_scan_test full_table_scan_test_idx) */ *
4 from full_table_scan_test
5 where object_type not in ('TABLE','INDEX','VIEW','SYNONYM',
6 'TYPE','JAVA CLASS','PACKAGE','PACKAGE BODY')
7 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
Plan hash value: 2742810082
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 313K| 26M| 12196 (1)| 00:02:27 |
| 1 | TABLE ACCESS BY INDEX ROWID| FULL_TABLE_SCAN_TEST | 313K| 26M| 12196 (1)| 00:02:27 |
|* 2 | INDEX FULL SCAN | FULL_TABLE_SCAN_TEST_IDX | 313K| | 2412 (3)| 00:00:29 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
2 - filter("OBJECT_TYPE"<>'INDEX' AND "OBJECT_TYPE"<>'TABLE' AND "OBJECT_TYPE"<>'SYNONYM'
AND "OBJECT_TYPE"<>'VIEW' AND "OBJECT_TYPE"<>'JAVA CLASS' AND "OBJECT_TYPE"<>'TYPE' AND
"OBJECT_TYPE"<>'PACKAGE BODY' AND "OBJECT_TYPE"<>'PACKAGE')

16 rows selected.



Conclution: Use NOT operation while retreiving a large portion of data. If you are going to retreive
a small portion of data try to be positive ;) and do not use NOT operator. One work around
could be use of > and < together.

CASE 3: A NULL column is not an option for INDEX FAST FULL SCAN.

Normally when we do a COUNT on a table that has a index on a column we expect it to use that
index. But that is possible only when the indexed column is NOT NULL.

This is simply because the NULL values of a column are not indexed. And hence the optimizer
cant scan throught the index to give you the exact count of a table.

A little demonstration about this is given below.

The column OBJECT_TYPE is a NULL'able column. Hence when you do a COUNT on the table it goes for a FULL SCAN.

Check this out.

SQL> explain plan
2 for
3 select count(*)
4 from full_table_scan_test
5 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
Plan hash value: 3323906207
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2755 (2)| 00:00:34 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| FULL_TABLE_SCAN_TEST | 930K| 2755 (2)| 00:00:34 |
-----------------------------------------------------------------------------------

9 rows selected.



So now lets modify the column OBJECT_TYPE as NOT NULL and see what happens.

SQL> alter table full_table_scan_test modify object_type not null
2 /

Table altered.

SQL> exec dbms_stats.gather_table_stats(user,'FULL_TABLE_SCAN_TEST',cascade=>TRUE)

PL/SQL procedure successfully completed.



Now i execute the same query.

SQL> explain plan
2 for
3 select count(*)
4 from full_table_scan_test
5 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
Plan hash value: 2070121381
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 536 (5)| 00:00:07 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| FULL_TABLE_SCAN_TEST_IDX | 937K| 536 (5)| 00:00:07 |
------------------------------------------------------------------------------------------

9 rows selected.



Conclution: Optimizer can't trust a INDEX on a NULL'able column to getting all the data of that column.

Wednesday, November 12, 2008

Index is not all GOD

Another case to prove that index is not all GOD and FULL TABLE SCAN is not evil.

Today i saw this question on the OTN forum.

Hi All,

I need to execute a query like this :
SELECT * FROM ORDERS WHERE APPROVE_DATE IS NULL

I read anywhere that this will cause unnecessary FTS so that I should create function based index.

I have tried one below , but not sure that this is correct approach :

CREATE INDEX idx_1
ON ORDERS (NVL(APPROVE_DATE, '01-JAN-1900'));

SELECT * FROM ORDERS WHERE NVL(APPROVE_DATE, '01-JAN-1900') = '01-JAN-1900'

Is this a correct approach ?

Thank you,
xtanto


This is what i gave in reply. I prepared a simple test case to show that an index be USED as well as IGNORED.

A SQL_TRACE output will explain clearly what Justin has stated.

I have created a table T based on all_objects.

SQL> desc t
Name Null? Type
----------------------------------------- -------- -----------------------
OWNER NOT NULL VARCHAR2(30)
OBJECT_NAME NOT NULL VARCHAR2(30)
SUBOBJECT_NAME VARCHAR2(30)
OBJECT_ID NOT NULL NUMBER
DATA_OBJECT_ID NUMBER
OBJECT_TYPE VARCHAR2(19)
CREATED DATE
LAST_DDL_TIME NOT NULL DATE
TIMESTAMP VARCHAR2(19)
STATUS VARCHAR2(7)
TEMPORARY VARCHAR2(1)
GENERATED VARCHAR2(1)
SECONDARY VARCHAR2(1)


CASE I

SQL> select count(1) from t
2 /

COUNT(1)
----------
934320

SQL> select count(1) from t where created is null
2 /

COUNT(1)
----------
2376



The number of null values in CREATED column is proportionately very small.

Now i execute the query without function based index.

select *
from t
where created is null

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.09 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 160 0.04 0.10 0 12662 0 2376
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 162 0.04 0.19 0 12662 0 2376

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: ALL_ROWS
2376 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'T' (TABLE)



And here is the query that uses the function based index

select *
from t
where nvl(created,to_date('01-01-1900','DD-MM-YYYY')) = to_date('01-01-1900','DD-MM-YYYY')

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 160 0.01 0.01 0 698 0 2376
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 162 0.03 0.01 0 698 0 2376

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: ALL_ROWS
2376 TABLE ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF 'T' (TABLE)
2376 INDEX GOAL: ANALYZED (RANGE SCAN) OF 'T_FN_IDX' (INDEX)



Its very obvious from the above output that the Function Based Index as increased the performance.

CASE II

SQL> select count(1) from t
2 /

COUNT(1)
----------
934320

SQL> select count(1) from t where created is null
2 /

COUNT(1)
----------
202168



Now the null values in the CREATED column is proportionately large than the first test case.

Now lets see without using the function based index

select *
from t
where created is null

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 13479 0.46 0.71 2 25832 0 202168
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 13481 0.46 0.71 2 25832 0 202168

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: ALL_ROWS
202168 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'T' (TABLE)



Now iam trying to use the function based index

select *
from t
where nvl(created,to_date('01-01-1900','DD-MM-YYYY')) = to_date('01-01-1900','DD-MM-YYYY')

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 13479 0.54 0.84 0 33826 0 202168
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 13481 0.54 0.84 0 33826 0 202168

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: ALL_ROWS
202168 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'T' (TABLE)



Its obvious from the result that oracle has decided to go for a FULL TABLE SCAN even when an index was available.

So just having a function based index is not going to increase the query performance. There are lot of other factors to be considered as stated above.

The original link to the OTN Thread is

http://forums.oracle.com/forums/thread.jspa?threadID=825995&tstart=0

Thanks,
Karthick.

Wednesday, July 2, 2008

Loding XML using external table

A small example to load XML using external table.

SQL> CREATE DIRECTORY my_xml_dir AS 'E:\oracle\Log_files\UTL_AKIVATST'
2 /

Directory created.

SQL> DROP TABLE my_xml_et
2 /

Table dropped.

SQL> CREATE TABLE my_xml_et
2 ( EMPNO NUMBER,
3 EMPNAME VARCHAR2(10),
4 JOB VARCHAR2(10),
5 HIREDATE DATE,
6 SAL NUMBER
7 )
8 ORGANIZATION EXTERNAL
9 (
10 TYPE ORACLE_LOADER
11 DEFAULT DIRECTORY my_xml_dir
12 ACCESS PARAMETERS
13 (
14 records delimited by "</EMP>"
15 badfile my_xml_dir:'empxt%a_%p.bad'
16 logfile my_xml_dir:'empxt%a_%p.log'
17 FIELDS
18 (
19 filler char(2000) terminated by "<EMP>",
20 EMPNO char(2000) enclosed by "<EMPNO>" and "</EMPNO>",
21 EMPNAME char(2000) enclosed by "<ENAME>" and "</ENAME>",
22 JOB char(2000) enclosed by "<JOB>" and "</JOB>",
23 HIREDATE char(2000) enclosed by "<HIREDATE>" and "</HIREDATE>",
24 SAL char(2000) enclosed by "<SAL>" and "</SAL>"
25 )
26 )
27 LOCATION ('emp.xml')
28 )
29 PARALLEL
30 REJECT LIMIT UNLIMITED
31 /

Table created.

SQL> SELECT * FROM my_xml_et
2 /

EMPNO EMPNAME JOB HIREDATE SAL
---------- ---------- ---------- --------- ----------
7369 SMITH CLERK 17-DEC-80 800
7499 ALLEN SALESMAN 20-FEB-81 1600


This is the sample XML that i used.

emp.xml
----------------

<EMPLOYEES>
<EMP>
<EMPNO>7369</EMPNO>
<ENAME>SMITH</ENAME>
<JOB>CLERK</JOB>
<HIREDATE>17-DEC-80</HIREDATE>
<SAL>800</SAL>
</EMP>
<EMP>
<EMPNO>7499</EMPNO>
<ENAME>ALLEN</ENAME>
<JOB>SALESMAN</JOB>
<HIREDATE>20-FEB-81</HIREDATE>
<SAL>1600</SAL>
<COMM>300</COMM>
</EMP>
</EMPLOYEES>

Wednesday, June 18, 2008

Find out Blocking Session

I came across this pieace of code in Oracle form which helps us to find the blocking sessions. I tought of sharing it here.


SQL> SET ECHO OFF
SQL> SET LINE 10000
SQL> COLUMN BLOCKER FORMAT A11;
SQL> COLUMN BLOCKEE FORMAT A10;
SQL> COLUMN SID FORMAT 99999;
SQL> COLUMN BLOCKER_MODULE FORMAT A15 TRUNC;
SQL> COLUMN BLOCKEE_MODULE FORMAT A15 TRUNC;
SQL>
SQL> ALTER SESSION SET OPTIMIZER_MODE=RULE;

Session altered.

SQL>
SQL> SELECT a.inst_id,
2 (SELECT username FROM gv$session s WHERE s.inst_id=a.inst_id AND s.sid=a.sid) blocker,
3 a.sid,
4 (SELECT module FROM gv$session s WHERE s.inst_id=a.inst_id AND s.sid=a.sid) blocker_module ,
5 ' is blocking ' "IS BLOCKING",
6 b.inst_id,
7 (SELECT username FROM gv$session s WHERE s.inst_id=b.inst_id AND s.sid=b.sid) blockee,
8 b.sid ,
9 (SELECT module FROM gv$session s WHERE s.inst_id=b.inst_id AND s.sid=b.sid) blockee_module
10 FROM gv$lock a, gv$lock b
11 where a.block <>0
12 AND b.request > 0
13 AND a.id1 = b.id1
14 AND a.id2 = b.id2
15 AND a.sid <> b.sid
16 order by 1, 2
17 /

INST_ID BLOCKER SID BLOCKER_MODULE IS BLOCKING INST_ID BLOCKEE SID BLOCKEE_MODULE
---------- ----------- ------ --------------- ------------- ---------- ---------- ------ ---------------
1 SYSADM 15 SQL*Plus is blocking 1 SYSADM 11 T.O.A.D.

SQL>
SQL> ALTER SESSION SET OPTIMIZER_MODE=CHOOSE;

Session altered.

SQL> BEGIN
2 dbms_output.enable(1000000);
3 FOR do_loop IN (SELECT session_id, a.object_id, xidsqn, oracle_username, b.owner owner,
4 b.object_name object_name, b.object_type object_type
5 FROM v$locked_object a, dba_objects b
6 WHERE xidsqn != 0
7 and b.object_id = a.object_id)
8 LOOP
9 dbms_output.put_line('.');
10 dbms_output.put_line('Blocking Session : '||do_loop.session_id);
11 dbms_output.put_line('Object (Owner/Name): '||do_loop.owner||'.'||do_loop.object_name);
12 dbms_output.put_line('Object Type : '||do_loop.object_type);
13 FOR next_loop IN (SELECT sid
14 FROM v$lock
15 WHERE id2 = do_loop.xidsqn
16 AND sid != do_loop.session_id)
17 LOOP
18 dbms_output.put_line('Sessions being blocked : '||next_loop.sid);
19 END LOOP;
20 END LOOP;
21 END;
22 /
.
Blocking Session : 15
Object (Owner/Name): SYSADM.T
Object Type : TABLE
Sessions being blocked : 11

PL/SQL procedure successfully completed.

SQL>

Monday, June 16, 2008

Lesson for the day

With cost based optimizer if you don’t collect statistics properly then it means that you are screwed.

This is what I did.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.1.0 - Production

I created the following table.

SQL> create table hx_index_test_tbl(
2 no integer not null,
3 name varchar2(50) not null,
4 old_value varchar2(100),
5 new_value varchar2(100));

Table created.

Next I added these two indexes to the table.

SQL> create index hx_index_test_tbl_idx on hx_index_test_tbl(name,no);

Index created.

SQL> create index hx_index_test_tbl_idx_001 on hx_index_test_tbl(name);

Index created.

Then I populated some data.

SQL> insert into hx_index_test_tbl
2 select row_number() over(partition by name order by 1) no, name, rpad('*',100,'*') old_value, rpad('*',100,'*') new_value
3 from (select 'A'mod(level,100) name from dual connect by level <=1000);

1000 rows created.

SQL> commit;

Commit complete.

Got the statistics gathered.

SQL> exec dbms_stats.gather_table_stats(user,'hx_index_test_tbl',method_opt=>'for all indexed columns');

PL/SQL procedure successfully completed.

SQL> set serveroutput on
SQL> set autotrace traceonly explain
SQL> select * from hx_index_test_tbl where name = 'A0';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=10 Bytes=1000)
1 0 TABLE ACCESS (FULL) OF 'HX_INDEX_TEST_TBL' (Cost=5 Card=10 Bytes=1000)

Oops its going for a full scan. It’s not considering my index. I want it to pick up the index HX_INDEX_TEST_TBL_IDX_001.

SQL> select * from hx_index_test_tbl where name = 'A0' and no = 1;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=100)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'HX_INDEX_TEST_TBL' (Cost=2 Card=1 Bytes=100)
2 1 INDEX (RANGE SCAN) OF 'HX_INDEX_TEST_TBL_IDX' (NON-UNIQUE) (Cost=1 Card=1)

But when I add no in the where clause its considering the index why is it so???

The reason is very simple. See how I have gathered statistics.

dbms_stats.gather_table_stats(user,'hx_index_test_tbl',method_opt=>'for all indexed columns');

I forgot to gather for Index of this table. Now let me gather for the index of this table also.

SQL> EXEC dbms_stats.gather_table_stats(user,'hx_index_test_tbl',method_opt=>'for all indexed columns', cascade=>TRUE);

PL/SQL procedure successfully completed.

Now see what happens...

SQL> select * from hx_index_test_tbl where name = 'A0';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=10 Bytes=1000)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'HX_INDEX_TEST_TBL' (Cost=2 Card=10 Bytes=1000)
2 1 INDEX (RANGE SCAN) OF 'HX_INDEX_TEST_TBL_IDX_001' (NON-UNIQUE) (Cost=1 Card=10)


So if you don’t gather stats properly with cost based optimizer you will get weird execution plans.

Monday, June 9, 2008

String List

Being a PLSQL developer many a times I have been in a situation where I would get list of values as input in a string and have to check if they exist in a particular table. The values would be separated by a delimiter. Let’s consider the below example

String Input: ‘ORACLE, SQL SERVER, DB2, MYSQL, SYBASE’

I have to use this string to see if these values exist in the table.

One possible way is to build a dynamic SQL. Just pad single quotes before and after each value in the string and build a SQL with the IN clause. But I want to do it without a dynamic SQL. How is this possible?

I came up with this SQL which splits this string and gives me the values in the string as rows.

Here is the SQL…

SQL> var a varchar2(100)

SQL> exec :a := 'ORACLE, SQL SERVER, DB2, MYSQL, SYBASE'

PL/SQL procedure successfully completed.

SQL> SELECT TRIM(SUBSTR(:a, INSTR(','||:a,',',1,level), INSTR(:a||',',',',1,level) - INSTR(','||:a,',',1,level))) my_list
2 FROM dual, (SELECT length(replace(translate(lower(:a)||',','abcdefghijklmnopqrstuvwxyz1234567890',' '),' ','')) cnt
3 FROM dual)
4 CONNECT BY level <= cnt
5 /

MY_LIST
--------------------------------------------------------------------------------
ORACLE
SQL SERVER
DB2
MYSQL
SYBASE


So what have I done here? Let me split it into steps for your understanding.

1. I am identifying the possible letters that would appear in the values and translating them into blank space. Following command does that for me.

SQL> SELECT translate(lower(:a)||',','abcdefghijklmnopqrstuvwxyz1234567890',' ') val
2 FROM dual;

VAL
--------------------------------------------------------------------------------
, , , , ,



Now you can see I don’t have anything other than comma (The delimiter).

2. Now I am going to remove all the blank space from the output that I got in step 1.

SQL> SELECT replace(translate(lower(:a)||',','abcdefghijklmnopqrstuvwxyz1234567890',' '),' ','') val         
2 FROM dual;

VAL
--------------------------------------------------------------------------------
,,,,,



Now the length of the output will give me the total number of values in the list. Note that I have padded a extra comma to the string as the number of delimiter is always less than the number of values by one.

3. So the third step is obviously to count the number of comma in the above input and determine total number of values in the string list.

SQL> SELECT length(replace(translate(lower(:a)||',','abcdefghijklmnopqrstuvwxyz1234567890',' '),' ','')) cnt
2 FROM dual
3 /

CNT
----------
5



4. Next step is to generate the string list for 5 times in rows using the CONNECT BY clause.

SQL> SELECT level, :a my_list
2 FROM dual, (SELECT length(replace(translate(lower(:a)||',','abcdefghijklmnopqrstuvwxyz1234567890',' '),' ','')) cnt
3 FROM dual)
4 CONNECT BY level <= cnt
5 /

LEVEL MY_LIST
---------- ---------------------------------------------------------------------------------------------------------------
1 ORACLE, SQL SERVER, DB2, MYSQL, SYBASE
2 ORACLE, SQL SERVER, DB2, MYSQL, SYBASE
3 ORACLE, SQL SERVER, DB2, MYSQL, SYBASE
4 ORACLE, SQL SERVER, DB2, MYSQL, SYBASE
5 ORACLE, SQL SERVER, DB2, MYSQL, SYBASE


5. Now its simple with the help of SUBSTR and INSTR I can split each values as shown below.

SQL>  SELECT TRIM(SUBSTR(:a, INSTR(','||:a,',',1,level), INSTR(:a||',',',',1,level) - INSTR(',':a,',',1,level))) my_list
2 FROM dual, (SELECT length(replace(translate(lower(:a)',','abcdefghijklmnopqrstuvwxyz1234567890',' '),' ','')) cnt
3 FROM dual)
4 CONNECT BY level <= cnt
5 /

MY_LIST
--------------------------------------------------------------------------------
ORACLE
SQL SERVER
DB2
MYSQL
SYBASE


So now I can use this SQL to convert any value list into rows and use in my SQL statements.

And if you can use Regular expression things can be much more easier.

Check this out.

SQL>  SELECT trim(regexp_substr(:a,'[^,]+', 1, level)) my_list
2 FROM dual
3 CONNECT BY level <= NVL( LENGTH( REGEXP_REPLACE( :a, '[^,]+', NULL ) ), 0 ) + 1
4 /

MY_LIST
--------------------------------------------------------------------------------
ORACLE
SQL SERVER
DB2
MYSQL
SYBASE


NOTE: This might need have to be customized based on the given requirement.

Monday, June 2, 2008

Code reusability in oracle pl/sql

Again today when checking out tom’s blog I lead me to this post in Tayler Muth’s Blog. This sounds to be an interesting topic. So I thought for a while and this is what came up to my mind…

Code reusability in oracle pl/sql

First to start with pl/sql is procedural language which is not object oriented. Once said that it implies that this language is not familiar with the concept of inheritance. I have been using pl/sql for quite some time. It’s a great programming language. It would be even better if it had the class concept in it.

Class concept would come handy when we want to build a framework which can be utilized over and over.

Currently I am working on a masking utility. We have different masking algorithms say…

Algorithm 1, Algorithm 2, Algorithm 3 etc.

The general framework of this masking is something like this.

1. Get the field from user which needs to be masked.
2. Get the master table for that field.
3. Load the distinct data for the field from master table into a staging table
4. Apply your masking logic on the staging table.
5. Find all the transaction tables that have this field in it.
6. Using the staging table update all the transaction table.

Here this is a very good candidate for class concept where I can have a parent class which will have the entire frame work of what I have given in six steps. And I can have one child class for each algorithm and in each child class I can just over write the step four where I can apply my masking algorithm. But unfortunately we don’t have it. Hence I went for package, Use some CASE statements and hooked in my algorithms and made it as optimal as possible. I basically hate writing lot of code. If Oracle can introduce OOP(Object Oriented Programming) in PL/SQL it would be of great use and even more flexible.

Thursday, May 15, 2008

Read Only Table.

I saw this post on Richard foote's blog which pointed me to this otn thread.

So i gave it a try. This is what i did.

Say I have a table as given below in production.

SQL> CREATE TABLE hx_my_read_only_table (no INTEGER CHECK(NO>0), name VARCHAR2(100));

Table created.

SQL> INSERT INTO hx_my_read_only_table VALUES(1,'KARTHICK');

1 row created.

SQL> INSERT INTO hx_my_read_only_table VALUES(2,'VIMAL');

1 row created.

SQL> INSERT INTO hx_my_read_only_table VALUES(3,'VIJAY');

1 row created.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM hx_my_read_only_table;

NO NAME
---------- --------------------------------------------------------------------------------
1 KARTHICK
2 VIMAL
3 VIJAY

And I want to make it read only. I will use ACCESS CONTROL to achieve it.

SQL> CREATE CONTEXT hx_my_context USING hx_my_pkg;

Context created.

SQL> CREATE OR REPLACE PACKAGE hx_my_pkg
2 AS
3 PROCEDURE hx_set_context (
4 hx_p_fld VARCHAR2,
5 hx_p_value VARCHAR2);
6
7 FUNCTION hx_get_val (
8 hx_p_fld varchar2) RETURN NUMBER;
9
10 FUNCTION hx_get_key(
11 p_schema in varchar2,
12 p_object in varchar2) RETURN VARCHAR2;
13 END hx_my_pkg;
14 /
Package created.
SQL> CREATE OR REPLACE PACKAGE BODY hx_my_pkg
2 AS
3 PROCEDURE hx_set_context (
4 hx_p_fld VARCHAR2,
5 hx_p_value VARCHAR2)
6 IS
7 BEGIN
8 DBMS_SESSION.SET_CONTEXT('HX_MY_CONTEXT', hx_p_fld, hx_p_value);
9 END;
10
11 FUNCTION hx_get_val (
12 hx_p_fld varchar2) RETURN NUMBER
13 IS
14 BEGIN
15 RETURN SYS_CONTEXT('HX_MY_CONTEXT', hx_p_fld);
16 END;
17
18 FUNCTION hx_get_key(
19 p_schema in varchar2,
20 p_object in varchar2) RETURN VARCHAR2
21 IS
22 lPredicate VARCHAR2 (2000);
23 BEGIN
24 lPredicate := 'NO = SYS_CONTEXT(''HX_MY_CONTEXT'', ''NO'')';
25
26 RETURN lPredicate;
27 END;
28
29 END hx_my_pkg;
30 /

Package body created.

SQL> BEGIN
2 DBMS_RLS.ADD_POLICY (USER, 'HX_MY_READ_ONLY_TABLE', 'HX_READ_ONLY_POLICY',USER, 'hx_my_pkg.hx_get_key', 'INSERT, UPDATE, DELETE' );
3 END;
4 /

PL/SQL procedure successfully completed.

SQL> BEGIN
2 hx_my_pkg.hx_set_context('NO',0);
3 END;
4 /

PL/SQL procedure successfully completed.

SQL> UPDATE hx_my_read_only_table
2 SET NAME = 'RICHARD'
3 WHERE NO = 1;

0 rows updated.

SQL> DELETE FROM hx_my_read_only_table;

0 rows deleted.

SQL> INSERT INTO hx_my_read_only_table VALUES(4,'RICHARD');

1 row created.

I was expecting this one not to insert record. I have to see what the problem is.

SQL> ROLLBACK;

Rollback complete.

Mean while I restricted the insert by with this trigger. I personally hate triggers but....

SQL> CREATE OR REPLACE TRIGGER hx_my_trigger BEFORE INSERT ON hx_my_read_only_table FOR EACH ROW
2 BEGIN
3 IF :NEW.NO <> hx_my_pkg.hx_get_val('NO')
4 THEN
5 RAISE_APPLICATION_ERROR(-20001, 'Insert Not possible...');
6 END IF;
7 END;
8 /

Trigger created.

SQL> INSERT INTO hx_my_read_only_table VALUES(4,'RICHARD');

INSERT INTO hx_my_read_only_table VALUES(4,'RICHARD')
*
ERROR at line 1:
ORA-20001: Insert Not possible...
ORA-06512: at "SYSADM.HX_MY_TRIGGER", line 4
ORA-04088: error during execution of trigger 'SYSADM.HX_MY_TRIGGER'

So i have Restricted INSERT, UPDATE, DELETE on this table.

SQL> SELECT * FROM HX_MY_READ_ONLY_TABLE;

NO NAME
---------- --------------------------------------------------------------------------------
1 KARTHICK
2 VIMAL
3 VIJAY

Tuesday, May 6, 2008

The bridge is falling down.

Few days back tom wrote this in his blog. I felt there is lot of truth in it and was very happy to hear it from him. Today when I came into office the first mail I came across had these pictures.











This one is an unacceptable construction flaw. The mail said it has happened in Korea. Such thing may be very… very… rare event in construction business. But here in software we see lots of bridges falling down every day. After seeing this I have started to worry even more about what he said.

DML Error Logging in 10gR2

This is a cool stuff in oracle. I just tried it few days back and I was exited so thought of sharing it here.

Have you ever stuck up with a bulk insert failing on a check constraint? In such a situation we spend most of our time trying to find out the culprit rows. Oracle has come up with a great stuff by which such rows can be logged separately without disturbing the valid rows.

To put it straight by using the DML Error Logging feature you can just pick up all the invalid rows and put it separately and let the valid records go into the target table.Following is a small demonstration of this feature.

This is the source table from which the data is going to be selected.

SQL> CREATE TABLE SOURCE(NO NUMBER, NAME VARCHAR2(100))
2 /

Table created.

SQL> INSERT INTO SOURCE SELECT LEVEL, 'KARTHICK' FROM DUAL CONNECT BY LEVEL <=15
2 /

15 rows created.

This is the target table into which the data from the source table is going to be inserted.

SQL> CREATE TABLE TARGET(NO NUMBER, NAME VARCHAR2(100))
2 /

Table created.

I am creating a check constraint as follows

SQL> ALTER TABLE TARGET ADD CONSTRAINT SOURCE_CHK CHECK(NO <10)
2 /

Table altered.

Now let’s try to insert the source table data into target table.

SQL> INSERT INTO TARGET SELECT * FROM SOURCE;
INSERT INTO TARGET SELECT * FROM SOURCE
*
ERROR at line 1:
ORA-02290: check constraint (SYSADM.SOURCE_CHK) violated

As we have a check constraint on column TARGET.NO which says NO should be less than 10, if we have any row which does not satisfy this condition then the entire insert fails. But I want the valid rows to be inserted into my target table and all the invalid rows to be logged separately. I am going to achieve this using the DML Error Logging technique.

To achieve that first we need to create a Error Log Table. The following command helps us in doing that.

SQL> BEGIN
2 DBMS_ERRLOG.create_error_log (dml_table_name => 'TARGET');
3 END;
4 /

PL/SQL procedure successfully completed.

This creates a new error log table for our TARGET table. The error log table is described below

SQL> desc err$_TARGET;

Name Null? Type
----------------------------------------- -------- ----------------------------

ORA_ERR_NUMBER$ NUMBER
ORA_ERR_MESG$ VARCHAR2(2000)
ORA_ERR_ROWID$ ROWID
ORA_ERR_OPTYP$ VARCHAR2(2)
ORA_ERR_TAG$ VARCHAR2(2000)
NO VARCHAR2(4000)
NAME VARCHAR2(4000)

Now we need to use LOG ERROR INTO syntax in our INSERT INTO statement to log the error.

SQL> INSERT INTO TARGET SELECT * FROM SOURCE LOG ERRORS INTO err$_TARGET ('INSERT') REJECT LIMIT UNLIMITED
2 /

9 rows created.

Now you can observe that of the 15 rows 9 rows that satisfy the check constraints are inserted into the TARGET table.So the invalid records will be logged into the Error Log table that we have created above. Let’s have a look into the error log table.

SELECT * FROM ERR$_TARGET

ORA_ERR_NUMBER$ ORA_ERR_MESG$ ORA_ERR_ROWID$ ORA_ERR_OPTYP$ ORA_ERR_TAG$ NO NAME
--------------------------------------------------------------------------------------------------------------------------------------------
2290 ORA-02290: check constraint (SYSADM.SOURCE_CHK) violated I INSERT 10 KARTHICK
2290 ORA-02290: check constraint (SYSADM.SOURCE_CHK) violated I INSERT 11 KARTHICK
2290 ORA-02290: check constraint (SYSADM.SOURCE_CHK) violated I INSERT 12 KARTHICK
2290 ORA-02290: check constraint (SYSADM.SOURCE_CHK) violated I INSERT 13 KARTHICK
2290 ORA-02290: check constraint (SYSADM.SOURCE_CHK) violated I INSERT 14 KARTHICK
2290 ORA-02290: check constraint (SYSADM.SOURCE_CHK) violated I INSERT 15 KARTHICK

Now which this information we can easily start our trouble shooting to figure out why the rows failed the check constraint.

Thursday, May 1, 2008

Do i need to repel?

I saw this posting by tom today. I felt like commenting it. And this is what i wrote.

It is not most of the time "I don't know the answer" but its "I don't know the question".

What I want?

Almost and always in the programming world it’s not what I want, it’s what they want me to want. Unclear requirements we see it every day don't we.

Why such questions?

The inability to question the authority is what leads to such questions. Some of the questions asked may not be looking for the answer but may be in the phase of trying to understand what is needed. I know it’s a wrong (very... very...) approach. But in the practical world that’s the way it is.

Don’t you here it?

Do what I say...
Don't do it that way, Do it my way...

Do I need to repel?

Is it right to say what you ask for just does not make any sense to me. If I tell that I know the next question (if the person sitting opposite to you is a genuine person) would be "Then tell me how it should be?” Do we really have answer at that moment is what matters. If we have then we may find a way out. But most of the time we know that something is wrong but we don’t know what is right. To find the right thing we need time. And that is what most of them don’t have with them. For all this as I said earlier the person should be genuine. What if he or she is a bossy person? It’s simple..."You are screwed up".

So I feel such question is born not because "I don’t know what I want" but because "I don’t know what they want me to want".

Thursday, March 20, 2008

Do you prefer to end up this way?

Think about it. I don't want to end up this way. Definitely not....

Few days back we had a discussion on how to build a better team, of which one of the important aspects was communication between the team members. After seeing this article I am getting a strong feeling about it.

There are situations where we don’t even know who the person sitting next to us. Yes that’s true. There could be several reasons for that. He may not be working on the same project on which you are. Or he may be into different technology.

So one argument that rose during our discussion was if the person sitting next to us is not in your team or your project is it necessary to communicate with him (This opinion could differ based on out personal interest). I was taking the "Not necessary side".

So if the person next to you is not in the same team as that of you then the question is why he is next to you? Again there could be several reasons for that. And we may not have control over those reasons.

After seeing this article I am starting to feel it may be "Extremely necessary and important". So I think it’s totally up to us to decide on what to do. The organization can only create the atmosphere for us to mingle with one another but it can make us to.

So I think we are the one who need to decide how much it is important.

Friday, March 14, 2008

How does it work? Does it matter!!!

Ok as my profile says I am an oracle PL/SQL developer!!!

That’s what I do for my living. I write code. That is all I do most of the time. I write some SQL or PL/SQL and throw it into the database. I have never cared about what oracle does with it. Why should I its oracles headache.

To write a SQL all I need to know is the correct syntax. That’s all I don’t need any thing else. Even for that I have SQL Reference to help me. May be in need the ability to think in Set’s. So I know SET... I know what is a JOIN and UNION.

Why should I care what oracle does with the SQL that I submitted. Why do I have to care about how it stores my data or how it gets me what I need? Why... we have the DBA guys to worry about that.

Do you feel the same way as I did? Do you? I hope the answer is not YES if it is then you are going down, down from day one. Yes that’s right!! No second thoughts you are going down... Down to hell!!

Have you heard about Bind Variables? If the answer is NO then you are not going down to hell as you are already living in there.

To know the importance of Bind Variable it’s essential to know how Oracle Works. What it does with the so called sophisticated code that you wrote.

Not only that, we need to know lot more. How the data are organized and stored, how they are retrieved. To put it in one word know the oracle database "Architecture".

A nice place to start with would be here.

http://blogs.ittoolbox.com/bi/confessions/archives/post-index-how-oracle-works-10605


I found it easy and interesting to read. The way the author has written is like a story narration. I like stories and "How oracle works" is a great story.

Once you understand all of this you will really come to know how much of your code is actually sophisticated.

So happy reading…

Friday, February 22, 2008

It's the Initiation that matters...

Taking Initiation has always been my biggest problem. Because once I get started I put lot of effort and dedication into it to get it done and to do it in a nice way. But starting it is the biggest problem. By “It” I mean any thing and every thing.

Take it to read some thing may be a book or to do a presentation or getting my ideas into execution. I always find it difficult to get that initial inspiration. Once it is there it’s always like sky is the limit. It starts to flow and I love it. I guess lots of people go through that state. Not sure if they do.

Tuesday, February 19, 2008

Ask Tom

Ask tom (http://asktom.oracle.com) is a great place for oracle developers and DBA. You have so much in there. I would say it’s a repository of Knowledge about Oracle and in general Database.

Today I wanted to write a small utility to export data from a table and put into a csv file. I just searched and got this great stuff from Ask Tom.

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:68212348056

Dump_CSV is what all I wanted. Got it modified little bit to support my requirements. And in no time I was ready.

In ask tom you not only get answers for your questions but also lot of alternatives. What is the best way to approach the problem and lot more?

Tuesday, February 12, 2008

Why do we need to follow coding standerds?

Why do we need to follow coding standerds? Over a period of time i have been tought to follow certain coding standards. Mainly like

Local variable - Starts with l

Global variables - Starts with g
Parameter - Starts with p

and so on...

Are these standards just to make our code look beautiful or are thes things our life savers. lets see a small example....

sql> create table hx_tmp(hx_val INTEGER);
Table created.

sql> insert into hx_tmp values (1);
1 row created.

sql> insert into hx_tmp values (2);
1 row created.

sql> commit;
Commit complete.

sql> create or replace function hx_f (hx_val in integer) return integer
2 as
3 hx_out INTEGER := 0;
4 begin
5 select hx_val into hx_out from hx_tmp where hx_val = hx_val;
6
7 return hx_out;
8 end;
9 /
Function created.

sql> variable a number

sql> exec :a := hx_f(2);
BEGIN :a := hx_f(2); END;
*
ERROR at line 1:
ORA-01422: exact fetch returns more than requested number of rows
ORA-06512: at "SYSADM.HX_F", line 5
ORA-06512: at line 1

Oooooops.....

If you where expecting to get a result think again. This is all because i dint follow the coding standard that i was supposed to follow. So i was expecting that input value 2 will be passed to the SQL. But it did the other way. Its all because of bad way of coding.

So conclution....

Coding standards is not just to make your code look beautiful or understandable or maintanable or.... or..... or.... but also to make your code bug free.

Monday, February 11, 2008

Dynamic data modal sucks…

Dynamic data modal sucks…

I have always felt this. But people keep on asking for this.

Dynamic data model…
Unknown number of columns in a table…
Store rows as columns…
Store column as rows…
Take transpose…
Do all sort of nonsense…

The outcome that I have found because of this is simple.

Headache wile coding…
Big headache wile debugging…
Bigger headache wile tuning for performance…

But people still want it and people still do it…