Thursday, February 12, 2009

IN vs EXISTS - They are all the same today.

When ever it comes to performance of a query, for a very long time there was a
difference noticed between the usage of IN and EXIST.

Back then, when i was using a 8i database an improper use of IN where you could have used EXISTS can
result in a very bad performance of your query.

The execution plan chosen by the optimizer is completely different for both the query.

Check this out.

ORACLE VERSION: Oracle8i Enterprise Edition Release 8.1.7.2.1 - Production

SQL> create table big_table
2 as
3 select *
4 from all_objects
5 /

Table created.

SQL> create table small_table
2 as
3 select distinct object_type
4 from all_objects
5 /

Table created.

SQL> create index big_table_idx on big_table(object_type)
2 /

Index created.

SQL> exec dbms_stats.gather_table_stats(user,'BIG_TABLE',cascade=>true)

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(user,'SMALL_TABLE',cascade=>true)

PL/SQL procedure successfully completed.



Now i want to pick up records from my SMALL_TABLE for which object_type exists in my
BIG_TABLE.

First iam using IN for this purpose.

SQL> select *
2 from small_table s
3 where object_type in (select object_type
4 from big_table)
5 /

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=182 Card=15 Bytes=285)
1 0 HASH JOIN (Cost=182 Card=15 Bytes=285)
2 1 VIEW OF 'VW_NSO_1' (Cost=180 Card=15 Bytes=165)
3 2 SORT (UNIQUE) (Cost=180 Card=15 Bytes=90)
4 3 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=55 Card=43794 Bytes=262764)
5 1 TABLE ACCESS (FULL) OF 'SMALL_TABLE' (Cost=1 Card=15 Byt es=120)


The optimizer goes for a full scan on the BIG_TABLE even though i have a index created on it. I can do it
in a better way using EXISTS. See this.

SQL> select *
2 from small_table s
3 where exists(select null
4 from big_table b
5 where b.object_type = s.object_type)
6 /

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=8)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'SMALL_TABLE' (Cost=1 Card=1 Bytes=8)
3 1 INDEX (RANGE SCAN) OF 'BIG_TABLE_IDX' (NON-UNIQUE) (Cost=7 Card=2920 Bytes=17520)


You can notice a nice RANGE SCAN is been done on the BIG_TABLE. So those days it made
a lot of difference. The optimizer could not understand what you are trying to mean. It just gives what you ask for.

But today, things are different. With the smart CBO on place it understands what you want. It rewrites your query for you.

Check this out.

ORACLE VERSION: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod


With both IN and EXIST you don't find any difference. Its both the same. CBO knows exactly what you want. It
knows so well that it even does only a SEMI join. It comes out of your big table once it finds a single match. It
does not go thought the entire table or the index.

See how it works.


SQL> select *
2 from small_table s
3 where exists(select null
4 from big_table b
5 where b.object_type = s.object_type)
6 /

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=34 Card=19 Bytes=323)
1 0 NESTED LOOPS (SEMI) (Cost=34 Card=19 Bytes=323)
2 1 TABLE ACCESS (FULL) OF 'SMALL_TABLE' (TABLE) (Cost=3 Card=31 Bytes=279)
3 1 INDEX (RANGE SCAN) OF 'BIG_TABLE_IDX' (INDEX) (Cost=1 Card=71414 Bytes=571312)

SQL> select *
2 from small_table s
3 where object_type in (select object_type
4 from big_table)
5 /

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=34 Card=19 Bytes=323)
1 0 NESTED LOOPS (SEMI) (Cost=34 Card=19 Bytes=323)
2 1 TABLE ACCESS (FULL) OF 'SMALL_TABLE' (TABLE) (Cost=3 Card=31 Bytes=279)
3 1 INDEX (RANGE SCAN) OF 'BIG_TABLE_IDX' (INDEX) (Cost=1 Card=71414 Bytes=571312)


Isn't that a beauty!!!

Thursday, January 15, 2009

INDEX SKIP SCAN

A new feature called INDEX SKIP SCAN was introduced in Oracle 9i. By which oracle has the ability to skip some column of the index and go with the rest.

For example if you have a index with column (NAME, EMPLID), before 9i when you specify a where condition which just have EMPLID then the index is not considered.

But from 9i the new feature enables the optimizer to skip the NAME field of the index a go for the EMPLID.

A simple test case can explain this. See below.

SQL> create table t
2 as
3 select 'MY_TEST_DATA' obj_desc, object_id, object_name, object_type
4 from all_objects
5 /

Table created.

SQL> create index t_idx on t(obj_desc, object_id)
2 /

Index created.

SQL> exec dbms_stats.gather_table_stats(user, 'T', cascade=>true)

PL/SQL procedure successfully completed.

SQL> select * from t where object_id = 40
2 /

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=4 Card=1 Bytes=45)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=4 Card=1 Bytes=45)
2 1 INDEX (SKIP SCAN) OF 'T_IDX' (INDEX) (Cost=3 Card=1)


SQL> select * from t where obj_desc = 'MY_TEST_DATA' and object_id = 40
2 /

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=4 Card=1 Bytes=45)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=4 Card=1 Bytes=45)
2 1 INDEX (RANGE SCAN) OF 'T_IDX' (INDEX) (Cost=3 Card=1)


But i am not sure if SKIP SCAN is as good as RANGE SCAN. But you can be certain that its better than FULL SCAN.

But again the selection of this option by the optimizer depends on the cardinality of the first field. In the above example we saw that OBJ_DESC had one value and hence the optimizer went for a SKIP SCAN.

Lets take a case where the first column of the index has a high cardinality. Here i am going to create OBJ_DESC field with unique values. lets see what happens.

SQL> drop table t
2 /

Table dropped.

SQL> create table t
2 as
3 select 'MY_TEST_DATA'||rownum obj_desc, object_id, object_name, object_type
4 from all_objects
5 /

Table created.

SQL> create index t_idx on t(obj_desc, object_id)
2 /

Index created.

SQL> exec dbms_stats.gather_table_stats(user, 'T', cascade=>true)

PL/SQL procedure successfully completed.

SQL> select * from t where object_id = 40
2 /

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=209 Card=1 Bytes=50)
1 0 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=209 Card=1 Bytes=50)


SQL> select * from t where obj_desc = 'MY_TEST_DATA' and object_id = 40
2 /

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=4 Card=1 Bytes=50)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=4 Card=1 Bytes=50)
2 1 INDEX (RANGE SCAN) OF 'T_IDX' (INDEX) (Cost=3 Card=1)




Now you can very well see that the Optimizer decides to go for a FULL SCAN.

Tuesday, December 16, 2008

When the optimizer goes for a FULL SCAN

It is often asked why a SQL is going for a Full Table Scan. And also its widely considered that
Full Table Scan means BAD performance.

So i would try to explain the possible caused when a CBO decides to go for a full table
scan. And also the concept of FULL TABLE SCAN IS NOT EVIL.

To do that i have created a small test setup.

I have create a table FULL_TABLE_SCAN_TEST from all_objects.


SQL> create table full_table_scan_test
2 as
3 select *
4 from all_objects
5 /

SQL> select count(1) from full_table_scan_test
2 /

COUNT(1)
----------
930360



I have indexed the column OBJECT_TYPE.

SQL> create index full_table_scan_test_idx on full_table_scan_test(object_type)
2 /

Index created.

SQL> exec dbms_stats.gather_table_stats(user,'FULL_TABLE_SCAN_TEST',cascade=>TRUE)

PL/SQL procedure successfully completed.



CASE 1: The optimizer normally goes for a FULL TABLE SCAN when it has to retreive a
large portion of data from the table. This is because when accessing a large
portion of data, a scan through the index can be a overhead.

Below is a little demonstration of the same.


Now below query shows the count of records for object_type FUNCTION.

SQL> select count(1)
2 from full_table_scan_test
3 where object_type in ('FUNCTION')
4 /

COUNT(1)
----------
2408



This is a very small portion when compared to the total count which is 930360

So a condition like this must result in a INDEX RANGE SCAN. Lets try it.

SQL> explain plan
2 for
3 select *
4 from full_table_scan_test
5 where object_type in ('FUNCTION')
6 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------
Plan hash value: 607499257

--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1833 | 161K| 65 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| FULL_TABLE_SCAN_TEST | 1833 | 161K| 65 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | FULL_TABLE_SCAN_TEST_IDX | 1833 | | 7 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------
2 - access("OBJECT_TYPE"='FUNCTION')

14 rows selected.



As expected we got a INDEX RANGE SCAN.

But now a filter criteria on the same column which will return a large protion of record will go for
a FULL SCAN. This can easily be illustrated.

A filter for object_type TABLE, INDEX returns a vast protion for data as shown below.

SQL> select count(1)
2 from full_table_scan_test
3 where object_type IN ('TABLE','INDEX')
4 /

COUNT(1)
----------
428688



So when a filtering is done for this condition you can see the optimizer going for a FULL TABLE SCAN.

SQL> explain plan
2 for
3 select *
4 from full_table_scan_test
5 where object_type in ('TABLE','INDEX')
6 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------------------------------
Plan hash value: 798910834

------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 437K| 37M| 2796 (3)| 00:00:34 |
|* 1 | TABLE ACCESS FULL| FULL_TABLE_SCAN_TEST | 437K| 37M| 2796 (3)| 00:00:34 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------
1 - filter("OBJECT_TYPE"='INDEX' OR "OBJECT_TYPE"='TABLE')

13 rows selected.



Most of the time people assume that this is bad and try to HINT the optimizer to use the index.

This could turn to be expensive. Lets hint the above query and see what happens.

SQL> delete from plan_table
2 /

2 rows deleted.

SQL> explain plan
2 for
3 select /*+ INDEX(full_table_scan_test full_table_scan_test_idx) */ *
4 from full_table_scan_test
5 where object_type in ('TABLE','INDEX')
6 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT

---------------------------------------------------------------------------------------------------------
Plan hash value: 279860542
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 437K| 37M| 14723 (1)| 00:02:57 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| FULL_TABLE_SCAN_TEST | 437K| 37M| 14723 (1)| 00:02:57 |
|* 3 | INDEX RANGE SCAN | FULL_TABLE_SCAN_TEST_IDX | 437K| | 1053 (2)| 00:00:13 |
---------------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):

3 - access("OBJECT_TYPE"='INDEX' OR "OBJECT_TYPE"='TABLE')

15 rows selected.


Here you can see there is a drastic increase in the CPU Cost.

Conclution: FULL TABLE SCAN is one of the optimal way when trying to access a large portion of data.

CASE 2: Use of NOT leads to a FULL TABLE SCAN.

Even though the optimizer is a very smart program it has its owne assumptions and limitations.
One such thing is, when a NOT operatior is used the optimizer assumes that its going to
fetch a large portion of data and directly goes for a FULL TABLE SCAN.

So the use of NOT operator leads to a full table scan even though the data retrived is
less. See the demonstration below.

The follwing condition shown in the below query results in a very small portion of the table

SQL> select count(*)
2 from full_table_scan_test
3 where object_type not in ('TABLE','INDEX','VIEW','SYNONYM',
4 'TYPE','JAVA CLASS','PACKAGE','PACKAGE BODY')
5 /

COUNT(*)
----------
21184



But you can see that the optimizer decides to go for a FULL TABLE SCAN. Check this out.

SQL> explain plan
2 for
3 select *
4 from full_table_scan_test
5 where object_type not in ('TABLE','INDEX','VIEW','SYNONYM',
6 'TYPE','JAVA CLASS','PACKAGE','PACKAGE BODY')
7 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
Plan hash value: 798910834
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 313K| 26M| 2813 (4)| 00:00:34 |
|* 1 | TABLE ACCESS FULL| FULL_TABLE_SCAN_TEST | 313K| 26M| 2813 (4)| 00:00:34 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
1 - filter("OBJECT_TYPE"<>'INDEX' AND "OBJECT_TYPE"<>'TABLE' AND
"OBJECT_TYPE"<>'SYNONYM' AND "OBJECT_TYPE"<>'VIEW' AND "OBJECT_TYPE"<>'JAVA
CLASS' AND "OBJECT_TYPE"<>'TYPE' AND "OBJECT_TYPE"<>'PACKAGE BODY' AND
"OBJECT_TYPE"<>'PACKAGE')

16 rows selected.


Even hinting will lead only to a INDEX FULL SCAN. A RANGE SCAN is never an option with NOT operator.

Lets hint and see.

SQL> explain plan
2 for
3 select /*+ INDEX(full_table_scan_test full_table_scan_test_idx) */ *
4 from full_table_scan_test
5 where object_type not in ('TABLE','INDEX','VIEW','SYNONYM',
6 'TYPE','JAVA CLASS','PACKAGE','PACKAGE BODY')
7 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
Plan hash value: 2742810082
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 313K| 26M| 12196 (1)| 00:02:27 |
| 1 | TABLE ACCESS BY INDEX ROWID| FULL_TABLE_SCAN_TEST | 313K| 26M| 12196 (1)| 00:02:27 |
|* 2 | INDEX FULL SCAN | FULL_TABLE_SCAN_TEST_IDX | 313K| | 2412 (3)| 00:00:29 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
2 - filter("OBJECT_TYPE"<>'INDEX' AND "OBJECT_TYPE"<>'TABLE' AND "OBJECT_TYPE"<>'SYNONYM'
AND "OBJECT_TYPE"<>'VIEW' AND "OBJECT_TYPE"<>'JAVA CLASS' AND "OBJECT_TYPE"<>'TYPE' AND
"OBJECT_TYPE"<>'PACKAGE BODY' AND "OBJECT_TYPE"<>'PACKAGE')

16 rows selected.



Conclution: Use NOT operation while retreiving a large portion of data. If you are going to retreive
a small portion of data try to be positive ;) and do not use NOT operator. One work around
could be use of > and < together.

CASE 3: A NULL column is not an option for INDEX FAST FULL SCAN.

Normally when we do a COUNT on a table that has a index on a column we expect it to use that
index. But that is possible only when the indexed column is NOT NULL.

This is simply because the NULL values of a column are not indexed. And hence the optimizer
cant scan throught the index to give you the exact count of a table.

A little demonstration about this is given below.

The column OBJECT_TYPE is a NULL'able column. Hence when you do a COUNT on the table it goes for a FULL SCAN.

Check this out.

SQL> explain plan
2 for
3 select count(*)
4 from full_table_scan_test
5 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
Plan hash value: 3323906207
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2755 (2)| 00:00:34 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| FULL_TABLE_SCAN_TEST | 930K| 2755 (2)| 00:00:34 |
-----------------------------------------------------------------------------------

9 rows selected.



So now lets modify the column OBJECT_TYPE as NOT NULL and see what happens.

SQL> alter table full_table_scan_test modify object_type not null
2 /

Table altered.

SQL> exec dbms_stats.gather_table_stats(user,'FULL_TABLE_SCAN_TEST',cascade=>TRUE)

PL/SQL procedure successfully completed.



Now i execute the same query.

SQL> explain plan
2 for
3 select count(*)
4 from full_table_scan_test
5 /

Explained.

SQL> select * from table(dbms_xplan.display)
2 /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
Plan hash value: 2070121381
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 536 (5)| 00:00:07 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| FULL_TABLE_SCAN_TEST_IDX | 937K| 536 (5)| 00:00:07 |
------------------------------------------------------------------------------------------

9 rows selected.



Conclution: Optimizer can't trust a INDEX on a NULL'able column to getting all the data of that column.

Wednesday, November 12, 2008

Index is not all GOD

Another case to prove that index is not all GOD and FULL TABLE SCAN is not evil.

Today i saw this question on the OTN forum.

Hi All,

I need to execute a query like this :
SELECT * FROM ORDERS WHERE APPROVE_DATE IS NULL

I read anywhere that this will cause unnecessary FTS so that I should create function based index.

I have tried one below , but not sure that this is correct approach :

CREATE INDEX idx_1
ON ORDERS (NVL(APPROVE_DATE, '01-JAN-1900'));

SELECT * FROM ORDERS WHERE NVL(APPROVE_DATE, '01-JAN-1900') = '01-JAN-1900'

Is this a correct approach ?

Thank you,
xtanto


This is what i gave in reply. I prepared a simple test case to show that an index be USED as well as IGNORED.

A SQL_TRACE output will explain clearly what Justin has stated.

I have created a table T based on all_objects.

SQL> desc t
Name Null? Type
----------------------------------------- -------- -----------------------
OWNER NOT NULL VARCHAR2(30)
OBJECT_NAME NOT NULL VARCHAR2(30)
SUBOBJECT_NAME VARCHAR2(30)
OBJECT_ID NOT NULL NUMBER
DATA_OBJECT_ID NUMBER
OBJECT_TYPE VARCHAR2(19)
CREATED DATE
LAST_DDL_TIME NOT NULL DATE
TIMESTAMP VARCHAR2(19)
STATUS VARCHAR2(7)
TEMPORARY VARCHAR2(1)
GENERATED VARCHAR2(1)
SECONDARY VARCHAR2(1)


CASE I

SQL> select count(1) from t
2 /

COUNT(1)
----------
934320

SQL> select count(1) from t where created is null
2 /

COUNT(1)
----------
2376



The number of null values in CREATED column is proportionately very small.

Now i execute the query without function based index.

select *
from t
where created is null

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.09 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 160 0.04 0.10 0 12662 0 2376
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 162 0.04 0.19 0 12662 0 2376

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: ALL_ROWS
2376 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'T' (TABLE)



And here is the query that uses the function based index

select *
from t
where nvl(created,to_date('01-01-1900','DD-MM-YYYY')) = to_date('01-01-1900','DD-MM-YYYY')

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 160 0.01 0.01 0 698 0 2376
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 162 0.03 0.01 0 698 0 2376

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: ALL_ROWS
2376 TABLE ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF 'T' (TABLE)
2376 INDEX GOAL: ANALYZED (RANGE SCAN) OF 'T_FN_IDX' (INDEX)



Its very obvious from the above output that the Function Based Index as increased the performance.

CASE II

SQL> select count(1) from t
2 /

COUNT(1)
----------
934320

SQL> select count(1) from t where created is null
2 /

COUNT(1)
----------
202168



Now the null values in the CREATED column is proportionately large than the first test case.

Now lets see without using the function based index

select *
from t
where created is null

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 13479 0.46 0.71 2 25832 0 202168
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 13481 0.46 0.71 2 25832 0 202168

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: ALL_ROWS
202168 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'T' (TABLE)



Now iam trying to use the function based index

select *
from t
where nvl(created,to_date('01-01-1900','DD-MM-YYYY')) = to_date('01-01-1900','DD-MM-YYYY')

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 13479 0.54 0.84 0 33826 0 202168
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 13481 0.54 0.84 0 33826 0 202168

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: ALL_ROWS
202168 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'T' (TABLE)



Its obvious from the result that oracle has decided to go for a FULL TABLE SCAN even when an index was available.

So just having a function based index is not going to increase the query performance. There are lot of other factors to be considered as stated above.

The original link to the OTN Thread is

http://forums.oracle.com/forums/thread.jspa?threadID=825995&tstart=0

Thanks,
Karthick.

Wednesday, July 2, 2008

Loding XML using external table

A small example to load XML using external table.

SQL> CREATE DIRECTORY my_xml_dir AS 'E:\oracle\Log_files\UTL_AKIVATST'
2 /

Directory created.

SQL> DROP TABLE my_xml_et
2 /

Table dropped.

SQL> CREATE TABLE my_xml_et
2 ( EMPNO NUMBER,
3 EMPNAME VARCHAR2(10),
4 JOB VARCHAR2(10),
5 HIREDATE DATE,
6 SAL NUMBER
7 )
8 ORGANIZATION EXTERNAL
9 (
10 TYPE ORACLE_LOADER
11 DEFAULT DIRECTORY my_xml_dir
12 ACCESS PARAMETERS
13 (
14 records delimited by "</EMP>"
15 badfile my_xml_dir:'empxt%a_%p.bad'
16 logfile my_xml_dir:'empxt%a_%p.log'
17 FIELDS
18 (
19 filler char(2000) terminated by "<EMP>",
20 EMPNO char(2000) enclosed by "<EMPNO>" and "</EMPNO>",
21 EMPNAME char(2000) enclosed by "<ENAME>" and "</ENAME>",
22 JOB char(2000) enclosed by "<JOB>" and "</JOB>",
23 HIREDATE char(2000) enclosed by "<HIREDATE>" and "</HIREDATE>",
24 SAL char(2000) enclosed by "<SAL>" and "</SAL>"
25 )
26 )
27 LOCATION ('emp.xml')
28 )
29 PARALLEL
30 REJECT LIMIT UNLIMITED
31 /

Table created.

SQL> SELECT * FROM my_xml_et
2 /

EMPNO EMPNAME JOB HIREDATE SAL
---------- ---------- ---------- --------- ----------
7369 SMITH CLERK 17-DEC-80 800
7499 ALLEN SALESMAN 20-FEB-81 1600


This is the sample XML that i used.

emp.xml
----------------

<EMPLOYEES>
<EMP>
<EMPNO>7369</EMPNO>
<ENAME>SMITH</ENAME>
<JOB>CLERK</JOB>
<HIREDATE>17-DEC-80</HIREDATE>
<SAL>800</SAL>
</EMP>
<EMP>
<EMPNO>7499</EMPNO>
<ENAME>ALLEN</ENAME>
<JOB>SALESMAN</JOB>
<HIREDATE>20-FEB-81</HIREDATE>
<SAL>1600</SAL>
<COMM>300</COMM>
</EMP>
</EMPLOYEES>

Wednesday, June 18, 2008

Find out Blocking Session

I came across this pieace of code in Oracle form which helps us to find the blocking sessions. I tought of sharing it here.


SQL> SET ECHO OFF
SQL> SET LINE 10000
SQL> COLUMN BLOCKER FORMAT A11;
SQL> COLUMN BLOCKEE FORMAT A10;
SQL> COLUMN SID FORMAT 99999;
SQL> COLUMN BLOCKER_MODULE FORMAT A15 TRUNC;
SQL> COLUMN BLOCKEE_MODULE FORMAT A15 TRUNC;
SQL>
SQL> ALTER SESSION SET OPTIMIZER_MODE=RULE;

Session altered.

SQL>
SQL> SELECT a.inst_id,
2 (SELECT username FROM gv$session s WHERE s.inst_id=a.inst_id AND s.sid=a.sid) blocker,
3 a.sid,
4 (SELECT module FROM gv$session s WHERE s.inst_id=a.inst_id AND s.sid=a.sid) blocker_module ,
5 ' is blocking ' "IS BLOCKING",
6 b.inst_id,
7 (SELECT username FROM gv$session s WHERE s.inst_id=b.inst_id AND s.sid=b.sid) blockee,
8 b.sid ,
9 (SELECT module FROM gv$session s WHERE s.inst_id=b.inst_id AND s.sid=b.sid) blockee_module
10 FROM gv$lock a, gv$lock b
11 where a.block <>0
12 AND b.request > 0
13 AND a.id1 = b.id1
14 AND a.id2 = b.id2
15 AND a.sid <> b.sid
16 order by 1, 2
17 /

INST_ID BLOCKER SID BLOCKER_MODULE IS BLOCKING INST_ID BLOCKEE SID BLOCKEE_MODULE
---------- ----------- ------ --------------- ------------- ---------- ---------- ------ ---------------
1 SYSADM 15 SQL*Plus is blocking 1 SYSADM 11 T.O.A.D.

SQL>
SQL> ALTER SESSION SET OPTIMIZER_MODE=CHOOSE;

Session altered.

SQL> BEGIN
2 dbms_output.enable(1000000);
3 FOR do_loop IN (SELECT session_id, a.object_id, xidsqn, oracle_username, b.owner owner,
4 b.object_name object_name, b.object_type object_type
5 FROM v$locked_object a, dba_objects b
6 WHERE xidsqn != 0
7 and b.object_id = a.object_id)
8 LOOP
9 dbms_output.put_line('.');
10 dbms_output.put_line('Blocking Session : '||do_loop.session_id);
11 dbms_output.put_line('Object (Owner/Name): '||do_loop.owner||'.'||do_loop.object_name);
12 dbms_output.put_line('Object Type : '||do_loop.object_type);
13 FOR next_loop IN (SELECT sid
14 FROM v$lock
15 WHERE id2 = do_loop.xidsqn
16 AND sid != do_loop.session_id)
17 LOOP
18 dbms_output.put_line('Sessions being blocked : '||next_loop.sid);
19 END LOOP;
20 END LOOP;
21 END;
22 /
.
Blocking Session : 15
Object (Owner/Name): SYSADM.T
Object Type : TABLE
Sessions being blocked : 11

PL/SQL procedure successfully completed.

SQL>

Monday, June 16, 2008

Lesson for the day

With cost based optimizer if you don’t collect statistics properly then it means that you are screwed.

This is what I did.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.1.0 - Production

I created the following table.

SQL> create table hx_index_test_tbl(
2 no integer not null,
3 name varchar2(50) not null,
4 old_value varchar2(100),
5 new_value varchar2(100));

Table created.

Next I added these two indexes to the table.

SQL> create index hx_index_test_tbl_idx on hx_index_test_tbl(name,no);

Index created.

SQL> create index hx_index_test_tbl_idx_001 on hx_index_test_tbl(name);

Index created.

Then I populated some data.

SQL> insert into hx_index_test_tbl
2 select row_number() over(partition by name order by 1) no, name, rpad('*',100,'*') old_value, rpad('*',100,'*') new_value
3 from (select 'A'mod(level,100) name from dual connect by level <=1000);

1000 rows created.

SQL> commit;

Commit complete.

Got the statistics gathered.

SQL> exec dbms_stats.gather_table_stats(user,'hx_index_test_tbl',method_opt=>'for all indexed columns');

PL/SQL procedure successfully completed.

SQL> set serveroutput on
SQL> set autotrace traceonly explain
SQL> select * from hx_index_test_tbl where name = 'A0';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=10 Bytes=1000)
1 0 TABLE ACCESS (FULL) OF 'HX_INDEX_TEST_TBL' (Cost=5 Card=10 Bytes=1000)

Oops its going for a full scan. It’s not considering my index. I want it to pick up the index HX_INDEX_TEST_TBL_IDX_001.

SQL> select * from hx_index_test_tbl where name = 'A0' and no = 1;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=100)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'HX_INDEX_TEST_TBL' (Cost=2 Card=1 Bytes=100)
2 1 INDEX (RANGE SCAN) OF 'HX_INDEX_TEST_TBL_IDX' (NON-UNIQUE) (Cost=1 Card=1)

But when I add no in the where clause its considering the index why is it so???

The reason is very simple. See how I have gathered statistics.

dbms_stats.gather_table_stats(user,'hx_index_test_tbl',method_opt=>'for all indexed columns');

I forgot to gather for Index of this table. Now let me gather for the index of this table also.

SQL> EXEC dbms_stats.gather_table_stats(user,'hx_index_test_tbl',method_opt=>'for all indexed columns', cascade=>TRUE);

PL/SQL procedure successfully completed.

Now see what happens...

SQL> select * from hx_index_test_tbl where name = 'A0';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=10 Bytes=1000)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'HX_INDEX_TEST_TBL' (Cost=2 Card=10 Bytes=1000)
2 1 INDEX (RANGE SCAN) OF 'HX_INDEX_TEST_TBL_IDX_001' (NON-UNIQUE) (Cost=1 Card=10)


So if you don’t gather stats properly with cost based optimizer you will get weird execution plans.