Category Archives: Databases

Language and Use of Indexes in Oracle

Symptoms

It takes a very long time to load a data warehouse on Oracle database with Kettle when the language in the Windows Region and Language settings is different from English (United States).

Cause

  • Oracle JDBC driver used by Kettle issues command ALTER SESSION SET NLS_LANGUAGE and changes default database language.
  • Changing NLS_LANGUAGE also changes oracle parameter NLS_SORT from BINARY to the specified language.
  • When NLS_COMP is equal to ANSI and the value of NLS_SORT is not BINARY, the linguistic comparison is used for WHERE statements.
  • When the value of NLS_COMP is different from BINARY a full scan is used for VARCHAR2 fields instead of index range scan.

Resolution

Add “-Duser.country=en -Duser.language=en” to java runtime options in Kettle batch files.

Experiment

When NLS_SORT is BINARY, the index is used.

SELECT PARAMETER, VALUE FROM V$NLS_PARAMETERS
WHERE PARAMETER IN ('NLS_LANGUAGE', 'NLS_SORT', 'NLS_COMP');

UPDATE F_JOB_BUDGET_LINE SET ACTUAL_COST=0
WHERE INSTANCE_KEY='JobBudgetLine00000000000004835665';


When the  parameter NLS_LANGUAGE is changed, NLS_SORT is also changed. The index is not used when NLS_SORT=CZECH and NLS_COMP=ANSI.

ALTER SESSION SET NLS_LANGUAGE='CZECH';
SELECT PARAMETER, VALUE FROM V$NLS_PARAMETERS
WHERE PARAMETER IN ('NLS_LANGUAGE', 'NLS_SORT');

UPDATE F_JOB_BUDGET_LINE SET ACTUAL_COST=0
WHERE INSTANCE_KEY='JobBudgetLine00000000000004835665';


We can change NLS_SORT to BINARY to make the index used again. Alternatively we can set NLS_COMP to BINARY, this will have the same effect.

ALTER SESSION SET NLS_SORT='BINARY';
SELECT PARAMETER, VALUE FROM V$NLS_PARAMETERS
WHERE PARAMETER IN ('NLS_LANGUAGE', 'NLS_SORT');

UPDATE F_JOB_BUDGET_LINE SET ACTUAL_COST=0
WHERE INSTANCE_KEY='JobBudgetLine00000000000004835665';

How to Find Running Queries – SQL Server

SELECT
  CN.session_id  AS SPID,
  ST.text        AS SqlStatementText
FROM
  sys.dm_exec_connections AS CN
CROSS APPLY
  sys.dm_exec_sql_text(CN.most_recent_sql_handle) AS ST
ORDER BY
  CN.session_id

If a BusinessObjects document is refreshing for a long time, you might be interested in what query is currently running (to optimize it). The above code can help to figure out this.

Hierarchical (Recursive) Queries in SQL Server

This code sample demonstrates recursive queries in SQL Server.

The query returns paths of Kettle transformations.

with Parent as
(
   select
      ID_DIRECTORY
   ,  ID_DIRECTORY_PARENT
   ,  cast(DIRECTORY_NAME
         as varchar(255)) as DIRECTORY_PATH
   from
      R_DIRECTORY
   where
      ID_DIRECTORY_PARENT = 0
   union all
   select
      TH.ID_DIRECTORY
   ,  TH.ID_DIRECTORY_PARENT
   ,  cast(Parent.DIRECTORY_PATH
         + '/' + DIRECTORY_NAME
         as varchar(255)) as DIRECTORY_PATH
   from
      R_DIRECTORY TH
   inner join
      Parent
   on
      Parent.ID_DIRECTORY = TH.ID_DIRECTORY_PARENT
)
select
   coalesce(dir.DIRECTORY_PATH, '/') DIRECTORY_PATH
,  trans.NAME as TRANSFORMATION_NAME
from
   Parent dir
right join
   R_TRANSFORMATION trans
on
   dir.ID_DIRECTORY = trans.ID_DIRECTORY

Lock information in SQL Server

The script gets miscellaneous information about locks including SQL.

SELECT 
  L.request_session_id              AS SPID, 
  DB_NAME(L.resource_database_id)   AS DatabaseName,
  L.request_mode                    AS LockType,
  L.request_status                  AS RequestStatus,
  L.resource_type                   AS ResourceType,
  L.resource_description            AS ResourceDescription,
  L.resource_associated_entity_id   AS ResourceAssociatedEntryId,
  -- Object
  O.name                            AS ObjectName, 
  O.object_id                       AS ObjectId,
  -- SQL
  ST.text                           AS SqlStatementText,
  CN.auth_scheme                    AS AuthenticationMethod,
  -- Session
  ES.login_name                     AS LoginName,
  ES.host_name                      AS HostName,
  -- Transaction
  TST.is_user_transaction           AS IsUserTransaction,
  AT.name                           AS TransactionName,
  AT.transaction_begin_time         AS TransactionBeginTime
FROM    
  sys.dm_tran_locks L
  -- Object
  LEFT JOIN sys.objects O ON O.object_id = (
    CASE 
      WHEN resource_type IN ('PAGE', 'KEY', 'RID', 'HOBT') 
        THEN (
          SELECT P.object_id 
          FROM sys.partitions P 
          WHERE P.hobt_id = L.resource_associated_entity_id
        ) 
      WHEN resource_type = 'OBJECT' 
        THEN L.resource_associated_entity_id
      WHEN resource_type = 'ALLOCATION_UNIT'     
        THEN (
          SELECT P.object_id FROM sys.partitions P 
          INNER JOIN sys.allocation_units AU
          ON AU.container_id = (
            CASE 
              WHEN AU.type IN (1,3) THEN P.hobt_id
              WHEN AU.type = 2 THEN P.partition_id
            END
          )
          WHERE AU.allocation_unit_id = L.resource_associated_entity_id
        ) 
    END
  )
  -- SQL
  LEFT JOIN sys.dm_exec_connections CN 
  ON CN.session_id = L.request_session_id
  CROSS APPLY sys.dm_exec_sql_text(CN.most_recent_sql_handle) AS ST
  -- Session
  LEFT JOIN sys.dm_exec_sessions ES 
  ON ES.session_id = L.request_session_id
  -- Transaction
  LEFT JOIN sys.dm_tran_session_transactions TST 
  ON TST.session_id = L.request_session_id
  LEFT JOIN sys.dm_tran_active_transactions AT 
  ON AT.transaction_id = TST.transaction_id 
ORDER BY 
  L.request_session_id

Script to kill user sessions in Oracle

Script to kill all sessions of a user in Oracle.

BEGIN
  FOR r IN (select sid,serial# from v$session where username = 'USER')
  LOOP
    EXECUTE IMMEDIATE 'alter system kill session ''' || r.sid 
      || ',' || r.serial# || '''';
  END LOOP;
END;

Manual

Retrieve session identifiers and session serial number (which uniquely identifies a session’s objects):

select sid, serial# from v$session where username = 'USER'

The syntax to kill session is

alter system kill session 'sid,serial#'

Disconnecting session:

alter system disconnect session 'sid,serial#' post_transaction;
alter system disconnect session 'sid,serial#' immediate;

More information

Killing Oracle Sessions on www.oracle-base.com

Comparison of database schemas

Suppose we need to compare two database schemas to find out changes such as new tables, columns, and change of data type.

Test

create user USER1 identified by USER1;
create user USER2 identified by USER2;
create table USER1.T1 (C int);
create table USER2.T1 (C int);
create table USER1.T2 (C int);
create table USER2.T3 (C int);
create table USER1.T4 (C int, D int);
create table USER2.T4 (C int);
create table USER1.T5 (C int, E int);
create table USER2.T5 (C int, E varchar(20));

Result

Schema comparison

Continue reading

How to print explain plan

DELETE PLAN_TABLE;
EXPLAIN PLAN FOR
  SELECT * FROM t WHERE group2=1;
SELECT * FROM TABLE(dbms_xplan.display());

Oracle utilities

The Oracle utility creating PLAN_TABLE:

D:\Oracle\product\11.2.0\dbhome_1\RDBMS\ADMIN\utlxplan.sql

The Oracle utility displays the last explain plan. It displays also parallel query information if the plan happens to run parallel:

D:\Oracle\product\11.2.0\dbhome_1\RDBMS\ADMIN\utlxplp.sql

The utility that does not displays parallel query information (sequential).

D:\Oracle\product\11.2.0\dbhome_1\RDBMS\ADMIN\utlxpls.sql

Hierarchical query

An example of use of hierarchical queries to print the last explain plan:

SELECT
  LPad(' ', level-1) || operation || ' (' || options || ')' AS operation,
  coalesce(object_name, ' ') object,
  cost,
  cardinality,
  bytes
FROM
  plan_table
START WITH
  id = 0
CONNECT BY PRIOR
  id=parent_id;

Example of the result:

OPERATION OBJECT COST CARDINALITY BYTES
SELECT STATEMENT () 873 1000 20000
  TABLE ACCESS (BY INDEX ROWID) T 873 1000 20000
    INDEX (RANGE SCAN) T_GROUP2 5 1000

Selectivity and Execution Plan

Cardinality is how many rows should be returned after evaluating predicates. Selectivity is cardinality divided by the total number of input rows. DBMS uses selectivity for choosing the right index.

Density is characteristic of data in a column. It is equal to 1 divided by the number of distinct values.

Let’s consider few examples of queries with different selectivity of predicate. We will restrict on one field so the selectivity of the predicate will be equal to the density.

Data

Create a table with five indexed fields. All fields have different number of unique values.

DROP TABLE t;
CREATE TABLE t (id INT, 
                group1 INT, 
                group2 INT, 
                group3 INT, 
                group4 INT);

BEGIN
  FOR id IN 1..1000000 LOOP
    INSERT INTO t VALUES (id, 
                          dbms_random.Value(1,100000), 
                          dbms_random.Value(1,1000), 
                          dbms_random.Value(1,100), 
                          dbms_random.Value(1,10));
  END LOOP;
END;
/

commit;

CREATE INDEX t_id ON t (id);
CREATE INDEX t_group1 ON t (group1);
CREATE INDEX t_group2 ON t (group2);
CREATE INDEX t_group3 ON t (group3);
CREATE INDEX t_group4 ON t (group4);

Example 1

Query with a restriction on id with unique values. Selectivity=1/1M (high)

SELECT * FROM t WHERE id = 1; 

Execution plan. Index used.

TABLE ACCESS (BY INDEX ROWID) of T #1 TABLE (Cost=4 Card=1 Bytes=65)
  INDEX (RANGE SCAN) of T_ID INDEX Optimizer=ANALYZED (Cost=3 Card=1 Bytes=)

Example 2

Query with a restriction on group1. Selectivity=1/100K (still high)

SELECT * FROM t WHERE group1 = 1;

Execution plan. Index used.

TABLE ACCESS (BY INDEX ROWID) of T #1 TABLE (Cost=6 Card=3 Bytes=195)
  INDEX (RANGE SCAN) of T_GROUP1 INDEX Optimizer=ANALYZED (Cost=6 Card=3 Bytes=)

Example 3

Query with a restriction on group2. Selectivity=1/1K

SELECT * FROM t WHERE group2 = 1;

Execution plan. Index used.

TABLE ACCESS (BY INDEX ROWID) of T #1 TABLE (Cost=348 Card=468 Bytes=30,420)
  INDEX (RANGE SCAN) of T_GROUP2 INDEX Optimizer=ANALYZED (Cost=3 Card=468 Bytes=)

Example 4

Query with a restriction on group3. Selectivity=1/100 (low)

SELECT * FROM t WHERE group3 = 1;

Execution plan. Index has NOT been used.

TABLE ACCESS (FULL) of T #1 TABLE (Cost=795 Card=6,425 Bytes=417,625)

Example 5 (hint)

Let’s compare with the plan of the same query but using hint forcing use of the corresponding index.

                       
SELECT /*+ INDEX(t t_group3) */ * FROM t WHERE group3 = 1;

The cost is 1,774 vs 795 i.e. 2.23 times larger.

TABLE ACCESS (BY INDEX ROWID) of T #1 TABLE (Cost=1,774 Card=6,425 Bytes=417,625)
  INDEX (RANGE SCAN) of T_GROUP3 INDEX Optimizer=ANALYZED (Cost=13 Card=6,425 Bytes=)

Example 6

Just to complete picture. Here is query without restriction:

SELECT * FROM t;

Execution plan:

TABLE ACCESS (FULL) of T #1 TABLE (Cost=792 Card=1,179,200 Bytes=76,648,000)