(Familiar with case-insenstive collation already? Skip ahead to the "Side Effects")
A history of failed applications
Before Oracle 12, a common cause of frustration was how to easily handle the case of character data in your applications. It is day 1 of your application being live, and a simple query on your database suggests that all is going well so far
SQL> select * 2 from customers 3 where cust_name = 'ADAMS'; COUNTRY CREATED CUST_NAME ------------ --------- --------------------- AUS 08-NOV-16 ADAMS
Then a few minutes, the phone starts to ring, and people are complaining that their records are not being displayed, or the screens are crashing with duplicate errors, and your day sounds like it is about to get a whole lot worse. You make a small adjustment to your query and the cause is revealed:
SQL> select * 2 from customers 3 where upper(cust_name) = 'ADAMS'; COUNTRY CREATED CUST_NAME ------------ --------- ------------ AUS 07-NOV-16 Adams AUS 08-NOV-16 ADAMS AUS 09-NOV-16 adams
Maybe having more than one ADAMS is OK, maybe its not, but the most common scenario is that some of your development team were expecting all the data to be mixed case, and some others in the team were expecting it all to be upper case; the classic "left hand not knowing what the right hand is doing" problem. For the same reason, you'll probably quickly find that half of the application is running fine, and the other half is not, because of your indexing strategy
SQL> select column_name 2 from user_ind_columns 3 where index_name = 'CUST_IX'; COLUMN_NAME ------------------------------ CUST_NAME SQL> select * from customers 2 where upper(cust_name) = 'ADAMS'; ------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | ------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 152 | |* 1 | TABLE ACCESS FULL| CUSTOMERS | 1 | 152 | -------------------------------------------------------
Rather than refactor great chunks of the application, you'll probably take the quick fix and double up on indexes on the database. Ugh!
SQL> create index cust_ix 2 on customers ( cust_name ); Index created. SQL> create index cust_ix2 2 on customers ( upper(cust_name) ); Index created.
Now your query performance is better, but the one thing that always comes back to bite us when we opt for quick fixes to the problems, is not exploring the potential side effects. For example, in our hypothetical application, you're probably going to get another out of hours phone call from the DBA after you add this index when you discover
SQL> alter table customers shrink space; * ERROR at line 1: ORA-10631: SHRINK clause should not be specified for this object
Yup...no shrink command on a table with a function based index.
In Oracle 12, we introduced better handling of case-insensitivity with the BINARY_CI collation. I recreate my CUSTOMERS table with the new collation clause on the CUSTOMER_NAME column
SQL> CREATE TABLE CUSTOMERS 2 ( 3 COUNTRY VARCHAR2(128), 4 CREATED DATE, 5 CUST_NAME VARCHAR2(150) COLLATE BINARY_CI 6 ); Table created.
And now with a single index, I can get case-insensitive results even through my predicate only nominated one case.
SQL> create index cust_ix 2 on customers ( cust_name); Index created. SQL> set autotrace traceonly explain SQL> select * from customers 2 where cust_name = 'ADAMS'; ----------------------------------------------------------------- | Id | Operation | Name | Rows | ----------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| CUSTOMERS | 1 | |* 2 | INDEX RANGE SCAN | CUST_IX | 1 | ----------------------------------------------------------------- SQL> select * from customers 2 where cust_name = 'ADAMS'; COUNTRY CREATED CUST_NAME ------------ --------- ---------------- AUS 07-NOV-16 Adams AUS 08-NOV-16 ADAMS AUS 09-NOV-16 adams
Side Effects
Once developers have seen how easy it is to make columns case-insensitive in the new collation functionality, those who have come from platforms where case-insensitivity is the default (I'm looking squarely at you Windows file systems! ) are easily tempted to rush out and throw a BINARY_CI as the default at schema level and thus "magically" make your entire application case-insensitive.
Here is why you might want to have a re-think on that approach. In particular, just like the SHRINK SPACE side-effect of function-based indexes, the collation clauses also have side effects that you should be aware of before you head down that path.
For this example, I'll create a table of employees, but I have drunk the collation Kool-aid and my names are going to be case-insensitive.
SQL> create table CI_EMP ( 2 emp_id int, 3 first_name varchar2(30) collate binary_ci, 4 last_name varchar2(30) collate binary_ci, 5 salary int, 6 manager_id int); Table created.
I'll insert some sample data from the standard HR sample schema, and a few queries shows that my case-insensitive queries are working as expected
SQL> insert into CI_EMP 2 select employee_id, first_name, last_name, salary, manager_id 3 from hr.employees; 107 rows created. SQL> SQL> select * from CI_EMP where rownum <= 10; EMP_ID FIRST_NAME LAST_NAME SALARY MANAGER_ID ---------- ------------------------------ ------------------------------ ---------- ---------- 100 Steven King 24000 101 Neena Kochhar 17000 100 102 Lex De Haan 17000 100 103 Alexander Hunold 9000 102 104 Bruce Ernst 6000 103 105 David Austin 4800 103 106 Valli Pataballa 4800 103 107 Diana Lorentz 4200 103 108 Nancy Greenberg 12008 101 109 Daniel Faviet 9000 108 10 rows selected. SQL> select * from CI_EMP where last_name = 'Lorentz'; EMP_ID FIRST_NAME LAST_NAME SALARY MANAGER_ID ---------- ------------------------------ ------------------------------ ---------- ---------- 107 Diana Lorentz 4200 103 1 row selected. SQL> select * from CI_EMP where last_name = 'LORENTZ'; EMP_ID FIRST_NAME LAST_NAME SALARY MANAGER_ID ---------- ------------------------------ ------------------------------ ---------- ---------- 107 Diana Lorentz 4200 103 1 row selected.
Now a new application requirement comes in - we'd like a new column called FULL_NAME which is the simple concatenation of the FIRST_NAME and LAST_NAME.
"No problem!" I say. I'll simply throw together a quick trigger to handle that
SQL> alter table CI_EMP add full_name varchar2(80); Table altered. SQL> SQL> create or replace 2 trigger CI_EMP_FULL_NAME 3 before insert on CI_EMP 4 for each row 5 begin 6 :new.full_name := :new.first_name||' '||:new.last_name; 7 end; 8 / Warning: Trigger created with compilation errors. SQL> sho err Errors for TRIGGER CI_EMP_FULL_NAME: LINE/COL ERROR -------- ----------------------------------------------------------------- 2/21 PLS-00049: bad bind variable 'NEW.FIRST_NAME' 2/43 PLS-00049: bad bind variable 'NEW.LAST_NAME'
It can often be the case that our first cut at trigger code results in a compilation error due to our keyboard skills and the like, but the SHOW ERRORS command reveals that there is more to this error. Triggers (and PL/SQL code in general) are not going to play nicely with collations that deviate from the norm. This is thoroughly covered in the official docs.
Workarounds
For this simple requirement of an expression on existing columns, we could do away with the need of the trigger altogether and just use a virtual column:
SQL> alter table CI_EMP drop column full_name; Table altered. SQL> SQL> alter table CI_EMP add full_name varchar2(80) 2 generated always as ( first_name||' '||last_name ); Table altered.
But in the general case where a trigger contains extensive logic this may not be appropriate. To handle these situations, the key is to remember that PL/SQL is not going to handle the values, and thus you need to be using SQL to perform any logic. For example, I can use a compound trigger to move the :NEW logic to a SQL statement in the AFTER STATEMENT section
SQL> alter table CI_EMP drop column full_name; Table altered. SQL> SQL> alter table CI_EMP add full_name varchar2(80) ; Table altered. SQL> SQL> drop trigger CI_EMP_FULL_NAME; Trigger dropped. SQL> SQL> create or replace 2 trigger CI_EMP_FULL_NAME 3 for insert on CI_EMP 4 compound trigger 5 6 pk sys.odcinumberlist := sys.odcinumberlist(); 7 8 before each row is 9 begin 10 pk.extend; 11 pk(pk.count) := :new.emp_id; 12 end before each row; 13 14 15 after statement is 16 begin 17 update ci_emp 18 set full_name = first_name||' '||last_name 19 where emp_id in ( select column_value from table(pk)); 20 end after statement; 21 22 end; 23 / Trigger created. SQL> SQL> insert into ci_emp (emp_id,first_name,last_name,salary,manager_id) 2 values (1000,'Connor','McDonald',0,100); 1 row created. SQL> SQL> select * from ci_emp 2 where first_name = 'Connor'; EMP_ID FIRST_NAME LAST_NAME SALARY MANAGER_ID FULL_NAME ---------- ------------------------------ ------------------------------ ---------- ---------- ---------------------------------------- 1000 Connor McDonald 0 100 Connor McDonald 1 row selected.
but such solutions might not always be possible, in particular, if you need to reference :OLD values.
Key Takeaway
Heading down the path of case-insensitivity might require some changes in how you design your applications in order to bypass the PL/SQL layer for certain components. Note this is a PL/SQL restriction, not a TRIGGER restriction. Many of the cool things we take for granted in PL/SQL may not be available when you start modifying collations for columns.
SQL> declare 2 myrow ci_emp%rowtype; 3 begin 4 select * 5 into myrow 6 from CI_EMP 7 where last_name = 'Lorentz'; 8 end; 9 / myrow ci_emp%rowtype; * ERROR at line 2: ORA-06550: line 2, column 9: PLS-00762: %TYPE or %ROWTYPE must have the USING_NLS_COMP collation ORA-06550: line 0, column 0: PL/SQL: Compilation unit analysis terminated
As always...test thoroughly.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.