Home » RDBMS Server » Server Utilities » Loading multiple tables from delimited file with SQL*Loader (Oracle 11g)
Loading multiple tables from delimited file with SQL*Loader [message #331930] Sun, 06 July 2008 12:42 Go to next message
renjiths
Messages: 8
Registered: July 2008
Location: Bangalore
Junior Member
Hi,

I need some help understanding the behavior of SQL*Loader. Though I searched thru the topics, was not able to find a very similar post. Even then, I have got some cues already, but want to confirm my understanding.

I have to load multiple, related tables from a single delimited file (CSV). The same logical record will have contents for all the tables. The load for the first table is fine; for subsequent INTO TABLE I was expecting the "position" to continue with the next column after the last processed one. (I deduced so from whatever documentation I have read so far!) However it is not so - I had to use the POSITION(1) clause and then skip columns (as many numbers for the first table) to point to the second table's columns and so on.

Though this solution works now, I want to know whether there is a better way to handle this problem.

I can provide more details if required.

Thanks in advance.

regards,
Renjith
Re: Loading multiple tables from delimited file with SQL*Loader [message #331931 is a reply to message #331930] Sun, 06 July 2008 13:06 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
Please read & follow Posting Guidelines below
http://www.orafaq.com/forum/t/88153/0/

You might want to consider using "external tables" instead of SQL*Loader
Re: Loading multiple tables from delimited file with SQL*Loader [message #331934 is a reply to message #331930] Sun, 06 July 2008 13:45 Go to previous messageGo to next message
Michel Cadot
Messages: 68650
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Quote:
I can provide more details if required.

Yes data file example as well as table description.
But as Ana said, consider external table.

Regards
Michel
Re: Loading multiple tables from delimited file with SQL*Loader [message #332491 is a reply to message #331934] Tue, 08 July 2008 12:49 Go to previous messageGo to next message
renjiths
Messages: 8
Registered: July 2008
Location: Bangalore
Junior Member
I have now concluded that SQL*Loader is not going to work for the use case. The main issue was with the fact that certain tables (to be loaded from a single data file) need to have primary keys generated using SEQUENCEs and the same key values need to be inserted in related tables as foreign keys.

As suggested, I am exploring the use of External tables. From the documentation avaiable, it is not clear about how to add extra columns to the external table (not present in the data file) - to populate them with CONSTANT values, SEQUENCE numbers (NEXTVAL, CURRVAL) etc. I have referred the "Oracle Database Utilities" Guide and the column_transforms Clause seems ideal for use in such situations (or is it?). But there are no usage examples for these. Can anyone help please?

Sampe input file (CSV) and tables are shown below:
Identifier value, Patient First Name, Last Name, Name Type
1001, John, Doe, Legal
1002, Maria, Armstrong, Legal

To be populated into tables
1. Patient_Identifier
Identifier_DBKey, Identifier_Value

2. Patient_Name
Name_DBKey, Identifier_DBKey, First_Name, Last_Name, Name_Type,...

The same Identifier_DBKey generated for Patient_Identifier table while uploading row need to be used for inserting Identifier_DBKey FK in the Patient_Name table.

regards,
Renjith
Re: Loading multiple tables from delimited file with SQL*Loader [message #332492 is a reply to message #332491] Tue, 08 July 2008 12:53 Go to previous messageGo to next message
Michel Cadot
Messages: 68650
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
If you provide a test case as requested, I can show how to do it.

Regards
Michel
Re: Loading multiple tables from delimited file with SQL*Loader [message #332498 is a reply to message #331930] Tue, 08 July 2008 13:13 Go to previous messageGo to next message
renjiths
Messages: 8
Registered: July 2008
Location: Bangalore
Junior Member
Create Statement:

CREATE TABLE PATIENT_EXT (
Identifier_Value VARCHAR2(128),
Identifier_DBKey NUMBER(10),
Name_DBKey NUMBER(10),
First_Name VARCHAR2(64),
Last_Name VARCHAR2(64),
Name_Type VARCHAR2(64)
)
ORGANIZATION EXTERNAL (
TYPE oracle_loader
DEFAULT DIRECTORY DATA_DIR
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
FIELDS TERMINATED BY ','
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(
Identifier_Value,
Identifier_DBKey, -- COMMENT -- to be a computed column from ID_DBKEY_SEQ.NEXTVAL, how?
Name_DBKey, -- COMMENT -- to be a computed column from ID_DBKEY_SEQ.CURRVAL, how?
First_Name,
Last_Name,
Name_Type
))
LOCATION ('BATCHFEED.csv'))
PARALLEL
REJECT LIMIT 0;
Re: Loading multiple tables from delimited file with SQL*Loader [message #332499 is a reply to message #332498] Tue, 08 July 2008 13:31 Go to previous messageGo to next message
Michel Cadot
Messages: 68650
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
What can I do with that.
I want a FULL test case that I can execute.
create table statements for target table, data file...

Regards
Michel
Re: Loading multiple tables from delimited file with SQL*Loader [message #332500 is a reply to message #331930] Tue, 08 July 2008 14:01 Go to previous messageGo to next message
renjiths
Messages: 8
Registered: July 2008
Location: Bangalore
Junior Member
File with the DDLs, file and external table create statement attached.
Thanks,
Renjith
Re: Loading multiple tables from delimited file with SQL*Loader [message #332532 is a reply to message #332500] Tue, 08 July 2008 19:08 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9092
Registered: November 2002
Location: California, USA
Senior Member
Although this can be done using SQL*Loader, it is easier and faster using an external table, as demonstrated below. I started the sequence at 101, instead of 1000, just to make it easier to differentiate it from the identifier_value. Since you did not say what calculation you wanted to do on the sequence to obtain the name_dbkey, I just subtracted 100 from it. This is just an example that you can modify to suite your needs.


SCOTT@orcl_11g> CREATE OR REPLACE DIRECTORY data_dir AS 'c:\oracle11g'
  2  /

Directory created.

SCOTT@orcl_11g> CREATE TABLE pat_ext (
  2  Identifier_Value VARCHAR2(128),
  3  First_Name VARCHAR2(64),
  4  Last_Name VARCHAR2(64),
  5  Name_Type VARCHAR2(64)
  6  )
  7  ORGANIZATION EXTERNAL (
  8  TYPE oracle_loader
  9  DEFAULT DIRECTORY DATA_DIR
 10  ACCESS PARAMETERS (
 11  RECORDS DELIMITED BY NEWLINE
 12  FIELDS TERMINATED BY ','
 13  MISSING FIELD VALUES ARE NULL
 14  REJECT ROWS WITH ALL NULL FIELDS
 15  (
 16  Identifier_Value,
 17  First_Name,
 18  Last_Name,
 19  Name_Type
 20  ))
 21  LOCATION ('BATCHFEED.csv'))
 22  PARALLEL
 23  REJECT LIMIT 0
 24  /

Table created.

SCOTT@orcl_11g> column identifier_value format a16
SCOTT@orcl_11g> column first_name format a10
SCOTT@orcl_11g> column last_name format a10
SCOTT@orcl_11g> column name_type format a10
SCOTT@orcl_11g> select * from pat_ext
  2  /

IDENTIFIER_VALUE FIRST_NAME LAST_NAME  NAME_TYPE
---------------- ---------- ---------- ----------
1001             John       Doe        Legal
1002             Maria      Armstrong  Legal
1003             Mark       Joiner     Legal

SCOTT@orcl_11g> CREATE TABLE Patient_Identifier (
  2  Identifier_DBKey NUMBER(10),
  3  Identifier_Value VARCHAR2(128)
  4  )
  5  /

Table created.

SCOTT@orcl_11g> CREATE TABLE Patient_Name (
  2  Name_DBKey NUMBER(10),
  3  Identifier_DBKey NUMBER(10),
  4  First_Name VARCHAR2(64),
  5  Last_Name VARCHAR2(64),
  6  Name_Type VARCHAR2(64)
  7  )
  8  /

Table created.

SCOTT@orcl_11g> CREATE SEQUENCE ID_DBKEY_SEQ
  2   START WITH     101
  3   INCREMENT BY   1
  4   NOCACHE
  5   NOCYCLE
  6  /

Sequence created.

SCOTT@orcl_11g> insert all
  2  into patient_identifier (identifier_dbkey, identifier_value)
  3    values (id_dbkey_seq.NEXTVAL, identifier_value)
  4  into patient_name (name_dbkey, identifier_dbkey, first_name, last_name, name_type)
  5    values (id_dbkey_seq.CURRVAL - 100, id_dbkey_seq.CURRVAL, first_name, last_name, name_type)
  6  select * from pat_ext
  7  /

6 rows created.

SCOTT@orcl_11g> select * from patient_identifier
  2  /

IDENTIFIER_DBKEY IDENTIFIER_VALUE
---------------- ----------------
             101 1001
             102 1002
             103 1003

SCOTT@orcl_11g> select * from patient_name
  2  /

NAME_DBKEY IDENTIFIER_DBKEY FIRST_NAME LAST_NAME  NAME_TYPE
---------- ---------------- ---------- ---------- ----------
         1              101 John       Doe        Legal
         2              102 Maria      Armstrong  Legal
         3              103 Mark       Joiner     Legal

SCOTT@orcl_11g>

[Updated on: Tue, 08 July 2008 19:12]

Report message to a moderator

Re: Loading multiple tables from delimited file with SQL*Loader [message #332741 is a reply to message #332532] Wed, 09 July 2008 09:27 Go to previous messageGo to next message
renjiths
Messages: 8
Registered: July 2008
Location: Bangalore
Junior Member
Thanks a lot, it works great.

I have couple of additional questions, though -
1. How does the two solutions - SQL*Loader and External tables compare in terms of performance? BTW, we have to load about 3 million patient records (there are a couple of more similar tables involved) in a time window of about 2 hours.

2. I am curious as to how this can be done in SQL*Loader. The behavior I have seen so far is that it processes each table in the INTO TABLE clause separately and how the SEQUENCE number values be passed between tables for the same patient row insert.

Cheers,
Renjith
Re: Loading multiple tables from delimited file with SQL*Loader [message #332749 is a reply to message #332741] Wed, 09 July 2008 10:16 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9092
Registered: November 2002
Location: California, USA
Senior Member
If you want to use the same sort of sequence, then you would need to use the "rows=1" option in SQL*Loader, which causes it to process one row for both tables before going on to the next row, which is slow. It is faster if you use a SQL*Loader sequence as demonstrated below. Which method is best for you depends on a variety of things. If you need to load from the client, not the server, then you need SQL*Loader. If a SQL*Loader sequence is not acceptable, because you need to continue with an existing sequence, then an external table with a multi-table insert is better. There are several features that each has and the other doesn't. In general, if you can use an external table, it is the better way to go. They would not have invented it if it wasn't better. Bear in mind that external tables are really a nice interface to SQL*Loader behind the scenes. If you use SQL*Loader, then change your SQL*Loader command line syntax to use the "external_table=generate_only", then check your log file, you will get the syntax for an external table that does the same thing, so you can see how similar they are and how one is based on the other. As far as which is faster or better, here is what expert Tom Kyte has to say:

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:6611962171229

-- test.ctl:
LOAD DATA
INFILE batchfeed.csv
INTO TABLE patient_identifier
FIELDS TERMINATED BY ','
TRAILING NULLCOLS
(Identifier_Value,
Identifier_DBKey SEQUENCE (101, 1))
INTO TABLE patient_name
FIELDS TERMINATED BY ','
TRAILING NULLCOLS
(identifier_value FILLER POSITION (1),
First_Name,
Last_Name,
Name_Type,
identifier_dbkey SEQUENCE (101, 1),
name_dbkey ":identifier_dbkey - 100")


SCOTT@orcl_11g> CREATE TABLE Patient_Identifier (
  2  Identifier_DBKey NUMBER(10),
  3  Identifier_Value VARCHAR2(128)
  4  )
  5  /

Table created.

SCOTT@orcl_11g> CREATE TABLE Patient_Name (
  2  Name_DBKey NUMBER(10),
  3  Identifier_DBKey NUMBER(10),
  4  First_Name VARCHAR2(64),
  5  Last_Name VARCHAR2(64),
  6  Name_Type VARCHAR2(64)
  7  )
  8  /

Table created.

SCOTT@orcl_11g> HOST SQLLDR scott/tiger CONTROL=test.ctl LOG=test.log

SCOTT@orcl_11g> column identifier_value format a16
SCOTT@orcl_11g> column first_name format a10
SCOTT@orcl_11g> column last_name format a10
SCOTT@orcl_11g> column name_type format a10
SCOTT@orcl_11g> select * from patient_identifier
  2  /

IDENTIFIER_DBKEY IDENTIFIER_VALUE
---------------- ----------------
             101 1001
             102 1002
             103 1003

SCOTT@orcl_11g> select * from patient_name
  2  /

NAME_DBKEY IDENTIFIER_DBKEY FIRST_NAME LAST_NAME  NAME_TYPE
---------- ---------------- ---------- ---------- ----------
         1              101 John       Doe        Legal
         2              102 Maria      Armstrong  Legal
         3              103 Mark       Joiner     Legal

SCOTT@orcl_11g> 

Re: Loading multiple tables from delimited file with SQL*Loader [message #333067 is a reply to message #332749] Thu, 10 July 2008 07:24 Go to previous messageGo to next message
renjiths
Messages: 8
Registered: July 2008
Location: Bangalore
Junior Member
Thanks Barbara for the insightful reply. Really appreciate it.

regards,
Renjith
Re: Loading multiple tables from delimited file with SQL*Loader [message #333865 is a reply to message #333067] Mon, 14 July 2008 12:23 Go to previous messageGo to next message
renjiths
Messages: 8
Registered: July 2008
Location: Bangalore
Junior Member
Hi,

I am now facing some issue with bulk upload of data using External tables. While the script works fine for small number of records (upto 200), it gives a foriegn key violation error for more records. Please note that all our tables have DBKeys (primary keys, foriegn keys) generated as SEQUENCE numbers. I am suspecting a parallel load kicked off by Oracle messing up the order of insert of the tables and thus causing the referential integrity issue. Is there a way of turning off such a behavior? Please help.

regards,
Renjith
Re: Loading multiple tables from delimited file with SQL*Loader [message #333866 is a reply to message #331930] Mon, 14 July 2008 12:34 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
http://www.orafaq.com/forum/t/88153/0/
Please read & follow posting guidelines as stated in URL above

>Is there a way of turning off such a behavior?
yes, but there might be better ways to eliminate these errors if we really knew EXACTLY what you were doing.
Re: Loading multiple tables from delimited file with SQL*Loader [message #333883 is a reply to message #333865] Mon, 14 July 2008 14:00 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9092
Registered: November 2002
Location: California, USA
Senior Member
The problem is not caused by the external table. It is caused by a known bug that I had forgotten about with multi-table inserts and sequences. A workaround is to used deferrable foreign key constraints. I have provided a reproduction followed by a workaround below.

-- reproduction of bug:
SCOTT@orcl_11g> CREATE TABLE Patient_Identifier (
  2  Identifier_DBKey NUMBER(10),
  3  Identifier_Value VARCHAR2(128),
  4  CONSTRAINT pi_identifier_dbkey_pk
  5  PRIMARY KEY (identifier_dbkey))
  6  /

Table created.

SCOTT@orcl_11g> CREATE TABLE Patient_Name (
  2  Name_DBKey NUMBER(10),
  3  Identifier_DBKey NUMBER(10),
  4  First_Name VARCHAR2(64),
  5  Last_Name VARCHAR2(64),
  6  Name_Type VARCHAR2(64),
  7  CONSTRAINT identifier_dbkey_fk
  8  FOREIGN KEY (identifier_dbkey)
  9  REFERENCES patient_identifier (identifier_dbkey)
 10  )
 11  /

Table created.

SCOTT@orcl_11g> CREATE SEQUENCE ID_DBKEY_SEQ
  2   START WITH     101
  3   INCREMENT BY   1
  4   NOCACHE
  5   NOCYCLE
  6  /

Sequence created.

SCOTT@orcl_11g> insert all
  2  into patient_identifier (identifier_dbkey, identifier_value)
  3    values (id_dbkey_seq.NEXTVAL, identifier_value)
  4  into patient_name (name_dbkey, identifier_dbkey, first_name, last_name, name_type)
  5    values (id_dbkey_seq.CURRVAL - 100, id_dbkey_seq.CURRVAL, first_name, last_name, name_type)
  6  select distinct object_id as identifier_value,
  7  	    object_name as first_name,
  8  	    object_name as last_name,
  9  	    object_type as name_type
 10  from   user_objects
 11  /
insert all
*
ERROR at line 1:
ORA-02291: integrity constraint (SCOTT.IDENTIFIER_DBKEY_FK) violated - parent
key not found


-- workaround:
SCOTT@orcl_11g> ALTER TABLE patient_name
  2  DROP CONSTRAINT identifier_dbkey_fk
  3  /

Table altered.

SCOTT@orcl_11g> ALTER TABLE patient_name
  2  ADD CONSTRAINT pn_identifier_dbkey_fk
  3  FOREIGN KEY (identifier_dbkey)
  4  REFERENCES patient_identifier (identifier_dbkey)
  5  DEFERRABLE INITIALLY IMMEDIATE
  6  /

Table altered.

SCOTT@orcl_11g> SET CONSTRAINT pn_identifier_dbkey_fk DEFERRED
  2  /

Constraint set.

SCOTT@orcl_11g> insert all
  2  into patient_identifier (identifier_dbkey, identifier_value)
  3    values (id_dbkey_seq.NEXTVAL, identifier_value)
  4  into patient_name (name_dbkey, identifier_dbkey, first_name, last_name, name_type)
  5    values (id_dbkey_seq.CURRVAL - 100, id_dbkey_seq.CURRVAL, first_name, last_name, name_type)
  6  select distinct object_id as identifier_value,
  7  	    object_name as first_name,
  8  	    object_name as last_name,
  9  	    object_type as name_type
 10  from   user_objects
 11  /

1172 rows created.

SCOTT@orcl_11g> SET CONSTRAINT pn_identifier_dbkey_fk IMMEDIATE
  2  /

Constraint set.

SCOTT@orcl_11g> select count(*) from patient_identifier
  2  /

  COUNT(*)
----------
       586

SCOTT@orcl_11g> select count(*) from patient_name
  2  /

  COUNT(*)
----------
       586

SCOTT@orcl_11g> 


Re: Loading multiple tables from delimited file with SQL*Loader [message #334122 is a reply to message #333883] Tue, 15 July 2008 08:36 Go to previous message
renjiths
Messages: 8
Registered: July 2008
Location: Bangalore
Junior Member
Hi Barbara,

Thanks for rescuing me again. Making the foreign key constraints deferrable made the script work. But for some instances of the tests, I was getting primary key constraint violation also. So I guess it is advisable to make all the primary key constraints (involving SEQUENCE number columns) also to be deferrable.

best regards,
Renjith
Previous Topic: loader loading only the first character
Next Topic: Copying data from SQL Server to Oracle (merged)
Goto Forum:
  


Current Time: Sat May 11 14:19:02 CDT 2024