Write Less with More – Part 1 (SQL*Loader Express Mode)

Introduction

For some time now I have been presenting a lecture that I call in short “Write Less with More”, or in its full name “Write Less (code) with More (Oracle 12c New Features)”. The idea is to focus on some of the evolutionary – rather than revolutionary – features that were added in version 12c of Oracle. Features that  allow us developers to write less than in previous versions in order to achieve the same functionality.

Instead of starting by presenting the features and then giving some arbitrary examples, I prefer to start with a list of common development tasks, and to suggest for each task a pre-12c solution and a solution that uses a new 12c feature making it possible to write less. I like showing problems and how we can solve them, rather than showing features that look for problems to solve.

I’m starting today a series of blog posts that is based on the “Write Less with More” presentation, and follows its structure. In each post I’ll discuss one new Oracle 12c feature that solves some development task. Each such feature falls under one of the following categories:

  • Write Less Configuration
  • Write Less Application Code
  • Write Less Code in SQL Statements
  • Write Less “Inappropriately-Located” Code

All the Series Parts

Published

Part 1 – write less configuration with SQL*Loader Express Mode
Part 2 – write less application code with Identity Columns
Part 3 – write less application code with In-Database Archiving
Part 4 – write less application code with Temporal Validity
Part 5 – write less code in SQL with Row Limiting
Part 6 – write less “inappropriately located” code with SELECT FROM Package-Level Collection Types
Part 7 – write less code in SQL with Lateral Inline Views
Part 8 – write less “inappropriately located” code with PL/SQL in the WITH Clause

Setup

All the examples in the series are based on this data model:
ERD
We have three tables: PEOPLE, PROJECTS and a join table between them – PROJECT_ASSIGNMENTS. At this point PROJECT_ASSIGNMENTS is very simple – it only contains information about which people are assigned to which projects, not when or for how long. It may also contain duplicate rows for the same person+project combination (with different assignment_id though), which seems quite pointless. We will enhance this naïve design as we progress with the series.

Here is the script that creates the tables and fills a few projects:

create table people (
       person_id integer not null constraint people_pk primary key,
       first_name varchar2(20) not null,
       last_name varchar2(30) not null,
       general_info varchar2(100)
);

create table projects (
       project_id integer not null constraint projects_pk primary key,
       project_name varchar2(100) not null,
       status number(1) not null,
       last_days_to_show_in_reports integer not null
);

create table project_assignments (
       assignment_id integer not null constraint project_assignments_pk primary key,
       person_id integer not null constraint assignments_fk_people references people,
       project_id integer not null constraint assignments_fk_projects references projects
);

insert into projects values (1,'Project A',1,2);
insert into projects values (2,'Project B',2,3);
insert into projects values (3,'Project C',1,4);
insert into projects values (4,'Project D',2,3);

commit;

And this is our “To Do” list:
todo_full

So now we are ready to start with the first task.

Task #1todo1

The PEOPLE table is still empty. We have a CSV file – C:\Data\people.dat – that contains data about people and we want to load this data from the file into the table. The file contains one line for each person, the fields are separated by commas, and the order of the fields in the file is the same as the order of the columns in the table. Here is the content of the file:

101,John,Lennon,
102,Paul,McCartney,18/6/1942
103,Ringo,Starr,Drums
104,George,Harisson,
201,Louis,Armstrong,Jazz
202,Ella,Fitzgerald,15/6/1996
203,Etta,James,20/1/2012
317,Julie,Andrews,

You can see that each line contains details about one person, corresponding to the PEOPLE table’s structure: ID, first name, last name, and some (un-modeled, ugh!) general details.

A Pre-12c Solution

Oracle supports loading data from external files for decades now. There is the good old SQL*Loader utility, and since Oracle9i there is also the ability to use External Tables. Both are very powerful, but require quite a complex configuration even for very simple files. For SQL*Loader we need to write a control file with all the configuration. For External Tables we need something similar to the control file – this time as part of the CREATE TABLE statement – in addition to a DIRECTORY object and an INSERT statement that copies the data from the external table to the “real” table.

A 12c Solution

SQL*Loader Express Mode

In Oracle 12c SQL*Loader can be executed in a new mode – the Express Mode. In this mode we do not use a control file, and many defaults are used implicitly. We can override many of these defaults by specifying explicit parameters in the command-line level.

If you want to learn about many more new features of Oracle 12c, you may be interested in one of my in-house training sessions, particularly the Oracle 12c New Features for Developers full-day seminar.

Let’s Load

The only parameter that is mandatory in this mode is the TABLE parameter. In our case, all we need to do is execute SQL*Loader from C:\Data (where the text file is located) and specify TABLE=people.

First, let’s see that the table is really empty.

C:\Data>sqlplus demo5/demo5
> select * from people;

no rows selected

> exit

Now, let’s execute SQL*Loader:

C:\Data>sqlldr demo5/demo5 table=people

SQL*Loader: Release 12.1.0.2.0 - Production on Sat Aug 15 15:58:48 2015

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Express Mode Load, Table: PEOPLE
Path used:      External Table, DEGREE_OF_PARALLELISM=AUTO

Table PEOPLE:
  8 Rows successfully loaded.

Check the log files:
  people.log
  people_%p.log_xt
for more information about the load.

C:\Data>

And we can see now that the table contains the data from the file:

C:\Data>sqlplus demo5/demo5
> select * from people;

PERSON_ID  FIRST_NAME LAST_NAME       GENERAL_INFO
---------- ---------- --------------- --------------------
       101 John       Lennon
       102 Paul       McCartney       18/6/1942
       103 Ringo      Starr           Drums
       104 George     Harisson
       201 Louis      Armstrong       Jazz
       202 Ella       Fitzgerald      15/6/1996
       203 Etta       James           20/1/2012
       317 Julie      Andrews
 
8 rows selected.

> exit

C:\Data>

We specified the table name only. All the rest of the configuration was derived implicitly:

  • The default file name is <tableName>.dat (in our case, people.dat)
  • The default file location is the directory from which we execute sqlldr (in our case, C:\Data)
  • The default record delimiter is newline
  • The default field delimiter is comma
  • By default, the fields have no enclosures
  • Etc.

Let’s take a look at the output we got on the screen.

First of all, it says “Express Mode Load”. Ok, so it really worked in express mode.

Multiple Loading Mechanisms

A more interesting thing to note is “Path used:      External Table”. SQL*Loader actually loaded the data using its “descendant”, more advanced, feature – External Tables. By default, SQL*Loader Express Mode performs the load using External Tables if possible, and, if it cannot use External Tables, it falls back to a direct path load using its “legacy” way.

Beyond Loading

The last lines of the on-screen output  lead us to more output, which may be very valuable:

Check the log files:
  people.log
  people_%p.log_xt
for more information about the load.

SQL*Loader Express Mode not only loads the data, but also prepares for us some items we can re-use later. The log file – people.log – includes, in addition to the standard logging stuff:

A SQL*Loader control file

Generated control file for possible reuse:
OPTIONS(EXTERNAL_TABLE=EXECUTE, TRIM=LRTRIM)
LOAD DATA
INFILE 'people'
APPEND
INTO TABLE PEOPLE
FIELDS TERMINATED BY ","
(
  PERSON_ID,
  FIRST_NAME,
  LAST_NAME,
  GENERAL_INFO
)
End of generated control file for possible reuse.

A statement for creating the external table

CREATE TABLE "SYS_SQLLDR_X_EXT_PEOPLE" 
(
  "PERSON_ID" NUMBER(38),
  "FIRST_NAME" VARCHAR2(20),
  "LAST_NAME" VARCHAR2(30),
  "GENERAL_INFO" VARCHAR2(100)
)
ORGANIZATION external 
(
  TYPE oracle_loader
  DEFAULT DIRECTORY SYS_SQLLDR_XT_TMPDIR_00010
  ACCESS PARAMETERS 
  (
    RECORDS DELIMITED BY NEWLINE CHARACTERSET UTF8
    BADFILE 'SYS_SQLLDR_XT_TMPDIR_00010':'people_%p.bad'
    LOGFILE 'people_%p.log_xt'
    READSIZE 1048576
    FIELDS TERMINATED BY "," LRTRIM 
    REJECT ROWS WITH ALL NULL FIELDS 
    (
      "PERSON_ID" CHAR(255),
      "FIRST_NAME" CHAR(255),
      "LAST_NAME" CHAR(255),
      "GENERAL_INFO" CHAR(255)
   )
  )
  location 
  (
    'people.dat'
  )
)REJECT LIMIT UNLIMITED

An INSERT statement for filling the “real” table from the external table

INSERT /*+ append parallel(auto) */ INTO PEOPLE 
(
  PERSON_ID,
  FIRST_NAME,
  LAST_NAME,
  GENERAL_INFO
)
SELECT 
  "PERSON_ID",
  "FIRST_NAME",
  "LAST_NAME",
  "GENERAL_INFO"
FROM "SYS_SQLLDR_X_EXT_PEOPLE"

Even if we need to do some complex load which is not supported by the Express Mode, we can still use this generated code as a starting point instead of writing everything from scratch.

Command-Line Level Configuration

SQL*Loader Express Mode allows us to override many attributes of the default configuration using optional command-line parameters.

For example, let’s say we want to load another file, which has the following differences with respect to the original people.dat file:

  • The file name is my_people.txt
  • The field delimiter is semicolon
  • The fields may contain semicolons in their content, and are enclosed by double quotes in these cases

We can still use the Express Mode, like this:

C:\Data>sqlldr demo5/demo5 table=people data=my_people.txt terminated_by=';' optionally_enclosed_by='\"'

You can find the complete list of supported parameters in the SQL*Loader Express Mode Documentation.

Loading into Older Versions

One last note: SQL*Loader Express Mode is a new feature of Oracle 12c, but we can use it also for loading data into tables in pre-12c databases (using an Oracle 12c client).

Conclusion

We saw in this part of the Write Less with More series that SQL*Loader Express Mode allows us to write less configuration.

The next post will be about another new Oracle 12c feature – Identity Columns.

Leave a Reply

Your email address will not be published. Required fields are marked *