How to read xml files from the OS, parse them, and load the data into Oracle

Want to read xml files from the OS, parse them, and load the data into Oracle, all from PL/SQL (and a small Java stored procedure 🙂 )? Well allrightythen!

* This will all be done from within the database.

* The OS directories will be referenced only by Oracle directory name, not actual directory name.

Disclaimer: This code is not all mine. I’ve collected snippets from several different places, and wrote a bunch of my own as well. The goal was to take it all and put it into one post so that you can go from start to finish in one place.

We’ll be doing the following in this post

1) List the source directory
2) Loop through the file names
3) Parse the XML files
4) Insert the data (not the XML file) into a table
5) Move the file to a destination directory

Source directory: /tmp/xml_inbound
Destination directory: /tmp/xml_archive

Example XML file

<?xml version="1.0" encoding="UTF-8"?>
      <ID>Batch 1</ID>
               <Breed>American Bulldog</Breed>

Let’s do a little OS level setup, as the “oracle” user

# mkdir /tmp/xml_inbound
# mkdir /tmp/xml_archive

Copy the above XML into a file called /tmp/xml_inbound/<anything>.xml 

And database level – Note that I’m granting privileges to “ADAM”.
That might not be you..

sqlplus "/ as sysdba"

11:08:35 DEMO (SYS) > Create directory xml_inbound as '/tmp/xml_inbound';
11:08:43 DEMO (SYS) > Create directory xml_archive as '/tmp/xml_archive';
11:08:51 DEMO (SYS) > Grant read,write on directory xml_inbound to ADAM;
11:08:58 DEMO (SYS) > Grant read,write on directory xml_archive to ADAM;
-- This is for the Java stored procedure we'll create below. 
-- It's only used for reading the directory contents.
11:09:03 DEMO (SYS) > Exec dbms_java.grant_permission( 'ADAM','','/tmp/xml_inbound', 'read');

First, we need to be able to get a directory listing from the OS. We’ll use Java for this. One procedure will populate the DIR_LIST global temporary table with the file names. The table will be cleared when we do a commit. The other two procedures are for bringing the data in as a variable, and renaming (moving) the OS file.

As your non-SYS database user from now on.. For me it’s “ADAM”

Create global temporary table DIR_LIST
   filename varchar2(255)
) on commit delete rows;

The below code will create the following procedures

dir_list                Get a listing of files from a directory
get_xml_file_from_os    Load an XML file into a variable of XMLType
move_os_file            Used to move files from inbound to archive

Create or replace
and compile java source named "DirList"
import java.sql.*;

public class DirList
   public static void getList(String oracle_directory)
                   throws SQLException
      String os_directory;
      #sql {Select directory_path into :os_directory from all_directories where directory_name = upper(:oracle_directory)};

      File path = new File( os_directory );
      String[] list = path.list();
      String element;

      for(int i = 0; i < list.length; i++)
         element = list[i];
         #sql { INSERT INTO DIR_LIST (FILENAME) VALUES (:element) };


Create or replace
procedure get_dir_list( oracle_directory in varchar2 )
   as language java
   name 'DirList.getList( java.lang.String )';

Create or replace procedure move_os_file
   source_directory     in varchar2,
   source_file          in varchar2,
   target_directory     in varchar2,
   target_file          in varchar2


   srcdir               varchar2(255) := upper(source_directory);
   tgtdir               varchar2(255) := upper(target_directory);


   -- NOTE: If you're getting the all-too-familiar
   -- ORA-29292: file rename operation failed
   -- and you're SURE that your directory names are correct,
   -- and you're SURE that your privileges are correct, both at the 
   -- OS level, and within the database, there's one last thing that
   -- can get you. I learned the hard way that this command will NOT
   -- work successfully renaming a file from one filesystem to another,
   -- at least when those filesystems are NFS mounted. That is all. 

end move_os_file;

Create or replace procedure get_xml_file_from_os 
   oracle_directory     in varchar2, 
   filename             in varchar2 ,
   o_xml                out xmltype


    dest_clob           clob;
    src_clob            bfile;
    dst_offset          number := 1 ;
    src_offset          number := 1 ;
    lang_ctx            number := DBMS_LOB.DEFAULT_LANG_CTX;
    warning             number;
    os_directory        varchar2(255);
    ora_directory       varchar2(255);


   ora_directory := upper(oracle_directory);

   Select directory_path 
      into os_directory 
   from all_directories
   where directory_name = ora_directory;


   dbms_output.put_line('OS directory: '||os_directory);
   dbms_output.put_line('Oracle directory: '||ora_directory);
   dbms_output.put_line('File: '||filename);
   src_clob := bfilename(ora_directory, filename);

   dbms_output.put_line('Opening lob');,DBMS_LOB.LOB_READONLY);

   dbms_output.put_line('Loading lob');
      dest_lob     => dest_clob,
      src_bfile    => src_clob,
      amount       => DBMS_LOB.GETLENGTH(src_clob),
      dest_offset  => dst_offset,
      src_offset   => src_offset,
      bfile_csid   => DBMS_LOB.DEFAULT_CSID,
      lang_context => lang_ctx,
      warning      => warning


   o_xml := xmltype(dest_clob);

end get_xml_file_from_os;

Now that we can read files from the OS, we need to parse them and insert the data. Let’s create a couple of tables to support this.

Create table ANIMALS
   animal_id   number,
   type        varchar2(20)

Create table BREEDS
   animal_id    number,
   breed        varchar2(255)

And finally, a stored procedure to pull it all together. Here’s where we’ll do the XML parsing and inserting into the database.

Create or replace procedure process_xml


   v_XML                                XMLType;
   v_rowcount                           number;

   -- Header variables
   v_BatchID                            varchar2(20);
   v_AnimalCount                        number;
   v_BreedCount                         number;

   -- Animal variables
   v_AnimalID                           ANIMALS.ANIMAL_ID%TYPE;
   v_AnimalType                         ANIMALS.TYPE%TYPE;

   -- Breed variables
   v_BreedID                            BREEDS.ANIMAL_ID%TYPE;
   v_Breed                              BREEDS.BREED%TYPE;


   dbms_output.put_line('Getting directory listing');
   -- The records in the DIR_LIST table will be removed on commit!
   for i in (select filename from dir_list where upper(filename) like '%.XML')
      v_rowcount := 0;
      dbms_output.put_line('Filename: '||i.filename);
      dbms_output.put_line('File loaded');

      -- Extract the Header
      dbms_output.put_line('Getting Batch ID');
      v_BatchID         := v_XML.extract('/Data/Header/ID/text()').getStringVal();
      dbms_output.put_line('Getting Animal Count');
      v_AnimalCount     := v_XML.extract('/Data/Header/AnimalCount/text()').getNumberVal();
      v_BreedCount      := v_XML.extract('/Data/Header/BreedCount/text()').getNumberVal();
      dbms_output.put_line('Batch ID: '||v_BatchID);
      dbms_output.put_line('Animal Count: '||v_AnimalCount);
      dbms_output.put_line('Breed Count: '||v_BreedCount);

      -- Loop through the animals
      v_AnimalCount := 1;
      While v_XML.existsNode('/Data/Animals/Animal['||v_AnimalCount||']') = 1

      -- Loop through Animals
         v_AnimalID     :=  v_XML.extract('/Data/Animals/Animal['||v_AnimalCount||']/ID/text()').getNumberVal();
         v_AnimalType   :=  v_XML.extract('/Data/Animals/Animal['||v_AnimalCount||']/Type/text()').getStringVal();
         dbms_output.put_line('AnimalID: '||v_AnimalID);
         dbms_output.put_line('AnimalType: '||v_AnimalType);

         Insert into ANIMALS values (v_AnimalID,v_AnimalType);

         -- Loop through breeds
         v_BreedCount := 1;
         While v_XML.existsNode('/Data/Animals/Animal['||v_AnimalCount||']/BreedList/Breed['||v_BreedCount||']') = 1
            v_Breed     :=  v_XML.extract('/Data/Animals/Animal['||v_AnimalCount||']/BreedList/Breed['||v_BreedCount||']/text()').getStringVal();
            dbms_output.put_line('Breed: '||v_Breed);

            Insert into BREEDS values (v_AnimalID,v_Breed);

            v_BreedCount := v_BreedCount + 1;
         end loop;

         v_BreedCount := 1;
         v_AnimalCount := v_AnimalCount + 1;

      end loop;

      v_AnimalCount := 1;

      dbms_output.put_line('Moving file '||i.filename);

   end loop;


End process_xml;

And that’s it! Now, let’s execute it and see what happens 🙂
Note: My xml file is called test1.xml

10:46:41 DEMO (ADAM) > Exec process_xml;
Getting directory listing
Filename: test1.xml
OS directory: /tmp/xml_inbound
Oracle directory: XML_INBOUND
File: test1.xml
Opening lob
Loading lob
File loaded
Getting Batch ID
Getting Animal Count
Batch ID: Batch 1
Animal Count: 2
Breed Count: 6
AnimalID: 1
AnimalType: Dog
Breed: Bullmastiff
Breed: Dalmation
Breed: American Bulldog
AnimalID: 2
AnimalType: Cat
Breed: Tabby
Breed: Manx
Breed: Savannah
Moving file test1.xml

Is the data there?

10:49:03 RMAN (ADAM) > Select count(*) from animals;

10:49:11 RMAN (ADAM) > Select count(*) from breeds;


And if you look at /tmp/xml_archive..

[oracle@mastiff /home/oracle] [RMAN]$ ls /tmp/xml_archive

The file has been moved! That’s all folks!