Saturday, October 07, 2006

Python scripting with Oracle

I had not used Oracle in a while, and when I last used it, I didn't know about Python. Python has become my scripting language of choice for about a year now. My current database at work is Oracle, so it was natural for me to investigate if I could use Python to communicate with the remote Oracle database. This post describes in a step-by-step manner what I needed to get Python to work with the Oracle database.

I run Python 2.4.1 on Fedora Core 4 Linux. The Oracle version is Oracle 10g. The Oracle database runs on a central machine, and I did not have any Oracle software installed on my development machine, and nor did I want to download and run a lightweight version of Oracle, such as Oracle 10g Express Edition to get at the software. Not that I am short of disk or resources on my local box, it just seemed kind of wasteful, and I did not want to have to learn how to administer an Oracle database before I could get Oracle access on my machine.

However, Python libraries for various databases generally follow the same strategy to connect to the database as a person would if using the default command line client. Unlike JDBC on Java, for instance, which provide an uniform interface regardless of whether you connect to MySQL or Oracle or PostgreSQL or Sybase. In this case too, unless you are set up to connect using SQL*Plus (the default client), you are pretty much out of luck connecting with any of the Python libraries for Oracle connectivity. I guess this makes sense - unless you were comfortable working with the database interactively, why would you bother to learn how to start scripting it?

Someone at work pointed me to the Oracle Instant Client. This is a small client that allows you to access a remote Oracle database through a set of shared libraries. The one major application that comes with it is SQL*Plus. I downloaded the 10.1 version, which was the latest at the time. It is packaged as a zip file which you need to unzip into a location of your choice. In order to get SQL*Plus to work, you need to set up your LD_LIBRARY_PATH to point to your Instant Client Installation directory, and your PATH to point to SQL*Plus. You can do this in your .bash_profile, like so:

1
2
3
IC_INSTALL_DIR=/opt/oracle/instantclient10_1
export PATH=$IC_INSTALL_DIR:$PATH
export LD_LIBRARY_PATH=$IC_INSTALL_DIR:$LD_LIBRARY_PATH

This will allow your SQL*Plus prompt to come up, but you will not be able to login to the database. To login to the database, you need to configure the Instant Client libraries with a file called tnsnames.ora. You can configure the Instant Client libraries to look for it in a directory of your choice by also setting the TNS_ADMIN environment variable in your .bash_profile, like so:

1
TNS_ADMIN=/etc

And my /etc/tnsnames.ora looks like this (all names changed to protect the guilty):

1
2
3
4
5
6
7
8
9
remotedb =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = remotehost.mycompany.com)(PORT = 1234))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = remotedb)
    )
  )

So now I can connect to my remote database server remotehost.mycompany.com:1234/remotedb as scott/tiger (not my real user/password, by the way) using SQL*Plus like so:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
$ sqlplus scott/tiger@remotedb

SQL*Plus: Release 10.1.0.5.0 - Production on Sat Oct 7 11:34:04 2006

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.1.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL>

Now that I could connect to the remote database from the command line, it was time to install the Oracle library for Python. There are least two Oracle libraries for Python, DCOracle from Zope Corporation, and cx_Oracle from CXTools. I tried cx_Oracle, and it works great. It is packaged as an RPM, but unfortunately there are no RPMs for my operating system Fedora Core 4. I tried the Fedora Core 5 RPMs, but that has a dependency on glibc 2.4 which I did not have, so I tried the RPM for Fedora Core 3 (cx_Oracle-4.1.2-1), which installed fine.

The only catch was that the cx_Oracle RPM wrote its shared object file to /usr/local/lib/python2.4/site-packages, which seems to be correct behavior, but I could not find it in the list of directories Python looks at for shared objects.

1
2
3
4
5
6
7
8
9
Python 2.4.1 (#1, May 16 2005, 15:19:29)
[GCC 4.0.0 20050512 (Red Hat 4.0.0-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> print sys.path
['', '/usr/lib/python24.zip', '/usr/lib/python2.4', '/usr/lib/python2.4/plat-linux2', 
'/usr/lib/python2.4/lib-tk', '/usr/lib/python2.4/lib-dynload', 
'/usr/lib/python2.4/site-packages', '/usr/lib/python2.4/site-packages/Numeric', 
'/usr/lib/python2.4/site-packages/gtk-2.0']

Since /usr/lib/python2.4/site-packages was in the sys.path, I just created a symbolic link for the cx_Oracle.so file so it would be visible there:

1
2
cd /usr/lib/python2.4/site-packages
ln -s /usr/local/lib/python2.4/site-packages/cx_Oracle.so cx_Oracle.so

Finally, I wrote a little Python test script to check that I will be able to access the database through Python.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#!/usr/bin/python
# Test script to test connectivity.
import cx_Oracle

uid="scott"
pwd="tiger"
service="remotedb"
db = cx_Oracle.connect(uid + "/" + pwd + "@" + service)
cursor = db.cursor()
cursor.execute("select * from my_test_table")
rows = cursor.fetchall()
print "#-records:", cursor.rowcount
for i in range(0, cursor.rowcount):
  print rows[i][0], rows[i][1]
cursor.close()

I was able to get the number of records, and the first and second columns of each row in my_test_table.

This is basically all that I did to be able to access the remote Oracle database from a Python script. Prior to this, people (at my company) had been wrapping SQL*Plus calls in Shell scripts or dropping down to PL/SQL to write stored procedures. Both these approaches are valid in the sense that they both get the work done. However, this approach requires us to make a concious decision to select the right tool for the problem at hand, and possibly have to rewrite our solution when the level of complexity crosses a certain threshhold. The neat thing about using a (Python) script is that we can use the same tool regardless of the complexity. Having worked with both shell scripts and PL/SQL in the past, I can confidently say that Python is easier to work with and more feature rich than either of these languages.

24 comments (moderated to prevent spam):

Anonymous said...

For history support in SQL*Plus, you can work with SQL*Plus on a Unix box from within emacs. Open up emacs and get a shell (M-x shell) and then fire up sqlplus. You can move forward in the history with M-p and M-n. Another useful thing: using split screen (Ctl-x 2) and setting up one buffer with the sqlplus session you've started, and the other as scratch or saved query buffer (for example), you can edit your queries and then paste them whole into the sqlplus window.

Sujit Pal said...

Thanks, that was very helpful. I tried it and it works great, eliminates one of the reasons for me to use the Eclipse Database Explorer perspective (its kind of annoying to have to switch back and forth between the Java/J2EE and DB perspectives), so I can run sqlplus within emacs on a separate window.

Steve Jones said...

I've created pySQLPLUS, which is a SQLPlus wrapper and (mainly) adheres to the Python Database API Specification v2.0.

It has the advantage of working without any needed libraries and only requires SQLPlus client installed.

Works in Windows and *NIX.

Comments, Improvements?

http://code.google.com/p/pysqlplus

Sujit Pal said...

Hi Steve, this is good stuff, although speaking for myself, it would probably not be too useful. When I downloaded the instant client, I also downloaded SQLPlus, which works fine only with the instant client libs, as long as you set up tnsnames.ora.

It would be nice (at least for people who predominantly do Java, and probably have the Oracle JDBC driver somewhere in their classpath already), if there was one written with Jython using JDBC so there is no necessity to download instant client, just use the Oracle JDBC driver.

lysdexia said...

rlwrap adds emacs/vi style command editing (how many times did I hit the up arrow only to grind my teeth?) to SQL*Plus. It makes SQL*Plus much easier to live with. From the site:

rlwrap is a 'readline wrapper' that uses the GNU readline library to allow the editing of keyboard input for any other command. Input history is remembered across invocations, separately for each command; history completion and search work as in bash and completion word lists can be specified on the command line.

Look here.

Sujit Pal said...

Thanks for the link, should be less intensive than starting within emacs, I will try it out. Also, cool handle :-).

feWill said...

This post was very helpful to me in getting my machine setup to talk remotely to an Oracle database. Thanks for taking the time to post this.

I really appreciate it.

Sujit Pal said...

@feWill: you are welcome, glad it helped.

Anonymous said...

Great post!
I have installed oracle 11g in an xp laptop and changed the python script slightly:
uid="system"
pwd="sa"
service="localhost"
since I am using the system uid and I have a local db.
Besides that, the only other problem I had is that my test table had only one column, so I got the error:
IndexError: tuple index out of range. Once I realized why I changed the script: rint rows[i][0] #,rows[i][1]
Kudos,
Omar
omar.flores.i.t@gmail.com

Sujit Pal said...

Thanks Omar, both for the kind words and for pointing out the bugs and your solution.

Chris B said...

Great post Sujit. My Oracle is shaky and I am new to Python but I got this up and running in no time.

Only difference compared to your implementation was that I do not have permissions to install Python libraries on our environment so the cx_Oracle install step wasn't allowed to put cx_Oracle.so into the Python site packages directory. I used Graham Dumpleton's Python 'Virtualenv' to create my 'own' Python to get round this. A second detail - I also needed to export 'TNS_ADMIN' in my profile to get the tnsnames.ora to be picked up.
Just for info am running:
- Suse Linux x86_64
- Oracle 8.1.7 on a remote machine
- Oracle Instant Client 10.2.0.4 (I tried version 11 but it didn't support my ancient Oracle version)
- cx_Oracle built from source

Sujit Pal said...

Hi Chris, thanks for the comment, and I am glad the post helped you. I have actually started using Jython for this stuff lately, that way I can use JDBC and don't have to worry about having to install the sqlclient, and api differences between cx_oracle and the MySql python drivers. The only downside is that you have to update the classpath to contain the jdbc driver of your choice.

Krishnan R.S. said...

What a shame that Python does not have the equivalent of Oracle thin driver for JDBC. Requiring a oracle client side install is a total pain.
Is there a way to use the thin JDBC drivers from Python (not Jython) ?

Sujit Pal said...

Hi Krishnan, don't know for sure if Python has the equivalent of Oracle thin drivers. Although you may want to consider using Jython, it provides a Pythonic database interface over JDBC, so its kind of the best of both worlds.

Saranath said...

Thank you very much Mr. Sujith. You save my week end. Thanks a lot. :)

Sujit Pal said...

Hi Saranath, you are welcome.

Anonymous said...

Thanks a lot for an this clean doc.. It saved lots of time for me ..keep posting ..

Sujit Pal said...

Hi, thanks and you are welcome.

Robert said...

could you just add "/usr/local/lib/python2.4/site-packages" to PYTHONPATH ?

Sujit Pal said...

Hi Robert, yes, and this IMO is a better solution than symlinking. Thanks for the pointer!

ava said...

Hi I came across a problem, how to use "EXECUTE xx" with cx_oracle in python?

Sujit Pal said...

Hi ava, not really sure, haven't used cx_oracle recently, and never tried to run stored procedures when I /was/ using it - there are some execute* methods in the cx_oracle documentation pages, not sure but maybe they are helpful?

leancz said...

Many thanks for this excellent article. It has saved me a lot of time and I am up and running with Oracle queries from Python in one hour.

Sujit Pal said...

Thanks, leancz, glad it helped you.