I had not used Oracle in a while, and when I last used it, I didn't know about Python. Python has become my scripting language of choice for about a year now. My current database at work is Oracle, so it was natural for me to investigate if I could use Python to communicate with the remote Oracle database. This post describes in a step-by-step manner what I needed to get Python to work with the Oracle database.
I run Python 2.4.1 on Fedora Core 4 Linux. The Oracle version is Oracle 10g. The Oracle database runs on a central machine, and I did not have any Oracle software installed on my development machine, and nor did I want to download and run a lightweight version of Oracle, such as Oracle 10g Express Edition to get at the software. Not that I am short of disk or resources on my local box, it just seemed kind of wasteful, and I did not want to have to learn how to administer an Oracle database before I could get Oracle access on my machine.
However, Python libraries for various databases generally follow the same strategy to connect to the database as a person would if using the default command line client. Unlike JDBC on Java, for instance, which provide an uniform interface regardless of whether you connect to MySQL or Oracle or PostgreSQL or Sybase. In this case too, unless you are set up to connect using SQL*Plus (the default client), you are pretty much out of luck connecting with any of the Python libraries for Oracle connectivity. I guess this makes sense - unless you were comfortable working with the database interactively, why would you bother to learn how to start scripting it?
Someone at work pointed me to the Oracle Instant Client. This is a small client that allows you to access a remote Oracle database through a set of shared libraries. The one major application that comes with it is SQL*Plus. I downloaded the 10.1 version, which was the latest at the time. It is packaged as a zip file which you need to unzip into a location of your choice. In order to get SQL*Plus to work, you need to set up your LD_LIBRARY_PATH to point to your Instant Client Installation directory, and your PATH to point to SQL*Plus. You can do this in your .bash_profile, like so:
1 2 3
IC_INSTALL_DIR=/opt/oracle/instantclient10_1 export PATH=$IC_INSTALL_DIR:$PATH export LD_LIBRARY_PATH=$IC_INSTALL_DIR:$LD_LIBRARY_PATH
This will allow your SQL*Plus prompt to come up, but you will not be able to login to the database. To login to the database, you need to configure the Instant Client libraries with a file called tnsnames.ora. You can configure the Instant Client libraries to look for it in a directory of your choice by also setting the TNS_ADMIN environment variable in your .bash_profile, like so:
And my /etc/tnsnames.ora looks like this (all names changed to protect the guilty):
1 2 3 4 5 6 7 8 9
remotedb = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = remotehost.mycompany.com)(PORT = 1234)) ) (CONNECT_DATA = (SERVICE_NAME = remotedb) ) )
So now I can connect to my remote database server remotehost.mycompany.com:1234/remotedb as scott/tiger (not my real user/password, by the way) using SQL*Plus like so:
1 2 3 4 5 6 7 8 9 10 11
$ sqlplus scott/tiger@remotedb SQL*Plus: Release 10.1.0.5.0 - Production on Sat Oct 7 11:34:04 2006 Copyright (c) 1982, 2005, Oracle. All rights reserved. Connected to: Oracle Database 10g Enterprise Edition Release 10.1.0.3.0 - Production With the Partitioning, OLAP and Data Mining options SQL>
Now that I could connect to the remote database from the command line, it was time to install the Oracle library for Python. There are least two Oracle libraries for Python, DCOracle from Zope Corporation, and cx_Oracle from CXTools. I tried cx_Oracle, and it works great. It is packaged as an RPM, but unfortunately there are no RPMs for my operating system Fedora Core 4. I tried the Fedora Core 5 RPMs, but that has a dependency on glibc 2.4 which I did not have, so I tried the RPM for Fedora Core 3 (cx_Oracle-4.1.2-1), which installed fine.
The only catch was that the cx_Oracle RPM wrote its shared object file to /usr/local/lib/python2.4/site-packages, which seems to be correct behavior, but I could not find it in the list of directories Python looks at for shared objects.
1 2 3 4 5 6 7 8 9
Python 2.4.1 (#1, May 16 2005, 15:19:29) [GCC 4.0.0 20050512 (Red Hat 4.0.0-5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> print sys.path ['', '/usr/lib/python24.zip', '/usr/lib/python2.4', '/usr/lib/python2.4/plat-linux2', '/usr/lib/python2.4/lib-tk', '/usr/lib/python2.4/lib-dynload', '/usr/lib/python2.4/site-packages', '/usr/lib/python2.4/site-packages/Numeric', '/usr/lib/python2.4/site-packages/gtk-2.0']
Since /usr/lib/python2.4/site-packages was in the sys.path, I just created a symbolic link for the cx_Oracle.so file so it would be visible there:
cd /usr/lib/python2.4/site-packages ln -s /usr/local/lib/python2.4/site-packages/cx_Oracle.so cx_Oracle.so
Finally, I wrote a little Python test script to check that I will be able to access the database through Python.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
#!/usr/bin/python # Test script to test connectivity. import cx_Oracle uid="scott" pwd="tiger" service="remotedb" db = cx_Oracle.connect(uid + "/" + pwd + "@" + service) cursor = db.cursor() cursor.execute("select * from my_test_table") rows = cursor.fetchall() print "#-records:", cursor.rowcount for i in range(0, cursor.rowcount): print rows[i], rows[i] cursor.close()
I was able to get the number of records, and the first and second columns of each row in my_test_table.
This is basically all that I did to be able to access the remote Oracle database from a Python script. Prior to this, people (at my company) had been wrapping SQL*Plus calls in Shell scripts or dropping down to PL/SQL to write stored procedures. Both these approaches are valid in the sense that they both get the work done. However, this approach requires us to make a concious decision to select the right tool for the problem at hand, and possibly have to rewrite our solution when the level of complexity crosses a certain threshhold. The neat thing about using a (Python) script is that we can use the same tool regardless of the complexity. Having worked with both shell scripts and PL/SQL in the past, I can confidently say that Python is easier to work with and more feature rich than either of these languages.