Sunday, August 29, 2010

Python Web Application with Spring Python

I read the SpringPython Book about a month ago, and liked it, so about two weeks ago, started building a small web application with it in an attempt to learn more about it. I have never used Python to build anything larger than a script, so this was a first for me. One reason I haven't even tried to do a Python webapp before is that we are a Java shop, and maintainability becomes an issue. In this instance, however, I think I have a use case which just feels more natural to solve with Python than with Java.

The use case is exposing expensive computing resources to the users via a webapp. We have batch jobs that run for long periods and are fairly memory and processor intensive. Developing and running these jobs are restricted to a few people who are given access to large fast machines (or clusters in some cases). The downside (for these supposedly lucky folks) is that they often have to run jobs (often at odd hours) for people who dont have access. The downside for the others is that lot of good ideas die because of the limited scope for experimentation (having to ask other people to run stuff for you repeatedly can be a real initiative-killer).

The app that I built (and am about to describe) provides a simple web interface for people to submit jobs. A job is basically a Unix shell script (which would call a Java class), its input file(s), configuration parameters and output file. Jobs are queued up in a database. A server component calls the selected script and runs it on the user's behalf, reporting progress of the job via email.

Customizations

I started off generating the webapp component using SpringPython's coily tool, hoping to have most of the work done. Although the app so generated is functional, it is quite skeletal, and I ended up customizing it quite heavily. The coily generated app is something between a maven archetype and a Roo generated app. The files generated by coily are:

ScriptRunner.py Contains the main method to launch the webapp - its name is derived from the application name given to coily.
app_context.py Contains the objects in the application context, annotated as SpringPython @Object.
controller.py Empty File.
view.py Contains the CherryPy "controller" objects annotated by @cherrypy.expose.

What I ended up with was this set of files:

ScriptRunner.py Main method with option to switch between web client, server and cleaner mode.
app_context.py Contains a (2 level deep) hierarchy of application context objects. Each child application context is customized to the modes that ScriptRunner can run in (listed above).
server.py Contains the top level classes for running server and cleaner modes.
services.py Contains classes for various services. These services encapsulate functionality that are provided as service objects in the context and are injected into the top level objects.
view.py Contains the top level class for running the web mode. Similar to the view.py generated, except it has more @cherrypy.exposed methods, and I have moved out non HTML rendering code (as far as possible) into the service methods.

Multiple modes

The generated coily app only provides code for a web application, but I needed code to also do the server portion (to consume and process the queue), and to do cleanup (delete old files and database entries, scheduled via cron) in addition to that. So I made some changes in the ScriptRunner.py file to use options to run in different modes. Different modes need to run as different processes, ie, the server is a completely different process from the web client and needs to be started separately. Here is the usage output.

1
2
3
4
5
Options:
  -h, --help     show this help message and exit
  -w, --web      Run ScriptRunner web client
  -s, --server   Run ScriptRunner server
  -c, --cleanup  Cleanup old files

Custom User Details Service

Coily's generated webapp comes with a In-Memory User Details Service which is populated at startup from a dictionary. The model I wanted for my application was a self-service model - there is no concept of administrators or super-users. People sign up and login with their email/password. Login is required for identifying who to send the email notifications to. Because of this, I decided to populate the In-Memory User Details object with the contents of a database table, rather than go with the Database User Details object.

Embedded SQLite3 Database

I wanted this app to be standalone, ie, with as few moving parts as possible. So I decided a embed a SQLite3 database inside the application. One problem with that, however, was that SQLite3 does not allow a database connection to be shared by multiple threads. Obviously, this doesn't work if a database connection is created by the SpringPython container and then used by a request thread. However, this check can be disabled by passing a flag during connection factory construction. As a temporary measure, I added this in to the SpringPython code and reported it, and SpringPython 1.2 will allow this.

The DDL for the user and job tables are shown below. Currently, I don't have code to autodetect the absence of the DDL file and generate these tables automatically, but I will probably put this in.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
CREATE TABLE users (
  login VARCHAR(32) NOT NULL, 
  password VARCHAR(32) NOT NULL
);
CREATE TABLE jobs (
  id INTEGER PRIMARY KEY, 
  script_name VARCHAR(255) NOT NULL, 
  params VARCHAR(255), 
  requester VARCHAR(32) NOT NULL, 
  status VARCHAR(32) NOT NULL, 
  requestDttm INTEGER, 
  startDttm INTEGER, 
  stopDttm INTEGER
);

Properties Place Holder

To address the maintainability concern mentioned above, I decided to put as many properties as I could out into a Java style properties file. I then created a Properties Place Holder bean that read this file on startup and loaded it into a dictionary, and exposed the Properties Place Holder bean into the Application Context. Here is the properties file I used:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Source: etc/ScriptRunner.conf
# Configuration parameters for ScriptRunner

# Port on which the ScriptRunner web client will listen on (default 8080)
web_port=8080

# Path to the embedded SQLite3 database for ScriptRunner
path_to_db=data/ScriptRunner.db

# User/Pass for this box (so you can copy input files to it)
scp_user=guest
scp_pass=secret

# Full path to the directory on this box where files will be scp'd
inbox_dir=/Users/sujit/Projects/ScriptRunner/data/inputs
# Full path to the directory on this box where output files will be dropped
outbox_dir=/Users/sujit/Projects/ScriptRunner/data/outputs
# Full path to the directory on this box where scripts are located
script_dir=/Users/sujit/Projects/ScriptRunner/data/scripts

# The comment prefix for script parameter annotations
script_doc_prefix=##~~
# For params with this key prefix, we display the contents of the
# inbox directory
input_param_prefix=input
output_param_prefix=output

# ScriptRunner's email address
email_address=scriptrunner@mycompany.com

# Poll interval (sec) if nothing on queue
poll_interval=10

Multiple Application Contexts

The generated coily app contained a single Config object. Since I was essentially building three related components, each with a overlapping but different set of objects in their Config objects, I ended up building 4 Config objects, like so. Each config object corresponds to one of the ScriptRunner modes.

1
2
3
4
5
6
7
ScriptRunnerCommonConfig
  |
  +-- ScriptRunnerWebConfig
  |
  +-- ScriptRunnerServerConfig
  |
  +-- ScriptRunnerCleanerConfig

Services

The generated coily app contained an empty controller.py file. The view.py file seemed to me to be the closest (Spring-Java) analog to a controller. I guess since we use Python for the view layer, there is really no necessity for a separate controller layer. However, I noticed that the view.py became "less controller-like" if I refactored some of the code into a services layer, so I created a services.py file and deleted the controller.py file.

Script Annotations

In order for ScriptRunner to figure out what information is needed to be passed into a job, either the scripts need to be self contained and expose an uniform interface, or (the approach I have chosen), describe their interface using a special annotation. For example, here is an example from one of my test scripts. The code parses parameter information from the lines with the "##~~" prefix (configurable via the conf file, see above).

1
2
3
4
5
#!/bin/bash
##~~ input - the input file
##~~ conf - the configuration file
##~~ output - the output file
...

The Code

Given the rather verbose descriptions above, its probably not too hard to read the code below without much explanation, so here it is:

app_context.py

Inline comments are provided where I thought it made sense. All methods that coily generated are marked as such. The ScriptRunnerCommonConfig contains all the references to ScriptRunner core classes.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
# Source: app_context.py

import logging
import server
import services
import view
from springpython.config import Object
from springpython.config import PythonConfig
from springpython.database.factory import Sqlite3ConnectionFactory
from springpython.security.cherrypy3 import CP3FilterChainProxy
from springpython.security.cherrypy3 import CP3RedirectStrategy
from springpython.security.cherrypy3 import CP3SessionStrategy
from springpython.security.providers import AuthenticationManager
from springpython.security.providers.dao import DaoAuthenticationProvider
from springpython.security.userdetails import InMemoryUserDetailsService
from springpython.security.vote import AffirmativeBased
from springpython.security.vote import RoleVoter
from springpython.security.web import AuthenticationProcessingFilter
from springpython.security.web import AuthenticationProcessingFilterEntryPoint
from springpython.security.web import ExceptionTranslationFilter
from springpython.security.web import FilterSecurityInterceptor
from springpython.security.web import HttpSessionContextIntegrationFilter
from springpython.security.web import SimpleAccessDeniedHandler

class ScriptRunnerCommonConfig(PythonConfig):

  def __init__(self):
    super(ScriptRunnerCommonConfig, self).__init__()
    self.prop_dict = None

  @Object
  def propertiesPlaceHolder(self):
    if (self.prop_dict is None):
      prop = open("etc/ScriptRunner.conf", 'rb')
      propertiesPlaceHolder = {}
      for line in prop:
        line = line[:-1]
        if line.startswith("#") or len(line.strip()) == 0:
          continue
        (key, value) = line.split("=")
        propertiesPlaceHolder[key] = value
      prop.close()
    return propertiesPlaceHolder

  @Object
  def dataSource(self):
    return Sqlite3ConnectionFactory(
      self.propertiesPlaceHolder()["path_to_db"])

  @Object
  def loggingService(self):
    return services.ScriptRunnerLoggingService(logging.DEBUG)
  
  @Object
  def queueService(self):
    queueService = services.ScriptRunnerQueueService(self.dataSource())
    queueService.logger = self.loggingService()
    return queueService

  @Object
  def fileService(self):
    return services.ScriptRunnerFileService(self.propertiesPlaceHolder())


class ScriptRunnerWebConfig(ScriptRunnerCommonConfig):

  @Object
  def root(self):
    """This is the main object defined for the web application."""
    form = view.ScriptRunnerView()
    form.filter = self.authenticationProcessingFilter()
    form.userDetailsService = self.userDetailsService()
    form.authenticationManager = self.authenticationManager()
    form.redirectStrategy = self.redirectStrategy()
    form.httpContextFilter = self.httpSessionContextIntegrationFilter()
    form.dataSource = self.dataSource()
    form.propertiesPlaceHolder = self.propertiesPlaceHolder()
    form.queueService = self.queueService()
    form.fileService = self.fileService()
    form.logger = self.loggingService()
    return form

  @Object
  def userDetailsService(self):
    """
      @Override: modified to read (login,password) data from an
      embedded SQLite3 database and populate the data structure of
      its parent, an InMemoryUserDetails object.
    """
    parentUserDetailsService = InMemoryUserDetailsService()
    userDetailsService = services.ScriptRunnerUserDetailsService(
      parentUserDetailsService, self.dataSource())
    userDetailsService.loadUserdata()
    return userDetailsService

  @Object
  def authenticationProvider(self):
    """ Autogenerated by Coily """
    provider = DaoAuthenticationProvider()
    provider.user_details_service = self.userDetailsService()
    return provider

  @Object
  def authenticationManager(self):
    """ Autogenerated by Coily """
    authManager = AuthenticationManager()
    authManager.auth_providers = []
    authManager.auth_providers.append(self.authenticationProvider())
    return authManager

  @Object
  def accessDecisionManager(self):
    """ Autogenerated by Coily """
    adm = AffirmativeBased()
    adm.allow_if_all_abstain = False
    adm.access_decision_voters = []
    adm.access_decision_voters.append(RoleVoter())
    return adm

  @Object
  def cherrypySessionStrategy(self):
    """ Autogenerated by Coily """
    return CP3SessionStrategy()

  @Object
  def redirectStrategy(self):
    """ Autogenerated by Coily """
    return CP3RedirectStrategy()

  @Object
  def httpSessionContextIntegrationFilter(self):
    """ Autogenerated by Coily """
    filter = HttpSessionContextIntegrationFilter()
    filter.sessionStrategy = self.cherrypySessionStrategy()
    return filter

  @Object
  def authenticationProcessingFilter(self):
    """ Autogenerated by Coily """
    filter = AuthenticationProcessingFilter()
    filter.auth_manager = self.authenticationManager()
    filter.alwaysReauthenticate = False
    return filter

  @Object
  def filterSecurityInterceptor(self):
    """ Autogenerated by Coily """
    filter = FilterSecurityInterceptor()
    filter.auth_manager = self.authenticationManager()
    filter.access_decision_mgr = self.accessDecisionManager()
    filter.sessionStrategy = self.cherrypySessionStrategy()
    filter.obj_def_source = [
      ("/.*", ["ROLE_ANY"])
    ]
    return filter

  @Object
  def authenticationProcessingFilterEntryPoint(self):
    """ Autogenerated by Coily """
    filter = AuthenticationProcessingFilterEntryPoint()
    filter.loginFormUrl = "/login"
    filter.redirectStrategy = self.redirectStrategy()
    return filter

  @Object
  def accessDeniedHandler(self):
    """ Autogenerated by Coily """
    handler = SimpleAccessDeniedHandler()
    handler.errorPage = "/accessDenied"
    handler.redirectStrategy = self.redirectStrategy()
    return handler

  @Object
  def exceptionTranslationFilter(self):
    """ Autogenerated by Coily """
    filter = ExceptionTranslationFilter()
    filter.authenticationEntryPoint = 
      self.authenticationProcessingFilterEntryPoint()
    filter.accessDeniedHandler = self.accessDeniedHandler()
    return filter

  @Object
  def filterChainProxy(self):
    """ Autogenerated by Coily """
    """ added /signup.* because that needs to be open too """
    return CP3FilterChainProxy(filterInvocationDefinitionSource =
      [
        ("/images.*", []),
        ("/html.*",   []),
        ("/login.*",  ["httpSessionContextIntegrationFilter"]),
        ("/signup.*", ["httpSessionContextIntegrationFilter"]),
        ("/.*",       ["httpSessionContextIntegrationFilter",
                       "exceptionTranslationFilter",
                       "authenticationProcessingFilter",
                       "filterSecurityInterceptor"])
      ])


class ScriptRunnerServerConfig(ScriptRunnerCommonConfig):

  @Object
  def root(self):
    """ main server object """
    processor = server.ScriptRunnerProcessor()
    processor.propertiesPlaceHolder = self.propertiesPlaceHolder()
    processor.emailService = self.emailService()
    processor.queueService = self.queueService()
    processor.fileService = self.fileService()
    processor.logger = self.loggingService()
    return processor

  @Object
  def emailService(self):
    return services.ScriptRunnerEmailService(self.propertiesPlaceHolder())
  

class ScriptRunnerCleanerConfig(ScriptRunnerCommonConfig):

  @Object
  def root(self):
    """ main cleaner object """
    cleaner = server.ScriptRunnerCleaner()
    cleaner.propertiesPlaceHolder = self.propertiesPlaceHolder()
    cleaner.fileService = self.fileService()
    cleaner.queueService = self.queueService()
    cleaner.logger = self.loggingService()

services.py

This is my attempt to refactor some of the view.py code into a services layer.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
# Source: services.py

import logging
import os
import re
import smtplib
import time
from email.mime.text import MIMEText
from socket import gethostname
from springpython.context import DisposableObject
from springpython.database.core import DatabaseTemplate
from springpython.database.core import DictionaryRowMapper
from springpython.database.transaction import transactional
from springpython.security.context import SecurityContextHolder
from springpython.security.userdetails import UserDetailsService

class  ScriptRunnerUserDetailsService(UserDetailsService):
  """
    Custom UserDetailsService that extends InMemoryUserDetails to
    leverage its basic authentication functionality. InMemoryUserDetails
    depends on a dictionary, which is populated from data on startup
    from an embedded SQLite3 database table.
  """

  def __init__(self, parent, dataSource):
    UserDetailsService.__init__(self)
    self.parent = parent
    self.dataSource = dataSource
    self.databaseTemplate = DatabaseTemplate(dataSource)
    self.emailRegex = re.compile("[a-zA-Z0-9\._]+\@[a-zA-Z0-9\.]+")
    self.loadUserdata()

  def load_user(self, login):
    """
      Delegates to the parent InMemoryUserDetails object to return a
      valid user from the dictionary.
    """
    user = self.parent.load_user(login)
    return user

  def loadUserdata(self):
    """
      Loads the users into the data structure for the InMemoryUserDetails
      object this UserDetails implementation is derived from, from the
      embedded SQLite3 database.
    """
    rows = self.databaseTemplate.query("select login, password from users",
      rowhandler=DictionaryRowMapper())
    self.parent.user_dict = {}
    for row in rows:
      login = row["login"]
      password = row["password"]
      self.parent.user_dict[login] = (password, ["ROLE_ANY"], True)

  def userDict(self):
    """
      Convenience method to return a handle to the dictionary of the
      parent InMemoryUserDetails object.
    """
    return self.parent.user_dict

  def addUserdata(self, login, password):
    """
      Validate if the user is not signing up with an existing login,
      and if not, add to the user_dict and database (so new logins
      persist across application restarts).
    """
    if not self.parent.user_dict.has_key(login):
      self.databaseTemplate.execute(
        "insert into users(login,password) values (?,?)", [login, password])
      self.dataSource.commit()
      self.parent.user_dict[login] = (password, ["ROLE_ANY"], True)

  def isValidEmail(self, login):
    """
      Convenience method to validate email addresses (logins) against
      a precompiled regex.
    """
    return not self.emailRegex.match(login) is None

  def isValidPassword(self, password):
    """
      Convenience method to validate that a password is long enough.
      We can get more fancy here if we need to.
    """
    return len(password.strip()) >= 8

  def getCurrentUser(self):
    """
      Convenience method to return the current user from the context
    """
    return SecurityContextHolder.getContext().authentication.username


class ScriptRunnerLoggingService():
  """
    Convenience class to expose a cleaner logging interface to objects
    that need logging.
  """

  def __init__(self, level=logging.WARN):
    self.logger = logging.getLogger("ScriptRunner")
    self.level = level
    self.logger.setLevel(self.level)
    ch = logging.StreamHandler()
    ch.setLevel(self.level)
    formatter = logging.Formatter(
      "%(asctime)s - %(name)s - %(levelname)s - %(message)s")
    ch.setFormatter(formatter)
    self.logger.addHandler(ch)

  def debug(self, message):
    self.logger.debug(message)

  def info(self, message):
    self.logger.info(message)

  def warn(self, message):
    self.logger.warn(message)

  def error(self, message):
    self.logger.error(message)
    

class ScriptRunnerQueueService():
  """
    Exposes a Queue like abstraction over the jobs table in the
    embedded SQLite3 database.
  """

  def __init__(self, dataSource):
    self.dataSource = dataSource
    self.databaseTemplate = DatabaseTemplate(dataSource)

  def empty(self):
    """ Returns True if there are no jobs to process """
    rows = self.databaseTemplate.query("""
      select id from jobs where status = "New"
      """, rowhandler=DictionaryRowMapper())
    return len(rows) == 0

  def enqueue(self, row):
    """ Add a new element into the queue """
    self.databaseTemplate.query("""
      insert into jobs(id, script_name, params, requester,
      status, requestDttm, startDttm, stopDttm)
      values (NULL, ?, ?, ?, "New", ?, 0, 0)
    """, (row["script_name"], row["params"],
    row["requester"], time.time()), rowhandler=DictionaryRowMapper())
    self.dataSource.commit()

  @transactional
  def dequeue(self):
    """ Get the next job from the queue and mark it Started """
    row = {}
    rows = self.databaseTemplate.query("""
      select id, script_name, params, requester,
      status, requestDttm, startDttm, stopDttm
      from jobs
      where status = "New"
    """, rowhandler=DictionaryRowMapper())
    currentTime = time.time()
    for row in rows:
      self.databaseTemplate.update(
        "update jobs set status = ?, startDttm = ? where id = ?",
        ("Started", currentTime, row["id"]))
      break
    self.dataSource.commit()
    row["status"] = "Started"
    row["startDttm"] = self.formatDate(currentTime)
    return row

  def browse(self, requester="", formatDate=True):
    """ returns a list of queue elements """
    whereCondition = ""
    if requester != "":
      whereCondition = "where requester = '%s'" % (requester)
    rows = self.databaseTemplate.query("""
      select id, script_name, params, requester,
      status, requestDttm, startDttm, stopDttm
      from jobs
      %s""" % (whereCondition),
      rowhandler=DictionaryRowMapper())
    if formatDate:
      for row in rows:
        if row["params"] == "":
          row["params"] = "-"
        row["requestDttm"] = self.formatDate(int(row["requestDttm"]))
        row["startDttm"] = self.formatDate(int(row["startDttm"]))
        row["stopDttm"] = self.formatDate(int(row["stopDttm"]))
    return rows

  def remove(self, row):
    """ Remove job from queue when complete (mark it complete) """
    id = row["id"]
    currentTime = time.time()
    self.databaseTemplate.update(
      "update jobs set status = ?, stopDttm = ? where id = ?",
      ("Complete", currentTime, id))
    self.dataSource.commit()
    row["status"] = "Complete"
    row["stopDttm"] = self.formatDate(currentTime)
    return row
    
  def expunge(self, cutoff):
    """ Remove all completed jobs with completion dates older than cutoff """
    self.databaseTemplate.execute(
      "delete from jobs where status = ? and stopDttm < ?",
      ("Complete", cutoff))
    self.dataSource.commit()

  def formatDate(self, secondsSinceEpoch):
    """
      Converts times stored in database as seconds since epoch to human
      readable ISO-8601 format for display.
    """
    if secondsSinceEpoch == 0:
      return "-"
    ts = time.localtime(secondsSinceEpoch)
    return time.strftime("%Y-%m-%d %H:%M:%S", ts)


class ScriptRunnerFileService():
  """
    Convenience class to provide some common functionality to list
    and parse script and input files from the local file system.
  """

  def __init__(self, propertiesPlaceHolder):
    self.propertiesPlaceHolder = propertiesPlaceHolder
    self.prefix = self.propertiesPlaceHolder["script_doc_prefix"]

  def listFiles(self, dir):
    return os.listdir(dir)

  def listParams(self, scriptdir, scriptname):
    """
      Return a list of param tuples by parsing the script metadata
      identified by prefix. We can get fancier if we need to, extracting
      options, default values, etc, but currently we assume positional
      parameters, something like this:
      ##~~ input - the input file
      ##~~ conf - the configuration file
      ##~~ output - the output file, etc.
    """
    params = []
    script = open("/".join([scriptdir, scriptname]), 'rb')
    for line in script:
      if line.startswith(self.prefix):
        (name, desc) = line[len(self.prefix):].split("-")
        params.append([name.strip(), desc.strip()])
    script.close()
    return params

  def deleteFilesOlderThan(self, dir, cutoff):
    """ delete files in a directory that are older than cutoff """
    for file in os.listdir(dir):
      path = "/".join([dir, file])
      if os.path.getmtime(path) < cutoff:
        os.remove(path)


class ScriptRunnerEmailService(DisposableObject):
  """
    Provides a convenient abstraction to send email or print
    contents of email to STDOUT (if mock=True).
  """

  def __init__(self, propertiesPlaceHolder, mock=True):
    self.propertiesPlaceHolder = propertiesPlaceHolder
    self.sender = propertiesPlaceHolder["email_address"]
    self.mock = mock
    if not mock:
      self.smtp = smtplib.SMTP()

  def destroy(self):
    if not self.mock:
      self.smtp.quit()

  def sendMail(self, row):
    """ Sends start and completion messages """
    recipient = row["requester"]
    subject = """Job #%d [%s] %s""" % (
      int(row["id"]), row["script_name"], row["status"])
    body = ""
    if row["status"] == "Started":
      body += """
Your Job (Job# %d) has started at %s.
--
View status at http://%s:%d/jobs
        """ % (int(row["id"]), row["startDttm"],
        gethostname(), int(self.propertiesPlaceHolder["web_port"]))
    elif row["status"] == "Complete":
      outputFile = self.getOutputFile(row["params"])
      body += """
Your Job (Job# %d) completed at %s.
Output file [%s] is available for pickup and will be deleted
after 7 days. To retrieve the output, scp it from this machine
using the following command:
--
scp %s@%s:%s/%s %s
(Password: %s)
        """ % (int(row["id"]), row["stopDttm"],
        outputFile, self.propertiesPlaceHolder["scp_user"],
        gethostname(), self.propertiesPlaceHolder["outbox_dir"],
        outputFile, outputFile, self.propertiesPlaceHolder["scp_pass"])
    else:
      return
    if self.mock:
      print "---- Sending message ----"
      print "From: %s" % (self.sender)
      print "To: %s" % (recipient)
      print "Subject: %s" % (subject)
      print body
      print "----"
    else:
      msg = MIMEText(body)
      msg["From"] = self.sender
      msg["To"] = recipient
      msg["Subject"] = subject
      self.smtp.sendmail(send.sender, [recipient], msg.as_string())

  def getOutputFile(self, params):
    outputFile = "NONE"
    for param in params.split(","):
      if param != "":
        (paramName, paramValue) = param.split("=")
        if paramName.startswith(
            self.propertiesPlaceHolder["output_param_prefix"]):
          outputFile = paramValue
          break
    return outputFile

view.py

This corresponds to the Controller and JSTL layer in the Spring Java world. Each method annotated with @cherrypy.expose corresponds to a page.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
# Source: view.py

import cherrypy
import re
from socket import gethostname
from springpython.security import AuthenticationException
from springpython.security.context import SecurityContextHolder
from springpython.security.providers import UsernamePasswordAuthenticationToken

def header():
  """ Standard header used for all pages """
  header = """
    <!-- ScriptRunner :: Run Scripts on this Server -->
    <html>
      <head>
        <title>ScriptRunner :: Run Scripts on this Server</title>
        <style type="text/css">
            td { padding:3px; }
            div#top {position:absolute; top: 0px; left: 0px; background-color: #E4EFF3; height: 50px; width:100%; padding:0px; border: none;margin: 0;}
            div#image {position:absolute; top: 50px; right: 0%; background-image: url(images/spring_python_white.png); background-repeat: no-repeat; background-position: right; height: 100px; width:300px }
        </style>
      </head>
      <body>
        <div id="top">&nbsp;</div>
        <div id="image">&nbsp;</div>
        <br clear="all">
        <p/>
        <h2>Welcome to ScriptRunner</h2>
  """
  username = SecurityContextHolder.getContext().authentication.username
  if not username is None:
    header += """
      <font size="-1">
      Hello <b>%s</b> (if you are not %s, please <a href="/logout">click here</a>)
      </font><br/>
    """ % (username, username)
  header += "<hr/>"
  return header

def footer(currentPage=""):
  """ Standard footer used for all pages """
  links = [
    ["Home", "/"],
    ["Submit Job", "/submit"],
    ["View Jobs", "/jobs"],
    ["Logout", "/logout"]
  ]
  username = SecurityContextHolder.getContext().authentication.username
  return """
    <hr>
    <table style="width:100%%">
      <tr>
        <td>%s</td>
        <td style="text-align:right;color:silver">
          Powered by <a href="http://cherrypy.org">CherryPy</a>
          and <a href="http://springpython.webfactional.com">Spring-Python</a>.
        </td>
      </tr>
    </table>
    </body>
    """ % (footerLinkText(links, username, currentPage))

def footerLinkText(links, username, currentPage):
  footerLinkText = ""
  if not username is None:
    count = 0
    for link in links:
      if count > 0:
        footerLinkText += "&nbsp;|&nbsp;"
      if currentPage != "" and link[0] == currentPage:
        footerLinkText += """
          <font style="color:silver">%s</font>
        """ % (link[0])
      else:
        footerLinkText += "<a href=\"%s\">%s</a>" % (link[1], link[0])
      count = count + 1
  return footerLinkText


class ScriptRunnerView(object):
  """Presentation layer of the web application."""

  def __init__(self):
    self.emailRegex = re.compile("[a-zA-Z0-9\._]+\@[a-zA-Z0-9\.]+")

  @cherrypy.expose
  def login(self, fromPage="/", login="", password="", errorMsg=""):
    """ Controller for login and view for login/signup page """
    if login != "" and password != "":
      try:
        self.attemptAuthentication(login, password)
        return [self.redirectStrategy.redirect(fromPage)]
      except AuthenticationException, e:
        errorMsg = "Bad email/password, please try again"
        return [self.redirectStrategy.redirect("?login=%s&errorMsg=%s" %
          (login, errorMsg))]
    results = header()
    if errorMsg != "":
      results += """
        <b><font color="red">Errors Found:%s, please retry</font></b><br/>
      """ % (errorMsg)
    results += """
      <table cellspacing="4" cellpadding="4" border="0" width="100%%">
        <tr>
          <td><h3>Sign In</h3></td>
          <td><h3>Or Sign Up</h3></td>
        </tr>
        <tr>
          <td><!-- login -->
            <form method="POST" action="/login">
              <input type="hidden" name="fromPage" value="%s">
              <table cellspacing="2" cellpadding="4" border="0">
                <tr>
                  <td><b>Email Address:</b></td>
                  <td><input type="text" name="login" value="" size="10"/></td>
                </tr>
                <tr>
                  <td><b>Password:</b></td>
                  <td><input type="password" name="password" size="10"></td>
                </tr>
                <tr>
                  <td colspan="2"><input type="submit" value="Sign In"/>
                </tr>
              </table>
            </form>
          </td>
          <td><!-- signup -->
            <form method="POST" action="/signup">
              <input type="hidden" name="fromPage" value="%s">
              <table cellspacing="2" cellpadding="4" border="0">
                <tr>
                  <td><b>Email Address:</b></td>
                  <td><input type="text" name="login" value="" size="10"/></td>
                </tr>
                <tr>
                  <td><b>Password:</b></td>
                  <td><input type="password" name="password" size="10"></td>                  </tr>
                <tr>
                  <td colspan="2"><input type="submit" value="Sign Up"/>
                </tr>
              </table>
            </form>
          </td>
        </tr>
      </table>
      """ % (fromPage, fromPage)
    results += footer()
    return [results]

  @cherrypy.expose
  def signup(self, fromPage="", login="", password="", errorMsg=""):
    """
      Controller for signup page
      Collects the user data from the form, then updates the user
      dictionary in the UserDetailsService, then redirects to login
      page.
    """
    if login == "" or password == "":
      self.redirectStrategy.redirect("/login")
    else:
      """ validate the inputs """
      if (not self.userDetailsService.isValidEmail(login)):
        errorMsg = "<br/>Malformed Email Address"
      if (not self.userDetailsService.isValidPassword(password)):
        errorMsg += "<br/>Password too short (< 8 chars)"
    if errorMsg == "":
      """ no errors """
      self.userDetailsService.addUserdata(login, password)
      self.attemptAuthentication(login, password)
      return [self.redirectStrategy.redirect(fromPage)]
    else:
      """ redirect to login page """
      self.redirectStrategy.redirect("/login?errorMsg=%s" % (errorMsg))

  @cherrypy.expose
  def index(self):
    """ controller for index page (documentation) """
    results = header()
    results += """
    <p>ScriptRunner provides a web interface for users to run large,
    resource-intensive jobs on a large machine (or cluster of machines).
    Traditionally, such resources have been jealously guarded by the
    chosen few who were granted access to it, primarily because of
    security concerns. However, one side effect of this is that resource
    ends up being under-utilized because this group usually have other
    things to do besides trying to coordinate machine usage among
    themselves to keep the resource constantly busy. Another typical
    side effect is that everything is funnelled through this group,
    resulting in this group being overworked.</p>

    <p>ScriptRunner opens up this expensive computing resources to anyone
    within the organization who requires it. Jobs are submitted to ScriptRunner,
    which runs them serially. Progress (job start, job completion) is reported
    back using the submitter's email address.</p>

    <p>A job consists of some kind of transformation, represented by a
    Unix shell script, one or more input files, and optional configuration
    parameters or files.</p>

    <p>Scripts can be those that are already checked into the code repository,
    or something created and freshly checked in (which an admin would then
    need to update on the ScriptRunner box). Input files are scp'ed over to
    a public area of the ScriptRunner machine. ScriptRunner uses metadata
    annotations in the script to prompt for script parameters.</p>

    <p>Jobs are queued up in a database embedded within ScriptRunner. One
    (or more) server component(s), depending on the machine/cluster capacity,
    listen on this queue and runs them. Once done, an email containing
    instructions on where to retrieve the output from (using scp) is sent to
    the requester.</p>
    """
    results += footer("Home")
    return [results]

  @cherrypy.expose
  def submit(self, scriptname="", pageNum=1, **kwargs):
    """ controller for job submission """
    scpUser = self.propertiesPlaceHolder["scp_user"]
    scpPass = self.propertiesPlaceHolder["scp_pass"]
    inboxDir = self.propertiesPlaceHolder["inbox_dir"]
    scriptDir = self.propertiesPlaceHolder["script_dir"]
    inputParamPrefix = self.propertiesPlaceHolder["input_param_prefix"]
    hostName = gethostname()
    """ generate wizard pages based on pageNum """
    results = header()
    results += """<h3>Submit Job (Page %d/3)</h3>""" % (int(pageNum))
    if int(pageNum) == 1:
      scriptFiles = self.fileService.listFiles(scriptDir)
      results += """
        <form method="POST" action="/submit">
          <input type="hidden" name="pageNum" value="2"/>
          <ol>
            <li>scp your input file to this machine<br/>
              <font size="-1">(scp input_file %s@%s:%s), password: %s</font>
            </li>
            <li>Choose Script to run: <select name="scriptname">%s</select></li>
          </ol>
          <input type="submit" value="Next"/>
        </form>
      """ % (scpUser, hostName, inboxDir, scpPass, self.selectHtml(scriptFiles))
    elif int(pageNum) == 2:
      inputFiles = self.fileService.listFiles(inboxDir)
      params = self.fileService.listParams(scriptDir, scriptname)
      results += """
        <form method="POST" action="/submit">
          <input type="hidden" name="pageNum" value="3"/>
          <input type="hidden" name="scriptname" value="%s"/>
          <ol>
            <li><font style="color:silver">scp your input file to this machine<br/>
              <font size="-1">(scp input_file %s@%s:%s), password: %s</font></font>
            </li>
            <li><font style="color:silver">Choose Script to run: %s</font></li>
            <li>Set parameters:<br/>
              %s
            </li>
          </ol>
          <input type="submit" value="Next"/>
        </form>
      """ % (scriptname, scpUser, hostName, inboxDir, scpPass,
      scriptname, self.paramSelectHtml(params, inputParamPrefix, inputFiles))
    else:
      username = self.userDetailsService.getCurrentUser()
      (paramHtml, paramCsv) = self.formatParams(self.getRequestParams("param_"))
      """ save the job into the database """
      element = {
        "script_name" : scriptname,
        "params" : paramCsv,
        "requester" : username
      }
      self.queueService.enqueue(element)
      results += """
        <ol>
          <li><font style="color:silver">scp your input file to this machine<br/>
            <font size="-1">(scp input_file %s@%s:%s), password: %s</font></font>
          </li>
          <li><font style="color:silver">Choose Script to run: %s</font></li>
          <li><font style="color:silver">Set Parameters: %s</font></li>
          <li>Job submitted (<a href="/jobs">check status</a>), progress updates
            will be emailed to %s
          </li>
        </ol>
      """ % (scpUser, hostName, inboxDir, scpPass, scriptname,
      paramHtml, username)
    results += footer("Submit Job")
    return [results]

  @cherrypy.expose
  def jobs(self, requester=""):
    """ controller for listing all submitted jobs """
    results = header()
    if requester != "":
      results += """
        <h3><a href="/jobs">All Jobs</a> | <font style="background-color:#E4EFF3">Jobs for %s</font></h3>
      """ % (requester)
    else:
      username = self.userDetailsService.getCurrentUser()
      results += """
        <h3><font style="background-color:#E4EFF3">All Jobs</font> | <a href="/jobs?requester=%s">Jobs for %s</a></h3>
      """ % (username, username)
    results += """
      <table cellspacing="2" cellpadding="2" border="1" width="100%%">
        <tr>
          <th>Job-#</th>
          <th>Script</th>
          <th>Params</th>
          <th>Requester</th>
          <th>Status</th>
          <th>Requested</th>
          <th>Started</th>
          <th>Completed</th>
        </tr>
    """
    rows = self.queueService.browse(requester, True)
    for row in rows:
      results += """
        <tr>
          <td>%s</td>
          <td>%s</td>
          <td>%s</td>
          <td>%s</td>
          <td>%s</td>
          <td>%s</td>
          <td>%s</td>
          <td>%s</td>
        </tr>
        """ % (row.get("id"), row.get("script_name"),
        row.get("params"), row.get("requester"), row.get("status"),
        row.get("requestDttm"), row.get("startDttm"), row.get("stopDttm"))
    results += """
      </table>
    """
    results += footer("View Jobs")
    return [results]

  @cherrypy.expose
  def logout(self):
    """ controller to log out user """
    self.filter.logout()
    self.httpContextFilter.saveContext()
    raise cherrypy.HTTPRedirect("/")

  def attemptAuthentication(self, username, password):
    """Authenticate a new username/password pair using the authentication manager."""
    token = UsernamePasswordAuthenticationToken(username, password)
    SecurityContextHolder.getContext().authentication = self.authenticationManager.authenticate(token)
    self.httpContextFilter.saveContext()

  def selectHtml(self, files):
    """ Generate select widget to allow selection from collection """
    selectHtml = ""
    for file in files:
      selectHtml += """
        <option value="%s">%s</option>
      """ % (file, file)
    return selectHtml

  def paramSelectHtml(self, params, inputParamPrefix, inputFiles):
    """ Generate table widget to input script parameters """
    paramSelectHtml = """
      <table cellspacing="1" cellpadding="1" border="0">
    """
    for (paramName, paramDesc) in params:
      if paramName.startswith(inputParamPrefix):
        paramSelectHtml += """
          <tr>
            <td align="left">%s<br/><font size="-1">%s</font></td>
            <td><select name="param_input">%s</select></td>
          </tr>
        """ % (paramName, paramDesc, self.selectHtml(inputFiles))
      else:
        paramSelectHtml += """
          <tr>
            <td align="left">%s<br/><font size="-1">%s</font></td>
            <td><input type="text" name="%s" width="40"/></td>
          </tr>
        """ % (paramName, paramDesc, "param_" + paramName)
    paramSelectHtml += "</table>"
    return paramSelectHtml

  def getRequestParams(self, prefix):
    """ Generate list of parameter (name,value) from request params """
    params = []
    for paramName in cherrypy.request.params.keys():
      if paramName.startswith(prefix):
        paramValue = cherrypy.request.params[paramName]
        params.append([paramName[len(prefix):], paramValue])
    return params

  def formatParams(self, params):
    """ Format list of params to different formats (HTML, CSV) """
    paramHtml = ""
    paramCsv = ""
    for (paramName, paramValue) in params:
      paramHtml += """
        <br/>%s = %s
      """ % (paramName, paramValue)
      paramCsv += "%s=%s," % (paramName, paramValue)
    return (paramHtml, paramCsv)

server.py

Like the view.py contains the ScriptRunnerView class for the web client, this file contains the top level classes for the server and cleaner components.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# Source: server.py

import time
import subprocess

class ScriptRunnerProcessor(object):

  def __init__(self):
    self.terminate = False

  def start(self):
    while not self.terminate:
      if self.queueService.empty():
        self.logger.debug("Sleeping for some time...")
        time.sleep(int(self.propertiesPlaceHolder["poll_interval"]))
        continue
      row = self.queueService.dequeue()
      self.emailService.sendMail(row)
      subprocess.call(self.buildCommand(row), shell=True)
      row = self.queueService.remove(row)
      self.emailService.sendMail(row)

  def stop(self):
    self.terminate = True

  def buildCommand(self, row):
    command = []
    scriptname = "/".join(
      [self.propertiesPlaceHolder["script_dir"], row["script_name"]])
    command.append(scriptname)
    inputDir = self.propertiesPlaceHolder["inbox_dir"]
    outputDir = self.propertiesPlaceHolder["outbox_dir"]
    inputParamPrefix = self.propertiesPlaceHolder["input_param_prefix"]
    outputParamPrefix = self.propertiesPlaceHolder["output_param_prefix"]
    params = row["params"]
    for paramNvp in params.split(","):
      if paramNvp != "":
        (paramName, paramValue) = paramNvp.split("=")
        if paramName.startswith(inputParamPrefix):
          command.append("/".join([inputDir, paramValue]))
        elif paramName.startswith(outputParamPrefix):
          command.append("/".join([outputDir, paramValue]))
        else:
          command.append(paramValue)
    self.logger.debug("Executing: %s" % (command))
    return command


class ScriptRunnerCleaner(object):

  def __init__(self, propertiesPlaceHolder):
    self.secondsPerWeek = 7 * 24 * 60 * 60
    
  def clean(self):
    cutoffTime = time.time() - self.secondsPerWeek
    inbox = self.propertiesPlaceHolder["inbox_dir"]
    outbox = self.propertiesPlaceHolder["outbox_dir"]
    self.fileService.deleteFilesOlderThan(inbox, cutoffTime)
    self.fileService.deleteFilesOlderThan(outbox, cutoffTime)
    self.queueService.expunge(cutoffTime)

ScriptRunner.py

This is the main class that is called from the command line. The original code has been changed to add and parse options, and to switch between three modes of operation. Also some code has been factored out to the services layer.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# Source: ScriptRunner.py

import app_context
import cherrypy
import parser
import os
import sys
from optparse import OptionParser
from springpython.context import ApplicationContext
from springpython.security.context import SecurityContextHolder

def launch_web():
  """ run in web client mode """
  applicationContext = ApplicationContext(app_context.ScriptRunnerWebConfig())
  props = applicationContext.get_object("propertiesPlaceHolder")
  port = int(props["web_port"]) if props.has_key("web_port") else 8080
  conf = {'/'      : {"server.socket.port": port,
                      "tools.staticdir.root": os.getcwd(),
                      "tools.sessions.on": True,
                      "tools.filterChainProxy.on": True},
          "/images": {"tools.staticdir.on": True,
                      "tools.staticdir.dir": "images"}
  }
  applicationContext.get_object("filterChainProxy")
  """ set up security """
  SecurityContextHolder.setStrategy(SecurityContextHolder.MODE_GLOBAL)
  SecurityContextHolder.getContext()
  form = applicationContext.get_object(name="root")
  cherrypy.tree.mount(form, '/', config=conf)
  cherrypy.engine.start()
  cherrypy.engine.block()

def launch_server():
  """ run in server mode """
  applicationContext = ApplicationContext(app_context.ScriptRunnerServerConfig())
  processor = applicationContext.get_object(name="root")
  try:
    processor.start()
  except KeyboardInterrupt:
    processor.stop()

def launch_cleanup():
  """ scheduled from cron to periodically delete old files and job entries """
  applicationContext = ApplicationContext(app_context.ScriptRunnerCleanupConfig())
  cleaner = applicationContext.get_object(name="root")
  cleaner.clean()

if __name__ == '__main__':
  # Parse launch options
  parser = OptionParser(usage="Usage: %prog [-h|--help] [options]")
  parser.add_option("-w", "--web", action="store_true", dest="web", default=False, help="Run ScriptRunner web client")
  parser.add_option("-s", "--server", action="store_true", dest="server", default=False, help="Run ScriptRunner server")
  parser.add_option("-c", "--cleanup", action="store_true", dest="cleanup", default=False, help="Cleanup old files")
  (options, args) = parser.parse_args()

  selected = 0
  selected = selected + 1 if options.web else selected
  selected = selected + 1 if options.server else selected
  selected = selected + 1 if options.cleanup else selected
  if selected != 1:
    print "Select one of --web, --server, or --cleanup"
    parser.print_help(None)
    sys.exit(2)

  if options.web:
    launch_web()
  elif options.server:
    launch_server()
  elif options.cleanup:
    launch_cleanup()

Screenshots of Usage

Here are some screenshots of the web client, showing how someone would sign up or login, submit a job, then view the status of the job.

Login/Sign up page. New users sign in and are immediately logged in (ie without a separate login step). Existing users log in with their email address and password.

Logged in users are sent to an index page which provides a little documentation about the app, and what you can do with it. Now that we recognize the user, links to various pages are exposed in the footer. The header now contains a message showing that me@mycompany.com has logged in.

User clicks Submit Job. This is a 3 page wizard that will lead the user through the steps to submit the job. The first step is to scp your input file over to the box, then choose the script to run.

User provides parameters to the script. Any parameters whose names begin with the configured input prefix (currently "input") have the drop down containing all the files from the inbox - one or more of which the user would have scp-ed into the application's inbox.

Submission is successful! At this point the job is queued up for processing.

User clicks on the "check status" link in the last page, or the "View Jobs" link in the footer. The default view is to show all jobs, but you can also filter for your own jobs using the tab at the top of the page. In this case both are the same.

The server processes the job, sending out an email when it starts the job...

1
2
3
4
5
6
7
From: scriptrunner@mycompany.com
To: me@mycompany.com
Subject: Job #1 [runComplexOperation1.sh] Started

Your Job (Job# 1) has started at 2010-08-29 09:00:12.
--
View status at http://cyclone.hl.local:8080/jobs

...and one when it completes.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
From: scriptrunner@mycompany.com
To: me@mycompany.com
Subject: Job #1 [runComplexOperation1.sh] Complete

Your Job (Job# 1) completed at 2010-08-29 09:00:12.
Output file [output1] is available for pickup and will be deleted
after 7 days. To retrieve the output, scp it from this machine
using the following command:
--
scp guest@cyclone.hl.local:/Users/sujit/Projects/ScriptRunner/data/outputs/output1 output1
(Password: secret)

After the start and end of each job, the timestamps and current job status are updated. Here is what the View Jobs page looks like after the job is completed. As mentioned above, currently me@mycompany.com is the only user configured, so the two tabs on this page show the same results.

Conclusion

As I have mentioned before, this is my very first Python webapp (although I have used Python for scripting (not that much recently), and I have used CherryPy before). It took me around 2 weeks to build this app, which is probably a bit long by the standards of a good Python programmer. I did have a couple of false starts - I wanted to use Apache ActiveMQ for the queueing, so I checked out first stompy and then ActiveMQ's REST interface with Universal Feed Parser before settling on the embedded database approach. But in any case I now have a working (production ready) Python app with a nice web interface, and I don't think I could have done it in this time using something other than SpringPython. SpringPython helped in two ways:

  • Coily provided a working archetype - this helped immensely. Building a webapp involves many components, and having to only work on the application can be a huge time (and effort) saver.
  • Dependency Injection - Since I was familiar with Java Spring, using SpringPython's DatabaseTemplate and @transactional annotations, and even the concepts of ApplicationContext, etc, felt very familiar and intuitive to use.

2 comments (moderated to prevent spam):

Anonymous said...

I'm glad Spring Python helped you get this application developed. Your feedback on helping to improve the SQlite3 connection factory was very useful.

Sujit Pal said...

Thanks Greg, and thanks for building SpringPython. I have looked at Python web development alternatives before, but was always put off by the prospect of having to learn yet another framework (for limited gain, since I would not use it that much). SpringPython provided me a convenient and (comparatively) effortless entry point for doing this.