Friday, March 01, 2013

Drools Rules in a Database, Take 2


Quite some time back, I wrote an article describing how I used Drools with database backed rules. Since Drools (version 2) did not support this at the time, I ended up writing some glue code that tied Commons Collections Predicates and Closures into Drools Rule Conditions and Consequences, then built my rules on top of that. It was for a proof of concept project which ultimately never materialized. Over the years, I've had requests for the code backing the article.

Fast forward 6 years, and I find myself trying to do something similar for an upcoming project. Time sure flies when you are having fun. Of course, by this time, Drools is up to version 5, and database backed rules is now natively supported, so its much easier this time round.

What I describe here is a small example I tried out to prove to myself that this will work before putting it into the main application. The example I use is part of a (machine generated) decision tree from Erik Siegel's book "Predictive Analytics", that predicts the risk of mortgage default, given factors such as loan amount, applicant income, etc. Machine generated or not, its a decision tree, and hence can be modeled as a set of rules, which is what I do here.

Rules in Drools (now called JBoss Rules) are written using the "when ${condition} then ${consequence}" pattern. In keeping with the pattern, our rules are also written into a single MySQL table, and the interesting data is contained in two columns: rule_cond (the condition) and rule_cons (the consequence). Here is the dump (edited for readability) of these columns for our toy application.

1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
mysql> select rule_cond, rule_cons from mort_rules order by rule_name;
+---------------------------------------+------------------------------------+
| rule_cond                             | rule_cons                          |
+---------------------------------------+------------------------------------+
| interestRate < 7.94 &&                | setRisk(kcontext, $mortgage, 2.6)  |
|   applicantIncome < 78223             |                                    |
+---------------------------------------+------------------------------------+
| interestRate < 7.19 &&                | setRisk(kcontext, $mortgage, 3.4   |
|   applicantIncome >= 78223            |                                    |
+---------------------------------------+------------------------------------+
| interestRate >= 7.19 &&               | setRisk(kcontext, $mortgage, 9.1)  |
|   interestRate < 7.94 &&              |                                    |
|   applicantIncome >= 78223            |                                    |
+---------------------------------------+------------------------------------+
| interestRate >= 7.94 &&               | setRisk(kcontext, $mortgage, 8.1)  |
|   mortgageAmount < 67751 &&           |                                    |
|   loanToValue < 87.4                  |                                    |
+---------------------------------------+------------------------------------+
| interestRate >= 7.94 &&               | setRisk(kcontext, $mortgage, 8.5)  |
|   mortgageAmount < 182926 &&          |                                    |
|   mortgageAmount >= 67751 &&          |                                    |
|   loanToValue < 87.4 &&               |                                    |
|   interestRate < 8.69 &&              |                                    |
|   isCondo == 1                        |                                    |
+---------------------------------------+------------------------------------+
| interestRate >= 7.94 &&               | setRisk(kcontext, $mortgage, 16.3) |
|   mortgageAmount < 182926 &&          |                                    |
|   mortgageAmount >= 67751 &&          |                                    |
|   loanToValue < 87.4 &&               |                                    |
|   interestRate < 8.69 &&              |                                    |
|   isCondo == 0                        |                                    |
+---------------------------------------+------------------------------------+
| interestRate >= 7.94 &&               | setRisk(kcontext, $mortgage, 25.6) | 
|   mortgageAmount < 182926 &&          |                                    |
|   mortgageAmount >= 67751 &&          |                                    |
|   loanToValue < 87.4 &&               |                                    |
|   interestRate >= 8.69                |                                    |
+---------------------------------------+------------------------------------+
| interestRate >= 7.94 &&               | setRisk(kcontext, $mortgage, 6.4)  | 
|   mortgageAmount < 182926 &&          |                                    |
|   loanToValue >= 87.4                 |                                    |
+---------------------------------------+------------------------------------+
| interestRate >= 7.94 &&               | setRisk(kcontext, $mortgage, 15.2) | 
|   mortgageAmount >= 182926 &&         |                                    |
|   isCondo == 1                        |                                    |
+---------------------------------------+------------------------------------+
| interestRate >= 7.94 &&               | setRisk(kcontext, $mortgage, 40.0) | 
|   mortgageAmount >= 182926 &&         |                                    |
|   isCondo == 0                        |                                    |
+---------------------------------------+------------------------------------+
10 rows in set (0.01 sec)

These rules work are substituted into a DRT (Drools Template) file, which is written in MVEL and in my case looks something like this. Drools provides a ResultSetGenerator object which will use the ResultSet to build a large DRL string.

1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// Source: src/main/resources/mort_rules.drt
template header
// list of columns that the SELECT will return
rule_cond
rule_cons

// header (will be written out once)
package com.mycompany.mortgage;

dialect "mvel"
import function com.mycompany.mortgage.MortgageFunctions.setRisk;

template "mortgage"
// this section will be repeated for each database row, the value
// of x will be written out for @{x}.

rule "mortgage_@{row.rowNumber}"
no-loop
when
  $mortgage: Mortgage(@{rule_cond})
then
  @{rule_cons}
end

end template

Here is the code to run a bunch of mortgages (represented by the Mortgage object) through the rules thus generated.

1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
// Source: src/main/scala/com/mycompany/mortgage/MortgageRiskCalculator.scala
package com.mycompany.mortgage

import java.io.{FileInputStream, StringReader}
import java.sql.DriverManager

import org.drools.RuleBaseFactory
import org.drools.compiler.PackageBuilder
import org.drools.runtime.rule.RuleContext
import org.drools.template.jdbc.ResultSetGenerator

object MortgageRiskCalculator extends App {
  val mortgages = Seq(
    new Mortgage(120000.00, 2.3, 50000.00, 75.5, 1, -1.0),
    new Mortgage(360000.00, 4.7, 100000.00, 12.0, 0, -1.0),
    new Mortgage(400000.00, 7.5, 250000.00, 20.0, 0, -1.0))
  val rules = new RiskRules("src/main/resources/mort_rules.drt")
  rules.runRules(mortgages)
  mortgages.foreach(x => println("risk=" + x.risk))
}

class Mortgage(
    val mortgageAmount: Double, 
    val interestRate: Double, 
    val applicantIncome: Double, 
    val loanToValue: Double,
    val isCondo: Int,
    var risk: Double)
    
class RiskRules(template: String) {
  // set up database connection
  val jdbcUrl = "jdbc:mysql://localhost:3306/mydb"
  Class.forName("com.mysql.jdbc.Driver")
  val conn = DriverManager.getConnection(jdbcUrl, "myuser", "mypass")
  // build drl with data from database
  val preparedStmt = conn.prepareStatement("""
    select rule_cond, rule_cons 
    from mort_rules""")
  val resultSet = preparedStmt.executeQuery()
  val resultSetGenerator = new ResultSetGenerator()
  val drl = resultSetGenerator.compile(resultSet, 
    new FileInputStream(template))
  resultSet.close()
  preparedStmt.close()
  conn.close()
  // build rule base
  val packageBuilder = new PackageBuilder()
  packageBuilder.addPackageFromDrl(new StringReader(drl))
  val ruleBase = RuleBaseFactory.newRuleBase()
  ruleBase.addPackage(packageBuilder.getPackage())
  val workingMemory = ruleBase.newStatefulSession()
  
  def runRules(facts: Seq[Mortgage]): Unit = {
    facts.foreach(workingMemory.insert(_))
    workingMemory.fireAllRules()
    workingMemory.dispose()
  }
}

object MortgageFunctions {
  def setRisk(ctx: RuleContext, m: Mortgage, r: Double): Unit = {
    m.risk = r
  }
}

The MortgageRiskCalculator is the calling class that instantiates the RiskRules object and runs the Mortgages through the Rule. The RiskRules class is the central class, it reads the rules from the MySQL table, instantiates the RuleSetGenerator, creates the DRL, creates the RuleBase and sets the DRL into it, and finally instantiates a WorkingMemory. Its runRules method inserts all the Mortgage (facts) nto the WorkingMemory and fires all the rules.

In our case, there is only a single Consequence function setRisk that sets the risk value back into the Mortgage object. In practice there can be other such Consequences that are modelled as static methods of MortgageFunctions and are imported into the DRT file so they feel like a DSL to the rule maintainer.

The output of the run is the default risk associated with each mortgage (as a percentage).

1
2
3
risk=2.6
risk=3.4
risk=9.1

In the spirit of full disclosure, in case any of you want to replicate this, I built a standard SBT project with "g8 typesafehub/scala-sbt" and my build.sbt looks like this:

1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
name := "mortgage"

version := "1.0"

scalaVersion := "2.9.2"

resolvers += "Typesafe Repository" at "http://repo.typesafe.com/typesafe/repo/"

// drools

libraryDependencies += "org.drools" % "drools-core" % "5.5.0.Final"

libraryDependencies += "org.drools" % "drools-compiler" % "5.5.0.Final"

libraryDependencies += "org.drools" % "drools-jsr94" % "5.5.0.Final"

libraryDependencies += "org.drools" % "drools-decisiontables" % "5.5.0.Final"

libraryDependencies += "org.drools" % "knowledge-api" % "5.5.0.Final"

// mysql

libraryDependencies += "mysql" % "mysql-connector-java" % "5.1.12"

So now that this functionality (support for database backed rules) is available as part of Drools, I finally have justification for deleting the old code (backing the article) off my hard disk :-).

4 comments (moderated to prevent spam):

Jim White said...

Actually JBoss Rules was a failed experiment shortly after JBoss acquired the Drools project and hasn't been used for years.

Sujit Pal said...

Thanks Jim, I didn't know that. I am guessing you mean that the name "JBoss Rules" was a failed experiment, yes? That would account for the project still being called "Drools". I am going to change the title back to Drools to reflect this, although the URL would still reflect it.

Oggu said...

Nice post, I was wondering if you had a UI that was making insertions to the database other than the Guvnor

Sujit Pal said...

Thanks Oggu, and no, I didn't use Guvnor. Since this was just a proof of concept for figuring out the feasibility of using Drools for one of my work projects, I populated the table using plain old INSERT statements.