Last week, I described a simple JMX setup to manage remotely running shell scripts. Using the HTML Adapter supplied in the JMX toolkit, we were able to provide a nice web-based front end that allowed us to start and stop the scripts, as well as observe its logs and whether it is running or not. The setup so far is functional, but not overly convenient. For example, instead of having to point your browser to the MBean page (and remember, there could be multiple machines, each running a bunch of scripts that you have to manage) once every day or hour, you may want the server to notify you (maybe send you an email or an SMS) if there is anything amiss. So over the last week, I added this sort of functionality to my JMX server, which I describe in this post.
Fortunately, JMX provides these sort of features right out of the box, in the form of Monitor and Timer objects, so the work is mainly to choose the right component for the job, then hook these components up together correctly. The diagram below shows the components I use for each of my Script adapter MBeans in my toy example:
Looks impressive, doesn't it? Well, essentially, all it is saying is that we have two Monitors and one Timer attached to each Script Adapter. The two Monitors periodically poll the Script Adapter and send out notifications into the MBean server's event pool. These are picked up by the NotificationFilters connected with each ScriptAdapter, and passed through to the Listener object, which handles the notification. The Timer is slightly different, it just sends events on a schedule, which gets picked up by the Notification Filter, and passed through to the configured Listener.
Please note: The code for all the classes is provided towards the end of the article. I have shown snippets of code where it is more informative than describing it in English, but because of the high degree of reuse, there is a corresponding number of forward references, which may be hard for a reader to reconcile if I provided full code inline.
Status Monitoring
This monitor is a StringMonitor which calls getStatus() on the ScriptAdapter every minute and looks for an exact match to the the String "ERROR". If it finds it, then it sends out an email. Here is the snippet of code from ScriptAgent.java which does this (we provide the full source below).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | // add string monitor checking for getStatus() != "ERROR"
StringMonitor statusMonitor = new StringMonitor();
statusMonitor.addObservedObject(script);
statusMonitor.setObservedAttribute("Status");
statusMonitor.setGranularityPeriod(60000L); // check every minute
statusMonitor.setStringToCompare("ERROR"); // look for errors...
statusMonitor.setNotifyMatch(true); // ...and trigger notifications on match
ObjectName statusMonitorObjectName =
new ObjectName("monitor:type=Status,script=" + getScriptName(script));
server.registerMBean(statusMonitor, statusMonitorObjectName);
statusMonitor.start();
// ...and associated listener object to send an email once that happens
server.addNotificationListener(statusMonitorObjectName,
new EmailNotifierListener(script),
new ScriptNotificationFilter(script),
statusMonitorObjectName);
|
The first few lines are just instantiating the StringMonitor, linking it up to the ScriptAdapter and the getStatus(), then setting its properties before we register and start it. Linking this up to the listener and filter is done in the server.addNotificationListener() call.
To test this monitor, start up the server in a terminal. Switch to either the MBean server and notice that both scripts are running fine. Then on another terminal, navigate to the /tmp directory and create an empty .err file. Under Unix, this would be something like this:
1 2 3 | sujit@sirocco:~$ cd /tmp
sujit@sirocco:/tmp$ touch count_sheep_Dolly.err
sujit@sirocco:/tmp$
|
Back on the terminal where your MBean server was started up, you should see the following output from the MBean server. It may not be instantaneous, since the monitor is run on a schedule, but it should be within a few seconds.
1 2 3 4 5 6 7 8 9 10 | ...
EmailNotifierListener: sending email...
From: scriptmanager@clones-r-us.com
To: dr_evil@clones-r-us.com
Date: Wed Aug 27 08:31:56 GMT-08:00 2008
Subject: Alarm for [script:name=count_sheep_Dolly:Status]
--
Script [script:name=count_sheep_Dolly] reported ERROR.
--
...
|
Log size Monitoring
Our next monitor is a Log size monitor which checks the size of the log size every 30 seconds to make sure it is growing. To do this, we need to use a GaugeMonitor in difference mode. The code snippet to do this is shown below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | // add gauge monitor to determine if process hung, ie if log files don't grow
GaugeMonitor logsizeMonitor = new GaugeMonitor();
logsizeMonitor.addObservedObject(script);
logsizeMonitor.setObservedAttribute("LogFilesize");
logsizeMonitor.setDifferenceMode(true); // check diffs(logsize)
logsizeMonitor.setThresholds(Long.MAX_VALUE, 1L); // notify if less than 1 (0)
logsizeMonitor.setNotifyLow(true);
logsizeMonitor.setNotifyHigh(false);
logsizeMonitor.setGranularityPeriod(30000L); // every 30 seconds
ObjectName logsizeMonitorObjectName =
new ObjectName("monitor:type=Logsize,script=" + getScriptName(script));
server.registerMBean(logsizeMonitor, logsizeMonitorObjectName);
logsizeMonitor.start();
// ...and the associated listener object to send an email once this happens
server.addNotificationListener(logsizeMonitorObjectName,
new EmailNotifierListener(script),
new ScriptNotificationFilter(script),
logsizeMonitorObjectName);
|
The configuration is similar to the one above, first we instantiate the GaugeMonitor, then we configure it to observe ScriptAdapter.LogfileSize, then we configure the GaugeMonitor's behavior. Difference mode is selected because we want to make sure that the difference between consecutive readings of ScriptAdapter.LogfileSize is greater than 0. As before, the last line sets up the listener and filter objects.
To test this monitor, start up the server in a terminal and make sure that the scripts are running. Then on another terminal, kill one of the scripts, like so:
1 2 3 4 5 | sujit@sirocco:~$ cd /tmp
sujit@sirocco:/tmp$ cat count_sheep_Dolly.pid
3714
sujit@sirocco:/tmp$ kill -9 3714
sujit@sirocco:/tmp$
|
On the terminal window where the MBean server has been started, you should see the trace of the email being sent after a short while.
1 2 3 4 5 6 7 8 9 10 | ...
EmailNotifierListener: sending email...
From: scriptmanager@clones-r-us.com
To: dr_evil@clones-r-us.com
Date: Wed Aug 27 08:28:18 GMT-08:00 2008
Subject: Alarm for [script:name=count_sheep_Dolly:LogFilesize]
--
Script [script:name=count_sheep_Dolly] is HUNG.
--
...
|
Heartbeat Timer
We also have a Heartbeat mechanism for our scripts using a JMX Timer Monitor bean. This is a Timer that sends a notification every minute. This notification is picked up (via the NotificationFilter) by the CprListener, which sends a start() call to the ScriptAdapter. The ScriptAdapter.start() method is designed to check if the script is running, and only pass the command through if the script is not running. Here is the code snippet to set it up:
1 2 3 4 5 6 7 8 9 10 11 12 | String scriptName = getScriptName(script);
Timer heartbeatTimer = new Timer();
heartbeatTimer.addNotification("heartbeat", scriptName,
null, new Date(), 60000L); // every minute
ObjectName heartbeatTimerObjectName =
new ObjectName("timer:type=heartbeat,script=" + scriptName);
server.registerMBean(heartbeatTimer, heartbeatTimerObjectName);
heartbeatTimer.start();
server.addNotificationListener(heartbeatTimerObjectName,
new CprListener(script),
new ScriptNotificationFilter(script),
heartbeatTimerObjectName);
|
In this case, notice that the notification has a type called "heartbeat", which is what the Listener object will use to determine if it should handle the query. To test this, as before, start the MBean server on a terminal, then just wait until the Timer has a chance to kick in. You should see the following on the MBean Server console.
1 2 3 4 5 6 | ...
Performing CPR on script:name=count_sheep_Dolly
19:11:14: Attempted start, but script: count_sheep_Dolly already started
Performing CPR on script:name=count_sheep_Polly
19:11:14: Attempted start, but script: count_sheep_Polly already started
...
|
Timers can also be used with non-daemon scripts, to schedule them to run at a particular time of day.
Automatic script startup
One other thing I did was to start my scripts automatically via the JMX agent. Without JMX, if you have a bunch of application daemons running on a machine, you would probably create individual start/stop scripts for them in your /etc/init.d directory. With a JMX server managing your beans, you just start the JMX server using a start script in your /etc/init.d directory, and let it start the scripts that are registered to it. As an interesting side effect, this will also make you more popular with your Unix system administrator(s), since your popularity is computed as an inverse of the number of times your application daemons crash between midnight and 2am.
All I had to do for this was to add in a start() call in the init() method, and kill() calls in the destroy() method of my ScriptAgent class, and add a shutdown hook in the main() method to ensure that destroy() gets called on server shutdown. You will see this in action as you start and stop your MBean server.
1 2 3 4 5 6 7 8 | ...
[INFO] [exec:java]
08:52:31: Starting script: count_sheep_Dolly
08:52:32: Starting script: count_sheep_Polly
...
^C08:58:49: Killing script: count_sheep_Dolly
08:58:49: Killing script: count_sheep_Polly
sujit@sirocco:~$
|
The code
ScriptAdapterMBean.java
Hasn't changed a whole lot, but we did add in a new method, so it may be just good to provide the newest code.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | // Source: src/main/java/com/mycompany/myapp/ScriptAdapterMBean.java
package com.mycompany.myapp;
public interface ScriptAdapterMBean {
// read-only properties
public boolean isRunning();
public String getStatus();
public String getLogs();
public long getRunningTime();
public long getLogFilesize();
// operations
public void start();
public void kill();
}
|
ScriptAdapter.java
The only thing thats changed since last week is the implementation of the getLogFilesize() which is a new method in the interface.
| // Source: src/main/java/com/mycompany/myapp/ScriptAdapter.java
package com.mycompany.myapp;
import java.io.BufferedReader;
import java.io.File;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.Date;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.FilenameUtils;
import org.apache.commons.lang.StringUtils;
import org.apache.commons.lang.time.DateFormatUtils;
import org.springframework.jmx.JmxException;
public class ScriptAdapter implements ScriptAdapterMBean {
private static final String LOG_DIR = "/tmp";
private static final String ERR_DIR = "/tmp";
private static final String PID_DIR = "/tmp";
private String path;
private String args;
private String adapterName;
private String pidfileName;
private String logfileName;
private String errfileName;
private Process process;
public ScriptAdapter(String path, String[] args) {
this.path = path;
this.args = StringUtils.join(args, " ");
this.adapterName = StringUtils.join(new String[] {
FilenameUtils.getBaseName(path),
(args.length == 0 ? "" : "_"),
StringUtils.join(args, "_")
});
this.pidfileName = FilenameUtils.concat(PID_DIR, adapterName + ".pid");
this.logfileName = FilenameUtils.concat(LOG_DIR, adapterName + ".log");
this.errfileName = FilenameUtils.concat(ERR_DIR, adapterName + ".err");
}
/**
* Not part of the MBean so it will not be available as a manageable attribute.
* @return the computed adapter name.
*/
public String getAdapterName() {
return adapterName;
}
/**
* Checks for existence of the PID file. Uses naming conventions
* to locate the correct pid file.
*/
public boolean isRunning() {
File pidfile = new File(pidfileName);
return (pidfile.exists());
}
/**
* If isRunning, then status == RUNNING.
* If isRunning and .err file exists, then status == ERROR
* If !isRunning and .err file does not exist, then status == READY
*/
public String getStatus() {
File errorfile = new File(errfileName);
if (errorfile.exists()) {
return "ERROR";
} else {
if (isRunning()) {
return "RUNNING";
} else {
return "READY";
}
}
}
public String getLogs() {
if ("ERROR".equals(getStatus())) {
File errorfile = new File(errfileName);
try {
return FileUtils.readFileToString(errorfile, "UTF-8");
} catch (IOException e) {
throw new JmxException("IOException getting error file", e);
}
} else {
try {
Process tailProcess = Runtime.getRuntime().exec(
StringUtils.join(new String[] {"/usr/bin/tail", "-10", logfileName}, " "));
tailProcess.waitFor();
BufferedReader console = new BufferedReader(
new InputStreamReader(tailProcess.getInputStream()));
StringBuilder consoleBuffer = new StringBuilder();
String line;
while ((line = console.readLine()) != null) {
consoleBuffer.append(line).append("\n");
}
console.close();
tailProcess.destroy();
return consoleBuffer.toString();
} catch (IOException e) {
e.printStackTrace();
throw new JmxException("IOException getting log file", e);
} catch (InterruptedException e) {
throw new JmxException("Tail interrupted", e);
}
}
}
/**
* Returns the difference between the PID file creation and the current
* system time.
*/
public long getRunningTime() {
if (isRunning()) {
File pidfile = new File(pidfileName);
return System.currentTimeMillis() - pidfile.lastModified();
} else {
return 0L;
}
}
/**
* Returns the current size of the log file in bytes.
* @return the current size of the log file.
*/
public long getLogFilesize() {
File logfile = new File(logfileName);
return logfile.length();
}
public void start() {
try {
if (! isRunning()) {
log("Starting script: " + adapterName);
process = Runtime.getRuntime().exec(
StringUtils.join(new String[] {path, args}, " "));
// we don't wait for it to complete, just start it
} else {
log("Attempted start, but script: " + adapterName + " already started");
}
} catch (IOException e) {
throw new JmxException("IOException starting process", e);
}
}
public void kill() {
if (isRunning()) {
File pidfile = new File(pidfileName);
log("Killing script: " + adapterName);
try {
String pid = FileUtils.readFileToString(pidfile, "UTF-8");
Runtime.getRuntime().exec(StringUtils.join(new String[] {
"/usr/bin/kill", "-9", pid}, " "));
if (process != null) {
// remove hanging references
process.destroy();
}
pidfile.delete();
} catch (IOException e) {
throw new JmxException("IOException killing process", e);
}
}
}
private void log(String message) {
System.out.println(DateFormatUtils.ISO_TIME_NO_T_FORMAT.format(new Date()) +
": " + message);
}
}
|
ScriptAgent.java
This class has had some pretty huge changes since last week. For each script adapter, we call the registerMonitors() and registerTimer() which addes the 2 monitors and 1 timer to each script adapter. Each of these monitors are linked to the appropriate Listener in these two extra methods. The init() method starts up the scripts on MBean server startup, and there is a new destroy() method which will kills the scripts on server shutdown.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 | // Source: src/main/java/com/mycompany/myapp/ScriptAgent.java
package com.mycompany.myapp;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import javax.management.MBeanServer;
import javax.management.MBeanServerFactory;
import javax.management.ObjectName;
import javax.management.monitor.CounterMonitor;
import javax.management.monitor.GaugeMonitor;
import javax.management.monitor.StringMonitor;
import javax.management.timer.Timer;
import com.sun.jdmk.comm.HtmlAdaptorServer;
public class ScriptAgent {
// the port number of the HTTP Server Adapter
private final static int DEFAULT_AGENT_PORT = 8081;
private MBeanServer server;
private List<ScriptAdapter> scriptAdapters = new ArrayList<ScriptAdapter>();
public ScriptAgent() {
super();
}
public void addScriptAdapter(ScriptAdapter scriptAdapter) {
this.scriptAdapters.add(scriptAdapter);
}
protected void init() throws Exception {
server = MBeanServerFactory.createMBeanServer();
// load all script adapters
for (ScriptAdapter scriptAdapter : scriptAdapters) {
ObjectName script = new ObjectName(
"script:name=" + scriptAdapter.getAdapterName());
server.registerMBean(scriptAdapter, script);
// each script will have a monitor to check if its running and
// if its log file size is growing
registerMonitors(script);
// each script will have a timer to attempt restart every 1m
registerTimer(script);
// start all script services
scriptAdapter.start();
}
// load HTML adapter
HtmlAdaptorServer adaptor = new HtmlAdaptorServer();
adaptor.setPort(DEFAULT_AGENT_PORT);
server.registerMBean(adaptor, new ObjectName("adapter:protocol=HTTP"));
// start 'er up!
adaptor.start();
}
protected void destroy() throws Exception {
for (ScriptAdapter scriptAdapter : scriptAdapters) {
// stop all script services
scriptAdapter.kill();
}
}
private void registerMonitors(ObjectName script) throws Exception {
// add string monitor checking for getStatus() != "ERROR"
StringMonitor statusMonitor = new StringMonitor();
statusMonitor.addObservedObject(script);
statusMonitor.setObservedAttribute("Status");
statusMonitor.setGranularityPeriod(60000L); // check every minute
statusMonitor.setStringToCompare("ERROR"); // look for errors...
statusMonitor.setNotifyMatch(true); // ...and trigger notifications on match
ObjectName statusMonitorObjectName =
new ObjectName("monitor:type=Status,script=" + getScriptName(script));
server.registerMBean(statusMonitor, statusMonitorObjectName);
statusMonitor.start();
// ...and associated listener object to send an email once that happens
server.addNotificationListener(statusMonitorObjectName,
new EmailNotifierListener(script),
new ScriptNotificationFilter(script),
statusMonitorObjectName);
// add gauge monitor to determine if process hung, ie if log files don't grow
GaugeMonitor logsizeMonitor = new GaugeMonitor();
logsizeMonitor.addObservedObject(script);
logsizeMonitor.setObservedAttribute("LogFilesize");
logsizeMonitor.setDifferenceMode(true); // check diff(logsize)
logsizeMonitor.setThresholds(Long.MAX_VALUE, 1L); // notify if <1 (0)
logsizeMonitor.setNotifyLow(true);
logsizeMonitor.setNotifyHigh(false);
logsizeMonitor.setGranularityPeriod(30000L); // every 30 seconds
ObjectName logsizeMonitorObjectName =
new ObjectName("monitor:type=Logsize,script=" + getScriptName(script));
server.registerMBean(logsizeMonitor, logsizeMonitorObjectName);
logsizeMonitor.start();
// ...and the associated listener object to send an email once this happens
server.addNotificationListener(logsizeMonitorObjectName,
new EmailNotifierListener(script),
new ScriptNotificationFilter(script),
logsizeMonitorObjectName);
}
private void registerTimer(ObjectName script) throws Exception {
String scriptName = getScriptName(script);
Timer heartbeatTimer = new Timer();
heartbeatTimer.addNotification("heartbeat", scriptName,
null, new Date(), 60000L); // every minute
ObjectName heartbeatTimerObjectName =
new ObjectName("timer:type=heartbeat,script=" + scriptName);
server.registerMBean(heartbeatTimer, heartbeatTimerObjectName);
heartbeatTimer.start();
server.addNotificationListener(heartbeatTimerObjectName,
new CprListener(script),
new ScriptNotificationFilter(script),
heartbeatTimerObjectName);
}
private String getScriptName(ObjectName script) {
String canonical = script.getCanonicalName();
return canonical.substring("script:name=".length());
}
public static void main(String[] args) {
final ScriptAgent agent = new ScriptAgent();
Runtime.getRuntime().addShutdownHook(new Thread() {
public void run() {
try {
agent.destroy();
} catch (Exception e) {
throw new RuntimeException(e);
}
}
});
try {
agent.addScriptAdapter(new ScriptAdapter(
"/home/sujit/src/rscript/src/main/sh/count_sheep.sh", new String[] {"Dolly"}));
agent.addScriptAdapter(new ScriptAdapter(
"/home/sujit/src/rscript/src/main/sh/count_sheep.sh", new String[] {"Polly"}));
agent.init();
} catch (Exception e) {
e.printStackTrace(System.err);
}
}
}
|
ScriptNotificationFilter.java
The ScriptNotificationFilter is an optional item that makes sure that events raised for a script are handled by the listener that it configured for that script.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | // Source: src/main/java/com/mycompany/myapp/ScriptNotificationFilter.java
package com.mycompany.myapp;
import javax.management.Notification;
import javax.management.NotificationFilter;
import javax.management.ObjectName;
import javax.management.monitor.MonitorNotification;
import javax.management.timer.TimerNotification;
public class ScriptNotificationFilter implements NotificationFilter {
private static final long serialVersionUID = 6299049832726848968L;
private String scriptName;
public ScriptNotificationFilter(ObjectName objectName) {
super();
this.scriptName = objectName.getCanonicalName();
}
/**
* Is the notification meant for this script?
*/
public boolean isNotificationEnabled(Notification notification) {
if (notification instanceof MonitorNotification) {
MonitorNotification monitorNotification = (MonitorNotification) notification;
String observedObjectName =
monitorNotification.getObservedObject().getCanonicalName();
return scriptName.equals(observedObjectName);
} else if (notification instanceof TimerNotification) {
TimerNotification timerNotification = (TimerNotification) notification;
return scriptName.substring("script:name=".length()).equals(
timerNotification.getMessage());
} else {
// unknown notification type
return false;
}
}
}
|
EmailNotificationFilter.java
This listener is paired with the two Monitors. It composes and sends emails to the address configured for the script. The actual emails are shown in the tests above.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | // Source: src/main/java/com/mycompany/myapp/EmailNotificationFilter.java
package com.mycompany.myapp;
import java.util.Date;
import java.util.Map;
import javax.management.Notification;
import javax.management.NotificationListener;
import javax.management.ObjectName;
import javax.management.monitor.MonitorNotification;
import org.apache.commons.lang.ArrayUtils;
public class EmailNotifierListener implements NotificationListener {
// TODO: this should probably be a configuration item
@SuppressWarnings("unchecked")
private static final Map<String,String> ALARM_EMAILS =
ArrayUtils.toMap(new String[][] {
new String[] {
"script:name=count_sheep_Dolly", "dr_evil@clones-r-us.com"
},
new String[] {
"script:name=count_sheep_Polly", "pinky_n_brain@clones-r-us.com"
}
});
private String scriptName;
public EmailNotifierListener(ObjectName objectName) {
this.scriptName = objectName.getCanonicalName();
}
public void handleNotification(Notification notification, Object handback) {
if (notification instanceof MonitorNotification) {
MonitorNotification monitorNotification = (MonitorNotification) notification;
String observedAttribute = monitorNotification.getObservedAttribute();
String observedScriptName = monitorNotification.getObservedObject().getCanonicalName();
String emailTo = ALARM_EMAILS.get(observedScriptName);
String emailSubject = "Alarm for [" + observedScriptName + ":" +
monitorNotification.getObservedAttribute() + "]";
StringBuilder emailBody = new StringBuilder();
if (observedAttribute.equals("Status")) {
emailBody.append("Script [").
append(observedScriptName).
append("] reported ERROR.");
} else if (observedAttribute.equals("LogFilesize")) {
emailBody.append("Script [").
append(observedScriptName).
append("] is HUNG.");
} else {
// nothing to do for now, place holder for future notifications
}
sendEmail(emailTo, emailSubject, emailBody);
}
}
private void sendEmail(String emailTo, String emailSubject, StringBuilder emailBody) {
System.out.println("EmailNotifierListener: sending email...");
System.out.println("From: scriptmanager@clones-r-us.com");
System.out.println("To: " + emailTo);
System.out.println("Date: " + new Date());
System.out.println("Subject: " + emailSubject);
System.out.println("--");
System.out.println(emailBody.toString());
System.out.println("--");
}
}
|
CprListener.java
This listener is paired with the Heartbeat Timer and attempts to administer CPR to a downed script in order to restart it automatically.
Be the first to comment. Comments are moderated to prevent spam.
Post a Comment