Recently I added functionality to an application that increased its memory footprint considerably. This was because the original application stored its data in data structures in memory for performance, so the new stuff I added had to inter-operate with these data structures, so I did the same. For a while, I was getting the dreaded Out Of Memory Exceptions (OOMEs), but it went away after I replaced a MultiMap like structure (really a HashMap<String,List<String>>) with a plain Java HashMap.
However, that one afternoon of tracking down the OOME set me thinking seriously about whether it may be better to use something like BerkeleyDb as my data store. It is not as fast as in-memory data structures, but it is a lot faster than disk based SQL databases such as MySQL or Oracle. Moreover, it will attempt to keep as much of the data in memory as possible, swapping out to disk files when it cannot. In the past, I had run performance tests between some in-memory databases, and HSQLDB actually came out on top, but I was using BerkeleyDB version 2.1.30 (from Sleepycat before it was acquired by Oracle, I think). This time I decided to use version 3.1.0, the latest available from Oracle's website.
To get up to speed with BerkeleyDB, I decided to create a DAO that persisted a data structure representing a user's preferences. The session object will be keyed off by the userId for registered and logged-in users, and a temporary id built off the user's IP address and user-agent string for other users.
One of the advantages touted for BerkeleyDB is the absence of an SQL parsing layer. This makes it much faster than the other databases, but it also leads to having to write more code. One of the things I did not like about BerkeleyDB in the past is that if you were persisting anything more complicated than a String, you would need to write the serialization and deserialization code to convert the object to and from a byte stream. However, BerkeleyDB-JE 3.1 has a new Direct Persistence Layer (DPL) which generates these for you dynamically. The programmer just has to annotate the class to be persisted and the DPL takes care of the rest. I used the DPL for this user preference DAO example.
For our application, we first define the UserPrefsEntity bean. We need to annotate the class itself as an @Entity, the userId as a @PrimaryKey, and the updated timestamp as a @SecondaryKey. In addition, the DPL framework requires a public constructor with the primary key field as the argument, and a private null (no-args) constructor. Getters and setters for the fields are optional, but you probably need them in the DAO, so I would just put them in and remove them later if they are not used. Here is the code for the bean.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | import java.util.Map;
import java.util.TreeMap;
import com.sleepycat.persist.model.Entity;
import com.sleepycat.persist.model.PrimaryKey;
import com.sleepycat.persist.model.Relationship;
import com.sleepycat.persist.model.SecondaryKey;
/**
* Entity representing a User session object.
*/
@Entity
public class UserPrefsEntity {
@PrimaryKey private String userId;
@SecondaryKey(relate=Relationship.ONE_TO_ONE) private long updatedMillis;
private Map<String,String> prefs = new TreeMap<String,String>();
public UserPrefsEntity(String userId) {
this.userId = userId;
}
private UserPrefsEntity() {
super();
}
public String getUserId() {
return userId;
}
public void setUpdatedMillis(long updatedMillis) {
this.updatedMillis = updatedMillis;
}
public long getUpdatedMillis() {
return updatedMillis;
}
public Map<String,String> getPrefs() {
return prefs;
}
public void setPrefs(Map<String,String> prefs) {
this.prefs.clear();
this.prefs.putAll(prefs);
}
}
|
The DAO provides methods to operate on the bean. BerkeleyDB allows you to reference data in it using PrimaryIndex and SecondaryIndex accessors. These accessors, along with the Environment and EntityStore objects, are all declared in the init() method. The global objects are destroyed in the corresponding destroy() method. Since I use Spring, I will make sure that the DAO's bean definition has init-method and destroy-method attributes set to "init" and "destroy" respectively. Non-Spring code, such as my JUnit test shown below, must take care to call init() before all other calls to the DAO, and destroy() after.
The DAO provides methods to retrieve all or part (by preference key prefix) of a user's preferences using the load() method. Preferences can be saved using save(). If we have been collecting preferences for a user while he is still not registered or logged in, once he is, we need to copy all our collected preferences to his new userId using the migrate() method. Finally, there is a expire() method that can be called by a scheduled job to clean out preferences for temporary users after a certain time.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 | import java.io.File;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
import org.apache.commons.io.FileUtils;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;
import com.sleepycat.persist.EntityCursor;
import com.sleepycat.persist.EntityStore;
import com.sleepycat.persist.PrimaryIndex;
import com.sleepycat.persist.SecondaryIndex;
import com.sleepycat.persist.StoreConfig;
/**
* DAO that uses an in-memory Berkeley DB database as its datastore.
*/
public class UserPrefsDao {
private static final Log logger = LogFactory.getLog(UserPrefsDao.class);
private String dataDirectory;
private long timeToLiveMillis = 24 * 60 * 60 * 1000; // 1 day
private Environment env;
private EntityStore store;
private PrimaryIndex<String,UserPrefsEntity> userPrefsByUserId;
private SecondaryIndex<UserPrefsEntity,String,UserPrefsEntity> userPrefsByUpdatedMillis;
public void setDataDirectory(String dataDirectory) {
this.dataDirectory = dataDirectory;
}
public void setTimeToLiveMillis(long timeToLiveMillis) {
this.timeToLiveMillis = timeToLiveMillis;
}
protected void init() throws Exception {
File dataDir = new File(dataDirectory);
if (! dataDir.exists()) {
FileUtils.forceMkdir(dataDir);
}
EnvironmentConfig environmentConfig = new EnvironmentConfig();
environmentConfig.setAllowCreate(true);
environmentConfig.setTransactional(true);
env = new Environment(dataDir, environmentConfig);
StoreConfig storeConfig = new StoreConfig();
storeConfig.setAllowCreate(true);
storeConfig.setTransactional(true);
store = new EntityStore(env, dataDir.getName(), storeConfig);
userPrefsByUserId = store.getPrimaryIndex(String.class, UserPrefsEntity.class);
userPrefsByUpdatedMillis = store.getSecondaryIndex(
this.userPrefsByUserId, UserPrefsEntity.class, "updatedMillis");
}
protected void destroy() throws Exception {
if (store != null) {
store.close();
}
if (env != null) {
env.close();
}
}
/**
* Retrieve the preferences for the specified user.
* @param userId the userId.
* @return the preferences for the user, if it exists.
* @throws Exception if one is thrown.
*/
public Map<String,String> load(String userId) throws Exception {
UserPrefsEntity userPrefs = userPrefsByUserId.get(userId);
if (userPrefs == null) {
return Collections.EMPTY_MAP;
}
return userPrefs.getPrefs();
}
/**
* Retrieves a partial map of preferences for the specified user. This is
* useful when we want to partition the preferences across multiple applications,
* so each application only saves and uses a non-overlapping subset of the
* preferences.
* @param userId the userId.
* @param keyPrefix the preference key prefix, eg. language.dialect
* @return the partial Map of preferences. Only the keys which start with the
* specified keyPrefix will be returned.
* @throws Exception if one is thrown.
*/
public Map<String,String> load(String userId, String keyPrefix) throws Exception {
TreeMap<String,String> allPrefs = (TreeMap<String,String>) load(userId);
return allPrefs.tailMap(keyPrefix, true);
}
/**
* Migrate the user's preferences to a permanent storage when he registers.
* Temporary preference values are stored for a configurable time, by default
* it is 1 day. However, once the user registers, his preferences are never
* expired.
* @param sourceUserId the temporary user id.
* @param targetUserId the permanent user id.
* @return the preferences for the target user id.
* @throws Exception if one is thrown.
*/
public Map<String,String> migrate(String sourceUserId, String targetUserId)
throws Exception {
UserPrefsEntity sourceEntity = (UserPrefsEntity) userPrefsByUserId.get(sourceUserId);
logger.debug("Deleting temp user:" + sourceUserId);
userPrefsByUserId.delete(sourceUserId);
return save(targetUserId, sourceEntity.getPrefs());
}
/**
* Save the user preferences. The map of preferences passed in can be partial
* or full. Only the preference values provided will be updated, the rest will
* remain untouched.
* @param userId the user id.
* @param values the Map of preferences.
* @return the updated map.
* @throws Exception if one is thrown.
*/
public Map<String,String> save(String userId, Map<String,String> values)
throws Exception {
PrimaryIndex<String,UserPrefsEntity> primaryKey =
store.getPrimaryIndex(String.class, UserPrefsEntity.class);
UserPrefsEntity entity = new UserPrefsEntity(userId);
entity.setPrefs(values);
entity.setUpdatedMillis(System.currentTimeMillis());
logger.debug("Saving prefs for userId:" + userId);
primaryKey.put(entity);
return entity.getPrefs();
}
/**
* Used for one time load of the existing data. Will probably never be used
* after that.
* @param data the Prefs data from the old system.
* @throws Exception if one is thrown.
*/
public void saveAllPrefs(Map<String,Map<String,String>> data) throws Exception {
for (String key : data.keySet()) {
save(key, data.get(key));
}
}
/**
* Used by backend scheduled job to expire temporary (non-registered user)
* preferences. The cutoff time is the time specified in the call to
* expire. Any entries which are older than millisSinceEpoch - timeToLiveMillis
* will be expired.
* @param millisSinceEpoch the current time in milliseconds since epoch.
* @throws Exception if one is thrown.
*/
public void expire(long millisSinceEpoch) throws Exception {
long cutoff = millisSinceEpoch - timeToLiveMillis;
List<String> userIdsToDelete = new ArrayList<String>();
EntityCursor<UserPrefsEntity> userPrefsCursor = null;
try {
userPrefsCursor = userPrefsByUpdatedMillis.entities();
for (UserPrefsEntity userPrefs : userPrefsCursor) {
long updatedMillis = userPrefs.getUpdatedMillis();
if (updatedMillis < cutoff) {
String userId = userPrefs.getUserId();
if (userId.startsWith("t-")) {
userIdsToDelete.add(userId);
}
} else {
// all entries will have been updated after the cutoff
break;
}
}
} finally {
if (userPrefsCursor != null) {
userPrefsCursor.close();
}
}
for (String userIdToDelete : userIdsToDelete) {
logger.debug("Deleting expired user:" + userIdToDelete);
userPrefsByUserId.delete(userIdToDelete);
}
}
}
|
I created a JUnit test to exercise this class, which I show below to illustrate usage. Because this is not Spring enabled, I use the @BeforeClass and @AfterClass to call the DAO's init() and destroy() methods. The rest of it is pretty self-explanatory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 | import java.io.File;
import java.util.HashMap;
import java.util.Map;
import junit.framework.Assert;
import org.apache.commons.io.FileUtils;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.Test;
public class UserPrefsDaoTest {
private static final Log logger = LogFactory.getLog(UserPrefsDaoTest.class);
private static UserPrefsDao dao;
@BeforeClass
public static void setUpBeforeClass() throws Exception {
FileUtils.forceDelete(new File("/tmp/UserPrefs"));
dao = new UserPrefsDao();
dao.setDataDirectory("/tmp/UserPrefs");
dao.setTimeToLiveMillis(0L);
dao.init();
}
@AfterClass
public static void tearDownAfterClass() throws Exception {
dao.destroy();
}
@Test
public void testSavePrefs() throws Exception {
// save a temp user
Map<String,String> value1 = new HashMap<String,String>();
value1.put("a.b.c.d", "14.0");
value1.put("a.b.c.d2", "16.0");
value1.put("a.b", "false");
dao.save("t-1234", value1);
Assert.assertNotNull(dao.load("t-1234"));
// save a perm user
Map<String,String> value2 = new HashMap<String,String>();
value2.put("x.y.z.a", "234");
value2.put("x.y", "true");
value2.put("x.y.z.1", "123");
dao.save("12345678", value2);
Assert.assertNotNull(dao.load("12345678"));
// save another temp user
Map<String,String> value3 = new HashMap<String,String>();
value3.put("x.y.z.a", "986");
value3.put("x.y.a", "true");
value3.put("x.y.z.1", "234");
dao.save("t-2345", value3);
Assert.assertNotNull(dao.load("t-2345"));
}
@Test
public void testRetrieve() throws Exception {
// get back the first temp user
Map<String,String> rvalues1 = dao.load("t-1234");
logger.debug("retrieved values for t-1234:" + rvalues1.toString());
Assert.assertNotNull(rvalues1);
// get back a perm user
Map<String,String> rvalues2 = dao.load("12345678");
logger.debug("retrieved values for 12345678:" + rvalues2.toString());
Assert.assertNotNull(rvalues2);
}
@Test
public void testRetrieveInvalidUser() throws Exception {
// try to get a user with incorrect id, should return empty map
Map<String,String> ivalues1 = dao.load("23456789");
logger.debug("retrived values for invalid user 23456789:" + ivalues1.size());
Assert.assertNotNull(ivalues1);
Assert.assertEquals(0, ivalues1.size());
}
@Test
public void testPropertySubsetRetrieval() throws Exception {
// try to get a subset of properties for a user
Map<String,String> svalues1 = dao.load("t-1234", "a.b.c");
logger.debug("retrieved values for t-1234 for a.b.c:" + svalues1.toString());
Assert.assertNotNull(svalues1);
Assert.assertEquals(2, svalues1.size());
}
@Test
public void testInvalidPropertySubsetRetrieval() throws Exception {
// try to get a invalid subset of properties for a user, should return empty map
Map<String,String> svalues1 = dao.load("t-1234", "x.y.z");
logger.debug("retrieved values for t-1234 for x.y.z:" + svalues1.toString());
Assert.assertNotNull(svalues1);
Assert.assertEquals(0, svalues1.size());
}
@Test
public void testMigrate() throws Exception {
// migrate the t-2345 user to perm user 23456789
Map<String,String> mvalues1 = dao.load("t-2345");
Map<String,String> mvalues2 = dao.migrate("t-2345", "23456789");
logger.debug("migrate source values (t-2345):" + mvalues1.toString());
logger.debug("migrate target values (23456789):" + mvalues2.toString());
Assert.assertNotNull(mvalues2);
Assert.assertEquals(mvalues1.size(), mvalues2.size());
}
@Test
public void testExpire() throws Exception {
// expire prefs, temp users (only) should be deleted
dao.expire(System.currentTimeMillis());
Map<String,String> rvalues1 = dao.load("t-1234");
Assert.assertNotNull(rvalues1);
Assert.assertEquals(0, rvalues1.size());
Map<String,String> rvalues2 = dao.load("12345678");
Assert.assertNotNull("User 12345678 should have non-null prefs", rvalues2);
Assert.assertEquals(3, rvalues2.size());
}
}
|
I was quite pleasantly surprised with the Berkeley-DB DPL. Berkeley-DB does not have much of a following in the Java community, perhaps because it is perceived as difficult to use. The annotation based persistence mechanism provided by the DPL goes a very long way in alleviating this problem. There are many situations where BerkeleyDB would be a great fit, and with the DPL, it would be easier to apply. Hopefully, this example illustrates how easy it is to use Berkeley-DB to solve real-life business problems.
On a personal note, when annotations were introduced in Java 1.5, I did not like them that much. I started using the @Override, @SuppressWarning, etc because Eclipse would provide them as suggestions, then I started to use the Spring @Required tag, then the various JUnit 4.0 annotations, and now the DPL annotations. I still don't know much about how annotations work, but I seem to be pretty much hooked on them now.