Saturday, January 28, 2006

Using rsync with Eclipse

I use Eclipse, the free, open-source IDE (Integrated Development Environment) from the Eclipse Foundation, with an extensive plug-in suite from MyEclipse.com. Eclipse by itself is good for Java development, but lacks sufficient tools for JSP and XML file editing (although this is changing thanks to the Eclipse WTP (Web Tools Project)), which is where MyEclipse fills in. At my workplace, we have a corporate license for the IDEA IDE from Jetbrains, and most people use IDEA, so I am one of the few odd ones out, which also means that I have to roll my own solutions when it comes to IDEA-ish features that are not supported by Eclipse.

I will concede that IDEA's web application support for JSP, CSS and Javascript editing are pretty fantastic, and Eclipse (or MyEclipse) does not even come close, although I am told that the WTP may change this. The one major reason I use Eclipse is that I started using Eclipse before IDEA (I was a vim user when the early adopters at work were banging away at IDEA), and I am just more familiar with Eclipse's keyboard shortcuts and general layout, and switching to IDEA represents a steep learning curve for me. The other reason is that IDEA is a commercial product, and since I do some open-source work on my own computer and on my own time, switching back and forth between the two IDEs was not an appealing prospect for me.

Anyway, one thing I have always liked about IDEA was its Unison plugin. Unison is a GUI tool that allows you to sync up code from your desktop to the central minicomputer where your application server resides. Personally, I have no use for this at work, because I develop on a shared NFS mount from my local Linux workstation which is also visible from any machine on the network. For Windows users, there is a similar shared SAMBA mount, but the mount is quite slow, so that is not a true option.

I do have a use for this functionality, however, when working with Eclipse from home on my laptop, over a VPN connection. For a while, I tried opening a VNC remote desktop session on my workstation (using the RealVNC software on both ends) from my laptop, but full-screen refreshes are slow enough to make you wish you switched back to using vim for remote editing, especially for heavy-duty coding against a tight deadline.

The only real alternative I found at the time was to transfer the files over to my laptop over the VPN, work on them locally, and transfer them back up to my workstation at work. My first approach was to tar the directory on the remote machine, use scp to copy over the tarball, explode them locally, work on them, tar it back, and scp them over to the remote machine. This is quite a bit of overhead if you are downloading and uploading 400+ files to make a single change and see if your change worked.

I looked at Unison working on a co-workers Apple PowerBook which he used at work, and it seemed to me that what Unison was really using under the covers was rsync. Of course, I was only partially correct, Unison does have features that go beyond what rsync does, but that is how I got the idea of using rsync to sync up files between my laptop and my desktop at work over VPN. I am ashamed to admit that I did not even look at the Unison site at the time, thinking that it must be some closed-source freeware with very tight bindings to IDEA, and consequently of no use to me as an Eclipse user. I know now that Unison is free and open source, and released under the GNU Public License.

In any case, I was learning Python at the time, so I thought it would be cool to wrap the rsync call in a Python script which exposed only a very simple interface, shown below:

1
sync.py up|down webapp-name

The "up" would sync the files up from my local laptop to my workstation (which is conveniently located on a NFS mount, so the files would be instantly available for working with the web application). The "down" would sync files down from the my desktop at work to my laptop. The webapp-name would name the web application I would be working on at the time. This utilizes the almost similar directory structures I have set up on my desktop at work and my laptop. The exact mapping between the two locations is embedded within the script, since I dont forsee me changing it any time soon. Heres the script:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#!/usr/local/bin/python2.4
# $Id: sync.py,v 1.1 2005/11/25 17:44:24 sujit Exp sujit $
# $Source: /home/sujit/bin/python/RCS/sync.py,v $
# Called from within Eclipse when trying to rsync between laptop and desktop
#
import os
import sys

# CONFIG - start of configuration
LOCAL_BASEDIR = "/home/remoteuser/src/company"
REMOTE_BASEDIR = "nfs-mount/head/"
REMOTE_HOST = "remoteuser@remotedesk.company.com"
RSYNC_COMMAND = "/usr/bin/rsync -Cavzu --stats --progress"
# CONFIG

if (len(sys.argv) != 3):
   print "sync.py up|down ${app.name}\n"
   sys.exit(-1)
  
rsync_mode = sys.argv[1]
app_name = sys.argv[2]

print "Connecting to " + REMOTE_HOST + ":" + REMOTE_BASEDIR + "..."
if (rsync_mode == "up"):
   args = [RSYNC_COMMAND, LOCAL_BASEDIR + app_name + "/*", REMOTE_HOST + ":" + REMOTE_BASEDIR + app_name]
else:
   args = [RSYNC_COMMAND, REMOTE_HOST + ":" + REMOTE_BASEDIR + app_name + "/*", LOCAL_BASEDIR + app_name]
print "Executing " + " ".join(args)
os.system(" ".join(args))

The CONFIG block within the script specifies the mapping between the location of the files in the local laptop and the remote desktop. The LOCAL_BASEDIR is where I put all my company code when I have to work on them, under src/company. The REMOTE_BASEDIR is where I mount the NFS file system on my home directory on my local workstation. Under that is all the code I am working on. The REMOTE_HOST is the remote user name, followed by an @ sign, followed by the full machine name of the desktop. Finally, the RSYNC_COMMAND is the command we will use to do the upload or download, with all the required options. See the man page (man rsync) for details about these options.

The next step is hooking up this from Eclipse. This is fairly easy, click on Run::External Tools::External Tools. This will bring up a External Tool configuration popup. You will need to configure two entries for each web application you are working on. Fill in the Location field with the full path to the sync.py file (you will need to make this executable if you are on Unix). In the arguments, put the arguments "up webapps/your-webapp-name" for the sync-up version and "down webapps/your-webapp-name" for the sync-down version. This will create two entries under the Run::External Tools menu option. You will need to sync down before you start work on making modifications, and sync up when you are done.

Does this work as well as working locally on the NFS mount? No, not really, since there is one additional step to remember to do when working over the VPN. When working locally, you would make your changes and save in your Eclipse IDE, then push the files over to your development application server using an Ant script or calling the Ant task from within Eclipse. When working remotely, you will need to have a terminal window open on the remote machine and run the Ant task from that machine. Since I rarely use Ant tasks from within Eclipse, for me the extra step is really the "remembering to rsync up" step.

Monday, January 16, 2006

Windows XP - Reinstall and Network

Last weekend, I reinstalled the Windows XP OS on my wife's laptop. I have pretty much given up using Windows since about Windows 2000, when my employer allowed me to switch to a Linux desktop. My laptop had already been converted to Redhat 9.0 a few months before that, and now runs Fedora Core 2. I do have a secondary Windows XP Professional PC at work, although all I use it for is tp rdesktop to it about once a month to check if some Javascript code works on Microsoft Internet Explorer. At home, too, we used to be a more of a Linux than a Windows household, with 2 Linux and 1 Windows box, which changed recently with the purchase of a Dell Windows desktop recently for the kids to play games.

Well, anyway, back to the re-install. The laptop is a Dell Inspiron 1100, purchased new in 2002 for my wife when she decided to go back to school to get her business degree. Dell provided a 1 year trial subscription for McAfee's anti-virus, which I did not renew at the end of that time. So by the time I got to reinstalling the OS in 2005, it was pretty much dripping with viruses, adware and spyware. Fortunately, we did not use this computer (ever) to do financial or other internet transactions, so the most the spyware picked up were google links that my wife went to while doing her research.

Which brings me to a philosophical question. I understand how spyware can create zombie networks out of random computers on the internet and launch denial of service attacks or break cryptos for malicious purposes, and I am sure that this computer participated unknowingly in these, but what of the adware that pops up with "helpful" information to refinance your home and fix your bad credit? Over the last few months, the computer would routinely automatically open 20 or more Internet explorer windows, and slowing the computer down to a crawl. Not surprisingly, my feelings are not too favorable towards the companies marketing to me in this manner, and I am sure that was not the company's intent when they signed up for service with the adware vendor.

The last time I tried to re-install Windows 95 on my old IBM Aptiva, the restore disk did not work. The machine was out of warranty by the time I got around to doing the restore, and I had added RAM and disk to it, so I ended up installing Linux Redhat 9.0 on it. My first attempt in this case was to try to restore using the Dell application software CD, which did not boot. I then found that Dell provides a way to burn an operating system bootable CD using software preloaded on the PC. Not as good an idea as it sounds, especially when your PC is so infected that even pulling up your browser and going to a specific website is a 10 minute project of itself. Thankfully, I finally found the Windows CD that Dell had supplied with this PC. I had specifically requested this disk at additional cost when ordering the PC. I think that companies should provide the Windows CD by default, its the honest thing to do, after all, we are paying extra for the operating system, so we should get it. Supplying non-working restore disks and/or making the whole process of making one so complicated is one reason why customers end up dumping PCs which can be restored to their previous working state. Not so good for hardware/software manufacturers and arguably for the economy, but probably better for the environment and customer's pocketbooks.

Booting with the Windows OS disk and the installation was fairly painless, its just a matter of repartitioning the hard drive to one big NTFS filesystem (the Dell folks had a 9MB FAT partition which I blew away), and installing the drivers for the various peripherals (a Lexmark Z25 printer and a Netgear wireless card). The first thing I did this time was to install the McAfee anti-virus, firewall and privacy manager. These are free (or built into the price of monthly high speed internet access) for Comcast customers, so I did not have to shell out for these. I only wish I had known about the free offer sooner, then I wouldn't have had to reinstall in the first place.

Re-installing the Netgear wireless card involved having to set the SSID of my Belkin 802.11b wireless access point that hangs off on of the ports on my Netgear 4-port router. Since I use WEP, I also had to set the network key. I also use MAC address authentication, but that was not an issue since I was using the same PCI card that I had before. Fortunately, I had all the SSID and WEP key written down in my Belkin user manual, I would have a hard time figuring out the WEP encryption key otherwise.

Installing the Lexmark Z25 printer was a little more involved, I had to download the driver from the internet since I had misplaced the driver disk that came with the printer. Google was my friend in this case, and I finally found the version I required from the good folks at Soft32.com.

Other software I had to reinstall was Microsoft Office. I took this opportunity to not install the free stuff that came preinstalled with the original PC, such as Microsoft Money (I use GnuCash on my Linux box), and AOL and Earthlink free offers. I also installed Firefox, my wife is quite dependent on its tabbed browsing feature, and only goes back to MSIE for sites which require it, such as her university website.

My wife had pretty much given up working on the laptop for the past few weeks, which meant that I either had to copy over all the files she had created on the kid's desktop during this time, or just set up networking between the two Windows XP machines. I did the latter, since that was on my list of things to do, and now was as good a time as any.

To do this, I had to create a network connection on the laptop using the control panel and build the netsetup disk. I found this page from the geekgirls site very helpful in doing this. I then installed the netsetup disk on the Windows XP desktop machine. The share on the desktop appeared on the list of network shares available on the laptop automatically. However, I could not access the share. Turns out that Windows XP creates a bridge network between the two machines, which effectively turns off any network access. Deleting the bridge device and manually mapping a remote drive to the network share allowed me to access the files on the desktop from the laptop. Its important to remember to not have the mapping happen on login, since the desktop will not always be on, and the boot process will hang or have errors.

Next steps? Get my linux laptop talking to the windows boxes, and resurrect my old IBM Aptiva running Redhat 9.0 into a headless Debian desktop, and make it a Samba print and file server accessible from both my Linux and Windows boxes. Not going to happen for a while though, got too much going on at the moment, but I will write about it when I do.

Sunday, January 01, 2006

Version Control with RCS

We Pragmatic Programmers know that putting our source code under version control is important. I use CVS for version control both at work and my open source stuff at sourceforge. But there are times when I work on my laptop without checking in for days or weeks at a time. At these times, I rely on RCS, a pessimistic version control system that ships as part of the Unix (and Linux) operating systems. By pessimistic, I mean that a checkout will lock the file, which cannot be checked out by someone else until the file is checked back in. This is in contrast to the optimistic style favored by CVS and Subversion, where multiple authors can check out the same file at the same time, and any reconciliation is done at the time the changes are checked in.

Unlike CVS or Subversion, RCS does not need a server component, so there is no additional daemon to run on your laptop. Simply create an RCS subdirectory under the directory you want to put under version control , and you can use the standard RCS commands to put files under version control. To checkout a file, use the command:

ci -l MyFile.java

and this will create a version controlled file RCS/MyFile.java,v. The ci (check-in) is slightly misnamed in this example, since it also locks the file and checks it out at the same time in one command. There are other commands like rcsdiff, rlog and ident, which allow you to see the differences between two revisions of a file, display a log of all commit comments and get a display ident information for a file respectively. For a complete list, checkout:

man ci

and follow the links to the other commands.

One thing that was holding me back from using RCS was the need to remember an entirely new set of commands for version control. I have been using CVS for so long that I have developed finger memory. Also it was not clear to me how to apply a command to a hierarchy of files. Since I develop mostly in Java, and Java code is organized into a hierarchy of packages, this is important functionality to me.

For these reasons, I developed a script in Python which provides me functionality which is fairly close to that in CVS. Here is the help output from the script:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
Usage: rcstool.py [add|diff|log|commit|help] [-m=comment] [filename]
 add [dirname] - puts a new directory under RCS control. If no
                 dirname is specified, then puts the current directory
                 under RCS control
 diff [filename] - Reports differences between all files under
                 the current directory and the RCS version, or if a filename
                 is specified, the actual differences between the file and
                 its RCS version
 log filename - Prints the RCS log for the specified file
 commit -m=comment [filename]+ - Specifies a list of files that
                 needs to be checked into RCS. The comment to put in all
                 files is specified by preceding with -m=. Note that multi-
                 word comments should be enclosed in quotes
 help - print this message

The add is analogous to the CVS add subcommand, the diff to -nq update (in the no filename supplied mode) and diff (where the filename is supplied), the log to log and the commit to commit, respectively. More importantly, the script will traverse the file system from the root and apply the command to each file it encounters.

Here is the script (rcstool.py) for those interested. You will notice that rcstool.py has been put under RCS version control as well :-). You will need the Python interpreter installed to runA this. Most Red Hat Linux distributions will have this pre-installed, since Python is used for writing and running the sysadmin GUI tools.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
#!/usr/bin/python
# $Id: rcstool.py,v 1.3 2005/12/24 01:01:24 sujit Exp sujit $
# $Source: /home/sujit/bin/python/RCS/rcstool.py,v $
"""
Provides a CVS like wrapper for RCS with common commands. Please make sure
to modify the commands RLOG, RCSDIFF and COMMIT to conform to your local
system.
"""
import sys
import os
 
# Specify locations of the various commands you want to use here
RLOG = "/usr/bin/rlog"
RCSDIFF = "/usr/bin/rcsdiff"
COMMIT = "/usr/bin/ci -l"
 
def main():
    """
    This is how we are called.
    """
    if (len(sys.argv) == 1 or sys.argv[1] == "help"):
        help()
    elif (sys.argv[1] == "add"):
        adddir = os.getcwd()
        if (len(sys.argv) == 3):
            adddir = sys.argv[2]
        add(adddir)
    elif (sys.argv[1] == "diff"):
        filename = "*"
        if (len(sys.argv) == 3):
           filename = sys.argv[2]
        diff(filename)
    elif (sys.argv[1] == "log"):
        if (len(sys.argv) != 3):
            help()
        log(sys.argv[2])
    elif (sys.argv[1] == "commit"):
        print len(sys.argv)
        if (len(sys.argv) < 4):
            help()
        comment = sys.argv[2]
        if not comment.startswith("-m="):
            help()
        filenames = sys.argv[3:]
        commit(comment, filenames)
    else:
        help()
 
def add(adddir):
    """
    Creates an RCS directory under the specified directory. If there is
    already an RCS directory, it prints an error message and returns.
    """
    rcsdir = os.path.join(adddir, "RCS")
    if (os.path.isdir(rcsdir)):
        print "This directory is already under RCS control"
    else:
        os.mkdir(rcsdir)
 
def diff(filename):
    """
    Reports on diffs between the actual and RCS version. If no filename is
    supplied, it will list the files that are different from the RCS version.
    If the filename is supplied, then the actual differences from RCS are
    displayed for that file.
    """
    if (filename == "*"):
        visitFiles(os.getcwd(), RCSDIFF, 0)
    else:
        diff = os.popen(" ".join([RCSDIFF, filename]), 'r')
        for result in diff.readlines():
            print result
        diff.close()
 
def log(filename):
    """
    Prints the RCS log for the specified filename
    """
    log = os.popen(" ".join([RLOG, filename]), 'r')
    for result in log.readlines():
        print result
    log.close()
 
def commit(comment, filenames):
    """
    Commits a list of supplied filenames, with the appropriate commit comment.
    Our preferred mode of checking in is "ci -l", which locks the file again
    after check-in, so its always writable.
    """
    commitMessage = "-m=\"" + comment[3:] + "\""
    command = [COMMIT, commitMessage]
    for filename in filenames:
        command.append(filename)
    commit = os.popen(" ".join(command), 'r')
    for result in commit.readlines():
        print result
 
def visitFiles(root, operation, level):
    """
    Generic recursive directory walk. Applies the operation to each of the
    files encountered in the walk. This version is customized to ignore RCS
    files and any file which ends with ",v" (RCS file extensions). This
    version also ignores files on which the rcs operation is not required.
    Since we use this for commit and diff, only the files which are different
    from the RCS version.
    """
    for filename in os.listdir(root):
        fullpath = os.path.join(root, filename)
        if (os.path.isfile(fullpath)):
            oper = os.popen(" ".join([operation, fullpath, "2>/dev/null"]), 'r')
            numlines = 0
            for result in oper.readlines():
                numlines = numlines + 1
            if (numlines > 0):
                print fullpath
            oper.close()
        else:
            if filename == "RCS":
                continue
            visitFiles(fullpath, operation, level + 1)
 
def help():
    """
    Simple usage help text that prints and exits.
    """
    print "Usage: rcstool.py [add|diff|log|commit|help] [-m=comment] [filename]"    
    print "  add [dirname] - puts a new directory under RCS control. If no"
    print "       dirname is specified, then puts the current directory"
    print "       under RCS control"
    print "  diff [filename] - Reports differences between all files under"
    print "       the current directory and the RCS version, or if a filename"
    print "       is specified, the actual differences between the file and"
    print "       its RCS version"
    print "  log filename - Prints the RCS log for the specified file"
    print "  commit -m=comment [filename]+ - Specifies a list of files that"
    print "       needs to be checked into RCS. The comment to put in all "
    print "       files is specified by preceding with -m=. Note that multi-"
    print "       word comments should be enclosed in quotes"
    print "  help - print this message"
    sys.exit(-1)
 
if __name__ == "__main__":
    main()

Update: I dont use this anymore. As pragmatic as it seemed when I started writing this, it turned out to be too inconvenient to learn another set of commands.