Saturday, August 11, 2012

Learning Scala ... again

Recently, I read that Java (my primary language and what Steve Yegge calls "your father's language" :-)) was borrowing features from Scala for its upcoming version 8 release. This is obviously good news for Java, but as one of the commentors on the article pointed out, Java programmers now have two choices - either wait for Scala features to trickle into Java and then figure out how to use them, or learn Scala now in anticipation of these features to come into Java. And once you learn Scala, why not just start using it instead of Java?

I looked at Scala some three years ago (see here, here and here), when I was experimenting with its Actor model. Although I liked the language at the time, the audience seemed to be more of language enthusiast types rather than application programmer types. Plus, there was almost no supporting ecosystem (frameworks, tooling, etc) for Scala. So I decided to set it aside for a while and come back to it once it got a bit more mature.

Fast forward to 2012, and Scala has come a long way. Scala always had a dedicated and smart community of developers, but now it is being increasingly adopted by quite a few big name companies. There is much more (application programmer friendly) documentation and better frameworks and tooling around. So, all in all, a good time to learn and start working in Scala for me.

I don't anticipate using Scala at work, at least not in the immediate future, but I figured it may be good to start using it for my own stuff instead of Java - that way, I can start putting in my 10,000 hours towards Scala proficiency. So this post is mostly about my experience picking up Scala again, and about setting up a standard Scala project with the Typesafe software stack.

The last time I attempted to learn Scala, I used the Odersky book, the first and at that time the only book on Scala. This time round, almost coencidentally (or through some very effective contextual ad targeting), I came across Cay Horstmann's Scala for the Impatient (SFTI) book, which helped me pick up a working knowledge of Scala in about 1.5 weeks.

The SFTI book is written for Java programmers rather than the novice, so it assumes that you know the basic stuff. At the same time, the focus is on doing things (by example) in Scala that you can do (either poorly or not at all) in Java. Each book chapter (and sometimes section) is annotated with the Scala expertise levels (A1 to A3 for Application developers, L1 to L3 for Library developers), so you can decide what proficiency level you want to for for initially (A2/L1 for me), and not feel too guilty or waste too much time if you don't fully understand some concept or can't solve an exercise problem above that level. All in all, a book geared to get you writing useful Scala code as quickly as possible.

Its a bit of a no-brainer, but speaking of exercises, don't skip them. They help reinforce the concepts you've learned, and by the end of the book, you will be the proud owner of 150 or so machine searchable (grep) and potentially reusable Scala code snippets that you have written (and therefore understand intimately). Also do yourself a favor and download ScalaConsole - its a JAR file which you invoke as "scala /path/to/scalaconsole.jar", and provides a GUI which is much nicer to edit code in compared to the Scala REPL. Another advantage of doing the exercises is that your mind learns better by doing than seeing, so you are better prepared when the time comes to write real code.

So anyway, after you go through the book and you have learned enough Scala to be comfortable striking out on your own, its time to set up a Scala project and your IDE to work comfortably with Scala code. I chose the Typesafe Stack consisting of Scala, sbt (Scala Build Tool) and giter8 (to generate the project). In any case, to create a new project:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
sujit@cyclone:LearnScala$ g8 typesafehub/scala-sbt

Scala Project Using sbt 

organization [org.example]: com.mycompany
name [Scala Project]: hello-world
scala_version [2.9.2]: 
version [0.1-SNAPSHOT]: 

Applied typesafehub/scala-sbt.g8 in hello-world

This creates a standard sbt enabled Scala project similar in structure to one created by Maven. Like Maven, your source code resides under src/main/scala and your unit tests reside under src/test/scala. Unlike Maven, your build is customized by a file of key-value pairs called build.sbt in the project directory. There is also a project subdirectory which contains generated Scala code for a default build, which you can change if you want to customize the build.

The standard tasks in sbt are similar to those in Maven. A list of common commands can be found in the sbt Getting Started Guide.

The next step is to install the ScalaIDE plugin into MyEclipse. I did this using the update site and everything worked fine.

The final step is to generate the Eclipse .classpath and .project files. There is a sbteclipse plugin from Heiko Seeberger which works without problems (unlike the earlier sbteclipsify which I couldn't get working after multiple tries). Simply add the following line to your $HOME/.sbt/plugins/build.sbt:

1
addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "2.1.0")

Now run "sbt eclipse" in your project directory. This will create the .classpath and .project files. Note that as your project's library dependencies change, you can simply update your project's build.sbt and rerun "sbt eclipse" to regenerate the Eclipse files.

Finally you can open up the Scala Project in Eclipse. Unlike three years ago, when Eclipse/Scala integration was completely unusable, this time around its actually quite nice. It flags syntax errors, and has decent (still not as nice as Java, but usable) code completion. You can compile, run and debug Scala code from within the IDE.

4 comments (moderated to prevent spam):

Jonathan said...

I think that learning Scala is a good choice for a Java programmer right now. About a year ago, I (a PHP developer) started working with Haskell, which is, like Scala, a functional language, but, unlike Scala, doesn't run on the JVM. Haskell has revolutionized the way I think about programming, when I realized that minimizing the number of variable changes (based on Haskell's variable immutability) could dramatically lower the number of bugs that I would have to deal with. Overall I think that Haskell has made me a better programmer and I think that Scala could do the same thing.

Sujit Pal said...

Yes, I agree. I have already started using Scala in my personal projects. After reading the SFTI book, I chanced upon a "Functional Programming with Scala" course on Coursera taught by Dr Martin Odersky himself, which I took to further solidify my grasp of the language. While I still dont consider myself as proficient in Scala as with Java or even Python, I can do a fair bit with it that I couldn't before.

Unknown said...

Hi Sir,

Iam was going through the examples for patent data.
Could you please explain me in detail about the data sets being used.

1) the patent citation data set
This data set contains two columns citing and cited patents.

citing column refers to the owner ID who submitted the patent .
cited column refer to the patent ID which forms the key to the second data set?

2 ) the patent description data set.

There are number of fields in this data set.
To form the mapping for this two datasets, is it citing or cited column that maps the field which first column ( patent )in the second data set

Could you please explain me details because it is confusing me a lot.

Thanks in advance.
Trilok.


Sujit Pal said...

Hi Trilok, can you please comment on the post you have a question for? I don't see any patent data in this post.