Category Archives: Scala

Basic Learning Scala Resources

Scala is one of my primary languages which I have found useful for creating web services, large data-modeling and data processing. Below are some of my notes from my experience learning Scala.

Installation

Scala runs on the Java Virtual Machine. You’ll need to install Java first. Install Oracle Java 7, not OpenJDK or Java 6 unless you have to. I encountered a few bugs on OpenJDK but not on Oracle Java.

To install Scala, you can install it system-wide, but your build system will download the specific version of Scala for your project.

Programming Language

The official tutorial lists a lot of topics: (Implicit Parameters, Variances, Upper Type Bounds, etc) but I recommend not diving head deep first.

I recommend just starting with these four topics first. These topics showcase the benefits of Scala while being approachable for most programmers.

  • Functions as first class citizens – You can use closures unlike Java.

  • Collections and Collection operations – The ease of performing operations on collections makes Scala a very productive language. Learn the class hierarchy and methods

  • Class system – classes, objects(singleton), traits and case classes.

  • Pattern matching – Another great language feature in making code Scala-idiomatic.

With just these four topics, you can get started writing effective programs with Scala. Once you master these, then go onto the advanced material(type system, actors, etc)

These free books from Typesafe are a good start for the language.

Coding style

Ecosystem

The ecosystem of Scala is equally as important to learn as the language itself.

Sites

  • Reddit Scala – Keep up to date quickly.
  • Scala Package Release Notes – Typically a lot of recently released packages will be noted here.
  • Maven Central – All major Scala(and Java) packages are published here. You can access these with any Scala build system.

Involved Companies

  • Typesafe – Founded by the Scala team, Typesafe is the Scala support team. They help companies build robust Scala environments.
  • Companies using Scala – See what others are doing with Scala.
  • twitter github – Twitter has a released a lot of useful Scala libs including Finagle.
  • foursquare github – Foursquare

Build System

You can decide if you want to use an IDE or just plain text editors but I highly recommend using a standard build system.

Here’s an example pom.xml from a sample project. The main commands are to compile continuously (mvn scala:cc), run tests (mvn test) and package (mvn package).

  • SBT – The official build system supported by typesafe. I had bad experiences with this tool after trying it from version 0.7 to 0.10. It has arcane syntax and breaking changes between versions so I just stuck with Maven. Luckily the fast compiler inside SBT is available for Maven now too.

Editors

I typically use a plain text editor(VIM) with mvn scala:cc running in the background. Sometimes I use an IDE(Intellij) if I need to use a debugger or working with a larger project. You’ll find both build systems(Maven and SBT) to work well with Eclipse and Intellij

  • [VIM] I use Janus to bootstrap a VIM setup with basic Scala syntax highlighting. vim-scala is another good VIM plugin.

  • Sublime Text – A great general text editor with scala plugins.

  • Eclipse – Officially supported IDE.

  • Intellij – I prefer Intellij’s interface.

Basic Libs

Web Frameworks

  • Play – A full featured web framework similar to Ruby on Rails. Good for any general purpose web application.

  • Scalatra – Modeled on Ruby’s Sinatra. A very easy micro-web framework to start with. Mostly good for web services type of applications.

  • Lift – Used by Foursquare. I have never tried it but it is stable and well supported by a community.

Database

Other

  • Json4S – This is Scala’s primary JSON parsing package.

  • logback – logback is a widely used logging package.

  • junit – Junit is the most popular Java testing package. I like the simple syntax over other testing packages which feature a DSL including Specs2.

Help

Misc

Don’t worry about the complexity of Scala early on.

Specifically if you try to read the Scala source code, you will be perplexed initially. I would start learning by writing Scala in a traditional Ruby/Python or Java programming style first.

Learn how the JVM works.

If you wrote production Java code in the past, this will come in handy as the tuning/debugging/profiling process will be nearly the same.

All your favorite JVM tools like jstack and jvisualvm will still work.

Learn the configuration flags of the java command line. The garage collector, heap memory, logging are among the configurable options. This blog post and the Resin JVM Tuning page are helpful.

Be forewarned of binary compatibility.

I find this to be biggest problem with the Scala environment. Packages built with an older major version(2.8.X) probably won’t work with newer versions(2.10.X).

If you find a relatively new Scala package and a well supported Java package which do the nearly same thing, I would recommend to pick the Java one for your project because the Scala package might be outdated in a few months and you won’t be able to upgrade.

Rolling Restarts with Capistrano and Haproxy for Java Web Service Apps

Java web apps can be efficient because they are multithreaded and you only need to run one copy of the process to serve multiple conconcurrent requests.
This is in constrast to Ruby apps where you often need multiple processes to serve multiple requests. Using one process instead of ~8 will save you a lot of memory on the system.

The downside of one process is dealing with rolling restarts. In the case of Ruby app servers like Unicorn, multiple processes are ran and thus can be setup to provide rolling restarts.

If you are using a web container such as Tomcat 7, it can support hot reload in place.

But let’s assume your Java JVM web app is ran with a single command(e.g. java -jar backend-1.0.jar &). The idea of this setup is that it can be abstract to any single process web service.

To get rolling restarts out of this setup, we can use capistrano with haproxy.

We want to:

* start two difference servers with one process each(or use two processes on one server, this won’t provide failover though)
* use an haproxy as a load balancer to these servers

In your Java web service apps, add a health check endpoint(/haproxy-bdh3t.txt) and have it serve an empty text file.

[It’s important to use a random string as your endpoint if you are running in the public cloud since the load balancer could be referencing an old server address and haproxy could think a server is up but isn’t. ]

In your haproxy.cfg, add

option httpchk HEAD /haproxy-bdh3t.txt HTTP/1.0

as your check condition to the backend services.

In your capistrano script, let’s add two servers set as the app role

server "XXX1", :app
server "XXX2", :app

and alter the restart task to:

* remove the check file for one server. This will remove the server from the load balancer
* restart server.
* ping the server.
* add the check file back to the started server which haproxy will add back into the load balancer.
* repeat as a loop for each server.


desc "Restart"
task :restart, :roles => :web do 
  haproxy_health_file = "#{current_path}/path-static-files-dir/haproxy-bdh3t.txt"

  # Restart each host serially
  self.roles[:app].each do |host|
    # take out the app from the load balancer
    run "rm #{haproxy_health_file}", :hosts => host
    # let existing connections finish
    sleep(5)

    # restart the app using upstart
    run "sudo restart backend_server", :hosts => host

    # give it sometime to startup
    sleep(5)

    # add the check file back to readd the server to the load balancer
    run "touch #{haproxy_health_file}", :hosts => host
  end
end

How to parse URL strings in Java

I tend to use apache httpclient as my preferred java http client. I hit an error with invalid symbols such as the space character in this url:

val urlString = "http://maps.google.com/maps?q=Merrick, NY"

val cm = new ThreadSafeClientConnManager()
val client = new DefaultHttpClient(cm)
val httpRequest = new HttpGet(urlString)

java.lang.IllegalArgumentException: Illegal character in query at index 38: http://maps.google.com/maps?q=Merrick, NY
at java.net.URI.create(URI.java:859)
at org.apache.http.client.methods.HttpGet.(HttpGet.java:69)

My first attempt would be to use java.net.URLEncoder.encode

java.net.URLEncoder.encode(urlString, "UTF-8")
res4: java.lang.String = http%3A%2F%2Fmaps.google.com%2Fmaps%3Fq%3DMerrick%2C+NY

But this doesn’t work, it’s only for forms and it tries to encode the entire url string.

Our goal is to convert just the param part of the url from “http://maps.google.com/maps?q=Merrick, NY” to “http://maps.google.com/maps?q=Merrick%2C%20NY”

The trick is to construct a URL object so we can get the separate components and then create a URI object from these components:

val urlString = "http://maps.google.com/maps?q=Merrick, NY"
val url = new java.net.URL(urlString)
val uri = new java.net.URI(url.getProtocol, url.getAuthority, url.getPath, url.getQuery, null)
val httpRequest = new HttpGet(uri)
httpRequest: org.apache.http.client.methods.HttpGet = org.apache.http.client.methods.HttpGet@15ec2337