Dave Hrycyszyn

2014-05-12 10:00:00 +0100

A basic Scala application development setup

If you’re following along with any tutorials on this blog, or just getting started out in Scala, you’ll need to install some stuff to generate, edit, compile, build, and run Scala code.

Here’s the quickstart.

First, don’t download Scala from the Scala website. Don’t install it from your package manager. When used with SBT (see below), Scala auto-installs itself, and having two Scalas on your system may just lead to hassle.

Here’s what you do need. Click each link in turn for setup instructions. Everything works on Mac, Linux, and Windows:

  • the Java 7 JDK as a runtime.
  • SBT for downloading Scala, dependency management, compiling, and running your applications.
  • Conscript - a way of grabbing Scala code off the internet.
  • Giter8 - a utility for dynamically templating Scala application skeletons, built on top of Conscript.

Editor-wise, I recommend IntelliJ IDEA (Community Edition) and its Scala plugin, which you install through the IDE preferences. Having said that, use whatever text editor you like, the simpler the better.

Now, onwards to a detailed overview of Scala environment basics. You don’t need to read all of this right now - skim it so you know what’s here, then come back to it when you get stuck. Maybe it’ll help you!

A Scala development environment for beginners

Scala programming is extremely rewarding, and the language has forced me to re-think a lot of what I know about programming (in a good way). But as with any other language, the biggest initial barrier to Scala isn’t the language itself. The most time-consuming part is getting to grips with the environment.

Here’s the guide I wish I’d had when I started messing with Scala. It’s a turbo kickstart, written from the perspective of a web application, API, and Big Data developer. There’s nothing particularly comprehensive or coherent about it, it’s just a list of things I wish I’d known on Day 1, hopefully presented in an order that makes sense.

You don’t really need to install Scala

Weird, huh? It surprised me too, once I realized it. As long as you have a Java Development Kit installed, you’re fine. Install the JDK 7 which matches your operating system.

It’s worth mentioning that you can most easily get Java installed on Debian or Ubuntu Linux by using the Webupd8 PPAs:

SBT (see below) will take care of grabbing the Scala language, compiler, and core libraries for you.

Use Java 7

Java 8 came out fairly recently, and a lot of libraries still need to stabilize on it. Stick with Java 7 for now, if you’re just starting out the last thing you need is to debug library code on a new JVM version.

Next, we’ll turn our attention to Scala’s SBT, which is the most commonly-used build tool for Scala, and is the easiest way to get Scala itself installed.

SBT Overview

The SBT setup guide provides the best instructions for all platforms. Go there now and install it, if you haven’t already done so.

SBT, like any software with the words simple or lightweight in its name, is complex. It has quite a few functions:

  • it will automatically download and install a full Scala installation, including a Scala compiler, on first use
  • it builds your application
  • it runs your automated tests
  • it manages dependencies, rather like Maven, Ruby Gems, or Python Eggs do in other languages
  • it can automate common tasks, like Ant or Rake do in other languages

Library dependencies in Scala are packaged as .jar files, also called jars, jarfiles, dependencies, or libraries. These are basically zip files which are structured according to a known set of conventions and contain metadata and Java bytecode.

When you write an application, typically you depend on multiple jars written by other people. You’ll want to link those dependencies and compile them into your program. This is one of the things sbt does.

Starting sbt

Once it’s installed, you run it from a terminal, using the command sbt. The first time you run it within a directory containing an SBT project, a full Scala environment will be downloaded to your local machine.

What is an SBT project? It’s any directory with an SBT config file, and it’s laid out in a specific directory structure.

Code execution is worth a thousand words. I assume you’ve already installed Java, SBT, Giter8 and Conscript from the links up at the top of the article? Good. Let’s generate a basic SBT project now. We’ll use the g8 command (Giter8) to download a basic SBT application template from the Github repository at https://github.com/chrislewis/basic-project.g8.

We’ll customise the template by answering a very short questionnaire, and it’ll save on disk with the customisations applied.

Do this from your regular shell prompt:

$ g8 chrislewis/basic-project
name [Basic Project]: A Basic SBT Project
organization [com.example]: com.constructiveproof
version [0.1.0-SNAPSHOT]:

Template applied in ./a-basic-sbt-project

Next, cd into the generated project, and start sbt:

cd a-basic-sbt-project
sbt

Assuming sbt is on your $PATH, Scala will install itself, the heavens will open up, and you have become a kick-ass object-functional programmer. Everything will take a while to download: go get yourself a cup of coffee.

SBT tasks

You’ve entered the top-level directory of an SBT project, and typed the sbt command. What can you do now?

The first thing to do is try compiling. To do this, type the compile command, and you should see output something like this:

> compile
[info] Updating {file:/Users/dave/Desktop/a-basic-sbt-project/}a-basic-sbt-project...
[info] Resolving org.fusesource.jansi#jansi;1.4 ...
[info] Done updating.
[info] Compiling 1 Scala source to /Users/dave/Desktop/a-basic-sbt-project/target/scala-2.10/classes...
[success] Total time: 1 s, completed 26-May-2014 16:39:26
>

If you want to see a list of all available commands, type tasks. You’ll see a list of available tasks: you can compile your application, run unit tests, generate documentation, publish it to your local machine for use by other applications, and many other things.

The run command will run your application’s Main class, if there is one.

The console command fires up a regular Scala console for you. This is the nicest way to get a Scala console without installing the language from the Scala website or from your package manager. One advantage of using the SBT console to access the Scala REPL is that all your SBT-defined dependencies are available. It’s a great way to play with libraries and code ideas.

To stop SBT, type exit at the SBT command prompt, or hit ctrl-c.

Now, let’s talk about a few things which took me a long time to figure out on my own.

SBT has two kinds of config files

When you run it, SBT will look around for the following files:

build.sbt - usually used in smaller projects. This file lists app dependencies and allows some limited customization of the application’s build environment.

project/build.scala - usually used in larger projects, where the build may become more complex. You can manage dependencies and build with project/build.scala instead of build.sbt, and use Scala itself as a configuration language for the application.

It is allowable to have both a build.sbt and a project/build.scala in the same sbt project. Settings from both files will be used. Most projects go one way or the other, though.

Dependency management with sbt

The most common SBT-related thing you’ll do will be to open up build.sbt or project/build.scala and manage your application’s dependencies. Dependency listings look like this:

libraryDependencies ++= Seq(
  "com.typesafe" % "config" % "1.2.1",
  "org.apache.spark" %% "spark-core" % "0.9.1",
  "org.apache.spark" %% "spark-streaming-twitter" % "0.9.1",
  "org.scalatest" %% "scalatest" % "2.1.RC1" % "test",
  "org.scalacheck" %% "scalacheck" % "1.11.3" % "test"
)

In the Java (and thus Scala) world, every library jar has at least 3 pieces of information associated with it:

  1. an organization, typically defined by its domain name
  2. a name
  3. a version number

This is exactly what we see in a line of SBT dependencies:

"com.typesafe" % "config" % "1.2.1",

In this case, com.typesafe is the organization that made this library. The name of the library is config, and it has a version number of 1.2.1. This is the line of info that you’d add into your application’s Seq of dependencies if you wanted to use it. Make sure you include the comma at the end of the line unless it’s the last dependency.

Don’t get confused and think that you need to download this library from http://typesafe.com. The domain name acts as a namespace rather than a download link.

When people specify dependencies in their installation instructions, you may also see this alternate form of listing libraries sometimes:

libraryDependencies += "org.eligosource" %% "eventsourced-journal-dynamodb" % "0.6.0"

libraryDependencies += "org.eligosource" %% "eventsourced-journal-hbase" % "0.6.0"

libraryDependencies += "org.eligosource" %% "eventsourced-journal-mongodb-casbah" % "0.6.0"

Doing it this way has exactly the same effect as adding the dependency to the libraryDependencies ++= Seq( sequence, above.

SBT config files require double-spacing

I’ll save you two hours of your life, right in the next 10 seconds.

The SBT file format requires you to have blank lines between every discrete line of code.

That means this will work:

name := "Spark Streaming Example"

organization := "com.constructiveproof"

version := "1.0"

scalaVersion := "2.10.3"

libraryDependencies += "org.eligosource" %% "eventsourced-journal-dynamodb" % "0.6.0"

libraryDependencies += "org.eligosource" %% "eventsourced-journal-hbase" % "0.6.0"

libraryDependencies += "org.eligosource" %% "eventsourced-journal-mongodb-casbah" % "0.6.0"

This will fail miserably:

name := "Spark Streaming Example"
organization := "com.constructiveproof"

version := "1.0"

scalaVersion := "2.10.3"

libraryDependencies += "org.eligosource" %% "eventsourced-journal-dynamodb" % "0.6.0"

libraryDependencies += "org.eligosource" %% "eventsourced-journal-hbase" % "0.6.0"
libraryDependencies += "org.eligosource" %% "eventsourced-journal-mongodb-casbah" % "0.6.0"

Don’t let your inner code-cleaner suck you into saving whitespace for future generations. You’ll be left wondering why your recently-working application suddenly refuses to build.

Picking up a new dependency

When you add a new dependency to libraryDependencies, you can do one of two things to pick it up when you’re in a running SBT shell:

  1. type reload, or
  2. type exit to get out of the SBT shell, then type sbt again to get back into the shell.

The new dependency will be downloaded and automatically linked into your application.

Resolvers

By default, SBT downloads libraries from code repositories on a server called Maven Central, which acts as a central repository for jar files. Many projects publish their jars to this site.

However, the system isn’t centralized. Some projects put their jars on other servers.

SBT may need you to add a new resolver so that it knows to search on other sites in addition to Maven Central. Libraries that require resolvers will tell you what to add in their install instructions. Typically you’ll see something like this:

resolvers += "Typesafe Repository" at "http://repo.typesafe.com/typesafe/releases/"

Just dump that straight into your build.sbt or project/build.scala file, and you’ll pick up the new resolver.

Finding and using libraries

You’ve got full access to a huge number of Java and Scala libraries. Maven Central alone currently lists 78,409 separate libraries in 676,469 versions available for your use.

If you want to use a library in your app, look in the following places.

On GitHub - typically a library will have a section on installing it, and if there are resolvers needed, it’ll mention that too.

On the library’s website - again, there’s usually an installation section.

On Maven Central - you can search for libraries on Maven Central. For example, if you type Scalatra into the search box, you’ll get back a huge number of artifacts. Click on the top one, and you’ll see a big pile of XML, which you might love if you were a Java coder.

The important thing: click on Scala SBT in the Dependency Information box on the left side. You’ll be presented with the correct libraryDependencies for your application, like this:

SBT deps on Maven Central

Giter8

Giter8 is a code generator used for quickly grabbing and customizing project skeletons. You typically use it to get going quickly when you’re starting a new project.

Giter8 downloads templates from Github onto your local machine, using another utility called Conscript. It then asks you a series of questions about your application: what you want to name it, what organization you work for, and what the version number should be. The values you enter are then injected into the application templates and saved to disk on your system.

You end up with a new project skeleton so you can quickly get to work.

Install Conscript first, then use Conscript to install Giter8. By default, they both install into a directory called bin in your home directory. On Linux or Mac systems, you’ll need to mess with your $PATH a bit in order to make them available on the command line.

// Add this to ~/.bashrc (Linux) or ~/.bash_profile (Mac)
export PATH=$HOME/bin:$PATH

Don’t forget to run source ~/.bashrc (Linux) or source ~/.bash_profile (Mac) so that your current terminal picks up the new $PATH variable. You may find it easiest to hit it with the big hammer: close and re-open all your terminal sessions.

There are Giter8 templates for many different project types. Two that you will see quite a bit on this blog:

Should you use an IDE?

There’s no cast-iron reason to use a big steam-powered IDE. If you’re already comfortable in Sublime, Emacs, Vim, Gedit, Jedit, or whatever, use that. All of your compilation will be handled from the SBT console anyway.

Having said that, although I prefer a light text editor (Vim or Sublime) for other languages, I personally have switched to using IntelliJ IDEA for Scala work. The static typing support is amazing. Download the Community Edition if you want to try it out.

It’s fast, rock-solid, code completion and refactoring support are excellent, and it has near-perfect syntax highlighting. Install the Scala plugin for it, using the built-in plugin manager within the IDE. You need to click on “Settings”, then “Plugins”, then “Browse Repositories…” in order to find it in the plugins list, it’s not there by default.

Once the Scala plugin is installed, you can just open the top-level directory of any SBT project and IDEA will recognize it as a Scala project and download necessary dependencies for use by its presentation compiler. When you open a project, make sure that you tick the “Use auto-import” tickbox. That will cause IDEA to download all your application’s dependencies, so that code highlighting works.

It used to be necessary to generate an IDEA project using an SBT plugin called sbt-idea, but now IDEA’s Scala plugin recognizes SBT projects without the additional help, so you don’t mess around with that.

What about Eclipse / Scala IDE?

Moore’s Law does not seem to be making any impact on Eclipse, so for me the Eclipse-based Scala IDE is not really an option. Maybe you’ve got a faster computer than me.

If you do want to use Eclipse, I highly recommend looking at the SBT Eclipse plugin, which can be used to generate an Eclipse project from the SBT command line.

Conclusion

It’s a long post and won’t be of much interest to anybody who’s been using Scala for a while. But if I’d known these things when I first started out, it would have saved me many hours of head-scratching.

Do you think there’s anything I missed? Got wrong? Is there anything that’s still confusing despite having been covered here? Let me know in the comments.