Scala, big data, and machine learning explorations

of a man with an encrypted last name.

Hello everybody. I'm Dave Hrycyszyn, a founding partner and tech director at Head London. When I'm not chatting to people about Big Data, application development, and APIs, I attempt to be a Scala coder.

Dave Hrycyszyn

2014-05-25 10:00:00 +0100

Streaming Twitter into Spark

I mentioned during the introduction to this series that Spark is more than a faster Hadoop. Besides batch processing, it can also operate on very large streams of incoming da...

Dave Hrycyszyn

2014-05-22 10:00:00 +0100

Running jobs on the Spark cluster

Ok, so now you’e got a Spark cluster running. How do we take our earlier example and run it over the cluster? If you don’t have it handy, check out the SimpleApp exam...

Dave Hrycyszyn

2014-05-14 10:00:00 +0100

Setting up Spark in cluster mode

In the previous post, we set up the simplest possible Spark job and ran it local mode. This is about as easy as it gets, and it was a good intro experim...

Dave Hrycyszyn

2014-05-13 10:00:00 +0100

Big Data in Spark for absolute beginners

Spark is a Big Data framework which allows you to run batch jobs, query data interactively, and process incoming information as it streams into your system. Spark runs on top of normal

Dave Hrycyszyn

2014-05-12 10:00:00 +0100

A basic Scala application development setup

If you’re following along with any tutorials on this blog, or just getting started out in Scala, you’ll need to install some stuff to generate, edit, compile, build, and run Scala code. Here’s the quickstart. First, don’t down...