published on in intro

For + yield = for comprehension

For loops and for comprehensions in Scala

Looping over a collection of items and transforming the individual elements within is quite a common task. So it seems natural that Scala offers a nice way to solve this.

If you know some other, more “traditional”, imperative programming languages, then you might have some assumptions about a thing that’s name begins with “for”. These can be useful to understand Scala’s for capabilities, but you might be surprised at how powerful Scala’s solution is.

Some simple examples

As a warm-up, let’s see a couple of short examples.

They will use the Twitter-related case classes that were introduced in a previous post, Lists in Scala.

  case class User(name: String)
  case class TweetMsg(id: Long, user: User, msg: String)

A for loop for printing

At first let’s just create a list of three TweetMsg objects and print their message strings to the screen within a for-loop:

def createTweetList: List[TweetMsg] = {
  val user1 = new User("ador")
  val user2 = new User("szeli")
  val t1 = new TweetMsg(9243012L, user1, "scala program")
  val t2 = new TweetMsg(9534594L, user2, "ice cream")
  val t3 = new TweetMsg(9811235L, user1, "playing with scala")
  val tweetList: List[TweetMsg] = List(t1, t2 ,t3)
  tweetList
}

val tweets = createTweetList

for (tw <- tweets) {
  println(tw.msg)
}

The last three lines contain the code to loop over all the tweet objects of the list that we created previously. It’s really just the for keyword, plus a so-called generator expression, that “pushes” the collections’s elements with the ‘<-’ arrow into a temporary value (called tw in our example), that can be used within the loop’s body (that is, within the curly braces) to do something with it (we just printed the message part).

This first example, strictly speaking, is just a for loop, not really a for comprehension yet. To write a real comprehension, we’ll need the yield keyword.

A real for-comprehension

In the previous example we looped over a list and printed something, based on each element of the list.

Now we’ll write a single line of code where we transform the incoming list of TweetMsg objects into a new list of String objects by selecting only the message part from the tweets.

val strMsgs = for (tw <- tweets) yield tw.msg

Simple, huh? :)

One big advantage of this approach (transforming the data instead of doing some “side effects”, like printing) is that we stay close to the functional programming paradgim, and this can have nice benefits in many cases.

Advanced features

Filtering

In the next example we go one step further: we filter the resulting String list to only contain messages that include the word “scala”. We can do this with the so-called guards, that are simply booelan expressions, added to the for comprehension like here:

val filteredMessageList = for {
  tw <- tweets
  if tw.msg.contains("scala")
} yield tw.msg

There is another change from the previous example, as you might have noticed. We use curly braces here to enclose the generator part of the structure. The Scala style guide recommends to use simple parentheses when you have only one generator (or when you don’t yield anything), and curly braces, plus having multiple lines for readability in cases of using more generators and guards.

Looping over more collections

Instead of ugly nested loops, Scala’s for comprehensions allow us to process elements from more than one collection simply by adding more generator expressions to the for (before the guard(s)).

For example, let’s say that we have a list of words (Strings in a list called words) and we want to filter our long list of tweets (called tweets) so that only those tweet messages remain that include any of the pre-defined words:

for {
  tw <- tweets
  word <- words
  if tw.msg.contains(word)
} yield tw

Summary and other sources

The nice thing about Scala’s for comprehension is that you don’t need to understand all the details (you don’t have to know what the for is being transformed to under the hood; btw. it’s a series of map, flatMap and filter calls) to efficiently use the power of it.

Code examples are available here.

Note n+1 : Feedback is welcome on Twitter or on GitHub :)