Playing with Scala – Implementing Filters

For a long time I stayed away from Scala, just because I had heard a lot of horror stories from people using Scala and why Java is better. Clearly I was judging the book by its cover. Java and Scala are very different worlds and if we try to mix it up the resulting code is definitely a Horror.

Recently I started learning Scala(for work) and I have started enjoying it. I was trying multiple ways of implementing a filter function and found a few.

What is a filter? – It filters a collection based on a given condition.

It takes a predicate, which is a function which evaluates to either true or false and then returns a collection of all the items which hold true for the given predicate.

For all the approaches I have used same function signature.

def filter[T](predicate: T => Boolean) (seq: Seq[T]) : Seq[T] = {
  .....
}

Closer look at the function signature.

  1. It is applied on a given type T.
  2. It takes a predicate which is a function which takes an input of type T and returns true or false.
  3. It takes a Sequence of type T on which the filter is applied.
  4. At the end filter returns the filtered Sequence of type T

Approach 1: This uses the standard collect function on a Seq which collects only those items for which predicate function returns true.

def filterV1[T](predicate: T => Boolean) (seq: Seq[T]) : Seq[T] = {
     seq.collect({
       case item if predicate(item) => item
     })
}

Approach 2: This approach uses pattern matching on the given sequence and recursively accumulates the items which holds up true for the given predicate function. To optimize the recursion it is designed in a tail recursive manner. @tailrec annotation is used to validate the tail recursion behavior.

def filterV2[T](predicate: T => Boolean)(seq: Seq[T]): Seq[T] = {
  @tailrec
  def filterInner(acc: Seq[T], originalSeq: Seq[T]): Seq[T] = {
    originalSeq match {
      case Nil => acc
      case first :: last => 
        if (predicate(first)) filterInner(acc :+ first, last) 
        else filterInner(acc, last)
    }
  }

  filterInner(Nil, seq)
}

Approach 3: Here we use a function called foldLeft on the sequence which takes a function to perform the folding operation. It appends the item to the accumulator only if predicate holds true.

def filterV3[T](predicate: T => Boolean)(seq: Seq[T]): Seq[T] = {
  seq.foldLeft(Seq[T]()) {
    (acc, item) => if (predicate(item)) acc :+ item else acc
  }
}

Approach 4: This approach is very similar to approach 3 except it does it in a reverse manner.

def filterV4[T](predicate: T => Boolean)(seq: Seq[T]): Seq[T] = {
seq.foldRight(Seq[T]()) {
(item, acc) => if (predicate(item)) acc :+ item else acc
}
}

Comparing performance of the approaches.

I generated a random list of 100k numbers and applied isEven predicate on them for each of the approaches.

ApproachTime Taken (ms)
v194
v216277
v315642
v415585

Not so surprisingly the standard filter function was the fastest and took only 9ms, but no fun in using that 🙂

Anyway it was a fun exercise. Do you know of more ways of doing this? Please share.

Thanks for reading,

Kaivalya

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s