For a long time I stayed away from Scala, just because I had heard a lot of horror stories from people using Scala and why Java is better. Clearly I was judging the book by its cover. Java and Scala are very different worlds and if we try to mix it up the resulting code is definitely a Horror.
Recently I started learning Scala(for work) and I have started enjoying it. I was trying multiple ways of implementing a filter function and found a few.
What is a filter
? – It filters a collection based on a given condition.
It takes a predicate, which is a function which evaluates to either true
or false
and then returns a collection of all the items which hold true for the given predicate.
For all the approaches I have used same function signature.
def filter[T](predicate: T => Boolean) (seq: Seq[T]) : Seq[T] = { ..... }
Closer look at the function signature.
- It is applied on a given type
T
. - It takes a predicate which is a function which takes an input of type T and returns
true
orfalse
. - It takes a
Sequence
of typeT
on which the filter is applied. - At the end filter returns the filtered
Sequence
of typeT
Approach 1: This uses the standard collect
function on a Seq
which collects only those items for which predicate
function returns true
.
def filterV1[T](predicate: T => Boolean) (seq: Seq[T]) : Seq[T] = {
seq.collect({
case item if predicate(item) => item
})
}
Approach 2: This approach uses pattern matching on the given sequence and recursively accumulates the items which holds up true
for the given predicate function. To optimize the recursion it is designed in a tail recursive manner. @tailrec
annotation is used to validate the tail recursion behavior.
def filterV2[T](predicate: T => Boolean)(seq: Seq[T]): Seq[T] = { @tailrec def filterInner(acc: Seq[T], originalSeq: Seq[T]): Seq[T] = { originalSeq match { case Nil => acc case first :: last => if (predicate(first)) filterInner(acc :+ first, last) else filterInner(acc, last) } } filterInner(Nil, seq) }
Approach 3: Here we use a function called foldLeft
on the sequence which takes a function to perform the folding operation. It appends the item to the accumulator only if predicate holds true.
def filterV3[T](predicate: T => Boolean)(seq: Seq[T]): Seq[T] = { seq.foldLeft(Seq[T]()) { (acc, item) => if (predicate(item)) acc :+ item else acc } }
Approach 4: This approach is very similar to approach 3 except it does it in a reverse manner.
def filterV4[T](predicate: T => Boolean)(seq: Seq[T]): Seq[T] = {
seq.foldRight(Seq[T]()) {
(item, acc) => if (predicate(item)) acc :+ item else acc
}
}
Comparing performance of the approaches.
I generated a random list of 100k numbers and applied isEven
predicate on them for each of the approaches.
Approach | Time Taken (ms) |
v1 | 94 |
v2 | 16277 |
v3 | 15642 |
v4 | 15585 |
Not so surprisingly the standard filter
function was the fastest and took only 9ms
, but no fun in using that 🙂
Anyway it was a fun exercise. Do you know of more ways of doing this? Please share.
Thanks for reading,
Kaivalya