Sunday, July 13, 2008

Traits in Scala: a Powerful Design Tool

Traits in Scala are an extremely powerful Object Oriented (OO) design tool. They also provide a powerful mechanism for code reuse by making good use of subtyping and delegation.

In terms of raw functionality, traits allow us to:
  • Define types by specifying the signatures of supported methods. This is similar to how interfaces work in Java.
  • Provide partial/full implementations that can be mixed-in to classes that use the trait. A class can mix in multiple traits, as opposed to (single) inheritance, where a class can extend from only one superclass.
This core functionality can be used in different ways to provide powerful usage patterns. But before I start digging into that, let me define, for quick reference, some frequently used terms in the area of OO design:


Type: a set of methods that an object responds to, as defined by an Interface.

Class: a definition of an object's implementation.

Subtype: a relation between Types. A given Type is a subtype of another Type if its Interface is a superset of the Interface of the other type.

Subclass: a relation between Classes. A subclass inherits its Type and its Implementation from its parent class.

Interface Inheritance: enables the creation of a subtype relation between two interfaces.

Class Inheritance: enables the creation of a subclass relation between two classes. Class inheritance is essentially a code reuse mechanism that is defined statically at compile time. Class inheritance also implies subtyping: a class that inherits from another class is a subtype of (the type of) the parent class.

Composition: Another code reuse mechanism, in which new functionality is obtained by assembling or composing objects - at runtime. Composition does not imply subtyping. Composition involves a wrapped object, a wrapper object, and method forwarding.


With that out of the way, let's move on...

Inheritance and composition are the primary mechanisms of code reuse within Object Oriented systems. A lot has been written about the relative merits and demerits of these two approaches, so I will not go into that in a lot of detail here. In general, composition is preferred to inheritance for the following reasons:
  • Dynamic reconfigurability: with composition (and its slightly looser variant - aggregation), objects are wired together at runtime to achieve desired functionality. Consequently, these objects can be rewired at runtime to tweak behavior. Inheritance, on the other hand, results in systems with runtime structures that are statically fixed at compile time.
  • Ability to compose multiple objects without complication, to provide different facets of an object's behavior. A similar effect can potentially be achieved with multiple-inheritance (MI), but MI has a host of problems associated with it.
  • Ability to compose multiple objects in a chain without complication, to refine a particular facet of an object's behavior. An example of this is a chain of Interceptor objects attached to an object. This kind of effect is difficult to achieve via inheritance.

Another touted benefit of composition is that it involves black-box reuse with loose coupling and good encapsulation. Inheritance, on the other hand, involves white-box reuse, and consequently breaks encapsulation.

I think this argument is a double edged sword. Inheritance definitely involves stronger coupling than composition for a scenario where a class wants to reuse the code in another (supplier) class; but this comes with potential benefits. To dig deeper into this, let's focus on just the public interface of a supplier class (as opposed to also talking about its protected interface, which is not available via composition). The primary reason for the stronger coupling shown by inheritance is: the potential for self-use of public methods. This happens when a public method (say m1) in the supplier class calls another public method (say m2) in this class. If a subclass overrides m2 (but not m1), then a call to the supplier's m1 method on an instance of the subclass will result in a call to the overridden m2. In other words, a subclass can alter the behavior of a method in its parent supplier class without redefining it. As opposed to this, with composition, there is no way that a wrapper object can interfere with a method call to a wrapped supplier object.

So we see that, for our scenario of interest, inheritance involves stronger coupling than composition because of self-use; but if this self-use is done in a controlled fashion, the stronger coupling afforded by inheritance can actually be useful. We will see an example of this later in the post. In a similar scenario, composition does not work out quite as well; this is because of the Self Problem.

On the other side of the coin, the stronger coupling shown by Inheritance has potential drawbacks. These are documented by Joshua Bloch in Effective Java, 1st ed., Item 14.


Composition, as a mechanism for reuse, has a lot of benefits compared to inheritance. But it also has some drawbacks:
  • Unlike inheritance, it does not play well with polymorphism in situations where it is used to enhance the functionality of an existing class. Consider, for example, a class A, and another class that contains a List of As. If a class B extends A via composition, instances of B cannot be added to the List of As, thus defeating the whole purpose of polymorphism.
    This is easily fixed via subtyping, with class B inheriting from an interface that it shares with A (say X), and then using composition to implement X. The List of As now becomes a List of Xs.
    But when this fix is applied, there is a lot of grunt work and code noise involved in forwarding all the methods of the implemented interface to the wrapped object.
  • Unlike inheritance, it exhibits the Self Problem.
This is where traits come in. Traits address the problems with composition, while keeping most of its advantages. A couple of points are worth discussing:
  • One element of composition that is compromised by traits is 'dynamic reconfigurability', but this is not a big issue in practice. Most cases of 'dynamic reconfigurability' make use of Dependency Injection to wire in dependencies into an object, and this is normally done only once at program startup. Traits provides similar functionality, albiet at compile time, via required methods or self-types.
  • With regard to the Self Problem, traits make use of true delegation; they consequently do not suffer from the Self Problem. The use of delegation introduces tighter coupling than composition for self-use scenarios, with the attendant risks and benefits.
That's more than enough talk! Let's look at some code.

Let's say that we want to enrich Java Sets so that we can do folds and foreachs and all that good functional stuff with them. Here's a trait that defines the methods that we want to provide:
trait RichIterableOps[A] {

// required method
def iterator: Iterator[A]

def foreach(f: A => Unit) = {
val iter = iterator
while (iter.hasNext) f(iter.next)
}

def foldLeft[B](seed: B)(f: (B, A) => B) = {
var result = seed
foreach(e => result = f(result, e))
result
}
}
Let me fire up the Scala Interpreter to play with this. I am going to mix the trait into a HashSet, and then do some operation on the Set; let's see how that plays out:
scala> val richSet = new HashSet[Int] with RichIterableOps[Int]
richSet: java.util.HashSet[Int] with traits.RichIterableOps[Int] = []

scala> richSet.add(1); richSet.add(2)

scala> richSet
res0: java.util.HashSet[Int] with traits.RichIterableOps[Int] = [1, 2]

scala> richSet.foldLeft(1)((x,y) => x+y)
res1: Int = 4
That's exactly what we wanted!

So why did I call the trait RichIterableOps as opposed to RichSetOps? Because it makes no assumptions about the kind of collection it is working with. All it needs to do its work is for the collection to provide an iterator() method. To validate this point, let me try it with a List now:
scala> val richList = new ArrayList[Int] with RichIterableOps[Int]
richList: java.util.ArrayList[Int] with traits.RichIterableOps[Int] = []

scala> richList.add(1)
res2: Boolean = true

scala> richList.add(2)
res3: Boolean = true

scala> richList
res4: java.util.ArrayList[Int] with traits.RichIterableOps[Int] = [1, 2]

scala> richList.foldLeft(1)((x,y) => x+y)
res5: Int = 4
Already, we're starting to see the power of traits. Let me pause for a moment here and enumerate the traits features that we've seen so far:
  • Trait usage Pattern 1: A trait can be used to provide a rich interface for a class. It does this by declaring required methods, which a class that mixes in the trait needs to implement. Based on these required methods, the trait can provide additional (rich) methods to the class. Required methods in a trait lead to low, controlled, and good coupling; more on that later.
  • A trait functions as a powerful unit of code reuse: it can be mixed into unrelated classes within an existing class hierarchy to provide extra functionality. This works despite the following potential obstacles:
    • The existing classes already extend other classes.
    • no source code is available for the (possibly third party) class hierarchy, so making tweaks to the existing classes is not possible.
Next, let's say that we want a Set of Strings that does not care about the case of its elements. This looks like something that can be implemented using an existing Set class, with minor code enhancements. I'll capture the desired enhancement in a trait:
trait IgnoreCaseSet extends Set[String] {

abstract override def add(e: String) = {
super.add(e.toLowerCase)
}

abstract override def contains(e: Any) = {
super.contains(e.asInstanceOf[String].toLowerCase)
}

abstract override def remove(e: Any) = {
super.remove(e.asInstanceOf[String].toLowerCase)
}
}
Let me try this out:
scala> val icSet = new HashSet[String] with IgnoreCaseSet
icSet: java.util.HashSet[String] with traits.IgnoreCaseSet = []

scala> icSet.add("Hi There")
res0: Boolean = true

scala> icSet
res1: java.util.HashSet[String] with traits.IgnoreCaseSet = [hi there]

scala> icSet.contains("hi there")
res2: Boolean = true

scala> icSet.remove("hi there")
res3: Boolean = true

scala> icSet
res4: java.util.HashSet[String] with traits.IgnoreCaseSet = []
Looks good!

Once again, let me enumerate the trait features that we just saw:
  • Trait usage Pattern 2: A trait can decorate the behavior of an existing class. It does this by:
    • extending an Interface that it shares with the class.
    • overriding the methods that it wants to decorate with the help of the abstract override modifier.
    • delegating to the original methods within its overridden methods using the super keyword.
It is possible for multiple traits to decorate the same method in a class by forming a chain of stackable decorators. In this scenario, the use of super calls within the traits provides control flow along the stack of decorators. The precise rules for the order in which this happens is governed by a process called Linearization. Details about Linearization are available in the Scala Language Spec, if you're interested.

Let's test something here. I want to see what happens when I add multiple Strings to the Set using the addAll() method. The intersting thing here is that the IgnoreCaseSet trait does not override the addAll() method. Does that mean that we can sneak in behind the covers and add Strings with uppercase letters to the Set - by using the addAll() method? Or is the trait going to be able to hook into this operation?
scala> val list = new ArrayList[String]
list: java.util.ArrayList[String] = []

scala> list.add("Element 1"); list.add("Element 2"); list.add("Element 3")

scala> list
res5: java.util.ArrayList[String] = [Element 1, Element 2, Element 3]

scala> icSet.addAll(list)
res6: Boolean = true

scala> icSet
res7: java.util.HashSet[String] with traits.IgnoreCaseSet = [element 3, element
2, element 1]
As we can see, the trait was able to hook into the functioning of the addAll() method, to help us do the right thing. This worked out because of two reasons:
  • HashSet's addAll() method calls its add() method (which it should, to keep things DRY).
  • The self variable for the call chain is such that HashSet.addAll()'s call to add() gets routed to the mixed in Trait. This feature distinguishes the delegation used by Traits from the simpler notion of composition; it also overcomes the Self Problem.
Note: if we use composition in this scenario, we will need to wrap the addAll() method to get the right behavior.

Now - let's say that we want a String Set that provides rich operations, but also ignores the case of its elements. I should be able to combine the two traits that I just created to get the desired behavior:
scala> val icRichSet = new LinkedHashSet[String] with IgnoreCaseSet with RichIterableOps[String]
icRichSet: java.util.LinkedHashSet[String] with traits.IgnoreCaseSet with traits.RichIterableOps[String] = []

scala> icRichSet.add("Hi There"); icRichSet.add("My Friends")

scala> icRichSet
res8: java.util.LinkedHashSet[String] with traits.IgnoreCaseSet with traits.RichIterableOps[String] = [hi there, my friends]

scala> icRichSet.foldLeft("Namaskar, and")((x,y) => x + " " + y)
res9: java.lang.String = Namaskar, and hi there my friends
Looks good.

Btw, I used a LinkedHashSet above because I needed the fold operation to return deterministic output based on the order in which I put things into the Set.

Again, let me enumerate the trait features that we just saw:
  • Trait usage Pattern 3: Multiple traits can be mixed into a class to provide different facets of a class's behavior. This is an extremely powerful capability, because it works at the level of two of the most fundamental forces in OO design: cohesion and coupling:
    • Cohesion: traits encourage small, cohesive chunks of code that focus on doing one thing well. These chunks can then be mixed together into a class to provide the class's functionality.
    • Coupling: when a trait is mixed into a class, there is very little coupling between it and the class. In general, three different kinds of coupling are possible when a class mixes in a trait:
      • No coupling: the trait just mixes in and does its thing.
      • Coupling based on required methods: in this case, the class mixing in a trait just needs to implement methods that match the required methods of the trait.
      • Coupling based on overridden methods in a shared Interface: in this case, the coupling is based on an interface. This is the highest degree of coupling that we encounter when a trait is mixed into a class. But even this is good coupling, because it is based on well-defined interfaces.

So, do traits have any downsides? A few minor ones:
  • Traits do not have constructor parameters (but this should not be an issue in practice).
  • Traits impose a slight performance overhead (but this is unlikely to impact the overall performance of your program).
  • Traits introduce compilation fragility for classes that mix them in. If a trait changes, a class that mixes it in has to be recompiled.
And what happens to Abstract Classes now that we have traits? Traits definitely encroach on the design space occupied by Abstract classes: which, in my mind, is the capturing of commonality within a very localized class hierarchy. If you find that even three (or two?) classes that implement an interface share some commonality, this commonality is a good candidate for being pulled out into the interface trait. This would fall under the category of the 'provide rich interface' mode of trait usage.

Conclusion

I have covered a fair bit of ground in this post. We saw that traits are a powerful tool for OO design, and that they serve as a great mechanism for code reuse. We also looked at the three primary usage patterns for traits:
  • To provide rich interfaces for classes that mix them in.
  • To decorate the behavior of existing classes in a stackable manner.
  • To provide different facets of a class's behavior.
Along the way we also saw how traits:
  • Eliminate the grunt work and code noise involved in doing manual composition with interface inheritance/subtyping.
  • Overcome the Self Problem.
  • Help in achieving high cohesion and low coupling.
I hope all of this is starting to make you see that traits are a great tool to have in your design toolbox.

Happy Trait-ing (and Scala-ing)!

Relevant Reads:
Programming in Scala; the chapter on Traits
Traits: Composable Units of Behaviour
Using Prototypical Objects to Implement Shared Behavior in Object Oriented Systems
What is (Not) Delegation
Scala Language Spec; the section on Trait Linearization

3 comments:

Jesper Nordenberg said...

Excellent post. For interoperability, and performance, reasons it's very understandable to include traits and (abstract) classes in Scala, but in a minimal OO language it would be sufficient to have interfaces, objects and delegation support (traits in Scala are basically implemented as delegation with a self parameter). With proper language support for delegation what you call "grunt work and code noise" would not be an issue. I also think it's valuable to cleanly separate interface (types) and implementation (i.e. objects). Eliminate classes and you eliminate a lot of problems with inheritance, access rights, initialization etc. It would be interesting to see a JVM language with delegation support and without classes.

Germán said...

Great post! One question: why do the overrides in IgnoreCaseSet need the abstract keyword? Is it because Set is an interface?

Lalit Pant said...

Germán,
Partly so. The need for the abstract keyword in the overridden methods of IgnoreCaseSet is driven by two factors:
- the methods override (effectively) abstract methods in java.util.Set
- they make use of super calls

Given that these super calls are calling into abstract methods (which are not defined!), we need a mechanism to tell the Scala compiler that this usage is deliberate, that we're trying to decorate existing behavior, and that we intend this trait to be mixed in with a class/trait which does define the overridden methods. The abstract keyword serves this purpose.