Wednesday, April 14, 2010

Scala Constructor Oddities and NPEs

While trying to track down a bug today I discovered a set of behaviors in scala that seem very dangerous to me and they exposed a few scala implementation details I wasn't expecting to see. The first thing I found was its trivial to make a class that is impossible to construct:


class UnConstructableClass{
def dontExcept() { b.size}
dontExcept
val b = List()
}


If you run "new UnConstructableClass", you will get a NullPointerException. It turns out that while the object is being constructed all of the fields (val and var) are assigned null before that line in the constructor is run, so its possible to have unexpected null values and exceptions in scala code that is supposed to be immune to nulls!

Thats an obviously pathological case and veteran scala developers will say that "you should allways define your vals/vars before using them". This is of-course true and a good rule of thumb but I would make 2 points. The first is that the compiler has enough information to error or atleast warn us when we've done something stupid like this. The second issue is that function definitions don't behave the same way, we can use them before they have been defined, we can switch the definiton and call for dontExcept and it still excepts. I think I understand the technical reasons why this occurs but its very non-intutive behavior in my mind.

What makes this behavior so insidious is the effects when using abstract classes. If we call a base classes function we have be completley sure that all parts of our derived class have been constructed or else we can end up with NPE's in completely unexpected locations. Consider the following:


class DerivedClass(delayCall : Boolean) extends BaseClass(delayCall)
{
start
val b = List()
def act() : Int = {b.size}
}


abstract class BaseClass(delayCall: Boolean){
implicit def asRunnable(func : ()=>unit) : Runnable = {new Runnable(){def run(){func()}}}

def act() : Int // abstract or overriden function
var result : Int = -1;
def start(){
if(delayCall) {
var t = new Thread(() => {
try{
result = act()
}catch{
case n : NullPointerException =>
result = -2;
}
})
val r = new Random
t.start
if(r.nextBoolean())Thread.sleep(10)
}
else result = act()
}
}


If we run try to construct a DerivedClass(false) then we will get an immediate NPE during construction because the behavior is totally deterministic. If instead we construct a DerivedClass(true) which calls our abstract function at a non-deterministic time in the future we may or may not get an NPE. This effect may also occur with overriden functions though I haven't investigated that yet.

Combining this type of issue and the often non-obvious asyncronous behavior of method calls (!?) its making me question when it is safe to call functions inside the class body constructor. The original bug that drove me to investigate this issue was "autostarting" an actor, calling the start function inside of the constuctor. This would cause a 1 out of 10 failure where a derived class wasn't fully constructed before the act() method was called and it tried to use an abstract field. For now the rule of thumb of initializing all fields before calling any functions will have to suffice.

I have a working example of these issues with test cases in a gist: http://gist.github.com/366364

5 comments:

Tony Morris said...

No different to Java.

import java.util.*;

class UnconstructableClass {
int dontExcept() { return b.size(); }

{
dontExcept();
}

List<Object> b = new LinkedList<Object>();
}

SamH said...

Actually this is different than javas behavior and that is what surprised me. In java all of the members have been initialized before the constructor is called which prevents this type of error from occurring. I have a gist of a simple testcase I ran on my machine and on an online compiler and it showed that the class is constructable.

Secondly my main point (which upon rereading of the article isn't incredibly clear) is that it was incredibly surprising to me that a scala class with only a 'val' could explode with an NPE if used incorrectly. The scala runtime should be able to detect that I am doing this and throw a more helpful error like "ConstructionOrderError" if I am ever trying to use a non-initialized 'val' field.

James Iry said...

It's not enabled by default because it's expensive, but a runtime check is available.

$ scalac -X
Usage: scalac
[...]
-Xcheckinit Add runtime checks on field accessors. Uninitialized accesses result in an exception being thrown.

Tony Morris said...

Sam,
The code you have given is different to that which I gave and is also different to the Scala code.

The code that I gave is not constructable.

SamH said...

Tony, you are correct, I hadn't encountered "Instance initializers" in java before so I had assumed you were writing pseduo code and had meant to use a constructor. While I was looking for information on the instance initializers I found the following article which was very helpful.

http://www.javaworld.com/javaworld/jw-03-1998/jw-03-initialization.html?page=5

The article claims that in java the compiler disallows using explicit forward references in those blocks and even uses the same function "trick" to force the compiler into letting us write incorrect code. This is something that the scala compiler doesn't appear to do, I can make an even simpler unconstructable class by using a val before declaring it.