David Buck
Note: When this article was originally written, I was using the term Static Typing to refer to the kind of explicitly-declared static typing that is used by Java, C# and C++. Languages that are statically typed but have type inferencing such as Haskell, OCaml and Scala don't suffer the same problems as described in this article. I've therefore changed the term to explicit static typing. Many people use the term manifest typing for this as well and it's not clear which term is best suited. In any event, I'm referring to languages that require the programmer to explicitly provide the type of each variable, parameter and return value.
Java borrowed many concepts from Smalltalk - object oriented programming, a bytecode language, just-in-time compiling, automatic garbage collection and collections to name just a few. It's interesting that one concept that's fundamental to the Smalltalk language wasn't adopted by Java - blocks. Blocks are objects which contain code that can be executed later. Because Smalltalk has blocks, it doesn't need special syntax for conditional and looping structures. Smalltalk simply uses blocks and ordinary message passing to make these constructs work.
This leads to an interesting question: what would Java look like if it had blocks? To answer this question, let's look at Smalltalk blocks a bit more, then see what would be required to add explicit static typing to them.
Smalltalk blocks may or may not have parameters. Consider the following Smalltalk statement:
10 timesRepeat: [Transcript show: 'Hello'; cr]
This statement uses a block that doesn't have any parameters. The method timesRepeat: for Integers knows how to run the block the proper number of times and doesn't need to pass a parameter to the block each time.
On the other hand, consider the following statement:
1 to: 10 do: [:index | Transcript show: index printString; cr]
This block takes one parameter. The to:do: method for Integers knows how to run the block 10 times passing in the numbers 1 to 10 as parameters.
Blocks always return some sort of result. The result returned is the result of running the last statement in the block. Here's a sample block:
[4 + 5] value
The result of this statement is 9. Many times, the caller doesn't care about the return value of the block and just discards it. Notice that the value message executes the block and returns the result. If you have parameters, you must use value:, value:value:, etc. to pass the right number of parameters.
[:x :y | x * y] value: 4 value: 7 ==> 28
If you send the wrong message and don't provide enough parameters, you'll get a runtime exception.
Suppose we now want to add explicit static typing to blocks. Clearly, you'd have to provide types for the block parameters. Let's say the syntax looks like this:
[Customer customer | customer name]
This block returns the customer's name. The parameter has to be a Customer.
Now, suppose we can run the block by providing it with a parameters. Again, let's invent a syntax for it (Smalltalk has no special syntax for evaluating blocks. It uses normal message sends. At this point, however, this would complicate things too much).
[Customer customer | customer name] (myCustomer)
We should be able to assign the result of running the block into a variable:
String result := [Customer customer | customer name] (myCustomer)
Something's wrong here. We've explicitly typed the parameter, but it's equally important to specify the return type of the block. Otherwise, when you execute the block, you won't know the type of the value returned in order to assign it into result without a type error. Let's extend the syntax:
String result := <String>[Customer customer | customer name] (myCustomer>
This block declares that it returns a String and allows us to assign it into the result variable which is declared to be a String.
We should be able to assign the block itself into a variable. What's the type of the variable? I would suspect that Block isn't enough.
Block myBlock := <String>[Customer customer | customer name].
We have several problems here. First you shouldn't really be able to assign a block with the wrong number of parameters to myBlock - otherwise you may later call it with the wrong number of parameters and the problem wouldn't be detectable at compile time. This is what explicit static typing is all about. In addition, you shouldn't be able to assign a block that has the wrong parameter types or the wrong return type to the variable. To prevent this, the type of the variable must be better specified. Let's try this:
Block <String>(Customer)myBlock := <String>[Customer customer | customer name]
To evaluate the block, we have to provide a special syntax that can reflect the parameter types and return type of the block. Let's say we call it this way:
Block <String>(Customer)myBlock :=
<String>[Customer customer | customer name].
String string := myBlock(myCustomer).
Now that we can declare types for the block's parameters, the return types, and the block itself, we can start looking for good uses for our statically typed blocks. Let's take a Smalltalk example:
#(6 8 4 2 7) collect: [:each | each printString] ==> #('6' '8' '4' '2' '7')
The collect: method is implemented by collections and takes a one parameter block as a parameter. It loops through the elements of the collection and passes each one in turn to the block. It returns a collection of the return values of the block.
We might define collect: like this:
collect: aBlock
| result |
result := self class new: self size.
1 to: self size do: [:index |
result at: index put: (aBlock value:
(self at: index))].
^result
(This isn't a particularly good implementation, but it's good for explaining the concepts.)
In other words, create a new collection, then loop through the indexes from 1 to my size, extract the element from myself, evaluate the block with that element, and write the result into the result collection.
Now, how would this translate when we have explicit static typing? We'll have to provide types for each variable and each return value. Keeping a Smalltalk-like syntax, the result would look something like this:
<Array of String>collect: <String>(Customer)aBlock
<Array of String> result;
result := <Array of String> new: self size.
1 to: self size do: [Integer index |
result at: index put: (aBlock(self
at: index))].
^result
This method would work in our particular case, but it's clearly not general. In general, collect: could iterate over any kind of collection of any kind of objects and return a collection of any kind of objects. Restricting it as we did above would mean that we'd have to create many versions of collect: each with a different type of parameter and a different return type. This quickly becomes infeasible.
One possible solution is to loosen our explicit static typing and require casting in order to make the type system happy. This is how Java collections work in general. If you have a collection of Vehicles and put a Car in, you have to cast it as a Car when you extract it from the collection if you ever want to use it as a Car and not just a Vehicle.
To apply that principle to this case, we would probably want to change the method to look like this:
<Array of Object>collect: <Object>(Object)aBlock
<Array of Object> result;
result := <Array of Object> new: self size.
1 to: self size do: [Integer index |
result at: index put: (aBlock(self
at: index))].
^result
The part that feels wrong here is executing the block. The block may not work if you pass some object other than a Customer but we're forced to declare it to take an Object which allows us to pass any object to it. The system would have to do an implicit cast down in order to make this call work. This automatic cast seems wrong.
The C++ approach would be to declare a template for collect: but not the specific method. The template might look something like this:
Method for <Array of <<T2>>>:
<Array of <<T1>>> collect:
<<T1>>(<<T2>>)aBlock
<Array of <<T1>>> result;
1 to: self size do: [Integer index |
result at: index put: (aBlock(self
at: index))].
^result
This would mean that you never have a single collect: method - you have a family of collect: methods that have different parameters and return types. Since there is no single method for collect:, it's more difficult to debug. There's no single place where you can put breakpoints and it's harder to locate the actual code.
If a general method wants to call collect: and pass in a block it got as a parameter, the method has to be a template.
<Array of String>collectPrintStrings:
<<T1>>(<<T2>>)aBlock
^self collect:
[<<T1>> each | aBlock(each) printString]
Pretty soon, large portions of your library become templates and the complexity increases dramatically.
Suppose I want an object that can do formatted printing of any other object. To make it general, I want to hold a Dictionary (Mapping) that maps classes to blocks that take an instance of that class and return a formatted string. In Smalltalk, I might write the formatting routine like this:
format: anObject
^(formats at: anObject class) value: anObject
The variable formats is a Dictionary. The keys are classes and the values are one-parameter blocks. I might create formats like this:
formats := Dictionary new
at: Integer put: [:integer | integer printString];
at: Double put: [:double | double printFormattedBy:
'###.##'];
at: Timestamp put: [:timestamp | timestamp printFormattedBy:
'mm/dd/yyyy hh:mm:ss'];
yourself.
In an explicit static typing system, what would the type of formats be? I don't know of any explicit static typing system is able to even express this concept. That's not to say that there isn't one, just that I don't know about it :-).
The only explicit statically typed language I'm aware of that has something like blocks is Eiffel which has inner agents, but to be fair, I haven't researched Eiffel enough to see how it handles these problems.
It seems like the more you try applying explicit static typing to blocks, the more the complexity of your system increases. To make it work, you have to do manual and automatic casts to defeat the type system or turn large portions of your program into templates and deal with the resulting complexity and loss of debugging power. It's no wonder only dynamically typed languages seem to have lexical closures. In such languages, the lexical closures add both a level of expressiveness and a level of simplicity not available otherwise. Without lexical closures, you are left forever with a small set of low level looping and conditional constructs and have no ability to rise above that level.