Thursday, November 12, 2009

Another encoding of typeclasses in Scala

I've been thinking about different ways of doing typeclass-like things in Scala, using Scala's first-class modules (aka objects), and functions over these modules (aka classes). More generally, I'm interested in exploring whether there's anything as convenient as typeclasses that could be implemented in a dynamic language.

Haskell's typeclasses can be encoded pretty directly in Scala, using an implicit module derived from some polymorphic type parameter:
  def dot[A](a: List[A], b: List[A])(
implicit r: Ring[A]): A = ...

def sequence[M[_],A](l: List[M[A]])(
implicit a: Applicative[M]): M[List[A]] = ...
From what I understand, Haskell's implementation of typeclasses works similarly: bounded polymorphic functions receive one or more hidden dictionary (module) parameters. Except in Haskell modules are not first class objects that the programmer can access or manipulate. (Although I'm unsure how much of a restriction this is.)

Given that modules are first class in Scala, though, why don't we just pass the module in at some outer scope?
  class ApplicativeM[M[_]](a: Applicative[M]) {
def sequence[A](l: List[M[A]]): M[List[A]] = ...
// other Applicative derived functions here
}
We can then put all the Applicative-derived functions nested within the body of this class. If we then give a name to our ApplicativeM[List] instance, like ListA, we can have syntax like ListA.sequence(blah), which is almost the best you can do. One thing I like about this is we avoid having to restate the bound for each and every Applicative-derived function - all functions simply reference M, where the bound on M stated in the outer scope.

Incidentally, this sounds very similar to the encoding discussed here:
The translation from Haskell type classes to ML modules encodes type classes as signatures and instances of type classes as functors that yield structures of these signatures.
Sidebar question for ML people: In ML, can you have higher-kinded modules like Applicative, and can ML functors receive such modules as arguments? Or are you limited to accepting first-order modules like Ring?

Per-function implicit parameters is still more flexible, though, since it allows the functions derived from a typeclass to be spread out over multiple modules. The most logical place to put a function isn't necessarily in the module with the typeclass definition since the bound is often somewhat incidental. Per-function implicits scale better, too - you don't need to have a separate module for every combination of type bounds (although I doubt very many combinations occur in real programs).

I wonder if it would be possible to get around this first restriction somehow by adding a few operators to modules - perhaps union, subtraction, and renaming?

Sunday, September 13, 2009

A Scala success story: commercial usage of Scala at Capital IQ ClariFI

Update: There's some additional comments and discussion on reddit. Also, made minor edits on our company background to reflect our recent rebranding.



For a while I've been meaning to put together a post discussing my experience using Scala successfully in a large commercial product.

First, a bit of background: the company I work for, Capital IQ, delivers fundamental and quantitative research and analytics software to the investment management, banking, corporate, and academic communities. Our flagship quant product, ClariFI, supports asset managers at every stage of the quantitative investment process: building and backtesting of factors, portfolio optimization, simulation of trading strategies, performance and risk attribution, overall data management (including organizing huge amounts of time series data pulled from a diverse set of raw sources), and lots more. In addition to ClariFI, though, we have started serving up some of our analytics to the Capital IQ web application.

How did we end up using Scala? About a year and a half ago we had a legitimate business case for rewriting—or at least substantially redesigning and refactoring—the core analytics backing our portfolio attribution (PA) workflow. I was slated to be the person doing this rewriting/redesign, and as my boss Scott was describing to me how the existing ModelStation PA worked, it became clear to me that a functional language was a natural fit for the domain. I'd been pushing for a while for us to try using languages other than Java—at least for some projects that were more isolated from the rest of our codebase—and this project seemed like the perfect opportunity. After doing some additional research we settled on Scala.

The decision to write in Scala implied this would be a true rewrite and I would be totally unencumbered by the previous architecture. Rewrites aren't always a bad idea, though. Our new Scala-backed PA exceeded the functionality of the old codebase, using about a third of the code of the old Java implementation, with comparable memory and speed footprints. The new codebase was more generic (components developed to support PA are now being reused elsewhere), more modular (the core PA engine itself is now being accessed by two completely different front ends), and much more testable. I made extensive use of ScalaCheck—an absolutely huge boon that dramatically cut down on the time and effort needed to find and fix bugs. In developing the new PA, Scott and I had conversations in which I would ask him for a property to specify a piece of code I needed to write, and then I would transcribe his description, fairly directly, into a ScalaCheck property. Originally, we were thinking of porting over Scott's old hand-made test cases from the previous PA, but we didn't end up doing this because our properties gave such good coverage. Having such an extensive set of tests also gave me a lot more confidence to make later changes to tweak the design, knowing I was not breaking anything in the process.

Part of the reason the code was so testable was that close to 100% of the Scala I wrote was purely functional, exceptions being the occasional use of local state whose effects did not escape function boundaries and "benign" uses of state that preserved referential transparency, like memoizing a function. Writing stateless code, especially within the analytics layer of our product, is extremely natural and doesn't require any contortions. And stateless code is very easy to test.

Scala is an interesting language. It doesn't really impose any particular view on you of how to write software. Everything from higher-order functional code to low-level imperative programming is well-supported and moving between these paradigms is seamless. On the other hand, there's so much to the language, and the features aren't all totally orthogonal, that it sometimes feels a bit clunky to me. But it is without a doubt an exceedingly practical, powerful language. Let me highlight a couple aspects of Scala I think have been important for us:
  • Integration between Scala and Java is trivial: If you were so inclined you could write "Java in Scala" and have it compile to essentially the same classfiles that Java would compile to. You can set breakpoints in Scala, step into Scala code from Java, call any Scala method as if it were a Java method and vice versa, profile Scala code using the same tools you'd use for profiling Java code, etc. This level of integration was important for us since we were interfacing our Scala code with a large Java application.

  • Basic functional idioms: Without a doubt, there's simply a huge productivity boost that comes with having access to first-class functions, convenient syntax for annoymous functions, and the various list processing functions like map, filter, fold, zip, etc. Scala's Option type is extremely useful as well, as are for-comprehensions.

  • The type system: It's quite good. I'm still exploring its limits. In particular, though, Scala's generics plus implicit parameters enable powerful libraries like ScalaCheck and SBinary, both of which we have used extensively.

  • Optional non-strict evaluation: One of our projects involved running a whole series of memory-hungry calculations whose results were then encoded as XML and streamed across the network. We couldn't construct all the results up front as that would consume too much memory, and we didn't want to rewrite the producer and/or consumer to have to tightly coordinate with the other. By constructing the results lazily via judicious use of Scala's by-name function parameters, the producer (the calculation) and consumer (the xml writer) remained ignorant of one another, and memory usage was nearly the same as if the producer code and consumer code were explicitly interleaved. Not that this was news, but laziness really can improve modularity.
Some other things weren't as nice:
  • At the time of this first project, the only developed IDE support was the Eclipse plugin which was extremely buggy and limited in functionality. Workable, but not good. Since then the situation has improved considerably.

  • In Scala 2.7.1 (the version we used up until about 6 months ago), Scala's parameterized types did not show up as Java parameterized types. This was actually rather annoying to deal with, especially given that I made frequent use of parameterized classes and methods. It wasn't until Scala 2.7.4 that generics support was actually bug-free enough for us to use. Even still, though, Java's lack of both type aliases and type inference makes dealing with generic types extremely painful.

  • Scala's type inference, while better than nothing, is quite limited. Method parameters always require type annotations. Higher-kinded type parameters are not inferred (although I've heard that there might be some form of inference of these parameters in 2.8). The inference algorithm seems unpredictable to me... my usual strategy is to start without type annotations, try compiling, fix errors by adding annotations, repeat (of course, I sometimes annotate types for clarity anyway). To be fair, the situation here is still much better than Java, but not nearly as nice as (say) Haskell.

  • Functions aren't curried by default, and I find working with curried functions kind of cumbersome in Scala. There are also some language warts that make it pretty much impossible to program in a point-free style (namely, the somewhat arbitrary distinction between methods, defined using def, and function values which implement one of the FunctionN traits). This might not be so terrible if you didn't have to fully annotate parameter types in method definitions, but you do, and it can get kind of annoying. Again, we are still much better off than we are in Java, but it's possible to do better (for instance, Haskell).

  • The standard library is kind of uninspiring and is missing some basic stuff. It also has some warts—for example, the function zip is defined for List but not for all sequences, there's no zipWith function, no scanl function, etc. Of course, you can write these functions yourself, and we did. The collections library is also getting revamped for 2.8 and a lot of these things are getting cleaned up. There's also an interesting project, Scalaz, that we haven't gotten around to using in any production code—but it's a nicely structured library that fills in a lot of the gaps in Scala's standard lib. In general, I don't want to dwell on this issue too much since every language's standard library has warts and Scala's is still certainly much saner than Java where they overlap in functionality (sensing a theme here?).
Overall I found these shortcomings to be more annoyances than showstoppers and I would certainly recommend Scala to anyone targeting the JVM. At our company, I know we will continue to expand our Scala usage: since shipping the Scala-backed PA, we've used Scala for one other major project in production and have several others either in progress now or in the pipeline. So we're pretty much hooked.

Takeaways


Typeful programming is a huge win, but it does take some getting used to. When I started on this first Scala project, I'd used Haskell for some toy projects but otherwise didn't have much experience with statically typed functional languages. There is a certain sort of brain-rewiring that occurs as you become proficient with the type system in a language like Scala—you reach a point where types become ingrained in how you think about and structure code, and the type system becomes something you work with, rather than against. Types are also an extremely concise way of communicating and documenting designs.

Another takeaway is that (no surprise), good design takes a while and is extremely hard, in particular getting the details right. Scott and I had a vague sketch of a design after a few hours of whiteboarding, but nailing down the details took much longer. There are always a lot of small design decisions beyond what you uncover whiteboarding. In general, I've become a lot less trusting of whatever initial ideas pop into my head when working on a design, and I prefer now, where possible, to give myself time to marinate on a problem. Several times I've had the experience of realizing something, months after the fact, which had I realized several months prior would have saved me a ton of work and made the code much simpler. This is very humbling, and it's why I think doing prototyping and giving yourself time to explore the problem domain often yields greater productivity in the long run.

Finally, I'd like to scrutinize the folk wisdom that rewrites are generally a bad idea. The argument goes that old, battle-tested code has accumulated scores of bugfixes, corner-case handling, and so forth and rewrites mean throwing away all of this instantiated knowledge. Rebuilding this knowledge in the new codebase is often more time-consuming than it's worth.

The experience with rewriting the PA has led me to suspect that this argument is more applicable for code that is underspecified and difficult to test. When code is testable and well-specified, the cost of discovering bugs and handling corner cases drops substantially... to the point where they are no longer such a huge time sink. I think this myth arises because most code is not very testable—such code tends to result in bugs being discovered much later in the development process when the cost of investigating and fixing them is much higher. When considering a rewrite of some codebase that has been this costly and time-consuming to battle-harden, it's easy to assume you will have to incur the same costs when developing the new codebase. But this is only true if you keep using the same old, busted development strategies.

Rewrites aren't always good: if you are going to rewrite code using substantially the same design, using the same language and tools, just to "clean up the code", then you are just code polishing and are not really adding any value. But if you are redesigning the code with an eye for making future work much easier, then you are simply trading off some current throughput for an increase in long term productivity. And this is a completely sensible thing to do.

Tuesday, September 8, 2009

Encoding functional dependencies in Scala

Update: Nevermind, I don't think this is useful. There's no advantage that I can see to this over using plain old abstract type members. Also, nothing stops us from defining two classes with the same type parameters in the same scope, but with different type constructors for converting to the dependent types. Original post follows.



Here is a way you can encode functional dependencies in Scala. I'm not sure if this technique has been described elsewhere, but I hadn't seen it so I thought I'd write it up.

Sometimes you will have a type Foo[X,Y], in which the choice of X uniquely determines Y (or more generally, in which the choice of one or more type parameters uniquely determines other type parameters). The example given on the Haskell page for fundeps is a multiplication typeclass: if both items being multiplied are matrices, the result is also a matrix; if both are scalars, the result is a scalar, if one is a matrix and the other is a scalar, the result is a matrix, etc. The types of the two things being multiplied uniquely determines the result type.

The basic idea for encoding this in Scala is to make the dependent type paramters into concrete type members of the class, dependent on an abstract type constructor that converts the supplied type parameters to the dependent type. An example:
object Fundep {
trait I[X] {
type XToZ[Q]
type Z = XToZ[X]
def z: Z
}
class J[X] extends I[X] {
type XToZ[Q] = String
def z = "hello"
}
class K[X] extends I[X] {
type XToZ[Q] = Int
def z = 42
}
def getZ[T<:I[_]](t: T): T#Z = t.z
}
The function getZ shows how users of the type can access the types that are derived from the supplied type parameters. We can call this with getZ(new K[Blah]) and it will return an Int; call it with getZ(new J[Blah]) and we get back a String!

The nice thing about this is that clients do not need to have knowledge of the type constructors for converting the known type parameters to those that are dependent. In contrast, consider the situation if we made the type constructor for doing the conversion into a type parameter of the class, like so: trait Foo[X,XToZ[_]] { type Z = XToZ[X] }. If we do this, all clients of the class are forced to know about the type constructor for converting X to Z, which they don't really care about - they just need to know what Z is, not how it is generated from X.

Note: this only actually works in 2.8. In 2.7.4, the example above crashes the compiler.

Wednesday, July 22, 2009

Market failure in the EA spouse case

There's an interesting class of market failures that occur in the presence of both imperfect information and high switching costs. As an example, consider the infamous EA Spouse case (background on the wikipedia page). Quick synopsis: a programmer joined EA with the expectation that working conditions would be relatively sane. After starting work, it became increasingly apparent that the company was a sweatshop in perpetual crunch mode. From the original EA spouse post:

Electronic Arts offered a job, the salary was right and the benefits were good, so my SO took it. I remember that they asked him in one of the interviews: "how do you feel about working long hours?" It's just a part of the game industry -- few studios can avoid a crunch as deadlines loom, so we thought nothing of it...

Within weeks production had accelerated into a 'mild' crunch: eight hours six days a week. Not bad. Months remained until any real crunch would start, and the team was told that this "pre-crunch" was to prevent a big crunch toward the end; at this point any other need for a crunch seemed unlikely, as the project was dead on schedule... Weeks passed. Again the producers had given a termination date on this crunch that again they failed... Now, it seems, is the "real" crunch... the current mandatory hours are 9am to 10pm -- seven days a week -- with the occasional Saturday evening off for good behavior (at 6:30pm). This averages out to an eighty-five hour work week. Complaints that these once more extended hours combined with the team's existing fatigue would result in a greater number of mistakes made and an even greater amount of wasted energy were ignored...

And the kicker: for the honor of this treatment EA salaried employees receive a) no overtime; b) no compensation time! ('comp' time is the equalization of time off for overtime -- any hours spent during a crunch accrue into days off after the product has shipped); c) no additional sick or vacation leave. The time just goes away. Additionally, EA recently announced that, although in the past they have offered essentially a type of comp time in the form of a few weeks off at the end of a project, they no longer wish to do this, and employees shouldn't expect it. Further, since the production of various games is scattered, there was a concern on the part of the employees that developers would leave one crunch only to join another. EA's response was that they would attempt to minimize this, but would make no guarantees.

In discussions of this story when it became widely publicized, a very common response was to say that either the programmer "should have known" what he was getting himself into, and/or if he didn't like the working conditions he should have simply quit - no one was forcing him to work for EA. While this is technically true, it is beside the point - the employee in this case accepted the job given imperfect information about the working conditions (we can quibble about whether this was his fault, but that is not relevant). After the fact, if he wished to find a different job he would be forced to incur various switching costs. Such costs could be substantial for an individual changing jobs (searching for a job takes a lot of time and energy, may require moving expenses and so forth).

In general, for a person in a job that he might not have accepted given better information, the switching costs give the status quo greater net utility than it would have on its own. This leads the employee to remain at the job longer, until conditions deteriorate (or the employee's utility function changes) to the point where the switching costs become acceptable.

This is a market failure. Imperfect information causes the market to misdirect productive capacity; switching costs then sustain the resulting inefficiency. In the presence of better information or in the presence of zero switching costs, a company like EA would not be able to attract and retain programmers at the salary they offered. It is only when both factors combine that EA's strategy becomes effective. In fact, given the presence of these factors, it may be the case that EA's strategy actually gives them a competitive advantage over companies that are more up front about their working conditions.

There are a lot of possible responses to this sort of market failure. Do we pass laws preventing these sort of working conditions? We could, but then we restrict legitimate market activity. Suppose EA were completely upfront about their working conditions and offered increased pay as compensation. If a programmer decides the deal is worth it to him, should he really be legally prevented from accepting the job? By having such a law we are restricting a whole class of mutually beneficial transactions.

No, the ideal way to deal with this sort of market failure is to provide the market with better information. This allows legitimate, beneficial activity to continue while preventing the inefficiency that would otherwise occur when employees accept jobs under assumptions that are out of sync with reality. But how do we provide the market with better information? I'm not sure. In principle, the government can play a role here - for instance, it can force employers to provide information about working hours. But again, this is a very rigid tool - in addition to increasing the overall regulatory burden, laws like this will always lag behind the needs of the market. If a new sort of information becomes relevant to programming jobs, the law does not automatically adapt to compel the release of this new information. Involving the government means we must also consider the possibility of government failure - even if government action can in principle make the market more efficient, in practice it isn't guaranteed that it will do so given the incentives of politicians and voters and the limitations of the people involved.

I'm very interested in understanding when markets will, via endogenous processes, tend to endow their participants with sufficient information relevant to market decisions. That is, when is a market naturally induced to provide its own information, and when it is an equilibrium for market participants to withhold information? A related question is: if we factor in the cost of information acquisition, when do the costs of acquiring information exceed its value to participants? And how should information be valuated? A better understanding of these questions might lead to ideas for how to coax a market into providing participants with better information - self-provided information like this will I think tend to be more subtle, more adaptive, more specialized, and more responsive to the needs of market participants.

Sunday, June 14, 2009

Perfect strictness analysis (Part 2)

Strictness analysis fails to see through polymorphic functions whose strictness is in whole or in part determined by arguments supplied by callers. The example I gave was the function foldl, whose seed parameter can be made strict so long as the binary function used for the fold is also strict.

Can we do better? Maybe, but I can't think of a clean solution. Here, I'll sketch out an algorithm that doesn't quite work. Note that there is a runtime cost for implementing such an approach, but I believe the runtime cost can be made less than the cost of unnecessarily allocating a thunk on the heap - which requires at the very least a handful of instructions.

Before giving the algorithm, let's be clear about the goal: the goal of perfect strictness analysis is to avoid allocating a thunk for any argument that could be made strict without altering the callee's semantics. We don't care whether this determination can be done statically or whether it must wait until runtime.

In the case that it must wait until runtime, here is a sketch of an algorithm that seems like it might work, but doesn't. I'm going to loosely assume a stack-based machine, in which we push a function then its arguments onto a stack, then either call the function or create a thunk:
  • Each function stores (at runtime) an annotation indicating which arguments to it are strict, or, more generally, how the strictness of each argument is dependent on the strictness of other arguments.

  • Rather than building thunks bottom up, we build them top-down: when creating a thunk, we first push the callee onto a stack.

  • Before building a thunk for an argument to the callee, we ask the callee whether it is strict in that argument. If it is, we evaluate the argument. Note that the callee may examine its caller, recursively, to determine strictness, or it may examine other arguments that are being passed to the callee. In the case that the information is not yet known or the callee is lazy in that argument, we drop back to building a thunk for that argument.
Let's look out our definition for foldl again, and see how this would work. Recall the definition for foldl:
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f s [] = s
foldl f s (h:t) = foldl f (f s h) t
I'll assume we can perform some analysis to determine that foldl is strict in its third parameter (since we must always evaluate the third parameter to distinguish between the empty list case and the non-empty list case), lazy in its first parameter (since we won't necessarily apply that function to any arguments), and strict in its second parameter iff the first parameter is a binary function strict in both its arguments. Notice that the strictness of f actually depends not just on the type of the third argument, but on its runtime value - if the list is empty, f will not be evaluated; if it is nonempty, f will be evaluated. This is going to cause problems, as we'll see in a moment.

In the recursive call, foldl f (f s h) t, we would push foldl onto a stack. Now we are at the point where we are passing the first argument to foldl, so we would ask foldl whether it is strict in this argument. It responds 'no', so we leave f unevaluated. Next we are at the point we we are about to pass the second argument to foldl. We again query foldl, asking if it is strict in its second argument. foldl checks its first argument - except if f is unevaluated, its strictness is unknown, so we have to fall back to thunking the second argument. We're back where we started - the strictness of the function f won't become known until we reach the end of the list, so we end up with the same behavior as before. Ouch! Note that if f were already evaluated, foldl would be able to determine that the second parameter could be evaluated before being supplied.

It seems we would need another level of analysis to determine that the first parameter, f, can be made strict if the list being folded over is nonempty. In principle, I don't see why we couldn't do this sort of analysis, but this is starting to get ugly. In addition, we have the problem that when we go to supply f to foldl, the list argument hasn't been supplied yet, so we can't inspect the list to determine if it is nonempty. This might not be a fundamental problem since we obviously have a reference to the list even if we haven't formally supplied to foldl yet. But still, this is getting a lot more complex than I'd initially hoped.

Sunday, June 7, 2009

Perfect strictness analysis (Part 1)

Previously, I talked about how strict-by-default results in less modular code. Now, I want to talk about problems with lazy-by-default languages, and how these problems could be addressed by better strictness analysis. This will be in multiple parts. (Part 2)

Let's look at a simple example where laziness causes problems. Here's a definition for a left fold in Haskell:
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f s [] = s
foldl f s (h:t) = foldl f (f s h) t
If you try evaluating foldl (+) 0 [1..1000000], you'll get a stack overflow. The stack overflow occurs because foldl is lazy in the parameter s (the 'seed' parameter); each iteration of the fold creates a thunk with one more level of nesting than the current seed. When the result of the fold is finally evaluated it requires stack space proportional to the length of the list folded over. If you want to sum a list in constant space, you can either use a hand-coded loop, or you can use foldl', which is strict in its seed argument. (Update: there's another way, too, see below.)

But wait a minute - if you look at the definition for foldl, you can see that if the operator, f is strict in its arguments, then the seed could also be made strict without any change in semantics. Strictness analysis has failed us: since foldl is polymorphic, the compiler can't assume anything about the strictness of the operator and must therefore treat the seed parameter as lazy. In principle the compiler could produce specialized versions of foldl automatically and apply the usual strictness analysis to the specialized versions. However, I doubt this works as a general solution to the problem as it's possible to construct examples where this strategy leads to an explosion in code size. (Open question: are said examples really that common in practice?)

Update: foldl, as it's written in GHC, will actually do the right thing assuming you compile with strictness analysis turned on. However, the recursive definition I gave above still yields a stack overflow.

Optimizations like strictness analysis that can't be counted on to be applied in all relevant cases are leaky abstractions. And like all leaky abstractions, strictness analysis fails to form a new semantic level upon which we can build new abstractions and modes of thinking; rather, it is an ad hoc piece of machinery that must be explicitly unpacked and considered when writing code. This places additional conceptual burden on the programmer.

Contrast this with laziness itself: laziness is pervasively and predictably applied, not an optimization you hope gets triggered. Laziness lets you program in a different style than you would in a strict language.

What would it look like to have perfect strictness analysis, which worked predictably, correctly handling polymorphic functions like foldl? I'll detail a possible solution in my next post.

Update: It appears I was overly optimistic about the difficulty of this problem. Read on for the details.

Wednesday, May 27, 2009

Optional laziness doesn't quite cut it

Strict languages that support optional laziness just don't provide the same modularity and composability that you get with a language that is lazy by default. In a strict language with optional laziness, callers (direct and indirect) all may need to be aware of the laziness of any potential callee (direct or indirect). This is a devastating handicap that makes these functions much less useful and composable than they would be otherwise.

Here's an example, in Scala. Suppose we create a method, foo, lazy in its second argument:
def foo[A,B](a: A, b: => B): B = ...
If we want to call foo, no problem, just do foo(a, hugeFunction()). Scala automatically creates a thunk wrapping the hugeFunction() call, ensuring that it won't be executed before foo gets called. The problem comes when we consider a caller of foo:
def bar[A,B,C](a: A, b: B, c: C): B = {
// big long function
foo(a, b)
}
This isn't quite right - bar is strict in its second argument, but when that argument is actually used, in the call to foo, it is lazy. If foo depends on the laziness of b for its semantics or performance, we probably should adjust the signature of bar to also be lazy:
def bar[A,B,C](a: A, b: => B, c: C): B = ...
Starting to see a problem yet? What about callers of bar? These need fixing, too. In principle we might have to adjust the signatures of a long chain of callers into foo, until we get to the function that generates the instance of B that gets passed ultimately to foo. The problem is especially bad when writing generic code that can appear in several locations in the call graph. Very often these functions will not have full knowledge of who they call; the only way for them to be fully general is for them to be lazy by default (since some callee may depend on this laziness). If you program in a style where you make plentiful use of HOFs, or you make use of a lot of generic interfaces, you start to see what a PITA it can be to deal with strictness. A function that appears deep in the call graph needing to be made lazy might result in the programmer having to propagate laziness information to a huge number of dependent functions.

As an example, I recently built an experimental combinator library for defining parallel computations. I won't go into details here, but you can build a nice set of combinators all derived from just three primitives: unit, map, and join. As it turns out, these three functions form a monad, and the combinator library I was going to write was just going to be the usual combinators that can be derived given the monad interface. I figured I'd just save myself the trouble and just use implementation of the monad combinators from scalaz.

... except, the monad interface in scalaz defined unit to be a strict function (this is being fixed in an upcoming release), and my particular implementation of the monad interface needed unit to be lazy. Short of modifying scalaz, my only option was to rewrite all the functions that called into unit. Unfortunately, the unit function has a lot of dependents - the functions that can be derived given a monad have a nice, layered structure, where each layer builds on the layer below, with the bottom layer being just the three functions that define a monad. Changing the signature of unit in the interface causes a cascade of changes to all the layers above.

There are certainly some problems with laziness being the default, but I've started to think the problems are of the type that can be solved or have already been solved by better technology (I'll talk more about this in a future post). In contrast, the above problems with strict-by-default are fundamental.