<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-2267747408703654731</id><updated>2012-01-27T00:48:12.302-08:00</updated><category term='mathematics'/><category term='economics'/><category term='scala'/><category term='health'/><category term='haskell'/><category term='politics'/><category term='programming'/><title type='text'>Prettt-tty, pretty, pretty good!</title><subtitle type='html'>(Stuff worth thinking about)</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>32</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-8137870764380768166</id><published>2012-01-19T07:57:00.000-08:00</published><updated>2012-01-19T17:07:54.590-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Possible projects for Boston Haskell Hackathon: smarter evaluation strategies and refactoring combinators</title><content type='html'>&lt;p&gt;This weekend I'm going to be participating in the &lt;a href="http://www.haskell.org/haskellwiki/Hac_Boston"&gt;Boston Haskell hackathon&lt;/a&gt;. I'm very excited about it and I have a couple idea for projects to work on. If any of these sound interesting or you are thinking of something similar, I'm looking for people to collaborate with! Send me an email or come talk to me in person! I think I'm going to get there sometime in the late afternoon to early evening on Friday and I'll be around all weekend.&lt;/p&gt;&lt;h2 id="evaluation-strategies-for-haskell-that-dont-leak-space"&gt;Evaluation strategies for Haskell that don't leak space&lt;/h2&gt;&lt;p&gt;The first project is doing some research / prototyping of an alternate evaluation strategy with the same termination properties as normal order evaluation, but with much easier reasoning about space usage. For lack of a better name, I'm calling it &lt;em&gt;specializing, strictness-propagating evaluation&lt;/em&gt;. In this model, calling a function is something like two communicating coroutines. When calling a function, the callee begins evaluating its body, yielding control back to the caller when it needs its first argument, and also indicating whether that argument should be strict or lazily passed, using whatever information is available at runtime. Subsequent arguments work similarly. As a result, functions are automatically specialized as arguments are passed, and we do not construct thunks if they are going to be consumed strictly by a subsequent callee. This can be implemented efficiently using just two call stacks, and there are various optimizations to the scheme. It is intended to augment, not replace, the existing static strictness analysis and argument passing.&lt;/p&gt;&lt;p&gt;Here's an example working through this for the &lt;code&gt;if&lt;/code&gt; function, which let's assume has the following implementation:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;foldl f z l = case l of &lt;br /&gt;  [] -&amp;gt; z&lt;br /&gt;  h:t -&amp;gt; foldl t f (f z h)&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;I'm also going to assume we've done some static strictness analysis to determine that all branches evaluate &lt;code&gt;z&lt;/code&gt; and that therefore the &lt;code&gt;h:t&lt;/code&gt; branch evaluates &lt;code&gt;f&lt;/code&gt; (since all branches evaluate &lt;code&gt;z&lt;/code&gt; and in the &lt;code&gt;h:t&lt;/code&gt; case, &lt;code&gt;f&lt;/code&gt; appears at the head of an expression passed as &lt;code&gt;z&lt;/code&gt;). Suppose we call this with &lt;code&gt;foldl (+) 0 [1,2,3,4]&lt;/code&gt;.&lt;/p&gt;&lt;ol style="list-style-type: decimal"&gt;&lt;li&gt;Caller pushes &lt;code&gt;foldl&lt;/code&gt; onto the call stack. &lt;code&gt;foldl&lt;/code&gt; begins evaluating with no arguments. It gets as far as the &lt;code&gt;case l&lt;/code&gt;. It will then request &lt;code&gt;l&lt;/code&gt; strictly, since it is about to evaluated it anyway.&lt;/li&gt;&lt;li&gt;To request the argument, &lt;code&gt;foldl&lt;/code&gt; pops its currently running frame from the call stack and pushes it onto the &lt;em&gt;save stack&lt;/em&gt;. It then resumes the caller now at the top of the call stack with an argument of &lt;code&gt;strict&lt;/code&gt;.&lt;/li&gt;&lt;li&gt;The caller passes the argument as requested - if the caller were itself receiving these arguments as function parameters, it would propagate the strictness request of &lt;code&gt;foldl&lt;/code&gt; to &lt;em&gt;its&lt;/em&gt; caller.&lt;/li&gt;&lt;li&gt;To resume &lt;code&gt;foldl&lt;/code&gt;, it pops &lt;code&gt;foldl&lt;/code&gt; from the save stack and pushes it onto the call stack, giving it the (strictly evaluated) list it requested.&lt;/li&gt;&lt;li&gt;Now we hit the interesting case: inside the &lt;code&gt;h:t&lt;/code&gt; branch, we know that &lt;code&gt;z&lt;/code&gt; is strict (this is known statically). We also know that &lt;code&gt;f&lt;/code&gt; can now be evaluated, so we request this argument strictly from our caller. With &lt;code&gt;f&lt;/code&gt; now evaluated, we can propagate its stricness information. We know we will be evaluating &lt;code&gt;f z h&lt;/code&gt; - what we did not know until runtime was that &lt;code&gt;f&lt;/code&gt; was plus (let's just say it was &lt;code&gt;+&lt;/code&gt; specialized to &lt;code&gt;Int&lt;/code&gt;), and therefore static SA has no choice but to pass &lt;code&gt;f z h&lt;/code&gt; as a thunk. We now know that &lt;code&gt;f&lt;/code&gt; is strict in both its arguments, so the call to &lt;code&gt;f z h&lt;/code&gt; means we can fully evaluate &lt;code&gt;z&lt;/code&gt; (which we do by requesting &lt;code&gt;z&lt;/code&gt; strictly from our caller), &lt;code&gt;h&lt;/code&gt;, and then &lt;code&gt;f z h&lt;/code&gt;.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Each step of the iteration works similarly and &lt;code&gt;foldl&lt;/code&gt; ends up running in constant space. I'm handwaving a lot here, but in general I want an evaluation order that is totally predictable in its space usage - values are immediately forced as soon as their consuming functions are known at runtime. The consuming functions tell us if an argument will ultimately be forced so we find out sooner rather than building up enormous thunks.&lt;/p&gt;&lt;p&gt;This needs some serious whiteboarding, but assuming it is at all sensible, here's what I propose doing:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Come up with an instruction set for this evaluation model, and write a simple interpreter for it&lt;/li&gt;&lt;li&gt;Write a compiler for a toy functional language to this instruction set, including the basic static analysis needed to kickstart the dynamic analysis&lt;/li&gt;&lt;li&gt;Try writing some programs with it&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Some other interesting ideas - I wonder if there's some way to embed this evaluation model in GHC itself.&lt;/p&gt;&lt;h2 id="a-code-database-for-haskell-and-refactoring-combinators"&gt;A code database for Haskell and refactoring combinators&lt;/h2&gt;&lt;p&gt;The other project I'm interested in working on is a code database for Haskell, and a &lt;a href="http://en.wikipedia.org/wiki/Datalog"&gt;Datalog&lt;/a&gt; interpreter to go with it. Using this database and the datalog query language, I then want to implement a set of &lt;em&gt;refactoring combinators&lt;/em&gt;. A &amp;quot;refactoring&amp;quot; is simply a &lt;em&gt;compilation-preserving&lt;/em&gt; function from one code database to another. I've started tinkering with a set of combinators that individually preserve compilation and can be composed to allow arbitrary code transformations. I wrote up some ideas for that &lt;a href="http://pchiusano.blogspot.com/2011/12/future-of-programming.html"&gt;here&lt;/a&gt;:&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;... Refactoring times in this new model will go from weeks or months to hours, and writing code to transform a codebase will become a separate but critical skill, distinct from the usual act of programming. That is, programmers do not simply conceive of a refactoring (which is often quite simple to express to another programmer), then begin a tedious, manual and error-prone process of text munging to implement it. Instead, the programmer conceives of a refactoring, then conceives of a code transforming program to implement the refactoring, then applies this transformation to the code database, all in the span of a few hours.&lt;/p&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;p&gt;... First, I am not advocating for datalog syntax. I don't care about that. The key functionality enabled by datalog over and above the relational algebra is the ability to express transitive closure and mutual recursion guaranteed to terminate. Together these features enable many of the common queries we'd like to express in transforming and querying our codebases. For instance, here is a hypothetical query to find all references to a given function id, fid. Don't worry if the syntax looks alien or doesn't make sense. The key is more that this query is just a few lines of code to express, and it can be reused and built upon.&lt;/p&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;pre&gt;&lt;code&gt;-- propagate reference to containing apply&lt;br /&gt;refs(Id) :- apps(Id, fid, _). &lt;br /&gt;refs(Id) :- apps(Id, _, fid).&lt;br /&gt;refs(Id) :- refs(X), apps(Id,X,_).&lt;br /&gt;refs(Id) :- refs(X), apps(Id,_,X).&lt;br /&gt;-- any lambda whose body is or contains fid&lt;br /&gt;-- is considered to reference fid&lt;br /&gt;return(Id) :- lambdas(Id,_,Id1), refs(Id1).&lt;br /&gt;return(Id) :- lambdas(Id,_,fid).&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Much of the analysis required to implement refactorings has this sort of &amp;quot;transitive-closure&amp;quot; feel to it - you need to do something to the &amp;quot;direct&amp;quot; callers, then do some transformation for their callers as necessary, and so on.&lt;/p&gt;&lt;p&gt;Here's what I propose for this project:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Implement datalog, possibly backed by just in-memory data structures, or maybe tied to something like SQLite. Or if there's an existing free datalog interpreter and backend for it somewhere, let's see if we can use that.&lt;/li&gt;&lt;li&gt;Come up with the normalized datalog representation for the Haskell AST and type information - besides just the AST I think you'll need to know all the type information. Is there some way to use the GHC API to get the type of all expressions in the&lt;/li&gt;&lt;li&gt;Implement or steal a Haskell parser, and write code to translate this to the normalized datalog representation. As a proof of concept, take some existing Haskell project and &amp;quot;code-database-ify&amp;quot; it.&lt;/li&gt;&lt;li&gt;Come up with a good set of refactoring combinators. Implement them using datalog. As a proof of concept, use the combinators to express some nontrivial refactoring (like - make this value monadic rather than pure, and propagate the change in calling convention to all direct and indirect callers as needed - this is exactly the sort of refactoring that is trivial to describe to another Haskell programmer, and is totally mechanical, but is still done via a tedious process of text munging)&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;If all this is too much, I propose not doing this for Haskell but instead for a toy functional language with a very simple AST and type system.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-8137870764380768166?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/8137870764380768166/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=8137870764380768166' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/8137870764380768166'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/8137870764380768166'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2012/01/possible-projects-for-boston-haskell.html' title='Possible projects for Boston Haskell Hackathon: smarter evaluation strategies and refactoring combinators'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-5656318608400538034</id><published>2011-12-28T19:39:00.000-08:00</published><updated>2012-01-02T13:30:42.582-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>The future of programming</title><content type='html'>&lt;p&gt;What will programming look like 10 or even 20 years from now? With another new year almost here, now is the time to wax philosophical about the future of our industry. We are on the cusp of a number of major transformations in programming that will make 2011 programming technology, techniques, and ideas seem primitive by comparison. The transformations will occur in several key areas: &lt;em&gt;tooling and infrastructure&lt;/em&gt;, &lt;em&gt;languages and type systems&lt;/em&gt;, and &lt;em&gt;runtime systems&lt;/em&gt;.&lt;/p&gt;&lt;p&gt;Before I get started, let me preface this all with the disclaimer that this should be taken with a grain of salt, in the spirit of being provocative, and these are not really predictions &lt;em&gt;per se&lt;/em&gt;. There are lots of reasons why inferior technologies might continue to dominate the industry longer than expected. I won't talk much about that here, although it's certainly interesting. No, instead I want to give &lt;em&gt;my vision&lt;/em&gt; of the future of programming. If there were no accumulated network effects attached to all the mediocre technologies currently in use for programming, if the programming world were given a clean slate and the directive to invent the future, what would I want to result?&lt;/p&gt;&lt;p&gt;An overarching theme to all these transformations is the move away from &lt;em&gt;incidental structure&lt;/em&gt; and its close cousin &lt;em&gt;incidental complexity&lt;/em&gt;. Manifested in various forms, these factors are major barriers to the building programs of greater complexity, and many of the major transformations will be to enable lifting of these barriers. The resulting programming world will look much different--orders of magnitude more code reuse, ease of deployment, efficiency, and much more.&lt;/p&gt;&lt;h4 id="where-we-begin"&gt;Where we begin&lt;/h4&gt;&lt;p&gt;The first major change is the move away from storing programs as text, split across &amp;quot;files&amp;quot;. You can be forgiven for thinking of this as a minor thing. Who cares how programs are stored or represented, right? In fact, this major source of incidental complexity has spillover effects all the way down the programming toolchain, starting with the pollution of programmers mental thought processes, continuing to development environments (which I'll discuss next), and then to build systems and deployment. Like many of the things I discuss here, the full extent of lost productivity due to this incidental complexity is below most programmers radars. Programmers have unconsciously accepted this lost productivity, to the point that many now rationalize their dependence on files and text for programming.&lt;/p&gt;&lt;p&gt;But what about &lt;a href="http://en.wikipedia.org/wiki/Intentional_programming"&gt;intentional programming&lt;/a&gt;, you ask? Or Smalltalk development environments? The idea of moving away from programs as text spread across files is not exactly new, but the devil is in the details. Previous efforts to rid ourselves of this incidental structure have failed in part because they merely substituted one arbitrary structure for another. For instance, intentional programming advocated the good idea of separating code presentation from its storage, substituting instead another format like XML with only slightly less incidental structure. Any advantage conferred by the new format and structure was not exploited to significant benefit, and there was a huge, very real cost in terms of lost network effects from all the text-based tools that could no longer be leveraged. Likewise for Smalltalk development environments. Also, somewhat ironically for Smalltalk, being an OO language, its fundamental premise includes one of the most arbitrary choices of all, in some sense the ultimate generator of incidental complexity, namely, &lt;em&gt;the decision of which class should implement a particular method.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;There is a real barrier to entry here for any new technology. It doesn't need to just be better than the antiquated status quo; it needs to be &lt;em&gt;significantly so&lt;/em&gt; to overcome the network effects and switching cost advantage that status quo technologies inherently posses. In my opinion, none of the proposed text-in-file alternatives so far have come close to overcoming these handicaps.&lt;/p&gt;&lt;p&gt;But this is nothing fundamental. We should not conclude from these failed experiments that the idea is unsound. It's merely been poorly executed. The first major shift in getting this right is not storing programs as text at all, not even some normalized textual form like XML, and not some normalized binary form which nonetheless contains a huge amount of incidental structure. Likewise, get rid of files. Instead, a codebase is stored relationally, as a database, with some version of &lt;a href="http://en.wikipedia.org/wiki/Datalog"&gt;datalog&lt;/a&gt; being used for querying and update. Careful thought will be given to this query-update language so that many of the large-scale refactorings that are currently difficult or impossible can be fully automated, or guided automated, with just a few lines of this codebase transformation code. As a simple example, we eliminate the notion of where a function or datatype &amp;quot;is located&amp;quot;. Instead, each function gets a unique id (perhaps content addressed, to enable a simple, automated form of code deduplication), and a separate &lt;code&gt;names&lt;/code&gt; table maps these ids to names for purposes of rendering the code in some fashion to a human (and in principle, there is really no reason why there can't be multiple &lt;code&gt;names&lt;/code&gt; tables, with different programmers choosing different names for the same operation). This representation trivially enables the &amp;quot;renaming&amp;quot; refactoring in just a single table cell update.&lt;/p&gt;&lt;p&gt;Let me sketch out a few additional thoughts on how this could work. First, I am not advocating for datalog &lt;em&gt;syntax&lt;/em&gt;. I don't care about that. The key functionality enabled by datalog over and above the relational algebra is the ability to express transitive closure and mutual recursion guaranteed to terminate. Together these features enable many of the common queries we'd like to express in transforming and querying our codebases. For instance, here is a hypothetical query to find all references to a given function id, &lt;code&gt;fid&lt;/code&gt;. Don't worry if the syntax looks alien or doesn't make sense. The key is more that this query is just a few lines of code to express, and it can be reused and built upon.&lt;/p&gt;&lt;pre&gt;&lt;code&gt;-- propagate reference to containing apply&lt;br /&gt;refs(Id) :- apps(Id, fid, _). &lt;br /&gt;refs(Id) :- apps(Id, _, fid).&lt;br /&gt;refs(Id) :- refs(X), apps(Id,X,_).&lt;br /&gt;refs(Id) :- refs(X), apps(Id,_,X).&lt;br /&gt;-- any lambda whose body is or contains fid&lt;br /&gt;-- is considered to reference fid&lt;br /&gt;return(Id) :- lambdas(Id,_,Id1), refs(Id1).&lt;br /&gt;return(Id) :- lambdas(Id,_,fid).&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Current IDEs, with their enormous development staff and reams of special purpose code, support a very limited subset of code transformation/querying operations, poorly, slowly, and with huge overhead. As a result, there is an entire subculture of programmers (myself included) who have largely rejected IDEs in favor of minimal, less feature-rich but more dependable text editors. Future development environments will take the refactorings now supported by the most advanced IDEs as the most insignificant starting point, and build from there.&lt;/p&gt;&lt;p&gt;Large scale refactorings are the inevitable consequence of achieving the reuse needed to build programs of significant complexity that don't die of heat death. Refactoring times in this new model will go from weeks or months to hours, and writing code to transform a codebase will become a separate but critical skill, distinct from the usual act of programming. That is, programmers do not simply conceive of a refactoring (which is often quite simple to express to another programmer), then begin a tedious, manual and error-prone process of text munging to implement it. Instead, the programmer conceives of a refactoring, then conceives of a code transforming program to implement the refactoring, then applies this transformation to the code database, all in the span of a few hours.&lt;/p&gt;&lt;p&gt;The code as database concept has spillover simplification effects in other areas of tooling, in particular version control. The problem of handling merges of ordered sequences of characters spread across files and directories with yet more arbitrary structure is extremely difficult, resulting in a huge amount of complexity in the area of version control software. The difficulties have led many to settle for what are arguably inferior VCS models (Git, Mercurial), where changesets form a total order and cherrypicking &lt;em&gt;just doesn't work&lt;/em&gt;. In the future, with code databases, we'll see a resurrection of Darcs' patch theory, only this time, it will work &lt;em&gt;exactly as expected&lt;/em&gt;, and will be trivial to implement.&lt;/p&gt;&lt;p&gt;Related to this notion of code databases, we get much more fine-grained dependency management. Dependency management is another serious problem for software development, a huge hindrance to code reuse, and once again, something that is simply below many programmer radars. A library will often introduce some new interface or abstraction and give several instances of that abstraction for particular concrete types. The library now &lt;em&gt;depends on&lt;/em&gt; these concrete types being available for compilation. Including this library now means pulling in all these dependencies, even if you just require the one instance. The overhead of doing this in conjunction with the existing complexity of builds (again partially caused by the programs-as-text-in-files paradigm) means code is reused much less than would be possible or easy otherwise.&lt;/p&gt;&lt;p&gt;In the future, we'll be able to conditionally specify code, and not using an ad hoc, 1970s technology like the C preprocessor. The datalog query and update language will factor in here, and allow us to express things like: if a concrete type &lt;code&gt;X&lt;/code&gt; is available in the codebase, define some set of functions and new datatypes. Likewise, dependencies will in general not be on &amp;quot;a library&amp;quot; or &amp;quot;a module&amp;quot;. A function will depend only on the set of functions and types it references, and a codebase will be a much more fluid thing. We can extract any connected component of functions and datatypes from a codebase, and this is trivial, automated transformation supported by the query language. There are some unknowns here, all solvable, around versioning, and they are related to how we reconceptualize version control as a partial order of changesets, with no extraneous dependencies due to the representation of programs as ordered sequences of characters.&lt;/p&gt;&lt;h4 id="code-editing-ides-and-type-systems"&gt;Code editing, IDEs, and type systems&lt;/h4&gt;&lt;p&gt;Code editing will be done in structural editors, which will look nothing like the existing batch of IDEs that are little more than glorified text editors (and they are actually rather poor text editors). In a structural editor, the programmer will construct expressions which may have holes in them not yet filled with terms. Importantly, these structural editors will be type-directed, so for any given hole the programmer can be presented with set of values matching the expected type, ordered in some sensible way. The editor will perform local program search to enable autocompleting of multiple expressions. If you've ever seen someone live-code Agda, you'll know how powerful and productive this idea could be. Yeah, the actual interface for programming Agda is still kind of 1970s (a custom Emacs mode), but the idea of type-directed editors is powerful. It makes it clear that types are effective not just at preventing many software errors, but also in &lt;em&gt;guiding development&lt;/em&gt;. They are a powerful tool to augment our puny programming brains.&lt;/p&gt;&lt;p&gt;Along these lines, the rise of type-directed development in languages with real, state of the art type systems will mark the beginning of the end for dynamically typed (aka, single-typed) languages. Dynamic typing will come to be perceived as a quaint, bizarre evolutionary dead-end in the history of programming. This is already widely accepted in some programming circles, where the general consensus is that most dynamic typing advocates are not familiar with type-directed development in a language with a real type system and are basically unaware of the state of the art in type systems and programming language theory. With some additional developments in type systems we'll see any last of any advantages to dynamic typing completely evaporate.&lt;/p&gt;&lt;p&gt;A related transition is that types will become even more critical and type systems will grow features to better handle data normalization, the lack of which is a major source of incidental structure and complexity. A big problem in large codebases is data normalization. Lack of data representation normalization is responsible for the significant amounts of plumbing code involved in aligning two pieces of code so they can talk to each other. Most of the mainstream programming world isn't really aware of this issue because code reuse in most imperative codebases is extremely limited (because side-effects don't compose well).&lt;/p&gt;&lt;p&gt;In the functional programming world, there is insane amounts of code reuse happening, but along with this comes a fair bit of plumbing. As a small example, consider a function, &lt;code&gt;f&lt;/code&gt;, of type &lt;code&gt;X -&amp;gt; Y -&amp;gt; Int&lt;/code&gt;, and a value, &lt;code&gt;xs&lt;/code&gt; of type &lt;code&gt;[(Y, X)]&lt;/code&gt; and suppose you want to apply &lt;code&gt;f&lt;/code&gt; to the list. An experienced functional programmer will bust out &lt;code&gt;map (uncurry f . swap) xs&lt;/code&gt; in about 3 seconds and not think twice about it. But this code is total boilerplate, there to convince the compiler this is what you really want to do, and is noise when reading the code. Yes, you are acheiving reuse (an imperative programmer would still be writing out a &lt;code&gt;for&lt;/code&gt; loop for the millionth time instead of reusing higher-order functions) but it should be cleaner. If this code needs to exist (and I'm not sure it even does, see below), I'd rather get to this point in the editing process and then tell my editor to map &lt;code&gt;f&lt;/code&gt; over &lt;code&gt;xs&lt;/code&gt;. The editor will search for a program to make the types align, show me the program for confirmation if I request it, and then &lt;em&gt;not&lt;/em&gt; show this subexpression in the main editor view, perhaps just showing &lt;code&gt;map f* xs&lt;/code&gt;, where the &lt;code&gt;*&lt;/code&gt; can be clicked and expanded to see the full program.&lt;/p&gt;&lt;p&gt;Even better, perhaps we can get away with much less plumbing in the first place. Plumbing can be eliminated from code in two general ways--the first, which I've just discussed, is to have plumbing code written for you automatically, then hidden unless requested. The second is to have more normalized types and type systems so that there &lt;em&gt;is no incidental structure for code to broker between in the first place&lt;/em&gt;. To support this we'll see things like row types, unordered tuples, type sets, and so on, and any related type system machinery needed to make all this feasible &lt;em&gt;and convenient&lt;/em&gt;. An idea I'm interested is explicit separation between the model underlying a type (which could be something very normalized, with no incidental structure), and views of that type, which may contain some additional structure. Functions will generally be written to operate over the model, from which any number of views can be reconstructed post-transformation. The compiler or runtime is generally smart about choosing runtime representations to avoid unnecessary round trip conversions between different representations.&lt;/p&gt;&lt;h4 id="language-runtimes"&gt;Language runtimes&lt;/h4&gt;&lt;p&gt;Non-strict languages will come to dominate the programming world, due to the increase in code reuse and modularity that comes with pervasive non-strictness. As I've argued before, &lt;a href="http://pchiusano.blogspot.com/2009/05/optional-laziness-doesnt-quite-cut-it.html"&gt;optional laziness doesn't cut it&lt;/a&gt;. As with other issues I've mentioned, the problems with strict as default aren't apparent to most programmers, even those who view themselves as well-versed in functional programming. The problems with strictness only become fully apparent as you get much further along in the development of the FP style, in particular, after the discovery of combinator libraries and the further abstractions that develop to remove duplication across these libraries. This has been a source of tension in the Scalaz library, which supports functional programming in Scala, a strict-by-default language, and also in our Scala codebase at work.&lt;/p&gt;&lt;p&gt;The only real problem with laziness has to do with reasoning about space usage performance, and evaluation stack usage. These problems get a lot of play among people who enjoy rationalizing their ignorance of Haskell and FP in general, but the truth is some additional research and smarter evaluation strategies can address the problems. There's nothing fundamental here suggesting we should throw up our hands and resort to strict evaluation.&lt;/p&gt;&lt;p&gt;The usual normal order evaluation is guaranteed to terminate if any is, but reasoning about is space usage in this evaluation order is problematic. We can do much better. Besides static strictness analysis, which only covers a fraction of the cases we'd like, we can conceive of additional evaluation orders which terminate for the same set of programs as normal-order evaluation, and which can propagate additional strictness information. The one I am particularly interested in I refer to as &lt;em&gt;specializing, strictness-propagating evaluation&lt;/em&gt;. I'll elaborate on this in another post, but in this evaluation model, calling a function is something like two communicating coroutines. When calling a function, the callee begins evaluating its body, yielding control back to the caller when it needs its first argument, and also indicating whether that argument should be strict or lazily passed, using whatever information is available at runtime. Subsequent arguments work similarly. As a result, functions are automatically specialized as arguments are passed, and we do not construct thunks if they are going to be consumed strictly by a subsequent callee. This can be implemented efficiently using just two call stacks, and there are various optimizations to the scheme. It is intended to augment, not replace, the existing static strictness analysis and argument passing.&lt;/p&gt;&lt;p&gt;In general, the goal of new evaluation strategies should not be efficiency (though that is a nice goal as well), but simple reasoning about space and evaluation stack usage, so that these things do not depend as they currently do on incidental details of how a function is factored or what the compiler chooses to inline (being forced to reason about these things in Haskell breaks the black box abstraction of functions that FP and higher-order programming depend on).&lt;/p&gt;&lt;p&gt;Language runtimes and VMs will grow to support these new evaluation strategies, and the current crop of &amp;quot;general-purpose&amp;quot; VMs that are actually quite specialized for strict, imperative languages (the JVM, the CLR) will likely die off.&lt;/p&gt;&lt;h4 id="code-distribution-and-the-future-of-the-web"&gt;Code distribution and the future of the web&lt;/h4&gt;&lt;p&gt;What of the web, javascript, html 5 and beyond? Are we going to keep hobbling along with these technologies indefinitely? I don't think it's too controversial to say that writing applications of any significant complexity as a &lt;em&gt;web application&lt;/em&gt; involves far more work than would be required with access to a truly powerful client-side language. Again, I don't think many web programmers realize just how bad the situation is. In comparison to mainstream languages like Java, C#, or Python, Javascript isn't so bad; in some ways it's even a better language. But compared to a programming language with a good type system and real support for FP like Haskell, Scala, and whatever the next generation of languages can bring, Javascript is quite sad.&lt;/p&gt;&lt;p&gt;Of course, proponents of existing web technologies are always finding ways to rationalize the status quo, pointing out how much better things are today than they used to be. That's maybe true, but why settle?&lt;/p&gt;&lt;p&gt;I envision a future where Javascript dies off, as do browser-specific plugins like Flash, and instead we'll see client code written in arbitrary, compiled languages, using something like &lt;a href="http://code.google.com/p/nativeclient/"&gt;NaCl&lt;/a&gt;. This will be combined with a signed code caching and dependency tracking mechanism, so that for instance, you can distribute an application that downloads the entire Java virtual machine and other dependencies, but only if they aren't already cached on your local machine (and the signatures match, of course). This changes how we think about software. Software won't be something you &amp;quot;download onto our computer and then run&amp;quot;. Instead, software exists &amp;quot;out there&amp;quot;, and you run it. As an implementation detail of running it, it may choose to download some additional code it needs to do its work.&lt;/p&gt;&lt;p&gt;This will end the monopoly that html+javascript has on client-side web applications and largely eliminate the switching costs of moving to new client-side technologies. A few lower-level protocols and standards will be enough to tie everything together and maintain the good parts of the web today (I'll say more about this next), but nothing so high-level as specifying the client side language for interaction (Javascript, or Dart, or whatever) or the client-side language for layout and display (html + CSS). In the future, client-side code is written in whatever language, compiled to native code if desired.&lt;/p&gt;&lt;p&gt;In place or in addition to native client support, another option would be some sort of in-browser low-level VM designed by people who know what they're doing, who are fluent in the state of the art in programming languages and runtimes. In other words, not the people who designed Dart, Go, Ceylon, or any of the other recent languages that are 30 years or more behind state of the art. We need something that actually supports languages of the future, not something that rearranges the deck chairs on the titanic sinking ship that form the current batch of mainstream languages.&lt;/p&gt;&lt;p&gt;What of the suggestion that we simply use Javascript as the &amp;quot;assembly language&amp;quot; of the web, use it as a compilation target, and avoid programming in it directly? This obviously is not ideal. Compiling a real programming language like Haskell or whatever is next through Javascript or Dart is obviously not how anyone would not how design things today given a clean slate. Even if this were possible to do well without bloating client code size beyond what's acceptable, there is the inevitable efficiency hit of compiling to a language whose performance is as bad as Javascript. When you think about it, it makes no real sense to be reverting to 1/10th or 1/100th the speed of what native code or a well-JIT'd VM could provide, &lt;em&gt;purely for ease of deployment&lt;/em&gt;. We don't need to be making this tradeoff, and with the rise of something like NaCl and/or a real browser VM, we won't have to.&lt;/p&gt;&lt;p&gt;There's one wrinkle. There are tremendous network effects on the web. This is part of its power, and its usefulness, and we don't want to lose it. We do still need the moral equivalent of urls and hyperlinking but one thing we don't necessarily need is the DOM. What will take its place? Don't we need the DOM to enable mashups and services like search engines? Actually, no. DOM munging and traversal is not the only way programs could obtain information from applications on the web, and when you think about it, it's actually rather primitive. Why screen scrape when you can call a real API? This transformation is already sort of happening; most modern web applications expose APIs in the form of REST+JSON. It just needs to be taken a little further.&lt;/p&gt;&lt;p&gt;What would this look like? Well, we would need a standard type system for the web. By this I mean a standard way for the applications that live at a url to expose a module of types and functions they support. The underlying type system would be expressive enough to encode richer data than what JSON currently provides (which besides being dynamically typed, cannot even represent sum types effectively) and will support for algebraic data types and some of the type system features alluded to earlier. With this in place, standards will arise for certain function signatures, with standard meanings attached to them. So, for instance, rather than the googlebot crawling the DOM looking for links, it invokes the &lt;code&gt;getAllLinks&lt;/code&gt; function of the application at the given url, which returns an actual list of url values. &lt;code&gt;getAllLinks&lt;/code&gt; is some separate standard, and new ones can arise on an ad hoc basis, as we grow new modes of interaction with web sites. There will be certain near-universal standards (like &lt;code&gt;getAllLinks&lt;/code&gt;) and more specialized ones, specific to the web application in question (for instance, Facebook exports certain functions and datatypes that are specific to Facebook, which are unlikely to be implemented by other sites, though this is certainly not a requirement).&lt;/p&gt;&lt;p&gt;Already, we can see this happening somewhat: there are various ad hoc APIs and mechanisms for basically controlling how web crawlers should interpret the DOM.&lt;/p&gt;&lt;p&gt;With a standard way of interacting with web sites programmatically, there's no longer any need for the DOM and we can see a proliferation of innovation of different display and layout technologies.&lt;/p&gt;&lt;h4 id="closing-thoughts-the-rise-of-functional-programming"&gt;Closing thoughts: the rise of functional programming&lt;/h4&gt;&lt;p&gt;I've hinted at this throughout: functional programming will &lt;em&gt;absolutely win&lt;/em&gt;. Not everyone shares my views, but many of the improvements I've talked about start with the assumption that we are working in a functional, post-imperative world. FP provides a dramatic increase in productivity due to massive increases in code reuse and the ease of reasoning about functional code (not to mention ease of parallelization). The industry is slowly starting to understand this, but it doesn't really matter if many programmers are still for whatever reason resistant to learning FP (which is unfortunately true). Eventually, the selection pressures of a competitive market will weed out less productive and effective techniques, and this largely means the death of imperative programming as we know it, except in certain niche areas. Regardless of initial biases or attitudes, programmers and companies who wish to stay competitive will be forced to employ FP.&lt;/p&gt;&lt;p&gt;As for why FP hasn't gained more prominence already, well, perhaps I'll write more about that in another post. What I will say is FP is currently reaching a tipping point enabling its wider adoption and relevance. There are several factors in play: functional language compiler technology is advanced enough, computers are fast enough, and most importantly, the FP community is rapidly discovering all the necessary techniques to organize large programs that preserve referential transparency. The Haskell community has mostly led this charge, Haskell being the only widely-used language that, due to its non-strict evaluation, had no choice but to fully commit to preserving referential transparency throughout. Ten years ago, admittedly, there was a good chance that expressing some programs purely functionally involved what was basically new research. Today, many of the required techniques are known, and expressing even the most imperative-seeming program functionally is possible, and for an experienced functional programmer, pretty effortless. There are still interesting open questions of how to express certain programs, but these are diminishing rapidly.&lt;/p&gt;&lt;p&gt;That said, I understand why many people claim FP is too hard, or it's unnatural or awkward, etc. Like many worthwhile subjects, attaining fluency in FP is difficult and requires dedication. Once this level of fluency is reached, though, expressing functional programs is quite natural and effortless (of course, software design is still hard, but finding functional designs becomes easy with practice). For people looking in, the techniques of FP seem opaque and unnecessarily difficult. For people on the inside, there's nothing difficult or complex about it and the benefits are enormous. For those who would criticise FP, I think &lt;a href="http://www.leavesofcode.com/2011/06/we-need-programming-language-for-rest.html?showComment=1308044152245#c7256366655841365329"&gt;a little humility is in order&lt;/a&gt;. To draw an analogy, no one without mathematical background would feel equipped to dismiss or criticise an entire branch of mathematics (&amp;quot;real analysis is a stupid idea&amp;quot;), and yet programmers with barely a cursory understanding of FP regularly (and loudly) criticise it.&lt;/p&gt;&lt;p&gt;I see why some people are frustrated with some of the existing resources for learning the subject. But it's wrong to dismiss the subject on that basis, or on the basis of personalities of people (myself included!) who feel that FP is more productive; objectively, either FP is worth knowing because of the productivity and other benefits it provides, or it isn't. For my part, I am &lt;a href="http://www.scala-lang.org/node/10448"&gt;co-authoring a book on FP&lt;/a&gt; that I hope is helpful to some who are interested in learning the subject. With the solidification of FP techniques I expect to see more and more resources like this, to the point that fluency and the huge resulting benefits are something that any motivated programmer can work toward, without encountering some of the discouraging hurdles to learning FP that are present today.&lt;/p&gt;&lt;p&gt;Will the future of programming look anything like what I've laid out here? Maybe, maybe not. But I certainly hope it will.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-5656318608400538034?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/5656318608400538034/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=5656318608400538034' title='48 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/5656318608400538034'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/5656318608400538034'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2011/12/future-of-programming.html' title='The future of programming'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>48</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-8708740969094817999</id><published>2011-12-27T19:39:00.000-08:00</published><updated>2011-12-29T10:58:37.801-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Programmatic translation to iteratees from pull-based code</title><content type='html'>&lt;p&gt;When working with iteratees, one notices a sort of dual between producer and consumer--for any given stream processing task (which pairs a producer of values with some consumer of these values), operations can often be expressed on either the producer side or the consumer side. For instance, if we want to square each value in a stream of integers, we can map over the stream (the producer), or we can map (contravariantly) over the values as they enter the stream's consumer (the iteratee). Here is an intriguing idea: what if we could programmatically translate stream processing tasks from one mode to the other? That is, write pull-based code, but translate to push-based for purposes of execution. It turns out this is actually possible and I'll show code in this post for doing the conversion.&lt;/p&gt;&lt;p&gt;Why is this desirable? Push-based APIs like iteratees are generally preferable for execution--data is produced and sent to its consumers, then any resources associated with that data can be freed. This model is simple and easy to reason about in terms of memory and resource usage. Pull-based execution is more problematic: resource lifetimes (and therefore space usage) are not deterministic since at any point in time, a function may choose to request &amp;quot;historical&amp;quot; data from the producer. The consequences of this have plagued some FRP implementations (though there are ad hoc workarounds that involve limiting access to history in various ways or only exposing certain &amp;quot;safe&amp;quot; combinators rather than the full pull-based API that forms the underlying model) and led many to reject lazy IO in favor of iteratees.&lt;/p&gt;&lt;p&gt;While iteratees address these problems for streaming IO, programming with them directly requires inverting your thinking, which is arguably somewhat unnatural. Likewise for other push-oriented APIs like arrows, which have used as the basis for FRP implementations that aren't as susceptible to space leaks. Honestly, I'm not sure what I think about the claims that push-based APIs are less &amp;quot;natural&amp;quot;. What's most important is that whatever API you use &lt;em&gt;composes&lt;/em&gt;, avoids code duplication, and gives you some reasoning tools. With those ingredients, I suspect you can get comfortable with any API. (I remember when I first started writing iteratee code, which we have a lot of at work, my brain nearly exploded when writing even the simplest of iteratees and HOFs for composing them... but after some practice it came much more naturally.)&lt;/p&gt;&lt;p&gt;Moving on, the key to converting from pull-based code to push-based code is to represent streams of data as a &lt;a href="http://www.haskell.org/haskellwiki/ListT_done_right"&gt;ListT&lt;/a&gt;, &lt;em&gt;universally quantified over the underlying monad&lt;/em&gt;. I'll explain this in a minute. First, a &lt;code&gt;ListT&lt;/code&gt; is just a list that yields a monadic effect with each &lt;code&gt;uncons&lt;/code&gt;. Here is a &lt;code&gt;ListT&lt;/code&gt; implementation:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;-- Skip nodes let us do filtering without lookahead&lt;br /&gt;-- uncons is then a simple loop that removes any Skip nodes&lt;br /&gt;data Step a s = Empty | Yield a s | Skip s&lt;br /&gt;&lt;br /&gt;newtype ListT f a = &lt;br /&gt;  ListT { runListT :: f (Step a (ListT f a)) }&lt;br /&gt;&lt;br /&gt;instance Monad f =&amp;gt; Monad (ListT f) where&lt;br /&gt;  return a = ListT (return (Yield a empty))&lt;br /&gt;&lt;br /&gt;  (ListT s) &amp;gt;&amp;gt;= f = ListT $&lt;br /&gt;    s &amp;gt;&amp;gt;= \step -&amp;gt; case step of  &lt;br /&gt;      Yield h t -&amp;gt; return $ &lt;br /&gt;        (Skip $ f h `mplus` (t &amp;gt;&amp;gt;= f))&lt;br /&gt;      Skip t -&amp;gt; return $ Skip (t &amp;gt;&amp;gt;= f)&lt;br /&gt;      Empty -&amp;gt; return Empty&lt;br /&gt;&lt;br /&gt;instance (Monad f) =&amp;gt; MonadPlus (ListT f) where&lt;br /&gt;  mzero = ListT (return Empty)&lt;br /&gt;&lt;br /&gt;  (ListT s) `mplus` b = ListT $ &lt;br /&gt;    s &amp;gt;&amp;gt;= \step -&amp;gt; case step of  &lt;br /&gt;      Yield h t -&amp;gt; return $ Yield h (t `mplus` b)&lt;br /&gt;      Skip t -&amp;gt; return $ Skip (t `mplus` b)&lt;br /&gt;      Empty -&amp;gt; return $ Skip b&lt;br /&gt;&lt;br /&gt;instance MonadTrans ListT where&lt;br /&gt;  lift fa = ListT $ liftM (\a -&amp;gt; Yield a empty) fa&lt;br /&gt;&lt;br /&gt;empty :: Monad f =&amp;gt; ListT f a&lt;br /&gt;empty = ListT (return Empty)&lt;br /&gt;&lt;br /&gt;cons :: Monad f =&amp;gt; a -&amp;gt; ListT f a -&amp;gt; ListT f a&lt;br /&gt;cons h t = ListT . return $ Yield h t&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now, consider a function of type &lt;code&gt;forall f . Monad f =&amp;gt; ListT f a -&amp;gt; ListT f a&lt;/code&gt;. Parametricity guarantees the function cannot assume anything about &lt;code&gt;f&lt;/code&gt; other than that it's a monad. Substitute the identity monad for &lt;code&gt;f&lt;/code&gt; and we can see how any function &lt;code&gt;[a] -&amp;gt; [b]&lt;/code&gt; could be written as a &lt;code&gt;forall f . Monad f =&amp;gt; ListT f a -&amp;gt; ListT f b&lt;/code&gt;. In other words, this API is just as expressive as the usual lazy IO style pull-based API.&lt;/p&gt;&lt;p&gt;The trick is that the caller of such a function can substitute another monad in for &lt;code&gt;f&lt;/code&gt;, namely the reader monad. This lets us &lt;em&gt;push values into&lt;/em&gt; the &lt;code&gt;ListT&lt;/code&gt;.&lt;/p&gt;&lt;pre&gt;&lt;code&gt;data Input a = Done | Await | Element a &lt;br /&gt;&lt;br /&gt;prompts :: ListT (Reader (Input a)) a &lt;br /&gt;prompts = ListT . reader $ \input -&amp;gt; case input of&lt;br /&gt;  Done -&amp;gt; Empty&lt;br /&gt;  Await -&amp;gt; Skip prompts&lt;br /&gt;  Element a -&amp;gt; Yield a prompts&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;code&gt;prompts&lt;/code&gt; is a potentially infinite stream of &lt;code&gt;a&lt;/code&gt;, but with each step, we obtain a function from &lt;code&gt;Input a -&amp;gt; ListT (Reader (Input a)&lt;/code&gt;. On the one hand, we can feed this to a &lt;code&gt;forall f . Monad f =&amp;gt; ListT f a -&amp;gt; ListT f b&lt;/code&gt; and from its perspective it is pulling values from the stream. On the other hand, at each step we get to push a value &lt;em&gt;into&lt;/em&gt; the stream (or signal termination, or whatever &amp;quot;instruction set&amp;quot; we wish to support).&lt;/p&gt;&lt;p&gt;We need one more ingredient, a slightly-reformulated version of iteratees. A well-behaved iteratee will always yield a result (rather than a &lt;code&gt;Cont&lt;/code&gt;) when fed &lt;code&gt;EOF&lt;/code&gt;. We can make this more explicit in the type:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;type IsEOF = Bool&lt;br /&gt;&lt;br /&gt;data Moore a b = &lt;br /&gt;    Feed b (Input a -&amp;gt; Moore a b) &lt;br /&gt;  | Stop b IsEOF&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;I've renamed this &lt;code&gt;Moore&lt;/code&gt; since it is essentially just a &lt;a href="http://en.wikipedia.org/wiki/Moore_machine"&gt;Moore machine&lt;/a&gt; supporting early termination (the &lt;code&gt;Stop&lt;/code&gt; case just encodes the fact that the state transition function always leads back to the same state from that point). With iteratees, it is really just a convention that intermediate &lt;code&gt;b&lt;/code&gt; values are not inspected (and hence not computed in a lazy language) until either &lt;code&gt;EOF&lt;/code&gt; is hit or the iteratee signals termination early.&lt;/p&gt;&lt;p&gt;We can now use this to perform the inversion of control programmatically. Note the signature for invert!&lt;/p&gt;&lt;pre&gt;&lt;code&gt;isDone :: Input a -&amp;gt; Bool&lt;br /&gt;isDone Done = True &lt;br /&gt;isDone _ = False&lt;br /&gt;&lt;br /&gt;invert :: (forall f . Monad f =&amp;gt; ListT f a -&amp;gt; ListT f b) &lt;br /&gt;       -&amp;gt; Moore a (Maybe b)&lt;br /&gt;invert f = go Nothing (f prompts) where&lt;br /&gt;  go cur res = Feed cur $ \a -&amp;gt; &lt;br /&gt;    case runReader (runListT res) a of&lt;br /&gt;      Empty -&amp;gt; Stop cur (isDone a)&lt;br /&gt;      Yield h t -&amp;gt; go (Just h) t&lt;br /&gt;      Skip s -&amp;gt; go cur s&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;We can write very similar code for monadic iteratees:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;data MooreT f a b = &lt;br /&gt;    FeedT b (Input a -&amp;gt; f (MooreT f a b)) &lt;br /&gt;  | StopT b IsEOF&lt;br /&gt;&lt;br /&gt;invertT :: Monad f &lt;br /&gt;        =&amp;gt; (forall t . (MonadTrans t, Monad m)&lt;br /&gt;          =&amp;gt; ListT (t f) a -&amp;gt; ListT (t f) b) &lt;br /&gt;        -&amp;gt; MooreT f a (Maybe b) &lt;br /&gt;invertT = ...&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Here is &lt;a href="https://gist.github.com/1525429"&gt;the full code&lt;/a&gt;. I haven't played with it much, but isn't it fascinating that this translation is possible? (I suspect there is some categorical connection here though I'm not quite sure what it is yet) It means we can write code using a pull-based API, where functions have access to the &amp;quot;full history&amp;quot; of the stream they are transforming, but translate to a push-based API for purposes of execution. The translation will discover and retain the exact portion of the history required to express the transformation, and it will retain this history for only as long as needed. I find it interesting to think about how different operations on &lt;code&gt;ListT&lt;/code&gt; will get mapped to the equivalent &lt;code&gt;Moore&lt;/code&gt; machine. For instance, a function that conses onto a &lt;code&gt;ListT&lt;/code&gt; results in a &lt;code&gt;Moore&lt;/code&gt; that &amp;quot;delays&amp;quot; the input stream by one step. And so on...&lt;/p&gt;&lt;p&gt;This trick of writing code which is parametric in the choice of monad is not really specific to stream processing and I suspect it has other uses. (Ed Kmett uses a similar technique in his recent &lt;a href="http://comonad.com/reader/2011/searching-infinity/"&gt;searching infinity post&lt;/a&gt; and it's probably been used elsewhere) For any given function parametric in the choice of monad, it's fun to consider what sort of interesting structures can be built by substituting different monads.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-8708740969094817999?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/8708740969094817999/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=8708740969094817999' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/8708740969094817999'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/8708740969094817999'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2011/12/programmatic-translation-to-iteratees.html' title='Programmatic translation to iteratees from pull-based code'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-2713226355452350729</id><published>2011-08-13T09:21:00.000-07:00</published><updated>2011-08-13T20:25:17.019-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='economics'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>(Software) patents don't make much economic sense</title><content type='html'>&lt;p&gt;Patents, in particular software patents, usually don't make sense from an economic standpoint. Patents grant a monopoly to an individual for the invention. Having a single individual control production and use of an invention is obviously inefficient compared to allowing everyone to use and build on it freely - there are a huge number of people in the world who could be doing useful things with the invention that are unable to do so because of the monopoly granted. Thus, granting the monopoly means that during the period the patent is valid, society makes less efficient usage of the invention than if it were in the public domain. How much? Well, let us say for the sake of argument that it's 10x less efficient. This number is completely made up, but bear with me.&lt;/p&gt;&lt;p&gt;Of course, patents act as an incentive to inventors. Without the incentive, perhaps it would have taken longer for the invention to become common knowledge. Let us say for the sake of argument that it took an extra five years for invention X to arise in the absence of patents. In this case, after the five year period, society benefits ten times more each year than it would if a single inventor were given a monopoly. If we assume, say, that society derives one unit of utility each year from the invention, then after a single year in the public domain, society will already have obtained double the benefit from the invention than in the previous five years that the patent holder was granted his monopoly. And this will continue for the lifetime of the patent.&lt;/p&gt;&lt;p&gt;Obviously, these numbers are made up. The point is more that the invention really does have to be something that would not have been discovered for a very long time to make up for the inefficiency of granting a monopoly. If it would have been discovered relatively soon anyway we would have been better off just waiting for that to occur. Software patents rarely make sense because software development requires almost no capital investment, and as a result, it is almost impossible for an individual to develop some software invention that would not be discovered by multiple other people soon in the future. Do you know of any individual or organization that is even capable of creating some software "invention" that would not be rediscovered independently anyway in the next five or ten years? I don't. No one is that far ahead of everyone else in software, precisely because there is no capital investment required and no real barriers to entry.&lt;/p&gt;&lt;p&gt;What I find irritating about all the software patent discussion is that patents are intended to benefit society - that is their purpose, &amp;quot;To promote the Progress of Science and useful Arts&amp;quot;. But no one seems to want to reason about whether that is actually happening - that would mean doing things like thinking about how likely the invention was to be independely discovered soon anyway, estimating the multiplier of having the invention be in the public domain, etc. Instead we get regurgitation of this meme about making sure the little guy working in his basement gets compensated for his invention. Aside from the disingenuousness of this example, this is shifting the discussion away from the real issue. The issue is not whether &amp;quot;the little guy&amp;quot; deserves to be compensated for his work, the issue is whether granting &amp;quot;the little guy&amp;quot; or anyone else a monopoly for 17 years is net beneficial to society.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-2713226355452350729?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/2713226355452350729/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=2713226355452350729' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/2713226355452350729'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/2713226355452350729'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2011/08/software-patents-dont-make-much.html' title='(Software) patents don&apos;t make much economic sense'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-2665671493398191964</id><published>2011-07-22T07:19:00.000-07:00</published><updated>2011-07-22T07:21:43.889-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Unrestricted effects and nondeterminism in purely functional code</title><content type='html'>&lt;p&gt;This post is partially a continuation of &lt;a href="http://pchiusano.blogspot.com/2011/06/referentially-transparent.html"&gt;Referentially transparent nondeterminism&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;In a program's outermost input and output layers, unrestricted effects and nondeterminism do not break referential transparency since there is no enclosing program around to observe the effects. This gives us a lot of freedom to do things we wouldn't ordinarily be able to do without breaking purity. Just how far can we push this?&lt;/p&gt;&lt;p&gt;Consider the mealy machine datatype:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;data Mealy a b = Mealy (a -&amp;gt; (b, Mealy a b))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Clearly, we can feed a &lt;code&gt;Mealy a b&lt;/code&gt; from an &lt;code&gt;IO a&lt;/code&gt; producer. But we can also combine two &lt;code&gt;IO a&lt;/code&gt; producers &lt;em&gt;nondeterministically&lt;/em&gt;. It is only when we sample the output that we observe the effect of the nondeterminism. Here is a typeclass expressing this:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;class (MonadPlus s) =&amp;gt; Source s where&lt;br /&gt;  -- &amp;lt;|&amp;gt; has the effect of merging two sources, nondeterministically&lt;br /&gt;  -- perfectly okay, since when we sample later &lt;br /&gt;  -- (at the end of the universe), we get IO&lt;br /&gt;  (&amp;lt;|&amp;gt;) :: s a -&amp;gt; s a -&amp;gt; s a&lt;br /&gt;  transform  :: s a -&amp;gt; Mealy a b -&amp;gt; s b&lt;br /&gt;  drain :: s a -&amp;gt; (a -&amp;gt; IO b) -&amp;gt; IO b&lt;br /&gt;  -- deterministic merge - explicit interleaving&lt;br /&gt;  zip   :: s a -&amp;gt; s b -&amp;gt; s (a,b)&lt;br /&gt;  filter :: (a -&amp;gt; Bool) -&amp;gt; s a -&amp;gt; s a&lt;br /&gt;  -- etc&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This gives us nondeterminism in our input streams, which are nicely first class. Our choice of stream transformers, &lt;code&gt;Mealy&lt;/code&gt;, are also first class. Only when we observe the output of a source (in &lt;code&gt;drain&lt;/code&gt;) do we pay for the side effects we may have accumulated. Note that our &lt;code&gt;(&amp;lt;|&amp;gt;)&lt;/code&gt; does not require us to pick a deterministic interleaving of the two sources - imagine we have a &lt;code&gt;Source&lt;/code&gt; for keypresses of the letter &lt;code&gt;a&lt;/code&gt; and another &lt;code&gt;Source&lt;/code&gt; of keypresses of &lt;code&gt;b&lt;/code&gt;. If our source representation were something like &lt;code&gt;data Source a = Source { sample :: IO a }&lt;/code&gt;, our implementation of &lt;code&gt;(&amp;lt;|&amp;gt;)&lt;/code&gt; would be required to pick a deterministic interleaving (wait for an &lt;code&gt;a&lt;/code&gt;, then wait for a &lt;code&gt;b&lt;/code&gt;, or vice versa). Our implementation can simply pass through either event when it occurs. Effects are paid for only when the stream is observed.&lt;/p&gt;&lt;p&gt;We can allow for more composition in our output handlers with a type like:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;type Sink a b = Iteratee a IO b&lt;br /&gt;&lt;br /&gt;-- replace drain signature in Source with:&lt;br /&gt;drain :: s a -&amp;gt; Sink a b -&amp;gt; IO b&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;We then compose &lt;code&gt;Sink&lt;/code&gt; values like we would any monadic &lt;code&gt;Iteratee&lt;/code&gt;. What we lack, though, is the ability to pass along the nondeterminism from the merging we did with &lt;code&gt;Source&lt;/code&gt;. Meaning, if we call &lt;code&gt;drain (a &amp;lt;|&amp;gt; b) i&lt;/code&gt;, we ought to have the option (since this is the end of the universe), to evaluate this like &lt;code&gt;drain a i OR drain b i&lt;/code&gt;, meaning we do not specify how the effects produced by the two separate instances of &lt;code&gt;i&lt;/code&gt; are interleaved. To acheive this, we can add another constructor to &lt;code&gt;Sink&lt;/code&gt;:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;data Sink a b = One (Iteratee a IO b) | Many (Sink a b)&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;code&gt;Many&lt;/code&gt; is used to signal to the &lt;code&gt;Source&lt;/code&gt; evaluator than any effects &lt;code&gt;&amp;lt;|&amp;gt;&lt;/code&gt;'d together may be fed to the given &lt;code&gt;Sink&lt;/code&gt; concurrently. Again, since no one is around to observe it, this nondeterminism does not break RT. I don't know how often this sort of output nondeterminism is really useful, but it's interesting that it is possible.&lt;/p&gt;&lt;p&gt;So, we so far have allowed unrestricted side effects and nondeterminism at both the input to our programs as well as the output of our programs. What else can we do? Well, we can allow &lt;em&gt;commutative&lt;/em&gt; effects in the implementation of our stream transformer:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;data Mealy m a b = Mealy (a -&amp;gt; m (a, Mealy a b))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Here &lt;code&gt;m&lt;/code&gt; must be some commutative monad. This ensures that we do not need to pick a global ordering on our effects. One interesting addition is to explicitly separate &lt;code&gt;zip&lt;/code&gt; from the monad interface and require only that &lt;code&gt;zip&lt;/code&gt; commute. This lets &lt;code&gt;Mealy&lt;/code&gt; pipelines see effects from earlier pipeline stages, but not from parallel pipelines:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;class Monad m =&amp;gt; Commutative m where &lt;br /&gt;  -- subject to zip a b == fmap (snd *** fst) (zip b a)&lt;br /&gt;  zip :: m a -&amp;gt; m b -&amp;gt; m (a,b)&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As a final thought, I wonder: is it possible to make an instance of &lt;code&gt;Commutative&lt;/code&gt; for something like &lt;code&gt;State&lt;/code&gt;? Clearly, the implementation of &lt;code&gt;zip&lt;/code&gt; should feed the same input &lt;code&gt;s&lt;/code&gt; into both rather than sequencing the state as usual, but what do we do with the two different output states? We require either some sort of commutative state merging function &lt;code&gt;s -&amp;gt; s -&amp;gt; s&lt;/code&gt; to combine the two new states, or we are forced to propagate both values along somehow, without committing to an order!&lt;/p&gt;Related posts:&lt;ul&gt;&lt;li&gt;&lt;a href="http://pchiusano.blogspot.com/2011/07/do-side-effects-really-need-total-order.html"&gt;Do side effects really need a total order?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://pchiusano.blogspot.com/2011/06/referentially-transparent.html"&gt;Referentially transparent nondeterminism&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-2665671493398191964?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/2665671493398191964/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=2665671493398191964' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/2665671493398191964'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/2665671493398191964'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2011/07/unrestricted-effects-and-nondeterminism.html' title='Unrestricted effects and nondeterminism in purely functional code'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-1107130285581475244</id><published>2011-07-06T07:03:00.000-07:00</published><updated>2011-07-06T12:25:56.695-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Do (side) effects really need a total order?</title><content type='html'>Monads and applicative functors enforce a total ordering on sequencing of effects - often not the right choice. Sure, you &lt;i&gt;could&lt;/i&gt; write your distributed or concurrent program to globally sequence effects, but the result is often unnatural, less efficient, and requires more coordination than truly dictated by the problem domain. At the other end of the spectrum we have commutative monads and applicatives - structures where &lt;code&gt;liftM2 f a b == liftM2 (flip f) b a&lt;/code&gt;. These give maximum flexibility in how effects may be sequenced, but the full commutativity requirement is just not possible for most situations. In a commutative monad, there can be no dependencies between the effects - an effect cannot depend in any way on what effects have come before, only on the set of effects that constitute it.&lt;br /&gt;&lt;br /&gt;As an aside, it makes more sense to talk about &lt;i&gt;the commutative applicative functor underlying a monad&lt;/i&gt;, rather than speaking of "commutative monads". The reason is the usual definition of commutativity of effects for monads - that &lt;code&gt;do { a &lt;- e1; b &lt;- e2; f a b }&lt;/code&gt; is equal to &lt;code&gt;do { b &lt;- e2; a &lt;- e1; f a b }&lt;/code&gt; - is only even defined if &lt;code&gt;e2&lt;/code&gt; does not reference &lt;code&gt;a&lt;/code&gt;. That is, the code cannot be actually monadic - for effects that commute, it must be possible to describe the behavior in terms of some underlying applicative. Applicative lifts the global ordering on the &lt;i&gt;production of the effects&lt;/i&gt; (we no longer have to wait until all previous effects have generated to produce the current effect), but not the global ordering on their sequencing (since we cannot rearrange effects without changing meaning).  &lt;br /&gt;&lt;br /&gt;What I'd like is the ability to specify and reason with a &lt;i&gt;partial order&lt;/i&gt; on effects - this would incidentally cover commutativity as the degenerate case where all effects are incomparable. I'm not sure how to express this as an API, though. Here was a first attempt:&lt;pre&gt;&lt;code&gt;class PosetApplicative m where &lt;br /&gt;  pure :: a -&gt; m a&lt;br /&gt;  lift2 :: (a -&gt; b -&gt; c) -&gt; m a -&gt; m b -&gt; m c&lt;br /&gt;  -- False if m a &lt;= m b, that is m b depends on m a&lt;br /&gt;  -- True if not (m a &lt;= m b), m b does not depend on m a&lt;br /&gt;  commutes :: m a -&gt; m b -&gt; Bool&lt;/code&gt;&lt;/pre&gt;The idea here is that if &lt;code&gt;commutes a b&lt;/code&gt;, then &lt;code&gt;lift2 f a b == lift2 (flip f) b a&lt;/code&gt;. This enables a few things - &lt;code&gt;sequence&lt;/code&gt; and &lt;code&gt;traverse&lt;/code&gt; could now work very differently - they might internally rearrange the effects in some way that is more efficient, then reconstruct the original sequence at the end. Or imagine the stream of effects one might feed to an iteratee - the iteratee could choose to skip a value in the input - the corresponding effect needed to generate this input could then be placed in a buffer and sequenced in later only if a subsequent effect depends on it. Hence the iteratee would not be forced to wait for evaluation of effects it was not interested in!   &lt;br /&gt;&lt;br /&gt;This interface would need to be implemented a bit differently than most applicatives. Consider the state applicative. The &lt;code&gt;commutes a b&lt;/code&gt; translates to the claim that &lt;code&gt;a&lt;/code&gt; and &lt;code&gt;b&lt;/code&gt; operate on separate portions of the state. E.g., the state is a pair and &lt;code&gt;a&lt;/code&gt; only touches the first element of the pair and &lt;code&gt;b&lt;/code&gt; only touches the second. We might say there is some lensing going on, and the two foci of the lenses do not overlap. But it should be clear that the regular representation of state - &lt;code&gt;s -&gt; (a,s)&lt;/code&gt; - is not going to work, as we have no knowledge of what this opaque function might do with the given &lt;code&gt;s&lt;/code&gt;. We need some sort of ADT to track the portion of the state each state action is given access to, so we can explicitly reason about dependencies and how they overlap. &lt;br /&gt;&lt;br /&gt;I am only just starting to think about these possibilities. I'd love to hear if this has been looked into before, or if anyone else has experience with this sort of thing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-1107130285581475244?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/1107130285581475244/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=1107130285581475244' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/1107130285581475244'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/1107130285581475244'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2011/07/do-side-effects-really-need-total-order.html' title='Do (side) effects really need a total order?'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-2377977402547653388</id><published>2011-06-06T21:00:00.000-07:00</published><updated>2011-06-06T21:00:18.880-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Referentially transparent nondeterminism</title><content type='html'>At &lt;a href="http://www.nescala.org/2011/"&gt;NEScala&lt;/a&gt;, I &lt;a href="http://vimeo.com/20307408"&gt;gave a talk&lt;/a&gt; on using mealy machines for deterministic parallelism. I think it's a nice model that works out pretty well. But, the more I've thought about it, the more I think deterministic parallelism isn't enough - nondeterminism is needed in some form, but with a way of preserving referential transparency (RT) by not allowing this nondeterminism to be observable by programs. &lt;br /&gt;&lt;br /&gt;Nondeterminism is both useful and important. In the context of stream transforming functions, we need to be able to merge two inputs to a stream transformer without specifying an ordering on the elements. As an example, if we have an event stream for mouse clicks and an event stream for key presses, we need to be able to merge these streams in a nondeterministic way, without saying anything at all about how the streams interleave - that is, I do not want to specify that the output stream is, say, all the mouse clicks followed by all the key presses, or the two interleaved in a completely regular way. No, what I want is more like the result of summing two independent random variables - the actual stream of events for the merged stream will be interleaved in a random way (not truly random, but modeled outside of the system), where there is no interaction between these events. I should be able to remove (via filtering) all the key presses from the merged stream and obtain something equivalent to the original mouse click stream. &lt;br /&gt;&lt;br /&gt;In order to keep RT, then, whatever consumes this stream must also end the universe (it can never return). If such a consumer did return, we could then feed the stream to another consumer and observe that we did not have the same results. But this is nothing new, functional programming is all about moving side effects to the edges of the program where they cannot be observed. The only difference is what effects we're hiding in this way. There's something neat about the approach though - instead of eliminating the nondeterminism entirely, which would force us to chose some global, explicit ordering for our effects (unrealistic for a distributed system and many other situations), we keep the nondeterminism and just focus on pushing it to the outside of our programs like we would other effects.&lt;br /&gt;&lt;br /&gt;At some point I'll write up these ideas in code to make it clearer what heck I'm talking about.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-2377977402547653388?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/2377977402547653388/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=2377977402547653388' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/2377977402547653388'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/2377977402547653388'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2011/06/referentially-transparent.html' title='Referentially transparent nondeterminism'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-5081387013884503903</id><published>2011-05-18T06:53:00.000-07:00</published><updated>2011-10-20T07:01:12.774-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Making the most of Scala's (extremely limited) type inference</title><content type='html'>The lack of good inference is one of the most annoying things hindering Scala's usability as a functional language, and it's actually been a real problem for us in some of the code we write at &lt;a href="http://capitaliq.com"&gt;Capital IQ&lt;/a&gt;. Good type inference is important for functional programming - it is important to be able to pull any common functionality out into higher-order functions, and not be penalized with a boatload of manual type annotations for doing so. I liken the friction we sometimes encounter dealing with Scala's poor type inference to be something like what one would encounter doing functional programming in Java - yes, you can use anonymous classes to simulate first-class functions, etc, but the amount of boilerplate really discourages you. &lt;br /&gt;&lt;br /&gt;With all that said, there are some workarounds and rules you can follow to make the most of Scala's limited inference. Type information in Scala flows from function arguments to their results (when the types of a function's arguments are known, that type information will flow into the body of the function), from left to right across argument lists, and from first to last across statements. This is in contrast to a language with full type inference, where (roughly speaking) type information flows unrestricted in all directions. But, if you pay attention to Scala's limitations, you can get better type inference out of your APIs. (Scala will also never infer a higher-kinded type which is a partially applied type-level function, but that is a topic for another post.)&lt;br /&gt;&lt;br /&gt;Let's look at a first example, illustrating the fact that type information does not flow from a function's body to its arguments:&lt;pre&gt;&lt;code&gt;scala&gt; case class Foo() { def quuux(f: Foo) = 12 }&lt;br /&gt;defined class Foo&lt;br /&gt;&lt;br /&gt;scala&gt; val f = (x,y) =&gt; x quuux y&lt;br /&gt;error: missing parameter type&lt;br /&gt;val f = (x,y) =&gt; x quuux y&lt;br /&gt;         ^&lt;br /&gt;error: missing parameter type&lt;br /&gt;val f = (x,y) =&gt; x quuux y&lt;br /&gt;           ^&lt;/code&gt;&lt;/pre&gt;So, even though the body of the lambda fixes the types of &lt;code&gt;x&lt;/code&gt; and &lt;code&gt;y&lt;/code&gt; to be &lt;code&gt;Foo&lt;/code&gt; (no other class in scope has a method &lt;code&gt;quuux&lt;/code&gt;), this type information will not flow 'upwards' to the function's arguments. This is a good thing to keep in mind as you are designing your APIs. Epecially when writing higher-order functions, you often expect to receive anonymous functions whose types you'd like inferred. &lt;br /&gt;&lt;br /&gt;Here's an example:&lt;pre&gt;&lt;code&gt;def zipWith[A,B,C](a: List[A], b: List[B], f: (A,B) =&gt; C): &lt;br /&gt;  List[C] = a zip b map f.tupled&lt;/code&gt;&lt;/pre&gt;Note what happens when we try using this function:&lt;pre&gt;&lt;code&gt;scala&gt; zipWith(List(1,2), List(2,3), (a,b) =&gt; a+b)&lt;br /&gt;&lt;br /&gt;error: missing parameter type&lt;br /&gt;zipWith(List(1,2), List(2,3), (a,b) =&gt; a+b)&lt;br /&gt;                               ^&lt;br /&gt;error: missing parameter type&lt;br /&gt;zipWith(List(1,2), List(2,3), (a,b) =&gt; a+b)&lt;br /&gt;                                 ^&lt;/code&gt;&lt;/pre&gt;What's going on here? Type information does not flow from left to right &lt;i&gt;within&lt;/i&gt; an argument list, only from left to right &lt;i&gt;across&lt;/i&gt; argument lists. So, even though Scala knows the types of the first two arguments, which would fix the type parameters &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;B&lt;/code&gt; to be &lt;code&gt;Int&lt;/code&gt;, that information does not flow to our anonymous function.&lt;br /&gt;&lt;br /&gt;If we change our definition of zipWith to be curried, then we can get type information to flow from the arguments to our binary function: &lt;pre&gt;&lt;code&gt;scala&gt; def zipWith[A,B,C](a: List[A], b: List[B])(f: (A,B) =&gt; C):&lt;br /&gt;  List[C] = a zip b map f.tupled&lt;br /&gt;&lt;br /&gt;zipWith: [A,B,C](a: List[A],b: List[B])(f: (A, B) =&gt; C)List[C]&lt;br /&gt;&lt;br /&gt;scala&gt; zipWith(List(1,2), List(2,3))((a,b) =&gt; a+b) &lt;br /&gt;res1: List[Int] = List(3, 5)&lt;br /&gt;&lt;br /&gt;scala&gt; zipWith(List(1,2), List(2,3))(_ + _)&lt;br /&gt;res2: List[Int] = List(3, 5)&lt;/code&gt;&lt;/pre&gt;Now that our binary function is in a separate argument list, any type information from the previous argument lists is used to fill in the types for our function. So, &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;B&lt;/code&gt; are fixed to be &lt;code&gt;Int&lt;/code&gt; by the first argument list, and therefore we don't need to annotate our lambda's parameters. From here, the type information about our lambda's function parameters will flow into the body of the lambda, and in this case, following the same inference rules, Scala is able to infer the result type - in this case, again, &lt;code&gt;Int&lt;/code&gt;. Note that if you have a method &lt;code&gt;bar&lt;/code&gt; on some object &lt;code&gt;foo&lt;/code&gt;, the type inference flow for &lt;code&gt;foo.bar(x,y)&lt;/code&gt; is going to be the same as if you defined bar as a standalone function &lt;code&gt;def bar (foo: Foo)(x: X, y: Y)&lt;/code&gt;. This explains why our actual implementation of this function - &lt;code&gt;a zip b map f.tupled&lt;/code&gt; - does not require type annotation.&lt;br /&gt;&lt;br /&gt;Incidentally, this restriction that type information does not flow at least left to right within a single argument list appears completely artificial to me. I posted &lt;a href="http://lampsvn.epfl.ch/trac/scala/ticket/3293"&gt;a ticket&lt;/a&gt; about this a while ago - Jason Zaugg commented there about the trick of currying the arguments to get better type propagation.&lt;br /&gt;&lt;br /&gt;Question: do you think the following will require type annotations? &lt;pre&gt;&lt;code&gt;scala&gt; zipWith(List(List(1,2)), List(List(2,3)))(&lt;br /&gt;  zipWith(_,_)(_ + _))&lt;/code&gt;&lt;/pre&gt;The answer is... no! But why? Expanding this out a bit, we have:&lt;pre&gt;&lt;code&gt;scala&gt; zipWith(List(List(1,2)), List(List(2,3)))(&lt;br /&gt;  (a,b) =&gt; zipWith(a,b)(_ + _))&lt;/code&gt;&lt;/pre&gt;By the previous reasoning, the types of a and b will be known (fixed from the previous argument list). Now we apply the rule that type information flows from a function's arguments into its body. From this Scala infers the type &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;B&lt;/code&gt; for the inner zipWith, and as a result the argument types for &lt;code&gt;(_ + _)&lt;/code&gt; are known. This information flows into the body of &lt;code&gt;(_ + _)&lt;/code&gt;, and Scala is able to infer the result is &lt;code&gt;Int&lt;/code&gt;. &lt;br /&gt;&lt;br /&gt;Note that if we were to declare our lambda outside of the call to zipWith, we don't get good type inference flow since type information does not flow backwards across statements - only forwards:&lt;pre&gt;&lt;code&gt; scala&gt; {val f = (_ + _); zipWith(List(1,2), List(2,3))(f)}&lt;br /&gt;error: missing parameter type for &lt;br /&gt;expanded function ((x$1, x$2) =&gt; x$1.$plus(x$2))&lt;br /&gt;{val f = (_ + _); zipWith(List(1,2), List(2,3))(f)}&lt;br /&gt;          ^&lt;br /&gt;error: missing parameter type for &lt;br /&gt;expanded function ((x$1 &lt;error&gt;, x$2) =&gt; x$1.$plus(x$2))&lt;br /&gt;{val f = (_ + _); zipWith(List(1,2), List(2,3))(f)}&lt;br /&gt;              ^&lt;/code&gt;&lt;/pre&gt;So, it's actually better for type inference to not declare variables for expressions you are only going to reference once. This is the style I prefer anyway, and it's much more common in purely functional programming - you just get used to reading more nested expressions, and the structure of the code is more clear. &lt;br /&gt;&lt;br /&gt;Judicious use of these rules can make your APIs much more usable. But don't overdo it - for instance, writing &lt;code&gt;zipWith&lt;/code&gt; like this does not improve type inference over our previous version:&lt;pre&gt;&lt;code&gt;def zipWith[A,B,C](a: List[A])(b: List[B])(f: (A,B) =&gt; C): &lt;br /&gt;List[C]&lt;/code&gt;&lt;/pre&gt;There is no type information to flow from the first argument list to the second - the types &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;B&lt;/code&gt; are completely unrelated. So we might as well put &lt;code&gt;a&lt;/code&gt; and &lt;code&gt;b&lt;/code&gt; in the same argument list.&lt;br /&gt;&lt;br /&gt;Scala's limitations are particularly unfortunate - as we've seen, in many cases the information is there to do better inference and the restrictions in place are artificial. I really hope the Scala developers put some more resources into improving the type inference in Scala - it's one of those improvements that isn't an obvious 'feature' win but it can dramatically improve everyday work in Scala for those who are using the language heavily.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-5081387013884503903?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/5081387013884503903/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=5081387013884503903' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/5081387013884503903'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/5081387013884503903'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2011/05/making-most-of-scalas-extremely-limited.html' title='Making the most of Scala&apos;s (extremely limited) type inference'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-374930394926452096</id><published>2011-03-02T09:00:00.000-08:00</published><updated>2011-03-02T09:00:12.410-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='economics'/><category scheme='http://www.blogger.com/atom/ns#' term='politics'/><title type='text'>Cliff Asness's excellent rant on health care</title><content type='html'>I was just rereading &lt;a href="http://www.realclearpolitics.com/articles/2009/07/23/health_care_mythology_97552.html"&gt;Health Care Mythology&lt;/a&gt;, by &lt;a href="http://en.wikipedia.org/wiki/Cliff_Asness"&gt;Cliff Asness&lt;/a&gt;. Most of his analysis is spot on. Here's a nice excerpt:&lt;br /&gt;&lt;blockquote&gt;The government does not co-exist or compete fairly with private enterprise, anywhere. It does not play well with others. The regulator cannot be a competitor at the same time. It cannot compete fairly while it owns the armed forces and courts. Finally, it cannot be a fair competitor if when the "public option" screws up (can't pay its bills), the government implicitly or explicitly guarantees its debts. We have seen what happens in that case and don't need a re-run.&lt;br /&gt;&lt;br /&gt;The first thing the government does is underprice the private system. You can easily be forgiven for thinking this is a good thing. Why not, cheaper is better, right? Wrong. They will underprice private enterprise by charging less to the purchaser of health insurance, not by actually creating it cheaper. Who makes up the difference? Well, you and your family do if you pay taxes, or your kids will pay taxes, or their kids will pay taxes. The government can always underprice competition, not through the old fashioned way of doing it better, they never do that, but by robbing Peter to pay for Paul. They are taking money from your left pocket and giving you a small portion of it back in your right pocket. They do it every day before breakfast, and take a victory lap for the small portion they return.&lt;br /&gt;&lt;br /&gt;Second, the government ultimately always cheats when it's involved in "honest" competition. Try mailing a first class letter through Fed-Ex, or placing an off-track bet on your favorite horse with a bookie, or playing a lottery through a private company. Uh, you can't, so please stop trying, I don't want you to hurt yourself. Once the government discovers it cannot win, it changes the rules. You see, the government has the power to legislate, steal, imprison, and even kill. Those are advantages most private firms do not have, save Google, and you did not hear that from me (we all know the Google guy with one eye-brow would crush your larynx for creating a competing search engine).&lt;/blockquote&gt;There's some other nice essays on &lt;a href="http://www.stumblingontruth.com/"&gt;his personal site&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-374930394926452096?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/374930394926452096/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=374930394926452096' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/374930394926452096'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/374930394926452096'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2011/03/cliff-asnesss-excellent-rant-on-health.html' title='Cliff Asness&apos;s excellent rant on health care'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-6948600826939274009</id><published>2011-02-08T18:43:00.000-08:00</published><updated>2011-02-08T20:39:56.904-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Comonads for stream processing... not so great</title><content type='html'>The problem with comonads is the &lt;code&gt;extend&lt;/code&gt; operation. The signature tells you all you need to know: &lt;code&gt;&lt;pre&gt;extend :: (f a -&gt; b) -&gt; f a -&gt; f b&lt;/pre&gt;&lt;/code&gt;What's wrong with this signature? Well, it means the function passed to &lt;code&gt;extend&lt;/code&gt; will have access to the full context of each element of the stream, and moreover, the extent of this context is &lt;i&gt;globally&lt;/i&gt; determined by the comonad's implementation of &lt;code&gt;extend&lt;/code&gt;. So, in the case of causal stream functions, where the context is say, the list of previous values in the stream, we can, at any point after the fact, reach arbitrarily far back into the past to inspect some values. This implies we must somehow hold onto these values, even if we discover (dynamically) we didn't need them.&lt;br /&gt;&lt;br /&gt;If we attempt to provide some more limited notion of context, that limited notion is global, and it restricts the access of every single node in our dataflow graph that might want to use such information.&lt;br /&gt;&lt;br /&gt;In a decent API for causal stream functions, it is an absolute requirement that each individual node be capable of deciding how much context it needs to hold onto, not after the fact, but as it receives each element. And importantly, we want these decisions be made locally. Thus, the stream processors are consumers to which we push data, rather than having them pull certain data from history, maintained globally by the producer.&lt;br /&gt;&lt;br /&gt;There's a nice comment on the &lt;a href="http://lambda-the-ultimate.org/node/988#comment-13305"&gt;LtU discussion of "The Essence of Dataflow Programming" that sums it up for me&lt;/a&gt;:&lt;blockquote&gt;Perhaps the fact that it can be done at all with a comonad (regardless of how simply) is of theoretical interest, but that is not the slant Uustalu and Vene and put on it. They claim (over-ambitiously I think) that comonadic semantics capture the "essence" of dataflow semantics...&lt;br /&gt;&lt;br /&gt;My personal take on the whole thing is they succumbed to the understandable excitement of finding an application for comonads and confused a task that *can* be done with a comonad for one that *should* be done with a comonad.&lt;/blockquote&gt;... although I disagree with the author of that comment that applicative functors capture more of the 'essence' of dataflow programming.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-6948600826939274009?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/6948600826939274009/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=6948600826939274009' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/6948600826939274009'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/6948600826939274009'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2011/02/comonads-for-stream-processing-not-so.html' title='Comonads for stream processing... not so great'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-6207379455016501135</id><published>2010-09-22T16:55:00.000-07:00</published><updated>2010-09-22T21:11:39.288-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>The four stages of (functional) programming</title><content type='html'>There seem to be a series of stages of abstraction in functional programming - actually, I'd say in all programming, but it's more obvious in FP:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Stage 0: no functions, and thus, no abstraction&lt;br /&gt;&lt;li&gt;Stage 1: functions, but no higher-order functions&lt;br /&gt;&lt;li&gt;Stage 2: higher-order functions, combinator libraries&lt;br /&gt;&lt;li&gt;Stage 3: typeclasses, abstracting over combinator libraries&lt;br /&gt;&lt;li&gt;Stage 4: abstracting over typeclasses&lt;/ul&gt;Each stage lets you remove a class of code duplication. If you were to code in a stage 0 language, your only means of reuse would be copy/paste. Stage 1 introduces the ability to abstract over functionality, but in a limited way, since the structure of the computation is fixed. (I think many programs in a language like Java are stage 1 programs, or close to it.) Stage 2 introduces HOFs which removes a class of duplication by letting us abstract over control flow (this is where I put most Lisp/Scheme programs). From this we get the usual combinator library style. But at this point, we are still dealing with concrete types. Common patterns across these libraries are not abstracted over.&lt;br /&gt;&lt;br /&gt;Stage 3 is the generalization of combinator libraries - code that is generic across multiple such libraries - think of the generalized combinator library we obtain from a monad, or a traversable. This removes the duplication found due to common patterns in collections of higher-order functions. I wonder - was Haskell's type system and commitment to purity a prerequisite to developing this stage of programming? In principle stage 3 is possible in any dynamic language with first class modules and first class functions.&lt;br /&gt;&lt;br /&gt;And stage 4... well, we're only starting to see what stage 4 looks like. In stage 3 programming, we can write code polymorphic to any instance of the typeclass, but there could be duplication in the implementations of different typeclass instances - for instance, you might create a data type that is essentially a list of trees, but in giving the &lt;code&gt;Applicative&lt;/code&gt; instance for your type, you start from scratch. In stage 4 programming, you'd assemble your type &lt;i&gt;and appropriate typeclass implementations&lt;/i&gt; from combinators. This is done a little bit here and there in FP - you might compose Applicatives or take their product, but in stage 4 this style is &lt;i&gt;pervasive&lt;/i&gt; and taken to its logical limit.&lt;br /&gt;&lt;br /&gt;Stage 4 requires some serious rewiring of your programming brain. Forget concrete types or concrete type implementations. You don't implement a tree datatype. You &lt;i&gt;assemble&lt;/i&gt; the type and all its useful functions from a catalog of typeclass combinators. Category theory would become an &lt;i&gt;essential&lt;/i&gt; tool for programming in this style ("oh, there's a natural isomorphism between this type and blah, just gotta establish this isomorphism to get all that functionality"). Would we then see category theory and this style of programming taught as part of the standard computer science curriculum? Maybe...&lt;br /&gt;&lt;br /&gt;I don't know of any any language that supports stage 4 programming adequately. It seems to require, among other things, first class and higher order modules. Probably more powerful type systems, too. &lt;a href="http://comonad.com/reader/"&gt;Ed Kmett&lt;/a&gt; has an interesting language he's been working on, called Kata. Based on what he's told me about it, it looks specifically designed to make stage 4 programming easy. Interestingly, I think he's given up on static typing.&lt;br /&gt;&lt;br /&gt;What do you think? Is stage 4 the future of (functional) programming?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-6207379455016501135?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/6207379455016501135/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=6207379455016501135' title='18 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/6207379455016501135'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/6207379455016501135'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2010/09/four-stages-of-functional-programming.html' title='The four stages of (functional) programming'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>18</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-7794304893218514529</id><published>2010-09-13T16:00:00.000-07:00</published><updated>2010-09-14T21:46:04.444-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Push libraries vs pull libraries</title><content type='html'>Ever get the feeling there's something very wrong with the library you're using, something a little too &lt;span style="font-style:italic;"&gt;framework-y&lt;/span&gt; about it, but you can't quite put your finger on the problem? My pet theory: it's often that the library is structured to &lt;i&gt;push&lt;/i&gt; its functionality to client code, rather than having client code &lt;i&gt;pull&lt;/i&gt; functionality that's needed.&lt;br /&gt;&lt;br /&gt;In a push library, the library pushes the functionality to clients, who then override how certain steps are performed. In this model, client code can only change overall program behavior by overriding or changing code in places predetermined by the library. A consequence is that overall control flow is rigid - it is mostly fixed by the library writer. As push libraries develop over time, they accumulate more and more options and hooks in different places to handle all the different control flows users want. The result isn't pretty. But even with all the complexity that accumulates, the library never seems to do exactly what you want it to. &lt;br /&gt;&lt;br /&gt;In contrast, a "pull" library puts client code in control. Client code makes all decisions regarding control flow and the library provides control flow combinators appropriate for the domain. Not only is this way more flexible and expressive (the client can construct control flows not envisioned by the framework designer), it's also much more pleasant for the client programmer. The client programmer now just has to understand the primitives and the control flow combinators he's using for his particular application, and nothing more.&lt;br /&gt;&lt;br /&gt;Libraries written in the pull style are more like combinator libraries and tend to make use of higher order functions, closures, etc. Perhaps this explains why this style isn't very common in languages like Java and why the big "frameworks" in these languages are often bloated monstrosities nobody likes using.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-7794304893218514529?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/7794304893218514529/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=7794304893218514529' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/7794304893218514529'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/7794304893218514529'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2010/09/push-libraries-vs-pull-libraries.html' title='&lt;i&gt;Push&lt;/i&gt; libraries vs &lt;i&gt;pull&lt;/i&gt; libraries'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-5471056167509795878</id><published>2010-09-12T20:26:00.001-07:00</published><updated>2010-09-12T20:38:03.101-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Question about simplifying Scala 2.8 collections</title><content type='html'>In &lt;a href="http://www.reddit.com/r/programming/comments/d5s9g/scala_is_for_vb_programmers/c0xs4ps"&gt;this reddit thread&lt;/a&gt;, I speculated that Scala's 2.8 collections &lt;a href="http://www.reddit.com/r/programming/comments/d5s9g/scala_is_for_vb_programmers/c0xt9a3"&gt;could be simplified&lt;/a&gt; by (among other things) separating &lt;code&gt;def map[B](f: A =&gt; B): F[B]&lt;/code&gt; and &lt;code&gt;def mapId(f: A =&gt; A): F&lt;/code&gt;. Can anyone come up with an example where having these be separate methods results in code duplication? I suspect there's no such example, but it might be that I'm not thinking hard enough. Someone prove me wrong!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-5471056167509795878?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/5471056167509795878/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=5471056167509795878' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/5471056167509795878'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/5471056167509795878'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2010/09/question-about-simplifying-scala-28.html' title='Question about simplifying Scala 2.8 collections'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-4647385221965628479</id><published>2010-07-30T09:36:00.000-07:00</published><updated>2010-08-03T14:20:59.252-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Reification of time in FRP is a conceptual error</title><content type='html'>I'm finally getting around to learning more about functional reactive programming and it is an absolutely fantastic field. FRP is a huge, important idea in programming... the missing link connecting FP to all the "inherently stateful" tasks that supposedly necessitate imperative programming (things like simulations, GUIs, etc). I'm quite excited to see how this technology evolves.&lt;br /&gt;&lt;br /&gt;That said, I think some of the explanations for FRP indicate what is in my opinion a conceptual error - &lt;i&gt;the reification of time&lt;/i&gt;. It may be that I am misunderstanding what is being meant by FRP researchers and the following rant will seem wholly uncontroversial. But if not, or if what I say seems crazy, please leave comments! Just to clarify, this is not a post trashing FRP - I am simply arguing against a particular way of viewing FRP that I feel is unhelpful.&lt;br /&gt;&lt;br /&gt;Reactive behaviors in FRP are often described in terms of the following "purely conceptual" model: &lt;code&gt;(Time -&gt; a) -&gt; (Time -&gt; b)&lt;/code&gt;. Time? Why on earth does time appear explicitly in the model? Values do not depend on time, they depend on other values. When we say that a value is time-dependent, we are just being sloppy, implicitly assuming the presence of some process that will provide our function (a function is a dependent value) with a series of possibly distinct inputs. &lt;br /&gt;&lt;br /&gt;Thus, for example, if we are really trying to be precise, it is a conceptual error to say that "the stock price of Microsoft is time-dependent" (of course, I don't object to this as an informal statement). The stock price of Microsoft is dependent on various things - current interest rates, say, the set of competing companies' products, and who knows what else... the price of oil, whether a butterfly in Idaho is flapping wings, etc. Time is just a proxy for some process that will provide our Microsoft price computing function with a series of &lt;code&gt;(interest rate, competing products, oil price, ...)&lt;/code&gt; tuples. In principle, if we could clone our universe and "at a later time" supply this function with the same values, it would produce the same output! So why talk about time at all? &lt;br /&gt;&lt;br /&gt;Reifying time, even in informal language about FRP or "loose conceptual models" is therefore confusing and misleading because it obscures the real story (of processes and their dependencies) behind an opaque and ill-defined notion. And if we attempt to reify time in the programming model we run into problems as soon as we attempt to make it less opaque - endowing time with any sort of operations yields unimplementable gobbledeygook, where functions can reach back into the past arbitrarily far (which suggests an omniscient runtime remembering every value that ever was), or look into the future, violating causality! (These problems are also &lt;a href="http://conal.net/blog/posts/garbage-collecting-the-semantics-of-frp/"&gt;discussed here by Conal&lt;/a&gt;) For these reasons I think it makes more sense to talk directly about processes and their dependencies, and it seems this is exactly what is done in modern FRP formulations! For instance, in &lt;a href="http://lambda-the-ultimate.org/node/3659"&gt;this paper on causal commutative arrows&lt;/a&gt;, time is not even explicitly mentioned! One possible instantiation of their model is: &lt;code&gt;data SF a b = SF (a -&gt; (b, SF a b))&lt;/code&gt;. Time is not mentioned at all; the paper deals with processes directly, which are the real story. Continuity is handled without any special consideration, as we can push updates through the dependency graph as often as we like, and "step size" is just like any other parameter propagating through the graph! &lt;br /&gt;&lt;br /&gt;As an aside, you can tell there's problems when it is not possible to expose a model in its full power (and not just for efficiency - most &lt;code&gt;(Time -&gt; a) -&gt; (Time -&gt; b)&lt;/code&gt; functions are actually ill-defined) and we must instead resort to a rather ad hoc set of combinators that expose only pieces. Another way I like to think of this situation is that the combinators chosen actually induce another model that may be quite different than the intended "conceptual model", and the conceptual model becomes a rather bad way of discussing what is actually happening. I distinguish between this and omitting possible operations in the conceptual model purely for efficiency reasons. (I wonder if there is a precise way of clarifying this distinction.)&lt;br /&gt;&lt;br /&gt;Lastly, when I was initally thinking about FRP, it was in the context of distributed, reactive programs that could replace &lt;a href="http://pchiusano.blogspot.com/2010/01/actors-are-not-good-concurrency-model.html"&gt;actors, which are stateful&lt;/a&gt;, and I came up with this type: &lt;code&gt;data Pipe a b = Pipe (a -&gt; Future (b, Pipe a b))&lt;/code&gt;, where &lt;code&gt;Future&lt;/code&gt; is the monad for concurrent evaluation (I couldn't believe it when I read the CCA paper and encountered the almost identical type, &lt;code&gt;SF&lt;/code&gt;!). But this could be generalized to &lt;code&gt;data Pipe m a b = Pipe (a -&gt; m (b, Pipe a b))&lt;/code&gt;. With different choices of &lt;code&gt;m&lt;/code&gt;, we obtain different sorts of systems. When I showed this to my coworker &lt;a href="http://apocalisp.wordpress.com/"&gt;Rúnar&lt;/a&gt;, he immediately suggested the list monad, implying nondeterminism. Along these lines, we can also imagine a probability monad, also implying nondeterminism in the system's evolution but something more akin to quantum physics. This would admit the possibility of a more interesting evaluator, one that culls out branches whose probability dips below some epsilon (or even lazily evaluates all possible branches, even low probability ones, but only when they are demanded). Interestingly, with &lt;code&gt;m&lt;/code&gt; as an arbitrary &lt;code&gt;Applicative&lt;/code&gt; (but not a monad), I believe we obtain the "commutativity" law suggested in the CCA paper, which basically enforces that processes, once split, cannot interfere with one another.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-4647385221965628479?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/4647385221965628479/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=4647385221965628479' title='33 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/4647385221965628479'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/4647385221965628479'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2010/07/reification-of-time-in-frp-is.html' title='Reification of time in FRP is a conceptual error'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>33</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-4680500345704361906</id><published>2010-06-29T11:21:00.000-07:00</published><updated>2010-07-05T12:15:20.903-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Alignable functors: a typeclass for 'zippy' containers</title><content type='html'>Applicative functors don't quite work for some applications. At work we had some code that needed to abstract over several containers each equipped with some sort of "zipWith" operation (for instance: lists, maps, trees). Unfortunately the &lt;code&gt;Applicative&lt;/code&gt;-derived version of zipWith was unsuitable. Here's that implementation:&lt;pre&gt;&lt;code&gt;zipWith :: (a -&amp;gt; b -&amp;gt; c) -&amp;gt; f a -&amp;gt; f b -&amp;gt; f c&lt;br /&gt;zipWith f x y = f &amp;lt;$&amp;gt; x &amp;lt;*&amp;gt; y&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;For "zippy" applicatives, in addition to the usual &lt;code&gt;Applicative&lt;/code&gt; laws, &lt;code&gt;zipWith&lt;/code&gt; should satisfy:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;zipWith f xs xs == fmap (\x -&amp;gt; f x x) xs&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;em&gt;Aside:&lt;/em&gt; it is well known that we could define &lt;code&gt;&amp;lt;*&amp;gt;&lt;/code&gt; in terms of &lt;code&gt;zipWith&lt;/code&gt;: &lt;code&gt;(&amp;lt;*&amp;gt;) = zipWith ($)&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;The &lt;code&gt;Applicative&lt;/code&gt;-derived &lt;code&gt;zipWith&lt;/code&gt; has no good way of handling containers with different shapes. Since the lifted function always receives both an &lt;code&gt;a&lt;/code&gt; and a &lt;code&gt;b&lt;/code&gt;, the shape of the resulting &lt;code&gt;f c&lt;/code&gt; &lt;em&gt;must&lt;/em&gt; be the intersection of the two shapes being zipped -- &lt;code&gt;zipWith&lt;/code&gt; has no way to notify the zipping function, &lt;code&gt;f&lt;/code&gt;, that it has encountered a position where only the first container or only the second container has a value.&lt;br /&gt;&lt;br /&gt;Here are some examples of how this plays out:&lt;ul&gt;&lt;li&gt;&lt;code&gt;zipWith&lt;/code&gt; for two lists results in a list whose length is the smaller of the two&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;code&gt;zipWith&lt;/code&gt; for two maps results in a map whose key set is the intersection of the two input key sets&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;code&gt;zipWith&lt;/code&gt; for two binary trees will contain only the nodes with a path from the root that exists in both input trees&lt;/li&gt;&lt;/ul&gt;This behavior might be acceptable for some applications but for ours it wasn't. Incidentally, the behavior also propagates to &lt;code&gt;Traversable&lt;/code&gt;, so for instance using &lt;code&gt;traverse&lt;/code&gt; to transpose a &lt;code&gt;[[a]]&lt;/code&gt; will truncate the output lists:&lt;pre&gt;&lt;code&gt;Prelude&amp;gt; getZipList $ sequenceA [ZipList [1], ZipList [1,2]]&lt;br /&gt;[[1,1]]&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;The second column, the one that should be &lt;code&gt;[2]&lt;/code&gt;, is removed because not all lists being transposed have a second element. We get similar behavior with other zippy applicatives.&lt;br /&gt;&lt;br /&gt;If we choose to take the &lt;em&gt;union&lt;/em&gt; of the two shapes being zipped rather than their intersection, we obtain the &lt;code&gt;Alignable&lt;/code&gt; typeclass:&lt;pre&gt;&lt;code&gt;class Functor f =&amp;gt; Alignable f where &lt;br /&gt;   empty :: f a&lt;br /&gt;   align :: (OneOrBoth a b -&amp;gt; c) -&amp;gt; f a -&amp;gt; f b -&amp;gt; f c&lt;br /&gt;&lt;br /&gt;data OneOrBoth a b = Both a b | OneL a | OneR b&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;This should satisfy:&lt;ul&gt;&lt;li&gt;&lt;code&gt;align f a empty == fmap (f . OneL) a&lt;/code&gt;&lt;/li&gt;&lt;li&gt;&lt;code&gt;align f empty a == fmap (f . OneR) a&lt;/code&gt;&lt;/li&gt;&lt;li&gt;&lt;code&gt;align f a a == fmap (\x -&amp;gt; f $ Both x x) a&lt;/code&gt;&lt;/li&gt;&lt;/ul&gt;So where &lt;code&gt;Applicative&lt;/code&gt; has &lt;code&gt;pure&lt;/code&gt;, &lt;code&gt;Alignable&lt;/code&gt; has &lt;code&gt;empty&lt;/code&gt;, and where &lt;code&gt;Applicative&lt;/code&gt; has &lt;code&gt;&amp;lt;*&amp;gt;&lt;/code&gt;, &lt;code&gt;Alignable&lt;/code&gt; has &lt;code&gt;align&lt;/code&gt;. Like &lt;code&gt;Applicative&lt;/code&gt;, we could use these to implement &lt;code&gt;fmap&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;If you have an &lt;code&gt;Alignable&lt;/code&gt; for some &lt;code&gt;f&lt;/code&gt;, you can lift Iteratee, &lt;a href="http://comonad.com/reader/2009/iteratees-parsec-and-monoid/"&gt;&lt;code&gt;Reducer&lt;/code&gt;&lt;/a&gt; and various other &lt;code&gt;Reducer&lt;/code&gt;-like objects to &lt;code&gt;f&lt;/code&gt;. Monoids and groups can also be lifted to &lt;code&gt;Alignable&lt;/code&gt;. And the version of &lt;code&gt;traverse&lt;/code&gt; based on &lt;code&gt;Alignable&lt;/code&gt; is more appropriate for some applications.&lt;br /&gt;&lt;br /&gt;One of the problems with &lt;code&gt;Alignable&lt;/code&gt; is it doesn't provide a way to lift algebraic structures with more than one distinguished element. If the structure has just one distinguished element (like a monoid or group), then &lt;code&gt;empty&lt;/code&gt; can represent the lifted version of that element - we just need to have the function we pass to &lt;code&gt;align&lt;/code&gt; fill in this element when it receives a &lt;code&gt;OneL&lt;/code&gt; or a &lt;code&gt;OneR&lt;/code&gt;. If there's multiple distinguished elements (like in a ring) then this doesn't work and we need something like &lt;code&gt;pure :: a -&amp;gt; f a&lt;/code&gt;. But this can't be the infinite constant container of &lt;code&gt;f&lt;/code&gt; like in a typical zippy applicative. We want it to interact properly with &lt;code&gt;align&lt;/code&gt; - &lt;code&gt;pure&lt;/code&gt; should satisfy &lt;code&gt;align f a (pure b) == fmap (\a -&amp;gt; f (Both a b)) a&lt;/code&gt;. Most containers aren't going to have a way of representing &lt;code&gt;pure&lt;/code&gt; which satisfies this property, so it seems we have to rely on a datatype like this:&lt;pre&gt;&lt;code&gt;data PureT f a = Pure a | Finite (f a)&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;This just adds a data constructor for symbolically representing the result of a call to &lt;code&gt;pure&lt;/code&gt;. And we can define our &lt;code&gt;Alignable&lt;/code&gt; instance such that &lt;code&gt;Pure a&lt;/code&gt; has the desired meaning.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-4680500345704361906?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/4680500345704361906/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=4680500345704361906' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/4680500345704361906'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/4680500345704361906'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2010/06/alignable-functors-typeclass-for-zippy.html' title='Alignable functors: a typeclass for &apos;zippy&apos; containers'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-3688679227423124520</id><published>2010-06-23T18:45:00.000-07:00</published><updated>2010-09-22T21:15:55.953-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='health'/><title type='text'>What causes people to gain or lose weight?</title><content type='html'>A while ago I finished Gary Taubes' &lt;a href="http://www.amazon.com/Good-Calories-Bad-Controversial-Science/dp/1400033462/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1272419246&amp;sr=1-1"&gt;Good Calories, Bad Calories: Fat, Carbs, and the Controversial Science of Diet and Health&lt;/a&gt; (GCBC). Surprisingly for a mainstream book, &lt;i&gt;Good Calories, Bad Calories&lt;/i&gt; has, like, actual science. The &lt;i&gt;horror&lt;/i&gt;!&lt;br /&gt;&lt;br /&gt;I'm going to focus on Taubes' discussion and explanation of obesity, since I found that the most interesting, but he also talks about heart disease, cancer, Alzheimer's and dementia, among other things. In addition, he traces the history of the science leading up to the our current understanding of nutrition and health, how certain ideas came to be accepted, etc. This chronicling is actually quite interesting in its own right and is a major focus of the book, but I won't discuss it here.&lt;br /&gt;&lt;h4&gt;Fat and carbohydrate metabolism&lt;/h4&gt;Any fully satisfying explanation for why people gain or lose weight must get down to the level of biochemistry. It isn't enough to say that "eating X makes you fat" - we need an actual mechanism by which the body chooses to store more fat in response to eating X. So what follows is a very brief overview of fat and carbohydrate metabolism, distilled from GCBC and some Wikipedia and web surfing. &lt;i&gt;Disclaimer:&lt;/i&gt; I am not a biochemist. Please provide corrections in the comments! &lt;br /&gt;&lt;br /&gt;During digestion carbohydrates are broken down into &lt;a href="http://en.wikipedia.org/wiki/Glucose"&gt;glucose&lt;/a&gt; and released into the bloodstream. Cells in the pancreas called &lt;a href="http://en.wikipedia.org/wiki/Beta_cells"&gt;beta cells&lt;/a&gt; detect glucose in the blood and secrete &lt;a href="http://en.wikipedia.org/wiki/Insulin"&gt;insulin&lt;/a&gt; in response. In addition to inhibiting fat metabolism, insulin promotes metabolism of glucose which brings blood sugar levels back down to normal.&lt;br /&gt;&lt;br /&gt;Glucose not immediately used for energy is converted to &lt;a href="http://en.wikipedia.org/wiki/Glycogen"&gt;glycogen&lt;/a&gt; and stored in the liver and muscles. If these glycogen stores are already at capacity (somewhere around 1500 calories' worth), the body converts the glucose to fat and stores it. People with type 2 diabetes are unable to synthesize and store this fat rapidly enough to maintain pace and end up excreting excess glucose in the urine.&lt;br /&gt;&lt;br /&gt;As blood glucose levels fall below normal levels, the pancreas secretes &lt;a href="http://en.wikipedia.org/wiki/Glucagon"&gt;glucagon&lt;/a&gt;, a hormone which signals the liver to convert glycogen back to glucose, again bringing blood sugar back to normal levels - this process is known as &lt;a href="http://en.wikipedia.org/wiki/Glycogenolysis"&gt;glycogenolysis&lt;/a&gt;. It seems the body prefers to use stored glycogen as energy before turning to fat metabolism, although the two sources are probably not mutually exclusive. In addition to glycogen, the liver can also syntesize glucose from other &lt;i&gt;non-carbohydrate&lt;/i&gt; sources (&lt;a href="http://en.wikipedia.org/wiki/Gluconeogenesis"&gt;gluconeogenesis&lt;/a&gt;), like proteins.&lt;br /&gt;&lt;br /&gt;In addition to this energy conversion process happening in the liver, low insulin levels enable fat tissue to convert fat from its stored form (&lt;a href="http://en.wikipedia.org/wiki/Triglyceride"&gt;triglycerides&lt;/a&gt;) to free fatty acids (&lt;a href="http://en.wikipedia.org/wiki/Fatty_acid#Free_fatty_acids"&gt;FFAs&lt;/a&gt;). These FFAs are released into the bloodstream and can be metabolized for energy. This is an active process, happening throughout the day, between meals and during sleep. In general, the body attempts to maintain a constant supply of energy to its cells despite fluctuations in the flow of fats and carbohydrates coming from digestion.&lt;br /&gt;&lt;br /&gt;The above story is an oversimplification in several ways. Glucagon production can also be promoted and inhibited by factors other than blood sugar levels. The mere taste of something sweet can prompt insulin production, well before any sugar would be digested and enter the bloodstream. And, going one level further down, each of the above responses has an underlying biochemical mechanism. For instance, in order to be stored as fat, the FFAs in the blood must be packaged up into triglycerides. The formation of triglycerides requires the presence of a molecule called glycerol phosphate, a byproduct of carbohydrate metabolism. (Glycerol phosphate, incidentally, is produced from fructose, the "fruit" sugar, more readily than from glucose.) It's easy to imagine various scenarios where, depending on the timing of when fats and carbohydrates are released into the bloodstream after a meal and the rate at which glucose and FFAs are absorbed, we get different amounts of fat stored. &lt;br /&gt;&lt;br /&gt;The other issue I've ignored is &lt;a href="http://en.wikipedia.org/wiki/Insulin_sensitivity"&gt;insulin sensitivity&lt;/a&gt;. Obese persons typically have higher than normal resting levels of blood sugar and insulin because their cells do not respond sufficiently to the hormone and thus fail to remove glucose from the blood. This condition is called insulin resistence, or low insulin sensitivity. Insulin resistence can lead to a vicious cycle ending in diabetes - because cells don't uptake the glucose, blood glucose remains high; beta cells in the pancreas respond to this heighted blood sugar by secreting even more insulin to compensate, which tends to increase insulin resistence further, and so on.&lt;br /&gt;&lt;br /&gt;Technically, it's more accurate to talk about the insulin sensitivity of &lt;i&gt;particular tissues&lt;/i&gt; in the body (when people talk about "insulin sensitivity" in general, as I did above, it's usually referring to insulin sensitivity of muscle tissue) and to take into account the &lt;i&gt;relative sensitivities&lt;/i&gt; of these tissues. Taubes discusses this in chapter 22, page 395, where he summarizes the research of James Neel. Neel listed listed three "scenarios" by which the body might be more inclined to store rather than burn fat. I'm going to just quote Taubes' descriptions:&lt;blockquote&gt;The first of these scenarios was what Neel called a "quick insulin trigger." By this Neel meant that the insulin-secreting cells in the pancreas are hypersensitive to the appearance of glucose in the bloodstream. They secrete too much insulin in response to the rise in blood sugar during a meal; that encourages fat deposition and induces a compensatory insulin resistence in the muslces...&lt;br /&gt;&lt;br /&gt;In Neel's second scenario, there is a tendency to become slightly more insulin-resistant than would normally be the case when confronted with a given amount of insulin in the circulation. So even an appropriate insulin response to the waves of blood sugar that appear during meals will result in insulin resistance, and that in turn requires a ratcheting up of the insulin response...&lt;br /&gt;&lt;br /&gt;Neel's third scenario is slightly more complicated, but there's evidence to suggest that this one comes closest to reality. Here an appropriate amount of insulin is secreted in response to the "excessive glucose pulses" of a modern meal, and the response of the muscle cells to the insulin is also appropriate. The defect is in the relative sensitivity of muscle and fat cells to the insulin. The muscle cells become insulin-resistent... but the fat cells fail to compensate. They remain stubbornly sensitive to insulin. So, as Neel explained, the fat tissue accumulates more and more fat, but "mobilization of stored fat would be inhibited."&lt;/blockquote&gt;Taubes doesn't offer much explanation or give any mechanism by which the insulin sensitivities of these tissues might be influenced, and he also doesn't talk about what factors other than insulin levels might affect the propensity of the body to burn rather than store fat. &lt;br /&gt;&lt;br /&gt;There is however an interesting discussion in the book of the role of the enzyme &lt;a href="http://en.wikipedia.org/wiki/Lipoprotein_lipase"&gt;lipoprotein lipase (LPL)&lt;/a&gt;, which acts as a "gatekeeper" for fat storage and gets us closer to understanding what's happening. LPL acts differently in different tissue types: in fat tissue, LPL acts to break lipids housed in &lt;a href="http://en.wikipedia.org/wiki/Lipoprotein"&gt;lipoproteins&lt;/a&gt; down into their component fatty acids and promotes flow of these FFAs into the fat cells where they are then repackaged as triglycerides for storage (lipoproteins serve as water-soluble transport for fats and cholesterol in the blood and are often characterized by their density - high-density lipoprotein (HDL), low density lipoprotein (LDL), etc). In muscle tissue, LPL promotes uptake of FFAs into the muscle where it can be used for energy. Differing levels of LPL activity in different parts of the body provide a good account for why men and women and even individuals tend to accumulate fat in different areas. &lt;br /&gt;&lt;br /&gt;LPL activity is upregulated in fat tissue by insulin - this provides a possible explanation for exactly how insulin acts to promote storage of fat in the fat tissue. But it seems we don't have the full story. Quoting Taubes, pg 399:&lt;blockquote&gt;LPL activity in fat tissue increases with weight loss on a calorie-restricted diet and it decreases in muscle tissue; both reactions will work to maintain fat in the fat tissue, regardless of any negative energy balance that may be induced by the semi-starvation diet. During exercise, LPL activity increases in muscle tissue, enhancing the absorption of fatty acids into the muscles to be burned as fuel. But when the workout is over, LPL activity in the fat tissue increases. The sensitivity of fat cells to insulin will also be "sufficiently altered," as the University of Colorado physiologist Robert Eckel has described it, so as to restock the fat tissue with whatever fat it might have surrendered.&lt;/blockquote&gt;This suggests there's more going on here - what is the mechanism by which the body affects this LPL activity and insulin sensitivity? Taubes doesn't quite close the biochemical loop here - I'm not sure if this is because the information just isn't known or that Taubes himself didn't know or chose to omit it.&lt;br /&gt;&lt;br /&gt;Since finishing the book I've been searching for additional info; what I've read so far is vague and handwavy, but seems to indicate that Omega-3 fatty acids, low &lt;a href="http://en.wikipedia.org/wiki/Cortisol"&gt;cortisol&lt;/a&gt; levels (cortisol is the "stress" hormone), sufficient sleep, vitamin D, and exercise all might promote (or at least prevent decline in) a tendency to mobilize fat out of the fat tissue for use as energy. But I'm still looking for more convincing accounts that get down to the biochemical nitty-gritty. Maybe I'll write another post after doing further research.&lt;br /&gt;&lt;br /&gt;As an aside, Taubes does offer an explanation for why some people might have pancreatic cells that are more sensitive to glucose:&lt;blockquote&gt;[A]s women of childbearing age get heavier and more of them become diabetic, they pass the metabolic consequences on to their children through what is technically known as the intrauterine environment... If the mother has high blood sugar, then the developing pancreas in the fetus will respond to this stimulus by over-producing insulin-secreting cells. "The baby is not diabetic," explains Boyd Betzger, who studies diabetes and pregnancy at Northwestern Univeristy, "but the insulin-producing cells in the pancreas are stimulated to function and grow in size and number by the environment they're in. So they start overfunctioning. That in turn leads to a baby laying down more fat, which is why the baby of a diabetic mother is typified by being a fat baby." ... This is also the most likely explanation for why children born to women who gain excessive weight during pregnancy also tend to be fatter.&lt;/blockquote&gt;I found this alarming.&lt;br /&gt;&lt;h4&gt;Diet implications&lt;/h4&gt;My impression is that none of the above is particularly controversial. But what are the dietary implications of all this, particularly for someone trying to lose weight? Here there's disagreement.&lt;br /&gt;&lt;br /&gt;First, Taubes' argument: merely restricting calories is not the most effective way to lose weight. As mentioned above, most people who are overweight have some level of insulin resistance and higher-than-normal levels of circulating insulin. We might say that the base insulin level in the blood and insulin sensitivities of the fat cells and muscles together determine an equilibrium for amount of fat stored. Restricting calories without a commensurate decrease in base insulin levels (or increase in muscle insulin sensitivity, or decreased fat tissue insulin sensitivity) won't change this equilibrium. With no more fat available for metabolism than before but fewer calories available in the diet, the body has no way of maintaining its previous energy expenditure. The result is low energy and persistent hunger. According to this view, hunger is not the cause of obesity but merely a symptom of an underlying metabolic problem (see &lt;a href="http://en.wikipedia.org/wiki/Metabolic_syndrome"&gt;metabolic syndrome&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;It seems obvious, then, that the way to encourage fat loss is to eat less foods that promote an insulin response - that is, refined and easily digestible carbohydrates like sugar, potatoes, white bread, and other starches. This has the effect of decreasing insulin levels, allowing more stored fat to be used for energy. This fat is, to the body, indistinguishable from fat coming from the diet, and has the same effect on appetite as if it were coming from the diet. Since the body is extracting adequate energy from its fat stores, there's no increased feelings of hunger despite the restriction in dietary calories. A corollary of all this is that dietary fat does not itself make a person fat and eating a low fat diet (which implies increasing carbs assuming we keep calories constant) is probably a very bad way to lose weight, since insulin levels may actually increase under such a regime.&lt;br /&gt;&lt;br /&gt;There's more to Taubes' argument but this is the basic idea. I highly recommend reading chapters 22 and 24 of GCBC for more detail. They can be read independently from the rest of the book.&lt;br /&gt;&lt;br /&gt;I've been searching for good rebuttals to Taubes, but so far haven't found any. Many people seem annoyed that Taubes would even suggest anything contradicting the conventional wisdom of diet and health and don't engage honestly with his arguments. For instance, consider &lt;a href="http://reason.com/archives/2003/03/01/big-fat-fake"&gt;Michael Fumento's article in Reason&lt;/a&gt;, discussing the earlier (2002) article by Taubes that led to the book deal (&lt;a href="http://reason.com/archives/2003/03/01/an-exercise-in-vitriol-rather"&gt;Taubes' response&lt;/a&gt; and &lt;a href="http://reason.com/archives/2003/03/01/gary-taubes-tries-to-overwhelm"&gt;Fumento's follow up&lt;/a&gt;; there's also this &lt;a href="http://web.mit.edu/knight-science/fellows/interviews/taubes.html"&gt;excellent interview with Taubes&lt;/a&gt; discussing how the article evolved and some of his more controversial decisions). Fumento is quite angry and in my opinion misrepresents Taubes, whose goal was to get people to take seriously an alternative hypothesis of obesity. Here's &lt;a href="http://www.proteinpower.com/drmike/wp-content/uploads/2008/07/taubes-response-to-bray-ob-reviews.pdf"&gt;Taubes in his response to George Bray&lt;/a&gt; (&lt;a href="http://www.proteinpower.com/drmike/wp-content/uploads/2008/07/bray-review-of-gcbc.pdf"&gt;Bray's review&lt;/a&gt; and &lt;a href="http://www.proteinpower.com/drmike/statins/gary-taubes-responds-to-george-bray/"&gt;discussion by Michael Eades&lt;/a&gt;):&lt;blockquote&gt;One goal of GCBC is to motivate investigators in this field to take a more rigorous, strictly scientific approach to their research, rather than taking critical issues on faith because they agree with their preconceptions. The book attempts to establish that compelling evidence indeed exists for an alternative hypothesis of obesity, and that the disorder is fundamentally caused by the influence of carbohydrates on insulin and insulin on fat accumulation, not by eating too much or sedentary behaviour as has been dogma for decades.&lt;/blockquote&gt;In the afterword of GCBC, Taubes is also quite explicit about the fact that he simply wants people to take the alternative hypothesis seriously and even suggests an experimental design that would test the theory.&lt;br /&gt;&lt;br /&gt;Many responses, such as Bray's, bring up the first law of thermodynamics which states in a closed system energy can be neither created nor destroyed. This is true but irrelevant - the body is not a closed system. In response to increased consumption of calories, the body can: 1) excrete excess calories as waste 2) burn more calories by increasing activity, generating additional heat, or even simply allowing more heat to escape the skin, or 3) store more fat. Likewise, in response to decreased consumption of calories, the body can 1) lower energy expenditure in numerous ways or 2) convert fat or protein to energy. Given how much discretion the body has over these pathways, much of it involuntary or subconscious, it is implausible to assume we can hold all other variables constant while decreasing or increasing calorie intake. Robert McLeod also elaborates on this &lt;a href="http://entropyproduction.blogspot.com/2009/02/all-medical-science-is-wrong-within-95.html"&gt;in his review&lt;/a&gt;. Also see &lt;a href="http://www.proteinpower.com/drmike/weight-loss/more-braying-from-bray/"&gt;Michael Eades' discussion&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;A more substantive criticism is Taubes doesn't devote adequate attention to determining what factors promote or inhibit insulin sensitivity and fat metabolism in general, or at least emphasizing that these are critical questions. There obviously is more to the story here - in addition to the questions raised above regarding LPL activity, we know that some individuals can consume large amounts of sugar and carbs without becoming fat. What is the reason for this? Simply saying it's "genetic" or there exists a &lt;a href="http://en.wikipedia.org/wiki/Set_point"&gt;"set-point"&lt;/a&gt; is not an explanation (and Taubes trashes the whole set-point concept so I doubt he'd accept such an "explanation") - ultimately, there must be some biochemical mechanism, and if there is such a mechanism it could potentially be influenced in some way by diet or exercise.&lt;br /&gt;&lt;br /&gt;GCBC also fails to fully close the loop in the explanation of weight gain, energy levels, and hunger. We need to know the mechanisms by which the body increases or decreases energy expenditure and hunger. It seems that the hormone &lt;a href="http://en.wikipedia.org/wiki/Leptin"&gt;leptin&lt;/a&gt; plays a major role here (&lt;a href="http://www.marksdailyapple.com/leptin/"&gt;article on Mark's Daily Apple&lt;/a&gt;, &lt;a href="http://www.marksdailyapple.com/further-adventures-with-leptin/"&gt;follow up&lt;/a&gt;, &lt;a href="http://wholehealthsource.blogspot.com/2008/12/leptin-resistance-and-sugar.html"&gt;Stephen Guyenet's discussion of leptin resistance&lt;/a&gt;), but Taubes doesn't discuss it much at all.&lt;br /&gt;&lt;br /&gt;Overall, though, I highly recommend the book.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-3688679227423124520?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/3688679227423124520/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=3688679227423124520' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/3688679227423124520'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/3688679227423124520'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2010/04/what-causes-people-to-gain-or-lose.html' title='What causes people to gain or lose weight?'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-3336867801204372107</id><published>2010-06-09T18:42:00.000-07:00</published><updated>2010-06-09T19:08:15.184-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>GADTs in Scala</title><content type='html'>It looks like ordinary Scala case classes can express &lt;a href="http://en.wikibooks.org/wiki/Haskell/GADT"&gt;GADTs&lt;/a&gt;. Here's an example: &lt;pre&gt;trait Expr[A] &lt;br /&gt;case class I(i: Int) extends Expr[Int]&lt;br /&gt;case class B(b: Boolean) extends Expr[Boolean]&lt;br /&gt;case class Add(a: Expr[Int], b: Expr[Int]) extends Expr[Int]&lt;/pre&gt;We can define an evaluation function with the type &lt;code&gt;Expr[A] =&gt; A&lt;/code&gt;:&lt;pre&gt;def eval[A](e: Expr[A]): A = e match {&lt;br /&gt;  case I(i) =&gt; i&lt;br /&gt;  case B(b) =&gt; b&lt;br /&gt;  case Add(a,b) =&gt; eval(a) + eval(b)&lt;br /&gt;}&lt;/pre&gt;This all compiles and typechecks, and has the expected behavior: &lt;code&gt;Add(I(1), B(false))&lt;/code&gt; is a type error, &lt;code&gt;eval(I(22))&lt;/code&gt; results in 22, etc.&lt;br /&gt;&lt;br /&gt;I haven't played with this much, so I wonder if there's any situations where the Scala GADTs behave differently than Haskell's.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-3336867801204372107?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/3336867801204372107/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=3336867801204372107' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/3336867801204372107'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/3336867801204372107'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2010/06/gadts-in-scala.html' title='GADTs in Scala'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-4045790842172880997</id><published>2010-06-06T20:04:00.000-07:00</published><updated>2010-06-06T21:38:58.281-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>The SPJ trick for typesafe ADT annotations</title><content type='html'>I came across this technique in SPJ's book, &lt;a href="http://research.microsoft.com/en-us/um/people/simonpj/papers/pj-lester-book/"&gt;Implementing functional languages: a tutorial&lt;/a&gt; (see page 217). In retrospect I guess it's pretty obvious, but I figured I'd write it up since it's kind of nifty and I recently had occasion to use it at work.&lt;br /&gt;&lt;br /&gt;Here's a simple ADT: &lt;pre&gt;data Expr = &lt;br /&gt;  Primitive Val |&lt;br /&gt;  Variable Name |&lt;br /&gt;  App Expr Expr |&lt;br /&gt;  Lambda Name Expr&lt;br /&gt;&lt;/pre&gt;Say we're writing a compiler and we want to annotate each node with its type, or information about free variables, etc. We could add an extra constructor to hold these sorts of annotations:&lt;pre&gt;data Expr = &lt;br /&gt;  Primitive Val |&lt;br /&gt;  Variable Name |&lt;br /&gt;  App Expr Expr |&lt;br /&gt;  Lambda Name Expr |&lt;br /&gt;  Annotated Annotation Expr&lt;br /&gt;&lt;br /&gt;data Annotation = TypeAnnotation Type | FreeVariables [Name] | ...&lt;br /&gt;&lt;/pre&gt;This is okay, except that functions expecting certain annotation info have no guarantee it is actually present, or might receive annotations of the wrong type. (Also, how do we know what order the annotations are nested in, if multiple are expected to be present? The types don't tell us anything.) &lt;br /&gt;&lt;br /&gt;Okay, let's try again:&lt;pre&gt;data Expr a = &lt;br /&gt;  Primitive Val |&lt;br /&gt;  Variable Name |&lt;br /&gt;  App Expr Expr |&lt;br /&gt;  Lambda Name Expr |&lt;br /&gt;  Annotation a Expr&lt;br /&gt;&lt;/pre&gt;Slightly better... now we can at least guarantee that the annotation is of a certain type. But we still don't have any assurance that the annotation is attached to the expected nodes.&lt;br /&gt;&lt;br /&gt;Going one step further, we have:&lt;pre&gt;data Expr a = &lt;br /&gt;  Primitive a Val |&lt;br /&gt;  Variable a Name |&lt;br /&gt;  App a Expr Expr |&lt;br /&gt;  Lambda a Name Expr |&lt;br /&gt;&lt;/pre&gt;This guarantees the annotation is present, and has a definite type, but there's no uniform way to pull out the annotation info from each constructor. We could write a function, &lt;code&gt;annotation&lt;/code&gt; that pattern matches on the &lt;code&gt;Expr&lt;/code&gt; to retrieve the annotation. But, it turns out we can do one better:&lt;pre&gt;data Expr a = Expr a (ExprImpl a)&lt;br /&gt;&lt;br /&gt;data ExprImpl a = &lt;br /&gt;  Primitive Val |&lt;br /&gt;  Variable Name |&lt;br /&gt;  App (Expr a) (Expr a) |&lt;br /&gt;  Lambda Name (Expr a)&lt;br /&gt;&lt;/pre&gt;This is the technique described by SPJ. All nodes in the ADT are guaranteed to have the annotation, and access of the annotation is uniform. &lt;code&gt;Expr&lt;/code&gt; can even just be a type alias for &lt;code&gt;(a, ExprImpl a)&lt;/code&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-4045790842172880997?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/4045790842172880997/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=4045790842172880997' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/4045790842172880997'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/4045790842172880997'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2010/06/spj-trick-for-typesafe-adt-annotations.html' title='The SPJ trick for typesafe ADT annotations'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-6134675743699620724</id><published>2010-05-16T15:30:00.000-07:00</published><updated>2010-05-16T15:52:53.272-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Arguments from Personal Incredulity</title><content type='html'>Here's one of my favorite &lt;a href="http://en.wikiquote.org/wiki/Richard_Dawkins"&gt;Richard Dawkins quotes&lt;/a&gt;: &lt;blockquote&gt;Never say, and never take seriously anyone who says, "I cannot believe that so-and-so could have evolved by gradual selection". I have dubbed this kind of fallacy "the Argument from Personal Incredulity". Time and again, it has proven the prelude to an intellectual banana-skin experience.&lt;/blockquote&gt; Of course, Dawkins is talking about evolution, but I see the same sorts of "Arguments from Personal Incredulity" come up again and again in discussions about functional programming. Except, the "Personal Incredulity" argument is something along the lines of: "I cannot believe that such-and-such can be done without side effects..."&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-6134675743699620724?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/6134675743699620724/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=6134675743699620724' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/6134675743699620724'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/6134675743699620724'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2010/05/arguments-from-personal-incredulity.html' title='Arguments from Personal Incredulity'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-237000280084920472</id><published>2010-03-03T14:51:00.001-08:00</published><updated>2010-03-11T22:05:18.027-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Follow up to 'Actors are not a good concurrency model'</title><content type='html'>My &lt;a href="http://pchiusano.blogspot.com/2010/01/actors-are-not-good-concurrency-model.html"&gt;recent post on actors&lt;/a&gt; generated a lot of comments. I'll try to summarize some of them here and offer my responses. &lt;br /&gt;&lt;br /&gt;To start, a number of people took issue with my working definition of actors, claiming that in implementation &lt;i&gt;X&lt;/i&gt;, actors &lt;i&gt;are&lt;/i&gt; composable. We can quibble about my  definition, but I hope the deeper point was clear - side effects and state hurt composability and usage of actors as I defined them is stateful. James Iry has also made some of the same points about the &lt;a href="http://james-iry.blogspot.com/2009/04/erlang-style-actors-are-all-about.html"&gt;statefulness of Erlang style actors&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;But, if you don't like my definition, if you'd like to claim that an "actor" is actually just a function &lt;code&gt;A =&gt; Future[B]&lt;/code&gt; and that therefore "actors" are composable, then I don't really have a problem with that (although, why call this model 'actors'? Why not call it 'functions from &lt;code&gt;A =&gt; Future[B]&lt;/code&gt;' or 'the &lt;a href="http://www.haskell.org/haskellwiki/Arrow_tutorial"&gt;Kleisli arrow&lt;/a&gt; for futures'?). But if the actor model also includes the ability to asynchronously "send a message" to another arbitrary actor, and if the expression representing this message send does not evaluate to a future containing the result sent back by the receiving actor, then my subsequent arguments about the lack of composability still apply. The takeaway from my post, even if you think my definitions are bogus, shouldn't be "Aha!! Actors are okay! There is no issue with using them, even in stateful, non-composable ways".&lt;br /&gt;&lt;br /&gt;This brings up another point - I don't think everyone commenting (both on my post and James' above) is working with the same ideas about what it means for a function or expression to be pure. The "standard" definition is that an expression (such as sending a message to an actor) is side-effect free if it is &lt;a href="http://en.wikipedia.org/wiki/Referential_transparency_(computer_science)"&gt;referentially transparent&lt;/a&gt;. And while there are some nuances to the definition of referential transparency, I think everyone familiar with the concept would agree that functions from &lt;code&gt;A =&gt; Unit&lt;/code&gt; can't possibly be RT unless they are literally the constant unit function.&lt;br /&gt;&lt;br /&gt;Looking through the various responses, I notice that no one really argued with my claim that side-effects hurt composability. I'd be interested if anyone can poke holes in my argument here (and by that I mean finding some problem with my logic, not disputing my definitions).&lt;br /&gt;&lt;br /&gt;A number of people responded to my negative offhand remarks on OOP. Pointing out problems with OOP was not really the point of my post. Obviously I'm no fan of OOP and maybe someday I'll write more about that. The general point I was making is we should be wary of adopting a "better" technology without understanding the underlying problem that technology purports to solve. Doing so inevitably leads to solutions with a lot of incidental complexity that fail to even solve the underlying problems fully. For solutions like this, you'll often see advocates unable to really give a formal argument for why that solution is superior, instead falling back on pointing to particular examples, harping on how convenient certain things are, and gushing about various intangibles like how "beautiful" it is. There's nothing wrong with such advocacy, but if you cannot formalize your argument that one solution is better than another, chances are you do not fully understand the underlying problem and are therefore ignorant of whether some more direct, simpler solution exists. &lt;br /&gt;&lt;br /&gt;Moving on, Chris Quenelle had this interesting comment, claiming that any form of explicit parallelism is unnecessary:&lt;blockquote&gt;&lt;br /&gt;If your program is purely functional, the compiler can assign threads to whichever chunks of calculation it wants to. Hence you don't need actors. Or any other form of explicit parallelism. The purpose of explicit synchronization is to manage the timing of side-effects in the presence of parallelism.&lt;/blockquote&gt;&lt;br /&gt;I'm sympathetic to this point of view but I do think there needs to be something more. I've tinkered with the &lt;code&gt;Future&lt;/code&gt; monad, which is explicit in the sense that you have decide when you are writing code in that monad, but implicit in that it only requires that you indicate dependencies between computations, not specify how those computations are scheduled out to threads, how many threads are used, etc. But I believe this breaks down for distributed computation, where the &lt;i&gt;topology&lt;/i&gt; of the concurrency is a static or semi-static structure that must be under the control of the programmer. I've also found the monadic style doesn't work so great for "pipeline" parallelism. But more on that in a later post.&lt;br /&gt;&lt;br /&gt;Ulf Wiger had a pretty interesting comment in which he argued (I think, Ulf, correct me if I'm wrong) that the compositional style often leads to excessive dependencies between components. This is bad given that "the thing that often kills large projects is dependency management". I would buy the claim that large projects can be killed by poor dependency management, although of course projects can fail due to lots of other reasons, too. But I view this as completely separate from my general arguments about how side-effects kill composability. Even when programming purely functionally, nothing stops you from duplicating programming work to avoid relying on some shared (not yet completed) dependency. For instance, even if two modules could in principle both be implemented using some shared generic code, it might be worth not building this out if the communication overhead and additional dependency would create a bottleneck for the overall team. This is analogous to the situation that often arises in parallel algorithms, where one can improve runtimes by duplicating some work but running things in parallel. In any case, nothing about purely functional programming precludes you from duplicating work like this if it makes sense. But if you do decide you'd like to reuse and compose code, you have the means to do it.&lt;br /&gt;&lt;br /&gt;Lastly, there was one commenter who disagreed with my claim that "intuitiveness" does not justify use of actors as a programming model: "The link to our intuition and hence ability to leverage prior experience is paramount in being able to comprehend complex systems."&lt;br /&gt;&lt;br /&gt;This is sort of a loaded statement. Let me unpack it a bit. First, what is really meant by intuition here? When you learn something, anything, you develop an intuition for it which of course is helpful. But if this is the case, what does it really mean to say that one technology is more intuitive than another? Assuming you understand both technologies and have an intuition for both, is one more "intuitive" than the other? With respect to what? Perhaps that simply means you are more familiar with one than the other? &lt;br /&gt;&lt;br /&gt;No, I suspect what is actually meant by 'intuitive' here is "analogous to the physical world". But is there really anything particularly magical about the intuition each of us has from our understanding of the physical world? The more you try to pin down the supposed benefits of this intuition, the more they seem to vanish. In programming, the inferences one can make by analogy to the physical world are either too vague and too unreliable to be useful, or so simple that nothing is gained by tying them back to a physical system. Intuitions can inspire you as you explore a design space, but actual programming requires much more precise thinking than the vague intuitions we all have; fuzzy models here are for the most part "worse than useless". We're better off building, understanding, and developing an intuition for some new, simpler, more precise abstraction.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-237000280084920472?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/237000280084920472/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=237000280084920472' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/237000280084920472'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/237000280084920472'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2010/03/follow-up-to-actors-are-not-good.html' title='Follow up to &apos;Actors are not a good concurrency model&apos;'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-6571196421976972675</id><published>2010-01-15T12:02:00.000-08:00</published><updated>2010-03-11T22:11:04.874-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Actors are not a good concurrency model</title><content type='html'>&lt;b&gt;Update:&lt;/b&gt; Also see &lt;a href="http://pchiusano.blogspot.com/2010/03/follow-up-to-actors-are-not-good.html"&gt;follow up post&lt;/a&gt; and &lt;a href="http://www.reddit.com/r/programming/comments/b7amz/actors_are_not_a_good_concurrency_model/"&gt;discussion on reddit&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Actors are not a good concurrency model, and neither are Erlang processes.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Wait, &lt;i&gt;what?&lt;/i&gt; Isn't Erlang the king of high-uptime distributed systems? If Erlang can do all this using the actor model (Erlang processes are identical to actors in the essential ways, discussed below), isn't there something good about it? Isn't there??&lt;br /&gt;&lt;br /&gt;Well, yes. What's good about it is it's better than the dinosaur era alternative of shared-state preemptive multithreading. But so what? Just because the actor model is better than dinosaur era technology doesn't mean we should keep using it. Remember the rise of OO? Before OO, the alternative was programming in languages like C, with pretty much zero support for polymorphism. Any approach to polymorphism was better than nothing at all, so OO's concepts of classes and subtyping was a huge improvement (my opinion is that polymorphism was the real unfilled niche that OO filled). Now that we've learned a bit more, OO has started to seem less appealing to some - parametric polymorphism exists independent of OOP, and bounded polymorphism can be implemented with either typeclasses or a combination of first-class and higher-order modules. It's possible there are some other ideas from OO worth salvaging, but more likely I think OO is just an evolutionary dead-end. &lt;br /&gt;&lt;br /&gt;And so it might be with actors. Though I don't know exactly what the replacement looks like yet (I'd like to look at some alternatives and explore ideas in a future post), I do know what's wrong with actors, and that's what I'd like to explore here. But going further, I'd like to use actors as an example to show what's also problematic with side effects and impure functions &lt;i&gt;in general&lt;/i&gt;. In doing so I'll try to avoid the usual FP evangelizing and make this a more precise, technical argument. &lt;br /&gt;&lt;br /&gt;So what's wrong with actors? The problem which dooms the actor model is that actors are fundamentally &lt;i&gt;not composable&lt;/i&gt;. I'm going to leave that term undefined for now, except to say vaguely that entities are composable if we can easily and generally combine their behaviors in some way &lt;i&gt;without having to modify the entities being combined&lt;/i&gt;. I think of composability as being the key ingredient necessary for acheiving reuse, and for achieving a combinatorial expansion of what is succinctly expressible in a programming model. (Also see this &lt;a href="http://blog.tmorris.net/why-functional-programming-matters-in-short-prose/"&gt;explanation by Tony Morris&lt;/a&gt;.) It goes without saying that code reusability is a very desireable property for programs to have, and many of the other virtues of good software engineering end up tying back to reusability - for instance, testability is nothing more than the ability to reuse a component in the context of a test. &lt;br /&gt;&lt;br /&gt;So what makes actors not composable? Well, an actor (and an Erlang process) is constructed from a function &lt;code&gt;A =&gt; Unit&lt;/code&gt;. That this function returns &lt;code&gt;Unit&lt;/code&gt; rather than a meaningful type means that it must &lt;i&gt;hardcode some action to take once the result (&lt;b&gt;whose type is not even shown&lt;/b&gt;) is available&lt;/i&gt;. It is this unfortunate property that destroys any chance we have at combining actors in any sort of general way like we would in a typical combinator library. &lt;br /&gt;&lt;br /&gt;As a simple example, consider the following two actors: 1) an actor that accepts lists of integers and turns the list into a min-heap. 2) An actor that accepts (heap,k) pairs and extracts the top k elements in order from the given heap. Can we write a function that accepts these two actors and returns an actor which accepts (list,k) pairs and pulls out the k minimum elements in sorted order? And generalizing a bit, can we write the more general combining function, the one that doesn't care whether the types are 'heap', 'int' and 'list' or &lt;code&gt;A&lt;/code&gt;, &lt;code&gt;B&lt;/code&gt; or &lt;code&gt;C&lt;/code&gt;?&lt;br /&gt;&lt;br /&gt;You just can't do this with actors since you can't make any assumptions about what the actor will do with the result - it might forward it to some other actor, write it to a file, extract from it the missile launch codes and launch the missile, etc. In fact, if you actually try to achieve the composability I'm talking about using actors, what you end up doing is having all actors return their result as a response to their sender, essentially recreating pure functions within the actor model, badly, losing type information in the process. What this should tell you is that actors, if worth anything, might be useful more as a tool for building some other higher-level abstraction. But if that's the case, perhaps we shouldn't even expose actors as a programming model and instead just program directly in the higher-level model.&lt;br /&gt;&lt;br /&gt;In practice, I suspect actors are basically only used at the top level of a program, where the damage to composability is minimized, and pure functions (which are composable and work fine at any level of the program) are used everywhere else to achieve reuse. The problem with this approach is when you find out a few months later that what you thought was going to be the top level of your program actually is going to become a very small embedded component in some larger system, and this top level now contains a large amount of unreusable complex logic that must be gutted to work as an embedded component. &lt;br /&gt;&lt;br /&gt;The problem I am talking about here is not in any way specific to actors. It applies to any functions with side effects. I claim the only way to achieve maximum composability is to program with pure functions, and this applies to concurrent programs or to any other programs you'd like to write. (Sometimes, you decide it's worth taking the composability hit and you write functions with side effects - of course I don't object to this in absolutely all cases - but I don't think the fundamental model underlying concurrent and distributed programming should have to take this hit.) &lt;br /&gt;&lt;br /&gt;What makes pure functions composable is they are &lt;i&gt;only&lt;/i&gt; the logic of the computation. A pure function makes no decisions about what actions to take with the result of the computation, and also makes no decisions about what actions to take before the computation is executed. By keeping these concerns separate, we can reuse the function elsewhere in places where we need to do something different with the result, or where we need to do something different before the computation is executed (for instance, in testing, to generate an instance of the input type rather than obtaining the input from some impure function). &lt;br /&gt;&lt;br /&gt;In fact, you can think of any impure function as having three "steps": 1) An "input" side effect, &lt;code&gt;Unit =&gt; B&lt;/code&gt;, a pure function &lt;code&gt;(A,B) =&gt; C&lt;/code&gt;, and an "output" side effect &lt;code&gt;C =&gt; Unit&lt;/code&gt;. It makes perfect software engineering sense to decouple these components - and that is exactly what is done in purely functional programming.&lt;br /&gt;&lt;br /&gt;Let's look at a very simple example. Suppose I write a &lt;code&gt;sort&lt;/code&gt; function, which sorts the input list in place. For this function, we have no input side effects. The pure core of this function is (conceptually) a sorting function that returns a new list. What is the output side effect? It is to &lt;i&gt;rebind the name that the input list is bound to in the caller to the sorted version of the list&lt;/i&gt;. So if the caller were something like &lt;pre&gt;def foo(list: ArrayList[Int]): Foo = { &lt;br /&gt;  ...&lt;br /&gt;  sort(list)&lt;br /&gt;  ...&lt;br /&gt;}&lt;/pre&gt;then &lt;code&gt;sort&lt;/code&gt; will rebind the name &lt;code&gt;list&lt;/code&gt; to the sorted version of that list. Since &lt;code&gt;list&lt;/code&gt; is itself a parameter of &lt;code&gt;foo&lt;/code&gt;, this rebinding will occur in the caller of &lt;code&gt;foo&lt;/code&gt;, and possibly its caller, all the way out to the place where that list was originally declared. The question is, should the sort function really be making the decision about whether to do this? If we let the sort function make this decision, we're allowing it to make an assumption about the caller, namely, that the caller no longer intends to maintain references to the unsorted version of the list (an assumption which, incidentally, is not even tracked by the type system).&lt;br /&gt;&lt;br /&gt;Of course, this example of an in-place sort isn't too terrible. The caller just needs to be aware of this side effect and has to adjust its calling convention accordingly. But the lack of uniformity in chaining and combining logic caused by side effects starts to add up very quickly. In practice, what actually occurs in large systems with impure functions sprinkled about is that many functions end up with so many assumptions about their callers (and these assumptions propagate to their callers, as above) that the call graph becomes very static, with functions often having only a single or small number of callers. This makes the system a lot more rigid, more difficult to test without elaborate mocking and dependency injection frameworks (which brings with it its own set of problems), and results in a lot of (often hidden) duplication of logic. &lt;br /&gt;&lt;br /&gt;Other simple examples show more obvious destruction of composability. If I write a function that reads a bunch of things from a file &lt;i&gt;and then&lt;/i&gt; performs some complex logic, we can't easily reuse that logic elsewhere (unless we want to populate a file first, which may not be what we want). If a function performs some complex logic &lt;i&gt;and then&lt;/i&gt; launches the missile, we can't easily reuse that complex logic elsewhere in places where we don't want to launch the missile. &lt;br /&gt;&lt;br /&gt;Taking a step back from all this bashing of actors and side effects, I do certainly agree that actors and side effects in general have a certain "intuitive" appeal, a rough analog to how the real world works. But that does not justify their usage as a programming model. The technical challenges of engineering large pieces of software mean that a good programming model may have to tradeoff intuitiveness for attributes like composability. But I suspect even this tradeoff is overplayed - intuitiveness is also often code language for "what system I am familiar with". Once you learn and internalize a different model your ability to reason within that model (and thus, I believe, its intuitiveness) is more a function of the features and formal structure of that model than of some inherent "intuitiveness" metric.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-6571196421976972675?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/6571196421976972675/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=6571196421976972675' title='28 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/6571196421976972675'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/6571196421976972675'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2010/01/actors-are-not-good-concurrency-model.html' title='Actors are not a good concurrency model'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>28</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-1689709648634934310</id><published>2009-12-03T10:25:00.000-08:00</published><updated>2009-12-04T07:07:57.126-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>A silly trick for implementing parametrized traits</title><content type='html'>Here's something I noticed recently. In Scala, when you implement a generic trait, specialized to some particular type arguments, you end up having to rewrite the full method signatures, substituting in the type arguments. So, for instance, if you have&lt;br /&gt;&lt;pre&gt;&lt;code&gt;trait Ring[T] {&lt;br /&gt;  def one: T&lt;br /&gt;  def zero: T&lt;br /&gt;  def plus(a: T, b: T): T&lt;br /&gt;  def negate(a: T): T&lt;br /&gt;  def times(a: T, b: T): T&lt;br /&gt;}&lt;/code&gt;&lt;/pre&gt; and go to provide a &lt;code&gt;Ring[Int]&lt;/code&gt; implementation, you might write something like:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;object IntRing extends Ring[Int] {&lt;br /&gt;  def one: Int = 0&lt;br /&gt;  def zero: Int = 1&lt;br /&gt;  def plus(a: Int, b: Int) = a + b&lt;br /&gt;  ... &lt;br /&gt;}&lt;/code&gt;&lt;/pre&gt; A little trick is to add a local type member within your implementation of the trait, aliasing &lt;code&gt;T&lt;/code&gt; to &lt;code&gt;Int&lt;/code&gt;:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;object IntRing extends Ring[Int] {&lt;br /&gt;  type T = Int // GENIUS!!!&lt;br /&gt;  def one: T = 1&lt;br /&gt;  def zero: T = 0&lt;br /&gt;  def plus(a: T, b: T) = a + b&lt;br /&gt;  ... &lt;br /&gt;}&lt;/code&gt;&lt;/pre&gt; Kind of silly, but now the implementation is literally a copy/paste of the original trait, and you only have to provide implementations for the methods. In some ways, I think it's a little clearer than writing out the specialization. &lt;br /&gt;&lt;br /&gt;It's too bad Scala forces you to repeat yourself though. The method signatures have already been supplied in the trait; unless you happen to be overloading method names, rewriting the method signatures provides no additional information. In effect, the problem of inferring these method signatures is already done. Other than not being valid syntax, it seems like I ought to be able to write something like:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;object IntRing extends Ring[Int] {&lt;br /&gt;  def one = 1&lt;br /&gt;  def zero = 0&lt;br /&gt;  def plus(a, b) = a + b&lt;br /&gt;  def times(a, b) = a*b&lt;br /&gt;  ...&lt;br /&gt;}&lt;/code&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-1689709648634934310?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/1689709648634934310/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=1689709648634934310' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/1689709648634934310'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/1689709648634934310'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2009/12/silly-trick-for-implementing.html' title='A silly trick for implementing parametrized traits'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-3752225649726951038</id><published>2009-11-12T10:52:00.000-08:00</published><updated>2009-12-10T17:35:10.235-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Another encoding of typeclasses in Scala</title><content type='html'>I've been thinking about different ways of doing typeclass-like things in Scala, using Scala's first-class modules (aka objects), and functions over these modules (aka classes). More generally, I'm interested in exploring whether there's anything as convenient as typeclasses that could be implemented in a dynamic language. &lt;br /&gt;&lt;br /&gt;Haskell's typeclasses can be encoded pretty directly in Scala, using an implicit module derived from some polymorphic type parameter: &lt;pre&gt;  def dot[A](a: List[A], b: List[A])(&lt;br /&gt;    implicit r: Ring[A]): A = ...&lt;br /&gt;&lt;br /&gt;  def sequence[M[_],A](l: List[M[A]])(&lt;br /&gt;    implicit a: Applicative[M]): M[List[A]] = ...&lt;br /&gt;&lt;/pre&gt;From what I understand, Haskell's &lt;i&gt;implementation&lt;/i&gt; of typeclasses works similarly: bounded polymorphic functions receive one or more hidden dictionary (module) parameters. Except in Haskell modules are not first class objects that the programmer can access or manipulate. (Although I'm unsure how much of a restriction this is.)&lt;br /&gt;&lt;br /&gt;Given that modules are first class in Scala, though, why don't we just pass the module in at some outer scope? &lt;pre&gt;  class ApplicativeM[M[_]](a: Applicative[M]) {&lt;br /&gt;    def sequence[A](l: List[M[A]]): M[List[A]] = ...&lt;br /&gt;    // other Applicative derived functions here&lt;br /&gt;  }&lt;/pre&gt;We can then put all the &lt;code&gt;Applicative&lt;/code&gt;-derived functions nested within the body of this class. If we then give a name to our &lt;code&gt;ApplicativeM[List]&lt;/code&gt; instance, like &lt;code&gt;ListA&lt;/code&gt;, we can have syntax like &lt;code&gt;ListA.sequence(blah)&lt;/code&gt;, which is almost the best you can do. One thing I like about this is we avoid having to restate the bound for each and every Applicative-derived function - all functions simply reference &lt;code&gt;M&lt;/code&gt;, where the bound on &lt;code&gt;M&lt;/code&gt; is stated in the outer scope. &lt;br /&gt;&lt;br /&gt;Incidentally, this sounds very similar to the encoding discussed &lt;a href="http://www.cse.unsw.edu/~chak/papers/modules-classes.pdf"&gt;here&lt;/a&gt;:&lt;br /&gt;&lt;blockquote&gt;The translation from Haskell type classes to ML modules encodes type classes as signatures and instances of type classes as functors that yield structures of these signatures.&lt;/blockquote&gt;Sidebar question for ML people: In ML, can you have higher-kinded modules like &lt;code&gt;Applicative&lt;/code&gt;, and can ML functors receive such modules as arguments? Or are you limited to accepting first-order modules like &lt;code&gt;Ring&lt;/code&gt;?&lt;br /&gt;&lt;br /&gt;Per-function implicit parameters is still more flexible, though, since it allows the functions derived from a typeclass to be spread out over multiple modules. The most logical place to put a function isn't necessarily in the module with the typeclass definition since the bound is often somewhat incidental. Per-function implicits scale better, too - you don't need to have a separate module for every combination of type bounds (although I doubt very many combinations occur in real programs). &lt;br /&gt;&lt;br /&gt;I wonder if it would be possible to get around this first restriction somehow by adding a few operators to modules - perhaps union, subtraction, and renaming?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-3752225649726951038?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/3752225649726951038/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=3752225649726951038' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/3752225649726951038'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/3752225649726951038'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2009/11/another-encoding-of-typeclasses-in.html' title='Another encoding of typeclasses in Scala'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-3553729147123761044</id><published>2009-09-13T15:55:00.000-07:00</published><updated>2009-09-22T11:09:32.485-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>A Scala success story: commercial usage of Scala at Capital IQ ClariFI</title><content type='html'>&lt;span style="font-weight:bold;"&gt;Update:&lt;/span&gt; There's some &lt;a href="http://www.reddit.com/r/programming/comments/9lmh3/a_scala_success_story_commercial_usage_of_scala/"&gt;additional comments and discussion on reddit&lt;/a&gt;. Also, made minor edits on our company background to reflect our recent rebranding. &lt;br /&gt;&lt;br /&gt;&lt;hr/&gt;&lt;br /&gt;For a while I've been meaning to put together a post discussing my experience using Scala successfully in a large commercial product. &lt;br /&gt;&lt;br /&gt;First, a bit of background: the company I work for, &lt;a href="http://capitaliq.com"&gt;Capital IQ&lt;/a&gt;, delivers fundamental and quantitative research and analytics software to the investment management, banking, corporate, and academic communities. Our flagship quant product, &lt;a href="http://clarifi.com"&gt;ClariFI&lt;/a&gt;, supports asset managers at every stage of the quantitative investment process: building and backtesting of factors, portfolio optimization, simulation of trading strategies, performance and risk attribution, overall data management (including organizing huge amounts of time series data pulled from a diverse set of raw sources), and lots more. In addition to ClariFI, though, we have started serving up some of our analytics to the Capital IQ web application.&lt;br /&gt;&lt;br /&gt;How did we end up using Scala? About a year and a half ago we had a legitimate business case for rewriting&amp;mdash;or at least substantially redesigning and refactoring&amp;mdash;the core analytics backing our portfolio attribution (PA) workflow. I was slated to be the person doing this rewriting/redesign, and as my boss Scott was describing to me how the existing ModelStation PA worked, it became clear to me that a functional language was a natural fit for the domain. I'd been pushing for a while for us to try using languages other than Java&amp;mdash;at least for some projects that were more isolated from the rest of our codebase&amp;mdash;and this project seemed like the perfect opportunity. After doing some additional research we settled on Scala.&lt;br /&gt;&lt;br /&gt;The decision to write in Scala implied this would be a true rewrite and I would be totally unencumbered by the previous architecture. Rewrites aren't always a bad idea, though. Our new Scala-backed PA exceeded the functionality of the old codebase, using about a third of the code of the old Java implementation, with comparable memory and speed footprints. The new codebase was more generic (components developed to support PA are now being reused elsewhere), more modular (the core PA engine itself is now being accessed by two completely different front ends), and much more testable. I made extensive use of &lt;a href="http://code.google.com/p/scalacheck/"&gt;ScalaCheck&lt;/a&gt;&amp;mdash;an absolutely huge boon that dramatically cut down on the time and effort needed to find and fix bugs. In developing the new PA, Scott and I had conversations in which I would ask him for a property to specify a piece of code I needed to write, and then I would transcribe his description, fairly directly, into a ScalaCheck property. Originally, we were thinking of porting over Scott's old hand-made test cases from the previous PA, but we didn't end up doing this because our properties gave such good coverage. Having such an extensive set of tests also gave me a lot more confidence to make later changes to tweak the design, knowing I was not breaking anything in the process.&lt;br /&gt;&lt;br /&gt;Part of the reason the code was so testable was that close to 100% of the Scala I wrote was purely functional, exceptions being the occasional use of local state whose effects did not escape function boundaries and "benign" uses of state that preserved referential transparency, like memoizing a function. Writing stateless code, especially within the analytics layer of our product, is extremely natural and doesn't require any contortions. And stateless code is very easy to test.&lt;br /&gt;&lt;br /&gt;Scala is an interesting language. It doesn't really impose any particular view on you of how to write software. Everything from higher-order functional code to low-level imperative programming is well-supported and moving between these paradigms is seamless. On the other hand, there's so much to the language, and the features aren't all totally orthogonal, that it sometimes feels a bit clunky to me. But it is without a doubt an exceedingly practical, powerful language. Let me highlight a couple aspects of Scala I think have been important for us:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-style:italic;"&gt;Integration between Scala and Java is trivial:&lt;/span&gt; If you were so inclined you could write "Java in Scala" and have it compile to essentially the same classfiles that Java would compile to. You can set breakpoints in Scala, step into Scala code from Java, call any Scala method as if it were a Java method and vice versa, profile Scala code using the same tools you'd use for profiling Java code, etc. This level of integration was important for us since we were interfacing our Scala code with a large Java application.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-style:italic;"&gt;Basic functional idioms:&lt;/span&gt; Without a doubt, there's simply a huge productivity boost that comes with having access to first-class functions, convenient syntax for annoymous functions, and the various list processing functions like map, filter, fold, zip, etc. Scala's Option type is extremely useful as well, as are for-comprehensions.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-style:italic;"&gt;The type system:&lt;/span&gt; It's quite good. I'm still exploring its limits. In particular, though, Scala's generics plus implicit parameters enable powerful libraries like ScalaCheck and SBinary, both of which we have used extensively.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-style:italic;"&gt;Optional non-strict evaluation:&lt;/span&gt; One of our projects involved running a whole series of memory-hungry calculations whose results were then encoded as XML and streamed across the network. We couldn't construct all the results up front as that would consume too much memory, and we didn't want to rewrite the producer and/or consumer to have to tightly coordinate with the other. By constructing the results lazily via judicious use of Scala's by-name function parameters, the producer (the calculation) and consumer (the xml writer) remained ignorant of one another, and memory usage was nearly the same as if the producer code and consumer code were explicitly interleaved. Not that this was news, but laziness really can improve modularity.&lt;/li&gt;&lt;/ul&gt;Some other things weren't as nice:&lt;ul&gt;&lt;li&gt;At the time of this first project, the only developed IDE support was the Eclipse plugin which was extremely buggy and limited in functionality. Workable, but not good. Since then the situation has improved considerably.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;In Scala 2.7.1 (the version we used up until about 6 months ago), Scala's parameterized types did not show up as Java parameterized types. This was actually rather annoying to deal with, especially given that I made frequent use of parameterized classes and methods. It wasn't until Scala 2.7.4 that generics support was actually bug-free enough for us to use. Even still, though, Java's lack of both type aliases and type inference makes dealing with generic types extremely painful.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Scala's type inference, while better than nothing, is quite limited. Method parameters always require type annotations. Higher-kinded type parameters are not inferred (although I've heard that there might be some form of inference of these parameters in 2.8). The inference algorithm seems unpredictable to me... my usual strategy is to start without type annotations, try compiling, fix errors by adding annotations, repeat (of course, I sometimes annotate types for clarity anyway). To be fair, the situation here is still much better than Java, but not nearly as nice as (say) Haskell.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Functions aren't curried by default, and I find working with curried functions kind of cumbersome in Scala. There are also some language warts that make it pretty much impossible to program in a point-free style (namely, the somewhat arbitrary distinction between methods, defined using &lt;code&gt;def&lt;/code&gt;, and function values which implement one of the &lt;code&gt;FunctionN&lt;/code&gt; traits). This might not be so terrible if you didn't have to fully annotate parameter types in method definitions, but you do, and it can get kind of annoying. Again, we are still much better off than we are in Java, but it's possible to do better (for instance, Haskell).&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The standard library is kind of uninspiring and is missing some basic stuff. It also has some warts&amp;mdash;for example, the function &lt;code&gt;zip&lt;/code&gt; is defined for List but not for all sequences, there's no &lt;code&gt;zipWith&lt;/code&gt; function, no &lt;code&gt;scanl&lt;/code&gt; function, etc. Of course, you can write these functions yourself, and we did. The collections library is also getting revamped for 2.8 and a lot of these things are getting cleaned up. There's also an interesting project, &lt;a href="http://code.google.com/p/scalaz/"&gt;Scalaz&lt;/a&gt;, that we haven't gotten around to using in any production code&amp;mdash;but it's a nicely structured library that fills in a lot of the gaps in Scala's standard lib. In general, I don't want to dwell on this issue too much since every language's standard library has warts and Scala's is still certainly much saner than Java where they overlap in functionality (sensing a theme here?).&lt;/li&gt;&lt;/ul&gt;Overall I found these shortcomings to be more annoyances than showstoppers and I would certainly recommend Scala to anyone targeting the JVM. At our company, I know we will continue to expand our Scala usage: since shipping the Scala-backed PA, we've used Scala for one other major project in production and have several others either in progress now or in the pipeline. So we're pretty much hooked.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Takeaways&lt;/h3&gt;&lt;br /&gt;Typeful programming is a huge win, but it does take some getting used to. When I started on this first Scala project, I'd used Haskell for some toy projects but otherwise didn't have much experience with statically typed functional languages. There is a certain sort of brain-rewiring that occurs as you become proficient with the type system in a language like Scala&amp;mdash;you reach a point where types become ingrained in how you think about and structure code, and the type system becomes something you work with, rather than against. Types are also an extremely concise way of communicating and documenting designs.&lt;br /&gt;&lt;br /&gt;Another takeaway is that (no surprise), good design takes a while and is extremely hard, in particular getting the details right. Scott and I had a vague sketch of a design after a few hours of whiteboarding, but nailing down the details took much longer. There are always a lot of small design decisions beyond what you uncover whiteboarding. In general, I've become a lot less trusting of whatever initial ideas pop into my head when working on a design, and I prefer now, where possible, to give myself time to marinate on a problem. Several times I've had the experience of realizing something, months after the fact, which had I realized several months prior would have saved me a ton of work and made the code much simpler. This is very humbling, and it's why I think doing prototyping and giving yourself time to explore the problem domain often yields greater productivity in the long run. &lt;br /&gt;&lt;br /&gt;Finally, I'd like to scrutinize the folk wisdom that rewrites are generally a bad idea. The argument goes that old, battle-tested code has accumulated scores of bugfixes, corner-case handling, and so forth and rewrites mean throwing away all of this instantiated knowledge. Rebuilding this knowledge in the new codebase is often more time-consuming than it's worth. &lt;br /&gt;&lt;br /&gt;The experience with rewriting the PA has led me to suspect that this argument is more applicable for code that is underspecified and difficult to test. When code is testable and well-specified, the cost of discovering bugs and handling corner cases drops substantially... to the point where they are no longer such a huge time sink. I think this myth arises because most code is not very testable&amp;mdash;such code tends to result in bugs being discovered much later in the development process when the cost of investigating and fixing them is much higher. When considering a rewrite of some codebase that has been this costly and time-consuming to battle-harden, it's easy to assume you will have to incur the same costs when developing the new codebase. But this is only true if you keep using the same old, busted development strategies.&lt;br /&gt;&lt;br /&gt;Rewrites aren't always good: if you are going to rewrite code using substantially the same design, using the same language and tools, just to "clean up the code", then you are just code polishing and are not really adding any value. But if you are redesigning the code with an eye for making future work much easier, then you are simply trading off some current throughput for an increase in long term productivity. And this is a completely sensible thing to do.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-3553729147123761044?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/3553729147123761044/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=3553729147123761044' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/3553729147123761044'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/3553729147123761044'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2009/08/scala-success-story-commercial-usage-of.html' title='A Scala success story: commercial usage of Scala at Capital IQ ClariFI'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-2947961978724638946</id><published>2009-09-08T11:52:00.000-07:00</published><updated>2009-09-09T07:34:55.165-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Encoding functional dependencies in Scala</title><content type='html'>&lt;span style="font-weight:bold;"&gt;Update:&lt;/span&gt; Nevermind, I don't think this is useful. There's no advantage that I can see to this over using plain old abstract type members. Also, nothing stops us from defining two classes with the same type parameters in the same scope, but with different type constructors for converting to the dependent types. Original post follows.&lt;br /&gt;&lt;br /&gt;&lt;hr/&gt;&lt;br /&gt;Here is a way you can encode &lt;a href="http://www.haskell.org/haskellwiki/Functional_dependencies"&gt;functional dependencies&lt;/a&gt; in Scala. I'm not sure if this technique has been described elsewhere, but I hadn't seen it so I thought I'd write it up. &lt;br /&gt;&lt;br /&gt;Sometimes you will have a type &lt;code&gt;Foo[X,Y]&lt;/code&gt;, in which the choice of &lt;code&gt;X&lt;/code&gt; uniquely determines &lt;code&gt;Y&lt;/code&gt; (or more generally, in which the choice of one or more type parameters uniquely determines other type parameters). The example given on the Haskell page for fundeps is a multiplication typeclass: if both items being multiplied are matrices, the result is also a matrix; if both are scalars, the result is a scalar, if one is a matrix and the other is a scalar, the result is a matrix, etc. The types of the two things being multiplied uniquely determines the result type.&lt;br /&gt;&lt;br /&gt;The basic idea for encoding this in Scala is to make the dependent type paramters into concrete type members of the class, dependent on an abstract type constructor that converts the supplied type parameters to the dependent type. An example:&lt;pre&gt;&lt;code&gt;object Fundep {&lt;br /&gt;  trait I[X] { &lt;br /&gt;     type XToZ[Q] &lt;br /&gt;     type Z = XToZ[X] &lt;br /&gt;     def z: Z&lt;br /&gt;  }&lt;br /&gt;  class J[X] extends I[X] { &lt;br /&gt;     type XToZ[Q] = String&lt;br /&gt;     def z = "hello"&lt;br /&gt;  }&lt;br /&gt;  class K[X] extends I[X] { &lt;br /&gt;     type XToZ[Q] = Int&lt;br /&gt;     def z = 42&lt;br /&gt;  }&lt;br /&gt;  def getZ[T&lt;:I[_]](t: T): T#Z = t.z&lt;br /&gt;}&lt;/code&gt;&lt;/pre&gt;The function &lt;code&gt;getZ&lt;/code&gt; shows how users of the type can access the types that are derived from the supplied type parameters. We can call this with &lt;code&gt;getZ(new K[Blah])&lt;/code&gt; and it will return an Int; call it with &lt;code&gt;getZ(new J[Blah])&lt;/code&gt; and we get back a String!&lt;br /&gt;&lt;br /&gt;The nice thing about this is that clients do not need to have knowledge of the type constructors for converting the known type parameters to those that are dependent. In contrast, consider the situation if we made the type constructor for doing the conversion into a type parameter of the class, like so: &lt;code&gt;trait Foo[X,XToZ[_]] { type Z = XToZ[X] }&lt;/code&gt;. If we do this, all clients of the class are forced to know about the type constructor for converting &lt;code&gt;X&lt;/code&gt; to &lt;code&gt;Z&lt;/code&gt;, which they don't really care about - they just need to know what &lt;code&gt;Z&lt;/code&gt; is, not how it is generated from &lt;code&gt;X&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;Note: this only actually works in 2.8. In 2.7.4, the example above crashes the compiler.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-2947961978724638946?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/2947961978724638946/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=2947961978724638946' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/2947961978724638946'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/2947961978724638946'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2009/09/encoding-functional-dependencies-in.html' title='Encoding functional dependencies in Scala'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-9139196333809851658</id><published>2009-07-22T11:47:00.000-07:00</published><updated>2009-07-23T10:23:05.816-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='economics'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Market failure in the EA spouse case</title><content type='html'>There's an interesting class of market failures that occur in the presence of both imperfect information and high switching costs. As an example, consider &lt;a href="http://ea-spouse.livejournal.com/274.html"&gt;the infamous EA Spouse case&lt;/a&gt; (background on &lt;a href="http://en.wikipedia.org/wiki/EA_Spouse"&gt;the wikipedia page&lt;/a&gt;). Quick synopsis: a programmer joined EA with the expectation that working conditions would be relatively sane. After starting work, it became increasingly apparent that the company was a sweatshop in perpetual crunch mode. From &lt;a href="http://ea-spouse.livejournal.com/274.html"&gt;the original EA spouse post&lt;/a&gt;:&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;Electronic Arts offered a job, the salary was right and the benefits were good, so my SO took it. I remember that they asked him in one of the interviews: "how do you feel about working long hours?" It's just a part of the game industry -- few studios can avoid a crunch as deadlines loom, so we thought nothing of it... &lt;br /&gt;&lt;br /&gt;Within weeks production had accelerated into a 'mild' crunch: eight hours six days a week. Not bad. Months remained until any real crunch would start, and the team was told that this "pre-crunch" was to prevent a big crunch toward the end; at this point any other need for a crunch seemed unlikely, as the project was dead on schedule... Weeks passed. Again the producers had given a termination date on this crunch that again they failed... Now, it seems, is the "real" crunch... the current mandatory hours are 9am to 10pm -- seven days a week -- with the occasional Saturday evening off for good behavior (at 6:30pm). This averages out to an eighty-five hour work week. Complaints that these once more extended hours combined with the team's existing fatigue would result in a greater number of mistakes made and an even greater amount of wasted energy were ignored...&lt;br /&gt;&lt;br /&gt;And the kicker: for the honor of this treatment EA salaried employees receive a) no overtime; b) no compensation time! ('comp' time is the equalization of time off for overtime -- any hours spent during a crunch accrue into days off after the product has shipped); c) no additional sick or vacation leave. The time just goes away. Additionally, EA recently announced that, although in the past they have offered essentially a type of comp time in the form of a few weeks off at the end of a project, they no longer wish to do this, and employees shouldn't expect it. Further, since the production of various games is scattered, there was a concern on the part of the employees that developers would leave one crunch only to join another. EA's response was that they would attempt to minimize this, but would make no guarantees. &lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;In discussions of this story when it became widely publicized, a very common response was to say that either the programmer "should have known" what he was getting himself into, and/or if he didn't like the working conditions he should have simply quit - no one was forcing him to work for EA. While this is technically true, it is beside the point - the employee in this case accepted the job given imperfect information about the working conditions (we can quibble about whether this was his fault, but that is not relevant). After the fact, if he wished to find a different job he would be forced to incur various switching costs. Such costs could be substantial for an individual changing jobs (searching for a job takes a lot of time and energy, may require moving expenses and so forth).&lt;br /&gt;&lt;br /&gt;In general, for a person in a job that he might not have accepted given better information, the switching costs give the status quo greater net utility than it would have on its own. This leads the employee to remain at the job longer, until conditions deteriorate (or the employee's utility function changes) to the point where the switching costs become acceptable.&lt;br /&gt;&lt;br /&gt;This is a market failure. Imperfect information causes the market to misdirect productive capacity; switching costs then sustain the resulting inefficiency. In the presence of better information or in the presence of zero switching costs, a company like EA would not be able to attract and retain programmers at the salary they offered. It is only when both factors combine that EA's strategy becomes effective. In fact, given the presence of these factors, it may be the case that EA's strategy actually gives them a competitive advantage over companies that are more up front about their working conditions.&lt;br /&gt;&lt;br /&gt;There are a lot of possible responses to this sort of market failure. Do we pass laws preventing these sort of working conditions? We could, but then we restrict legitimate market activity. Suppose EA were completely upfront about their working conditions and offered increased pay as compensation. If a programmer decides the deal is worth it to him, should he really be legally prevented from accepting the job? By having such a law we are restricting a whole class of mutually beneficial transactions.&lt;br /&gt;&lt;br /&gt;No, the ideal way to deal with this sort of market failure is to provide the market with better information. This allows legitimate, beneficial activity to continue while preventing the inefficiency that would otherwise occur when employees accept jobs under assumptions that are out of sync with reality. But how do we provide the market with better information? I'm not sure. In principle, the government can play a role here - for instance, it can force employers to provide information about working hours. But again, this is a very rigid tool - in addition to increasing the overall regulatory burden, laws like this will always lag behind the needs of the market. If a new sort of information becomes relevant to programming jobs, the law does not automatically adapt to compel the release of this new information. Involving the government means we must also consider the possibility of &lt;a href="http://en.wikipedia.org/wiki/Government_failure"&gt;government failure&lt;/a&gt; - even if government action can in principle make the market more efficient, in practice it isn't guaranteed that it will do so given the incentives of politicians and voters and the limitations of the people involved. &lt;br /&gt;&lt;br /&gt;I'm very interested in understanding when markets will, via endogenous processes, tend to endow their participants with sufficient information relevant to market decisions. That is, when is a market naturally induced to provide its own information, and when it is an equilibrium for market participants to withhold information? A related question is: if we factor in the cost of information acquisition, when do the costs of acquiring information exceed its value to participants? And how should information be valuated? A better understanding of these questions might lead to ideas for how to coax a market into providing participants with better information - self-provided information like this will I think tend to be more subtle, more adaptive, more specialized, and more responsive to the needs of market participants.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-9139196333809851658?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/9139196333809851658/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=9139196333809851658' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/9139196333809851658'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/9139196333809851658'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2009/07/market-failure-in-ea-spouse-case.html' title='Market failure in the EA spouse case'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-2482023653431264386</id><published>2009-06-14T12:20:00.000-07:00</published><updated>2009-11-24T21:11:28.580-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Perfect strictness analysis (Part 2)</title><content type='html'>Strictness analysis &lt;a href="http://pchiusano.blogspot.com/2009/06/perfect-strictness-analysis-part-1.html"&gt;fails to see through polymorphic functions&lt;/a&gt; whose strictness is in whole or in part determined by arguments supplied by callers. The  example I gave was the function &lt;code&gt;foldl&lt;/code&gt;, whose seed parameter can be made strict so long as the binary function used for the fold is also strict.&lt;br /&gt;&lt;br /&gt;Can we do better? Maybe, but I can't think of a clean solution. Here, I'll sketch out an algorithm that doesn't quite work. Note that there is a runtime cost for implementing such an approach, but I believe the runtime cost can be made less than the cost of unnecessarily allocating a thunk on the heap - which requires at the very least a handful of instructions.&lt;br /&gt;&lt;br /&gt;Before giving the algorithm, let's be clear about the goal: the goal of &lt;i&gt;perfect strictness analysis&lt;/i&gt; is to avoid allocating a thunk for any argument that could be made strict without altering the callee's semantics. We don't care whether this determination can be done statically or whether it must wait until runtime. &lt;br /&gt;&lt;br /&gt;In the case that it must wait until runtime, here is a sketch of an algorithm that seems like it might work, but doesn't. I'm going to loosely assume a stack-based machine, in which we push a function then its arguments onto a stack, then either call the function or create a thunk:&lt;ul&gt;&lt;li&gt;Each function stores (at runtime) an annotation indicating which arguments to it are strict, or, more generally, how the strictness of each argument is dependent on the strictness of other arguments.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Rather than building thunks bottom up, we build them top-down: when creating a thunk, we first push the callee onto a stack.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Before building a thunk for an argument to the callee, we ask the callee whether it is strict in that argument. If it is, we evaluate the argument. Note that the callee may examine its caller, recursively, to determine strictness, or it may examine other arguments that are being passed to the callee. In the case that the information is not yet known or the callee is lazy in that argument, we drop back to building a thunk for that argument.&lt;br /&gt;&lt;/ul&gt;Let's look out our definition for &lt;code&gt;foldl&lt;/code&gt; again, and see how this would work. Recall the definition for &lt;code&gt;foldl&lt;/code&gt;:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;foldl :: (a -&gt; b -&gt; a) -&gt; a -&gt; [b] -&gt; a&lt;br /&gt;foldl f s [] = s&lt;br /&gt;foldl f s (h:t) = foldl f (f s h) t&lt;/code&gt;&lt;/pre&gt;I'll assume we can perform some analysis to determine that &lt;code&gt;foldl&lt;/code&gt; is strict in its third parameter (since we must always evaluate the third parameter to distinguish between the empty list case and the non-empty list case), lazy in its first parameter (since we won't necessarily apply that function to any arguments), and strict in its second parameter iff the first parameter is a binary function strict in both its arguments. Notice that the strictness of &lt;code&gt;f&lt;/code&gt; actually depends not just on the type of the third argument, but on its runtime value - if the list is empty, &lt;code&gt;f&lt;/code&gt; will not be evaluated; if it is nonempty, &lt;code&gt;f&lt;/code&gt; will be evaluated. This is going to cause problems, as we'll see in a moment.&lt;br /&gt;&lt;br /&gt;In the recursive call, &lt;code&gt;foldl f (f s h) t&lt;/code&gt;, we would push &lt;code&gt;foldl&lt;/code&gt; onto a stack. Now we are at the point where we are passing the first argument to &lt;code&gt;foldl&lt;/code&gt;, so we would ask &lt;code&gt;foldl&lt;/code&gt; whether it is strict in this argument. It responds 'no', so we leave &lt;code&gt;f&lt;/code&gt; unevaluated. Next we are at the point we we are about to pass the second argument to &lt;code&gt;foldl&lt;/code&gt;. We again query &lt;code&gt;foldl&lt;/code&gt;, asking if it is strict in its second argument. &lt;code&gt;foldl&lt;/code&gt; checks its first argument - except if &lt;code&gt;f&lt;/code&gt; is unevaluated, its strictness is unknown, so we have to fall back to thunking the second argument. We're back where we started - the strictness of the function f won't become known until we reach the end of the list, so we end up with the same behavior as before. Ouch! Note that if &lt;code&gt;f&lt;/code&gt; were already evaluated, &lt;code&gt;foldl&lt;/code&gt; would be able to determine that the second parameter could be evaluated before being supplied.&lt;br /&gt;&lt;br /&gt;It seems we would need another level of analysis to determine that the first parameter, &lt;code&gt;f&lt;/code&gt;, can be made strict if the list being folded over is nonempty. In principle, I don't see why we couldn't do this sort of analysis, but this is starting to get ugly. In addition, we have the problem that when we go to supply &lt;code&gt;f&lt;/code&gt; to &lt;code&gt;foldl&lt;/code&gt;, the list argument hasn't been supplied yet, so we can't inspect the list to determine if it is nonempty. This might not be a fundamental problem since we obviously have a reference to the list even if we haven't formally supplied to &lt;code&gt;foldl&lt;/code&gt; yet. But still, this is getting a lot more complex than I'd initially hoped.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-2482023653431264386?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/2482023653431264386/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=2482023653431264386' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/2482023653431264386'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/2482023653431264386'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2009/06/perfect-strictness-analysis-part-2.html' title='Perfect strictness analysis (Part 2)'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-7016827163135884381</id><published>2009-06-07T19:30:00.000-07:00</published><updated>2009-11-24T21:10:41.338-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Perfect strictness analysis (Part 1)</title><content type='html'>Previously, I talked about how &lt;a href="http://pchiusano.blogspot.com/2009/05/optional-laziness-doesnt-quite-cut-it.html"&gt;strict-by-default results in less modular code&lt;/a&gt;. Now, I want to talk about problems with lazy-by-default languages, and how these problems could be addressed by better strictness analysis. This will be in multiple parts. (&lt;a href="http://pchiusano.blogspot.com/2009/06/perfect-strictness-analysis-part-2.html"&gt;Part 2&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;Let's look at a simple example where laziness causes problems. Here's a definition for a left fold in Haskell:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;foldl :: (a -&gt; b -&gt; a) -&gt; a -&gt; [b] -&gt; a&lt;br /&gt;foldl f s [] = s&lt;br /&gt;foldl f s (h:t) = foldl f (f s h) t&lt;/code&gt;&lt;/pre&gt;If you try evaluating &lt;code&gt;foldl (+) 0 [1..1000000]&lt;/code&gt;, you'll get a stack overflow. The stack overflow occurs because &lt;code&gt;foldl&lt;/code&gt; is lazy in the parameter &lt;code&gt;s&lt;/code&gt; (the 'seed' parameter); each iteration of the fold creates a thunk with one more level of nesting than the current seed. When the result of the fold is finally evaluated it requires stack space proportional to the length of the list folded over. If you want to sum a list in constant space, you can either use a hand-coded loop, or you can use &lt;code&gt;foldl'&lt;/code&gt;, which is strict in its seed argument. (Update: there's another way, too, see below.)&lt;br /&gt;&lt;br /&gt;But wait a minute - if you look at the definition for &lt;code&gt;foldl&lt;/code&gt;, you can see that if the operator, &lt;code&gt;f&lt;/code&gt; is strict in its  arguments, then the seed could also be made strict without any change in semantics. Strictness analysis has failed us: since &lt;code&gt;foldl&lt;/code&gt; is polymorphic, the compiler can't assume anything about the strictness of the operator and must therefore treat the seed parameter as lazy. In principle the compiler could produce specialized versions of &lt;code&gt;foldl&lt;/code&gt; automatically and apply the usual strictness analysis to the specialized versions. However, I doubt this works as a general solution to the problem as it's possible to construct examples where this strategy leads to an explosion in code size. (Open question: are said examples really that common in practice?)&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Update&lt;/b&gt;: &lt;code&gt;foldl&lt;/code&gt;, &lt;a href="http://www.nabble.com/Re%3A-Runtime-strictness-analysis-for-polymorphic-HOFs--p24039326.html"&gt;as it's written in GHC&lt;/a&gt;, will actually do the right thing assuming you compile with strictness analysis turned on. However, the recursive definition I gave above still yields a stack overflow.&lt;br /&gt;&lt;br /&gt;Optimizations like strictness analysis that can't be counted on to be applied in all relevant cases are leaky abstractions. And like all leaky abstractions, strictness analysis fails to form a new semantic level upon which we can build new abstractions and modes of thinking; rather, it is an ad hoc piece of machinery that must be explicitly unpacked and considered when writing code. This places additional conceptual burden on the programmer.&lt;br /&gt;&lt;br /&gt;Contrast this with laziness itself: laziness is pervasively and predictably applied, not an optimization you hope gets triggered. Laziness lets you program in a different style than you would in a strict language. &lt;br /&gt;&lt;br /&gt;What would it look like to have perfect strictness analysis, which worked predictably, correctly handling polymorphic functions like &lt;code&gt;foldl&lt;/code&gt;? I'll detail a possible solution in my next post.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Update:&lt;/b&gt; It appears I was overly optimistic about the difficulty of this problem. &lt;a href="http://pchiusano.blogspot.com/2009/06/perfect-strictness-analysis-part-2.html"&gt;Read on&lt;/a&gt; for the details.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-7016827163135884381?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/7016827163135884381/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=7016827163135884381' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/7016827163135884381'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/7016827163135884381'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2009/06/perfect-strictness-analysis-part-1.html' title='Perfect strictness analysis (Part 1)'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-5073789662040614229</id><published>2009-05-27T18:47:00.000-07:00</published><updated>2009-11-24T21:09:57.783-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Optional laziness doesn't quite cut it</title><content type='html'>Strict languages that support optional laziness just don't provide the same modularity and composability that you get with a language that is lazy by default. In a strict language with optional laziness, callers (direct and indirect) all may need to be aware of the laziness of any potential callee (direct or indirect). This is a devastating handicap that makes these functions much less useful and composable than they would be otherwise.&lt;br /&gt;&lt;br /&gt;Here's an example, in Scala. Suppose we create a method, foo, lazy in its second argument:&lt;pre&gt;&lt;code&gt;def foo[A,B](a: A, b: =&gt; B): B = ...&lt;/code&gt;&lt;/pre&gt;If we want to call &lt;code&gt;foo&lt;/code&gt;, no problem, just do &lt;code&gt;foo(a, hugeFunction())&lt;/code&gt;. Scala automatically creates a thunk wrapping the &lt;code&gt;hugeFunction()&lt;/code&gt; call, ensuring that it won't be executed before &lt;code&gt;foo&lt;/code&gt; gets called. The problem comes when we consider a caller of &lt;code&gt;foo&lt;/code&gt;:&lt;pre&gt;&lt;code&gt;def bar[A,B,C](a: A, b: B, c: C): B = {&lt;br /&gt;   // big long function&lt;br /&gt;   foo(a, b)&lt;br /&gt;}&lt;/code&gt;&lt;/pre&gt;This isn't quite right - &lt;code&gt;bar&lt;/code&gt; is strict in its second argument, but when that argument is actually used, in the call to &lt;code&gt;foo&lt;/code&gt;, it is lazy. If &lt;code&gt;foo&lt;/code&gt; depends on the laziness of &lt;code&gt;b&lt;/code&gt; for its semantics or performance, we probably should adjust the signature of &lt;code&gt;bar&lt;/code&gt; to also be lazy:&lt;pre&gt;&lt;code&gt;def bar[A,B,C](a: A, b: =&gt; B, c: C): B = ...&lt;/code&gt;&lt;/pre&gt;Starting to see a problem yet? What about callers of &lt;code&gt;bar&lt;/code&gt;? These need fixing, too. In principle we might have to adjust the signatures of a long chain of callers into &lt;code&gt;foo&lt;/code&gt;, until we get to the function that generates the instance of &lt;code&gt;B&lt;/code&gt; that gets passed ultimately to &lt;code&gt;foo&lt;/code&gt;. The problem is especially bad when writing generic code that can appear in several locations in the call graph. Very often these functions will not have full knowledge of who they call; the only way for them to be fully general is for them to be lazy by default (since some callee may depend on this laziness). If you program in a style where you make plentiful use of HOFs, or you make use of a lot of generic interfaces, you start to see what a PITA it can be to deal with strictness. A function that appears deep in the call graph needing to be made lazy might result in the programmer having to propagate laziness information to a huge number of dependent functions.&lt;br /&gt;&lt;br /&gt;As an example, I recently built an experimental combinator library for defining parallel computations. I won't go into details here, but you can build a nice set of combinators all derived from just three primitives: &lt;code&gt;unit&lt;/code&gt;, &lt;code&gt;map&lt;/code&gt;, and &lt;code&gt;join&lt;/code&gt;. As it turns out, these three functions form a monad, and the combinator library I was going to write was just going to be the usual combinators that can be derived given the monad interface. I figured I'd just save myself the trouble and just use implementation of the monad combinators from &lt;a href="http://code.google.com/p/scalaz/"&gt;scalaz&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;... except, the monad interface in &lt;a href="http://code.google.com/p/scalaz/"&gt;scalaz&lt;/a&gt; defined &lt;code&gt;unit&lt;/code&gt; to be a strict function (this is being fixed in an upcoming release), and my particular implementation of the monad interface needed &lt;code&gt;unit&lt;/code&gt; to be lazy. Short of modifying scalaz, my only option was to rewrite all the functions that called into &lt;code&gt;unit&lt;/code&gt;. Unfortunately, the &lt;code&gt;unit&lt;/code&gt; function has a lot of dependents - the functions that can be derived given a monad have a nice, layered structure, where each layer builds on the layer below, with the bottom layer being just the three functions that define a monad. Changing the signature of &lt;code&gt;unit&lt;/code&gt; in the interface causes a cascade of changes to all the layers above.&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;There are certainly some problems with laziness being the default, but I've started to think the problems are of the type that can be solved or have already been solved by better technology (I'll talk more about this in &lt;a href="http://pchiusano.blogspot.com/2009/06/perfect-strictness-analysis-part-1.html"&gt;a future post&lt;/a&gt;). In contrast, the above problems with strict-by-default are fundamental.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-5073789662040614229?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/5073789662040614229/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=5073789662040614229' title='18 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/5073789662040614229'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/5073789662040614229'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2009/05/optional-laziness-doesnt-quite-cut-it.html' title='Optional laziness doesn&apos;t quite cut it'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>18</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-7107619264891926095</id><published>2008-12-16T22:07:00.000-08:00</published><updated>2009-04-13T12:36:16.964-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mathematics'/><title type='text'>Objections to the blue-eyed islander problem</title><content type='html'>I've always found the classical answer to the blue-eyed islander problem vaguely unsatisfying. For background, here's &lt;a href="http://xkcd.com/blue_eyes.html"&gt;one formulation of the puzzle&lt;/a&gt;:&lt;br /&gt;&lt;blockquote&gt;A group of people with assorted eye colors live on an island. They are all perfect logicians -- if a conclusion can be logically deduced, they will do it instantly. No one knows the color of their eyes. Every night at midnight, a ferry stops at the island. Any islanders who have figured out the color of their own eyes then leave the island, and the rest stay. Everyone can see everyone else at all times and keeps a count of the number of people they see with each eye color (excluding themselves), but they cannot otherwise communicate. Everyone on the island knows all the rules in this paragraph.&lt;br /&gt;&lt;br /&gt;On this island there are 100 blue-eyed people, 100 brown-eyed people, and the Guru (she happens to have green eyes). So any given blue-eyed person can see 100 people with brown eyes and 99 people with blue eyes (and one with green), but that does not tell him his own eye color; as far as he knows the totals could be 101 brown and 99 blue. Or 100 brown, 99 blue, and he could have red eyes.&lt;br /&gt;&lt;br /&gt;The Guru is allowed to speak once (let's say at noon), on one day in all their endless years on the island. Standing before the islanders, she says the following:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;"I can see someone who has blue eyes."&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Who leaves the island, and on what night?&lt;/blockquote&gt;The classical answer is that all blue-eyed islanders leave on the 100th night. The proof proceeds inductively - here's &lt;a href="http://terrytao.wordpress.com/2008/02/05/the-blue-eyed-islanders-puzzle/#comment-25874"&gt;a nice explanation by a commenter on Terry Tao's blog.&lt;/a&gt; Note that this explanation uses slightly different formulation of the puzzle in which islanders must commit suicide (!!)  if they learn their eye color (rather than simply leave the island).&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Let’s use the letter k to represent the number of blue-eyed islanders, of which you will be one of. In the argument “I” is used so that you may read this aloud to yourself as if it were your own dilema.&lt;/p&gt; &lt;p&gt;If k is 1, then I cannot see any blue-eyed people on the island. This was fine until the visitor told us that they exist. But now, knowing that they do exist and not being able to see any, I can only conclude that it must be me! Thus, I must die one day from now.&lt;/p&gt; &lt;p&gt;If k is 2, then I can see only one other blue-eyed person, Joe. I know that Joe is as logical as I am so by the reasoning when k is 1, he’s going to have to kill himself. I check on him tomorrow and see that he hasn’t, but how can that be? The only logical conclusion is that the above reasoning did not apply for Joe because when he looked around he must have seen someone else with blue eyes. But I don’t see such a person so it means that it must be me! Thus the following day (two days after the information)I must die. But since Joe is as logical as I am and has been faced with the same situation, he must also have discovered, exactly as I did, that he has blue eyes. Thus he will kill himself at the same time as I do.&lt;/p&gt; &lt;p&gt;If k is 3, then I can see see two blue-eyed people, Joe and Ted. Now, here is where you see the argument continue to depend on the earlier cases (this is what is meant by induction). If Joe and Mike are truly the only two people on the island, then based on the exact same reasoning when k is 2, they will both be dead after two days exactly. However, after two days I see that both are still living. Again, the only logical conlusion is that when Joe and Ted looked around they must have seen someone other than each other with blue eyes. Since I can’t see such a person it means that it must be me! Thus I must kill myself the next day (now three days after the information). But, since Joe and Ted are equally logical, also have blue eyes, and have been part of the same situation, they must have learned exactly as I did that they too have blue eyes. They will join me at the suicide ritual on the same day.&lt;/p&gt; &lt;p&gt;At this point you can see the repetitive nature of the argument. For each higher value of k the argument is quickly formed in terms of our previous argument for k-1. The induction proof is merely a concise formation of this which (correctly) reasons in terms of a general value of k. The visitor’s information is essential because it sets up the “base case” where k is 1.&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;/p&gt;The induction certainly "works" in the sense that if start with your base case (&lt;span style="font-style: italic;"&gt;k=1&lt;/span&gt;) and reason inductively, you are led to the theorem that after &lt;span style="font-style: italic;"&gt;k&lt;/span&gt; days, all &lt;span style="font-style: italic;"&gt;k&lt;/span&gt; blue-eyed people leave the island. But note what you must do to start off this inductive process - you must consider what would happen &lt;span style="font-style: italic;"&gt;if k were 1. &lt;/span&gt;But in reality, no matter your eye color, you are certain it is impossible for &lt;span style="font-style: italic;"&gt;k &lt;/span&gt;to be 1. I claim that considering the &lt;span style="font-style: italic;"&gt;k=1&lt;/span&gt; case is an arbitrary thing to do. Maybe it is &lt;span style="font-style: italic;"&gt;natural&lt;/span&gt; that you happen to consider the &lt;span style="font-style: italic;"&gt;k=1&lt;/span&gt; case in light of the guru's announcement, but you are not &lt;span style="font-style: italic;"&gt;logically compelled&lt;/span&gt; to do so. It is only if we assume the announcement will cause everyone to consider &lt;span style="font-style: italic;"&gt;k=1&lt;/span&gt;, and further, that everyone knows the announcement will have this effect, that the proof goes through. As given, the problem is unclear.&lt;br /&gt;&lt;br /&gt;Note that we could do the exact same proof to argue that all &lt;span style="font-style: italic;"&gt;j &lt;/span&gt;brown-eyed islanders would leave after &lt;span style="font-style: italic;"&gt;j &lt;/span&gt;days, provided we assume that everyone will consider the &lt;span style="font-style: italic;"&gt;j=1 &lt;/span&gt;&lt;span&gt;case&lt;/span&gt;&lt;span style="font-style: italic;"&gt;, &lt;/span&gt;and that everyone knows everyone else will consider this case. The only effect of the announcement (since it reveals no new information) is simply to act as a coordination device to determine which chain of reasoning is to be considered. In the absense of some way of coordinating which chain of reasoning to use, no one can be convinced that others will follow through with the reasoning, and the induction cannot proceed.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-7107619264891926095?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/7107619264891926095/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=7107619264891926095' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/7107619264891926095'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/7107619264891926095'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2008/12/objections-to-blue-eyed-islander.html' title='Objections to the blue-eyed islander problem'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-1716448024507101947</id><published>2008-10-12T13:40:00.000-07:00</published><updated>2008-11-02T15:13:24.979-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='economics'/><title type='text'>Is the CDS market just a big casino?</title><content type='html'>The notional amount of all outstanding credit default swap (CDS) contracts &lt;a href="http://en.wikipedia.org/wiki/Credit_default_swap#Market"&gt;is estimated to approach the worldwide GDP&lt;/a&gt;. Part of the reason this figure is so high is that people can buy CDS contracts on debt they don't even own - it is common for the notional value of CDS contracts for some credit entity to exceed the total value of bonds issued by that entity. This implies people are using CDS contracts to speculate on a credit entity's risk of default, or even further, to profit from fluctuations in market expectations of credit risk. Does this mean the CDS market is just a glorified casino?&lt;br /&gt;&lt;br /&gt;In theory, the CDS market allows risk to be spread among more participants in the economy. But when a CDS is bought by somebody who isn't actually exposed to the risk of the underlying debt, this clearly doesn't transfer any risk from buyer to seller - it is merely a zero-sum bet made by the two parties that actually &lt;span style="font-style: italic;"&gt;creates&lt;/span&gt;&lt;span style="font-style: italic;"&gt; previously nonexistent risk&lt;/span&gt;. Arguably, such a transaction generates no real wealth. In contrast, in the non-zero-sum exchange where the CDS buyer owns the underlying debt, both parties benefit: the buyer reduces his risk, the seller increases his risk in exchange for a premium, and the total amount of risk remains unchanged.&lt;br /&gt;&lt;br /&gt;CDS speculators can still increase efficiency by 1) providing liquidity, and 2) by contributing to the process of price discovery. The additional liquidity provided by speculators allows more buyers and sellers to find one another. As a simple example, suppose I own some CDS contracts and would like to sell them now. A speculator, Bob, may buy these contracts from me and later resell them to Mary. I benefit, by getting a higher price for the contracts than I would have received from a nonspeculative buyer, and by not having to hold onto an asset I no longer want. Mary benefits, as she obtains an asset she wants at a price she feels is appropriate. And Bob benefits, we hope, by profiting from the difference in prices from when I sold to him versus when he sold to Mary.&lt;br /&gt;&lt;br /&gt;When a CDS is bought by someone not exposed to the underlying risk, the transaction also contributes to the process of price discovery, and this benefits all people who DO actually bear the risk of the underlying debt, and probably others as well. If you assume that price discovery takes resources, it also means that others besides those who own the underlying debt can do the work of valuating this debt and be compensated for the work. This is efficient, and it's better than having a few anointed ratings companies rate debt. Clearly, ratings companies paid by those selling the debt do not have the proper incentives to rate it accurately.&lt;br /&gt;&lt;br /&gt;In fact, in a public CDS market, a "ratings company" becomes just another market participant. Those skilled at evaluating risk would be able to profit in this market and could even start "CDS funds" that would attract capital from outside investors. Unlike current ratings companies who do not &lt;span style="font-style: italic;"&gt;really &lt;/span&gt;bear the cost of poorly evaluating credit risk, these hypotetical CDS funds would literally thrive or go bankrupt based on their rating abilities. So, even if the CDS market is a casino, creating risk which did not previously exist, it at least gives participants a strong financial incentive to come to an accurate valuation of the underlying debt.&lt;br /&gt;&lt;br /&gt;The problem with this story is that if CDS contracts aren't traded on a public exchange whose prices anyone can see, the benefits of the market's game of price discovery aren't realized by those on the outside. Why isn't there a public exchange for CDS contracts, anyway? According to &lt;a href="http://en.wikipedia.org/wiki/Credit_default_swap#Market"&gt;the Wikipedia article&lt;/a&gt;, "In 2007 the Chicago Mercantile Exchange set up a federally regulated, exchange-based market to trade CDSs. So far, it hasn't worked because it's been boycotted by banks, which prefer to continue their trading privately."&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-1716448024507101947?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/1716448024507101947/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=1716448024507101947' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/1716448024507101947'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/1716448024507101947'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2008/10/is-cds-market-just-big-casino.html' title='Is the CDS market just a big casino?'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2267747408703654731.post-5854086733030803137</id><published>2008-09-24T22:01:00.000-07:00</published><updated>2008-10-12T14:00:27.351-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='economics'/><title type='text'>Why inflation is appropriate</title><content type='html'>Is inflation really so bad? One argument that gets thrown around a lot is that growing the money supply is an insidious and immoral form of wealth redistribution. The reasoning goes something like this: to see why inflation is redistributive, consider what would happen if the money supply were increased by the government issuing an edict saying all dollars were now worth twice as much. Clearly this would not affect anyone's real wealth, and prices would quickly adjust to the new amount of currency in the economy. But this isn't how increase in the money supply occurs. When the Fed makes an open market purchase and injects new currency into the economy, that money goes first to the banks and the people they loan to, then to wherever they spend the money, and so on. These people who receive the new money first get to spend it before prices have adjusted to the new supply of currency - as they spend it, prices start to rise. Meanwhile, people holding cash or whose wage is a certain cash value find their spending power starts to diminish. Effectively, they have transferred some of their wealth to the people who received the newly generated currency.&lt;br /&gt;&lt;br /&gt;This is the argument, but now let's imagine the alternative of a fixed money supply, where the number of units of currency in circulation is fixed. Suppose you become very rich in 1970 and come to have under your control 1/10th of the total number of dollars in circulation. At the time (1970), with the GDP being whatever it is, this 1/10th of the total dollars has a certain real value. Now say you keep these dollars in a vault for 40 years. Fast forward to 2010 - society is a lot wealthier and more productive now, has a much higher GDP, etc. &lt;span style="font-style: italic;"&gt;It would not make sense for the money you kept in the vault for 40 years to have appreciated in value - your dollars sitting in a vault should not be generating wealth out of thin air.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Another way to think of it is if you used your money from 1970 to buy a bunch of 1970s goods, then transported these goods instantly to  2010 and sold them, they would be worth a lot less. We can do things more efficiently now, at lower (real) cost than we could in 1970 - that's the nature of economic growth! An inflating money supply therefore just reflects the reality that new wealth is being created over time. It therefore makes perfect sense that as the economy grows, exisiting cash assets decline in real value and prices therefore rise.&lt;br /&gt;&lt;br /&gt;Therefore, the argument about inflation being redistributive is missing the point. If you take out a loan and pay it back, it means you have generated new wealth which did not previously exist. Of course you (and the bank that loaned it to you) receive the benefit of that new money moreso than others - you are generating an amount of wealth equal to the loan amount plus interest. With more overall wealth in the economy, it makes sense that everyone else's existing stored wealth in the form of cash is devalued slightly.&lt;br /&gt;&lt;br /&gt;Now think about the money sitting in a bank vault for 40 years. In a fixed currency, that money will become more valuable just by sitting there. If that money is generating real value out of thin air, it must be the case that others are indirectly subsidizing this growth. And the people subsidizing it are all those people who actually do generate new wealth over the 40 year period and have to pay higher real interest rates because of a deflating currency (which is the inevitable result of a fixed money supply and a growing economy). Compare this to the more fair situation now, where if you want your cash to gain in value, you need to either use it yourself to generate wealth, or provide capital to somebody else who is doing that. A fixed currency is actually the opposite of what is truly fair!&lt;br /&gt;&lt;br /&gt;So, the money supply should grow as new wealth is created. The next questions are: can the Fed really grow the money supply in step with the creation of new wealth? And what are the consequences of not growing (or shrinking) the money supply at the appropriate rate?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2267747408703654731-5854086733030803137?l=pchiusano.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pchiusano.blogspot.com/feeds/5854086733030803137/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2267747408703654731&amp;postID=5854086733030803137' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/5854086733030803137'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2267747408703654731/posts/default/5854086733030803137'/><link rel='alternate' type='text/html' href='http://pchiusano.blogspot.com/2008/09/why-inflation-is-appropriate.html' title='Why inflation is appropriate'/><author><name>Paul Chiusano</name><uri>http://www.blogger.com/profile/04844651950877109501</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry></feed>
