Tuesday, February 8, 2011

Comonads for stream processing... not so great

The problem with comonads is the extend operation. The signature tells you all you need to know:
extend :: (f a -> b) -> f a -> f b
What's wrong with this signature? Well, it means the function passed to extend will have access to the full context of each element of the stream, and moreover, the extent of this context is globally determined by the comonad's implementation of extend. So, in the case of causal stream functions, where the context is say, the list of previous values in the stream, we can, at any point after the fact, reach arbitrarily far back into the past to inspect some values. This implies we must somehow hold onto these values, even if we discover (dynamically) we didn't need them.

If we attempt to provide some more limited notion of context, that limited notion is global, and it restricts the access of every single node in our dataflow graph that might want to use such information.

In a decent API for causal stream functions, it is an absolute requirement that each individual node be capable of deciding how much context it needs to hold onto, not after the fact, but as it receives each element. And importantly, we want these decisions be made locally. Thus, the stream processors are consumers to which we push data, rather than having them pull certain data from history, maintained globally by the producer.

There's a nice comment on the LtU discussion of "The Essence of Dataflow Programming" that sums it up for me:
Perhaps the fact that it can be done at all with a comonad (regardless of how simply) is of theoretical interest, but that is not the slant Uustalu and Vene and put on it. They claim (over-ambitiously I think) that comonadic semantics capture the "essence" of dataflow semantics...

My personal take on the whole thing is they succumbed to the understandable excitement of finding an application for comonads and confused a task that *can* be done with a comonad for one that *should* be done with a comonad.
... although I disagree with the author of that comment that applicative functors capture more of the 'essence' of dataflow programming.

No comments: