Friday, July 22, 2011

Unrestricted effects and nondeterminism in purely functional code

This post is partially a continuation of Referentially transparent nondeterminism.

In a program's outermost input and output layers, unrestricted effects and nondeterminism do not break referential transparency since there is no enclosing program around to observe the effects. This gives us a lot of freedom to do things we wouldn't ordinarily be able to do without breaking purity. Just how far can we push this?

Consider the mealy machine datatype:

data Mealy a b = Mealy (a -> (b, Mealy a b))

Clearly, we can feed a Mealy a b from an IO a producer. But we can also combine two IO a producers nondeterministically. It is only when we sample the output that we observe the effect of the nondeterminism. Here is a typeclass expressing this:

class (MonadPlus s) => Source s where
  -- <|> has the effect of merging two sources, nondeterministically
  -- perfectly okay, since when we sample later 
  -- (at the end of the universe), we get IO
  (<|>) :: s a -> s a -> s a
  transform  :: s a -> Mealy a b -> s b
  drain :: s a -> (a -> IO b) -> IO b
  -- deterministic merge - explicit interleaving
  zip   :: s a -> s b -> s (a,b)
  filter :: (a -> Bool) -> s a -> s a
  -- etc

This gives us nondeterminism in our input streams, which are nicely first class. Our choice of stream transformers, Mealy, are also first class. Only when we observe the output of a source (in drain) do we pay for the side effects we may have accumulated. Note that our (<|>) does not require us to pick a deterministic interleaving of the two sources - imagine we have a Source for keypresses of the letter a and another Source of keypresses of b. If our source representation were something like data Source a = Source { sample :: IO a }, our implementation of (<|>) would be required to pick a deterministic interleaving (wait for an a, then wait for a b, or vice versa). Our implementation can simply pass through either event when it occurs. Effects are paid for only when the stream is observed.

We can allow for more composition in our output handlers with a type like:

type Sink a b = Iteratee a IO b

-- replace drain signature in Source with:
drain :: s a -> Sink a b -> IO b

We then compose Sink values like we would any monadic Iteratee. What we lack, though, is the ability to pass along the nondeterminism from the merging we did with Source. Meaning, if we call drain (a <|> b) i, we ought to have the option (since this is the end of the universe), to evaluate this like drain a i OR drain b i, meaning we do not specify how the effects produced by the two separate instances of i are interleaved. To acheive this, we can add another constructor to Sink:

data Sink a b = One (Iteratee a IO b) | Many (Sink a b)

Many is used to signal to the Source evaluator than any effects <|>'d together may be fed to the given Sink concurrently. Again, since no one is around to observe it, this nondeterminism does not break RT. I don't know how often this sort of output nondeterminism is really useful, but it's interesting that it is possible.

So, we so far have allowed unrestricted side effects and nondeterminism at both the input to our programs as well as the output of our programs. What else can we do? Well, we can allow commutative effects in the implementation of our stream transformer:

data Mealy m a b = Mealy (a -> m (a, Mealy a b))

Here m must be some commutative monad. This ensures that we do not need to pick a global ordering on our effects. One interesting addition is to explicitly separate zip from the monad interface and require only that zip commute. This lets Mealy pipelines see effects from earlier pipeline stages, but not from parallel pipelines:

class Monad m => Commutative m where 
  -- subject to zip a b == fmap (snd *** fst) (zip b a)
  zip :: m a -> m b -> m (a,b)

As a final thought, I wonder: is it possible to make an instance of Commutative for something like State? Clearly, the implementation of zip should feed the same input s into both rather than sequencing the state as usual, but what do we do with the two different output states? We require either some sort of commutative state merging function s -> s -> s to combine the two new states, or we are forced to propagate both values along somehow, without committing to an order!

Related posts:

No comments: