Week 7 Exercises

This week we will study the essence of recursion and the classification of recursive functions in terms of recursion schemes. The difficulty-to-simplicity ratio of these concepts is perhaps the greatest you will ever come across.

Exercise 1

Consider the following function.

fix :: (a -> a) -> a
fix f = f (fix f)

This function is known as the fixed point combinator and it is basically recursion in the smallest box that will fit. Despite its unassuming appearance, the fixed point combinator plays an important role in programming language theory, because it is strong enough to generate any recursive function from a nonrecursive one.

Redefine the following functions by applying the fixed point combinator to nonrecursive generators. That is, for each function f, implement a nonrecursive function fF and use it to define f by partially applying fix fF.

Generator for id :: a -> a.
Generator for (++) :: [a] -> [a] -> [a].
Generator for reverse :: [a] -> [a].
Generator for repeat :: a -> [a].
Generator for foldr :: (a -> b -> b) -> b -> [a] -> b.
Generator for unfoldr :: (b -> Maybe (a, b)) -> b -> [a].
Generator for fix :: (a -> a) -> a itself.

If you find yourself needing some helper functions, make sure they are not recursive either.

module Week7.Exercise1 where

import Data.Function (fix)

Exercise 2

Since the fixed point combinator is so simple, we can actually take it from the level of values to the level of types. The definition is the same, even though the syntax looks quite different.

newtype Fix m = Fix {unFix :: m (Fix m)}

Redefine the following types by applying the fixed point combinator to nonrecursive types. That is, for each type A, implement a nonrecursive type AF and use it to define A by partially applying Fix AF.

Generator for Bool.
Generator for Maybe a.
Generator for Either a b.
Generator for ().
Generator for [] a.
Generator for Void.
Generator for Identity a.
Generator for Stream a from the streams package.
Generator for Tree a from the containers package.
Generator for Expr from week 4 exercise 3.
Generator for Free m a from the free package.
Generator for Cofree m a from the free package.
Generator for Fix m itself.

Remember to test your generators to make sure that they really generate the right types.

It helps to derive a few type class instances and write pattern synonyms for your types.

$(deriveFoldable ''ExprF)
$(deriveFunctor ''ExprF)
$(deriveShow1 ''ExprF)
$(deriveTraversable ''ExprF)

pattern Add' :: Expr' -> Expr' -> Expr'
pattern Add' x y = Fix (AddF x y)

We will need them to do exercise 4 anyway.

{-# LANGUAGE CPP, TemplateHaskell #-}
{-# LANGUAGE PatternSynonyms #-}

module Week7.Exercise2 where

import Data.Foldable.Deriving (deriveFoldable)
import Data.Functor.Classes
import Data.Functor.Deriving (deriveFunctor)
import Data.Traversable.Deriving (deriveTraversable)
import Text.Show.Deriving (deriveShow1)

newtype Fix m = Fix {unFix :: m (Fix m)}

-- | We can derive a `Show` instance for `Fix` by
--
-- * adding the `deriving Show` clause to the `newtype` declaration,
-- * observing the error messages,
-- * replacing the `deriving Show` clause with the inferred
--   `deriving instance Show (m (Fix m)) => Show (Fix m)` declaration,
-- * observing the error messages,
-- * turning on the `StandaloneDeriving` extension,
-- * observing the error messages,
-- * turning on the `UndecidableInstances` extension,
-- * observing the worrying lack of error messages,
-- * turning on the `-ddump-deriv` option,
-- * observing the dump messages,
-- * replacing the `deriving instance Show (m (Fix m)) => Show (Fix m)`
--   declaration with the dumped instance,
-- * turning off the `StandaloneDeriving` and
--   `UndecidableInstances` extensions,
-- * changing the `Show (m (Fix m))` constraint to the `Show1 m` constraint,
-- * replacing the `showsPrec` call with the `showsPrec1` call,
-- * observing the error messages,
-- * importing the `Data.Functor.Classes` module,
-- * observing the reassuring lack of error messages,
-- * observing the excessively verbose mess that the instance produces,
-- * writing less verbose variations that break the `Show` laws,
-- * using splices to make it possible to choose one of the variations,
-- * observing the error messages,
-- * turning on the `TemplateHaskell` extension,
-- * observing the error messages,
-- * replacing the splices with preprocessing directives,
-- * turning off the `TemplateHaskell` extension,
-- * observing the error messages,
-- * turning on the `CPP` extension,
-- * observing the eventual lack of error messages and
-- * sacrificing a child to Cheibriados.
--
-- We can then obtain a `Show` instance for `Fix F`
-- (or any partial application thereof) by
--
-- * turning on the `TemplateHaskell` extension,
-- * splicing `deriveShow1 ''F` to derive a `Show1` instance for `F` and
-- * letting the compiler figure out the rest.
--
-- You may choose your preferred variation of `Show` based on your alignment.
#define NEUTRAL_GOOD

#ifdef LAWFUL_GOOD
instance Show1 m => Show (Fix m) where
  showsPrec n (Fix x) = showParen (n >= 11) (showString "Fix {" .
    showString "unFix = " . showsPrec1 0 x . showString "}")
#else
#ifdef NEUTRAL_GOOD
instance Show1 m => Show (Fix m) where
  showsPrec n (Fix x) = showParen (n >= 11)
    (showString "Fix " . showsPrec1 11 x)
#else
#ifdef CHAOTIC_GOOD
instance Show1 m => Show (Fix m) where
  showsPrec n (Fix x) = showsPrec1 n x
#else
#error Alignment
#endif
#endif
#endif

Exercise 3

In week 4 exercises 3 and 4 we looked at an evaluator and a parser for expressions in a semiring. We shall now complement them with a collection of other utilities.

Define at least half of the following utility functions. That is, fulfill at least 6 of the following 12 requirements, whichever you find most interesting.

The function isSimple of type Expr -> Bool should check that the expression does not contain variable bindings or references.
The function breadth of type Expr -> Int should find the number of subexpressions in the expression. The resulting size should be unity for a constant.
The function depth of type Expr -> Int should find the nesting depth of the expression.
The function assocAdd of type Expr -> Expr should use the associativity laws of the sum semigroup to simplify the expression. The resulting expression should not contain any addition chains with mixed associativity. The function assocMul of type Expr -> Expr should work analogously for the product semigroup.
The function commAdd of type Expr -> Expr should use the commutativity law of the semiring to simplify the expression. The resulting expression should not contain any additions with their arguments in lexicographically decreasing order.
The function unifyAddZero of type Expr -> Expr should use the unit laws of the sum monoid to simplify the expression. The resulting expression should not contain any additions of zero. The function unifyMulOne of type Expr -> Expr should work analogously for the product monoid.
The function codistAddMul of type Expr -> Expr should use the distributivity laws of the semiring to simplify the expression. The resulting expression should not contain any distributed subexpressions.
The function absorbZeroMul of type Expr -> Expr should use the absorption laws of the semiring to simplify the expression. The resulting expression should not contain any multiplications by zero.
The function elimDead of type Expr -> Expr should eliminate dead code by removing all the unused bindings from the expression. The resulting expression should not contain any unused bindings, but may still contain used ones.
The function foldConst of type Expr -> Expr should fold constants by expanding all the bound variables in the expression. The resulting expression should not contain any used bindings, but may still contain unused ones.
The function substVar of type String -> Expr -> Expr -> Expr should substitute every occurrence of a variable in the expression. The resulting expression should not mention the old variable, unless it is shadowed by another variable with the same name or present in the substituted expressions. The function substVars of type Map String Expr -> Expr -> Expr should work analogously for several variables.
The function renameVar of type String -> String -> Expr -> Expr should rename a free variable in the expression. The resulting expression should not mention the old variable, unless it is shadowed by another variable with the same name. The function renameVars of type Map String String -> Expr -> Expr should work analogously for several variables.

Try to write your definitions using direct recursion instead of mutual recursion or folds. This keeps the utilities easy to change and makes it more convenient to do exercise 4 as well.

Once you are done, you can put your definitions together to produce a simple peephole optimizer.

optimize :: Expr -> Expr
optimize = (appEndo . foldMap Endo) [elimDead,
  assocAdd, commAdd, unifyAddZero,
  assocMul, unifyMulOne, codistAddMul, absorbZeroMul]

optimizePasses :: Int -> Expr -> Expr
optimizePasses n = (appEndo . stimesMonoid n . Endo) optimize

If you manage to implement most of the utilitity functions correctly, your optimizer should be able to find an optimal representation of the given deeply-embedded closed term representing 31 (built suboptimally according to OEIS sequence A005245).

{-# LANGUAGE ScopedTypeVariables #-}

module Week7.Exercise3 where

import Data.Char (intToDigit)
import Data.Foldable (find)
import Data.Map (Map (..))
import qualified Data.Map as Map
import Data.Semigroup (Endo (..), stimesMonoid)
import Data.Set (Set (..))
import qualified Data.Set as Set
import Week4.Exercise3
import Week4.Exercise4

closedStringBad :: String
closedStringBad = "let \
  \ two = 1 + 1 in let \
  \ three = 1 * (1 + two * 1) in let \
  \ five = let \
    \ four = two * two in \
    \ 1 + four in let \
  \ six = let \
    \ seven = 1 + six in \
    \ two * three in let \
  \ eight = two * four in let \
  \ nine = 0 + three * (three + 0) in \
  \ (0 * five + 1) + (three * (1 + nine) + eight * 0)"

closedDeepBad :: Expr
closedDeepBad = case parseExpr closedStringBad of
  Left e -> error (show e)
  Right x -> x

-- | An improved version of the `showIntAtBase` function
-- from the `Numeric` module of the `base` package.
showIntAtBase' :: forall a. Integral a => a -> (Int -> Char) -> a -> ShowS
showIntAtBase' n f = let
  g :: a -> ShowS
  g p = case p `divMod` n of
    (0, 0) -> id
    (0, m) -> showChar (f (fromEnum m))
    (d, m) -> g d . showChar (f (fromEnum m)) in
  g

-- | A dual of the `keysSet` function
-- from the `Data.Map` module of the `containers` package.
elemsSet :: Ord a => Map k a -> Set a
elemsSet xs = Set.fromList (Map.elems xs)

-- | A copy of the `thenCmp` function
-- from the `Language.Haskell.TH.Syntax` module
-- of the `template-haskell` package.
thenCmp :: Ordering -> Ordering -> Ordering
thenCmp EQ y = y
thenCmp x _ = x

namePrefixes :: [String]
namePrefixes = fmap pure ['x' .. 'z']

freshName :: Set String -> String
freshName css = case find (`Set.notMember` css)
  [(showString ds . showIntAtBase' 10 intToDigit n) mempty |
    n <- [0 :: Int ..], ds <- namePrefixes] of
  -- There are `on (*) toInteger maxBound (length namePrefixes)`
  -- possible names and they are generated in a sorted order
  -- to keep the behavior of the renamer predictable.
  Nothing -> error "Out of names"
  Just cs -> cs

findFree :: Expr -> Set String
findFree = let
  f :: Expr -> Set String
  f (Add x y) = f x <> f y
  f Zero = mempty
  f (Mul x y) = f x <> f y
  f One = mempty
  f (Let cs x y) = f x <> Set.delete cs (f y)
  f (Var cs) = Set.singleton cs in
  f

compareExpr :: Expr -> Expr -> Ordering
compareExpr = let
  f :: Map String String -> Map String String -> Expr -> Expr -> Ordering
  f css dss (Add x y) (Add z w) = f css dss x z `thenCmp` f css dss y w
  f _ _ Zero Zero = EQ
  f css dss (Mul x y) (Mul z w) = f css dss x z `thenCmp` f css dss y w
  f _ _ One One = EQ
  f css dss (Let cs x y) (Let ds z w) = f css dss x z `thenCmp` let
    es = freshName (findFree y <> findFree w) in
    f (Map.insert cs es css) (Map.insert ds es dss) y w
  f css dss (Var cs) (Var ds) = case (Map.lookup cs css, Map.lookup ds dss) of
    (Nothing, Nothing) -> cs `compare` ds
    -- We must not compare free variables to bound variables,
    -- because their ordering may be changed by the renamer.
    -- This is fine, because free variables are never renamed.
    (Nothing, Just _) -> LT
    (Just _, Nothing) -> GT
    (Just es, Just fs) -> es `compare` fs
  -- The remaining cases are chosen in such a way that
  -- "more constant" terms come before "more varying" ones.
  -- This kind of convention may seem arbitrary,
  -- but it comes up frequently in functional programming.
  f _ _ Add {} Zero {} = GT
  f _ _ Add {} Mul {} = LT
  f _ _ Add {} One {} = GT
  f _ _ Add {} Let {} = LT
  f _ _ Add {} Var {} = LT
  f _ _ Zero {} Add {} = LT
  f _ _ Zero {} Mul {} = LT
  f _ _ Zero {} One {} = LT
  f _ _ Zero {} Let {} = LT
  f _ _ Zero {} Var {} = LT
  f _ _ Mul {} Add {} = GT
  f _ _ Mul {} Zero {} = GT
  f _ _ Mul {} One {} = GT
  f _ _ Mul {} Let {} = LT
  f _ _ Mul {} Var {} = LT
  f _ _ One {} Add {} = LT
  f _ _ One {} Zero {} = GT
  f _ _ One {} Mul {} = LT
  f _ _ One {} Let {} = LT
  f _ _ One {} Var {} = LT
  f _ _ Let {} Add {} = GT
  f _ _ Let {} Zero {} = GT
  f _ _ Let {} Mul {} = GT
  f _ _ Let {} One {} = GT
  f _ _ Let {} Var {} = LT
  f _ _ Var {} Add {} = GT
  f _ _ Var {} Zero {} = GT
  f _ _ Var {} Mul {} = GT
  f _ _ Var {} One {} = GT
  f _ _ Var {} Let {} = GT in
  f mempty mempty

instance Eq Expr where
  x == y = compareExpr x y == EQ

instance Ord Expr where
  compare = compareExpr

Exercise 4

This is a bonus exercise, which means that you can get stars from it just like you could from any other exercise, but the stars have not been taken into account when calculating the maximum number of stars you can get. Thus, this exercise will be ignored in the unlikely event that not ignoring it would make your grade overflow.

If you were to integrate your peephole optimizer into a real compiler, you would find yourself having to deal with the following problems.

There is a lot of repetitive pattern matching. This is not only verbose, but also makes the code hard to change. Adding a new constructor to the expression type, for instance, would require modifying every single utility function.
There are no termination guarantees. One mistake in any of the utility functions might cause the program to never terminate. Problems like this are especially annoying to diagnose if they only tend to happen deep inside the optimization pipeline.
The performance scales poorly. As the breadth of the expressions or the number of passes increases, so do the resource requirements of the optimizer. In the worst case, every call to every utility function has to reconstruct the entire expression.

Luckily, these issues can be addressed with the clever use of recursion schemes.

Redefine your utility functions by applying a suitable recursion scheme to a nonrecursive generator. That is, for each function f, implement a nonrecursive function, such as fAlg or fCoalg, and use it to define f by partially applying the appropriate recursion scheme, for example via cata fAlg or ana fCoalg. You may have to weaken some of the requirements of the utility functions, so that you can choose simpler recursion schemes for the job.

Once you are done, you can put your definitions together to produce a more advanced peephole optimizer.

optimizeAlg' :: Algebra ExprF Expr'
optimizeAlg' = (appEmbed . foldMap Embed) [elimDeadAlg',
  assocAddAlg', commAddAlg', unifyAddZeroAlg',
  assocMulAlg', unifyMulOneAlg', codistAddMulAlg', absorbZeroMulAlg']

optimize' :: Expr' -> Expr'
optimize' = cata optimizeAlg'

optimizePassesAlg' :: Int -> Algebra ExprF Expr'
optimizePassesAlg' n = (appEmbed . stimesMonoid n . Embed) optimizeAlg'

optimizePasses' :: Int -> Expr' -> Expr'
optimizePasses' n = cata (optimizePassesAlg' n)

The trick is in fusing the algebras together in optimizePassesAlg' and performing the recursion afterwards in optimizePasses'.

{-# LANGUAGE DeriveFunctor, DeriveFoldable, DeriveTraversable #-}
{-# LANGUAGE FlexibleInstances, TypeSynonymInstances #-}

module Week7.Exercise4 where

import Control.Arrow
import Data.Function (on)
import Data.Map (Map (..))
import qualified Data.Map as Map
import Data.Semigroup (Endo (..), stimesMonoid)
import Data.Set (Set (..))
import qualified Data.Set as Set
import Week4.Exercise3
import Week7.Exercise2
import Week7.Exercise3

-- | Free-monadic version of the `(|||)` function
-- from the `Control.Arrow` module of the `base` package.
(||||) :: (a -> b) -> (m (Free' m a) -> b) -> Free' m a -> b
f |||| g = let
  h (Pure' x) = f x
  h (Free' xs) = g xs in
  h
infixr 2 ||||

-- | Cofree-comonadic version of the `(&&&)` function
-- from the `Control.Arrow` module of the `base` package.
(&&&&) :: (a -> b) -> (a -> m (Cofree' m b)) -> a -> Cofree' m b
f &&&& g = let
  h x = Cofree' (f x) (g x) in
  h
infixr 3 &&&&

type Algebra m a = m a -> a

cata :: Functor m => Algebra m a -> Fix m -> a
cata a = let
  c = cata a in
  a . fmap c . unFix

-- | Algebraic version of the `Endo` type
-- from the `Data.Monoid` module of the `base` package.
newtype Embed m = Embed {appEmbed :: Algebra m (Fix m)}

instance Semigroup (Embed m) where
  Embed g <> Embed f = Embed (g . unFix . f)

instance Monoid (Embed m) where
  mempty = Embed Fix

type Coalgebra m a = a -> m a

ana :: Functor m => Coalgebra m a -> a -> Fix m
ana c = let
  a = ana c in
  Fix . fmap a . c

hylo :: Functor m => Algebra m b -> Coalgebra m a -> a -> b
-- Factored version of `hylo a c = cata a . ana c`.
hylo a c = let
  h = hylo a c in
  a . fmap h . c

-- | Coalgebraic version of the `Endo` type
-- from the `Data.Monoid` module of the `base` package.
newtype Project m = Project {appProject :: Coalgebra m (Fix m)}

instance Semigroup (Project m) where
  Project g <> Project f = Project (g . Fix . f)

instance Monoid (Project m) where
  mempty = Project unFix

type ProductAlgebra m a = m (Fix m, a) -> a

para :: Functor m => ProductAlgebra m a -> Fix m -> a
para a = let
  p = id &&& para a in
  a . fmap p . unFix

type SumCoalgebra m a = a -> m (Either (Fix m) a)

apo :: Functor m => SumCoalgebra m a -> a -> Fix m
apo c = let
  a = id ||| apo c in
  Fix . fmap a . c

hypo :: Functor m => ProductAlgebra m b -> SumCoalgebra m a -> a -> b
hypo a c = para a . apo c

type CofreeAlgebra m a = m (Cofree' m a) -> a

histo :: Functor m => CofreeAlgebra m a -> Fix m -> a
histo a = let
  h = histo a &&&& fmap h . unFix in
  a . fmap h . unFix

type FreeCoalgebra m a = a -> m (Free' m a)

futu :: Functor m => FreeCoalgebra m a -> a -> Fix m
futu c = let
  f = futu c |||| Fix . fmap f in
  Fix . fmap f . c

chrono :: Functor m => CofreeAlgebra m b -> FreeCoalgebra m a -> a -> b
chrono a c = histo a . futu c

-- While we could reimplement all the old functions we have for `Expr`
-- to obtain new equivalent functions for `Expr'`,
-- that would be tedious and, in part,
-- vain due to our poor choice of representation (not de Bruijn).
-- Thus, we opt to merely derive an isomorphism and
-- transport the old functions along it.

fixExprCoalg' :: Coalgebra ExprF Expr
fixExprCoalg' = let
  f :: Expr -> ExprF Expr
  f (Add x y) = AddF x y
  f Zero = ZeroF
  f (Mul x y) = MulF x y
  f One = OneF
  f (Let cs x y) = LetF cs x y
  f (Var cs) = VarF cs in
  f

fixExpr' :: Expr -> Expr'
fixExpr' = ana fixExprCoalg'

unFixExprAlg' :: Algebra ExprF Expr
unFixExprAlg' = let
  f :: ExprF Expr -> Expr
  f (AddF x y) = Add x y
  f ZeroF = Zero
  f (MulF x y) = Mul x y
  f OneF = One
  f (LetF cs x y) = Let cs x y
  f (VarF cs) = Var cs in
  f

unFixExpr' :: Expr' -> Expr
unFixExpr' = cata unFixExprAlg'

showsPrecExpr' :: Int -> Expr' -> ShowS
showsPrecExpr' n = showsPrecExpr n . unFixExpr'

showsExpr' :: Expr' -> ShowS
showsExpr' = showsPrecExpr' 0

showExpr' :: Expr' -> String
showExpr' = flip showsExpr' mempty

closedDeepBad' :: Expr'
closedDeepBad' = fixExpr' closedDeepBad

closedDeep' :: Expr'
closedDeep' = fixExpr' closedDeep

instance Eq Expr' where
  (==) = on (==) unFixExpr'

instance Ord Expr' where
  compare = on compare unFixExpr'