Template Haskell

Gergely Risko (errge@nilcons.com)

2015-05-30

http://tiny.cc/nilcons-th

© 2015 Gergely Risko, All rights reserved.

1 About me

  • From Hungary, Eötvös Loránd university, CS MSc
  • Haskeller since 2007
  • Contributed TH features and patches to GHC 7.8
  • 2008 – 2010: Nokia Maemo
  • 2010 – 2013: Googler, SRE and later SWE
  • 2014 – : []: (Nilcons), own consulting and training company

2 TH Goals, features

  • Provide a clean macro system for Haskell
  • Generating expressions and top-level declarations
  • With the possibility of reification
    (looking up info about the source code)
  • To automate code generation, e.g.:
    • lenses
    • command line parsing

3 Versions

  • TH is shipped with GHC: ghc-pkg list | grep template-haskell
  • The compiler exposes an API with the Q monad
  • The template-haskell package contains the client side of that API and some extra helper functions
  • Have to use the template-haskell package version that shipped, no upgrade
    • GHC 7.8: TH 2.9 (that's what I'm using for this talk)
    • GHC 7.10: TH 2.10, GHC 7.10 is not deployable:
      • in 2015, with distributed buildfarms everywhere, it's no good to generate package ids from /dev/urandom
      • (BTW, we also shouldn't use predictable names in /tmp)

Docs: http://hackage.haskell.org/package/template-haskell-2.9.0.0

4 TH concepts

Look into the TH hackage docs.

  • Names
  • Reification
  • Locations
  • AddTopDecls

The last one sounds promising for generating code, but we're missing the real stuff:

  • quotations
  • splices

5 Quotations

Convert an inline piece of Haskell into an AST of:

  • type Exp if quoting with [e| ... |] or [| ... |]
  • type Type if quoting with [t| ... |]
  • type [Dec] if quoting with [d| ... |]
  • type Pat if quoting with [p| ... |]
> :set -XTemplateHaskell
> import Language.Haskell.TH
> let pprintQ q = putStrLn . pprint =<< runQ q
> pprintQ [| 1 + 2 |]
> runQ [| 1 + 2 |]
something :: [Dec]
something = [FunD f_19 [Clause [VarP x_20]
               (NormalB (InfixE (Just (LitE (IntegerL 1)))
                                (VarE GHC.Num.+)
                                (Just (InfixE (Just (LitE (IntegerL 2)))
                                      (VarE GHC.Num.*)
                                      (Just (VarE x_20)))))) []]]

6 Splices

The opposite of quotation.

If we already have a Q Exp, Q Pat, Q Type or Q [Dec], then we can use splicing to insert that piece of code (so we can actually use TH as a macro language).

> :set -ddump-splices
> $(return $ LitE $ IntegerL 42)

What is the return for? That's just for putting this constant expression into the Q monad, because the splicing API expects us to do that.

If we leave off the splicing we will just print out in IO, so this is an easy way to test our AST producing functions:

> return $ LitE $ IntegerL 42

7 AST builders

So far we have built up our ASTs with the data constructors (LitE, IntegerL) directly and then used return to lift into the Q monad.

There are also builder functions in TH that can be useful for building up bigger expressions.

> $(uInfixE (litE $ integerL 42) (varE '(+)) (litE $ integerL 42))

We could have done the same with constructor names, but we will see that there are some operations (new name generation, reification) that are in the Q monad. And once you're in, you have to stay in.

Also note the apostrophe before (+) for name lookup:

> :t '(+)
> :t 'something
> :t ''Int

8 Splicing in quotes in splicing in quotes…

> let funExp = [| \x -> x + 2 |]
> let valExp = [| 40 |]
> let thrice = [| $funExp $([| $funExp $([| $funExp $valExp |]) |]) |]

So quotation and splicing is recursively embeddable, this example is of course equivalent to the much simpler:

> pprintQ [| $funExp ($funExp ($funExp $valExp)) |]

Hmm, so many parantheses, we should use $, right?

> pprintQ [| $funExp $ $funExp $ $funExp $valExp |]

Or should we?

9 Exercise

Implement a generic tuple getter.

This is invalid Haskell with type errors:

tupleget 3 3 ("a", "b", "c") ===> "c"
tupleget 4 2 ("a", "b", "c", "d") ===> "b"

But we can do this with TH:

$(tupleget 3 3) ("a", "b", "c") ===> "c"
$(tupleget 4 2) ("a", "b", "c", "d") ===> "b"

Reverse live coding time! Post your gist links with solutions to:
https://titanpad.com/ZuriHac2015-TH

Hints:

  • idea 1: quote in [| \(_, m) -> m |] and disassemble (pattern match) and reassemble (constructor calls) it
  • idea 2: build up the correct lambda expression from scratch
    (but then you have to learn about newName)

9.1 Solution 1

We cheat and look into the basic case:

> runQ [| \(_, m) -> m |]
LamE [TupP [WildP,VarP m]] (VarE m)

Makes sense: a lambda expression using a tuple pattern with wildcards and variables.

Now we're ready to just disassemble this piece of data and reassemble it in our liking:

{-# LANGUAGE TemplateHaskell #-}

import Language.Haskell.TH

tupleget :: Int -> Int -> ExpQ
tupleget n i = do
  -- LamE [TupP [WildP,VarP m]] (VarE m)
  LamE [TupP [tplWild, tplMatch]] lamBody <- [| \(_, m) -> m |]
  let wildsBefore = replicate (i - 1) tplWild
  let wildsAfter = replicate (n - i) tplWild
  return $ LamE [TupP $ wildsBefore ++ [tplMatch] ++ wildsAfter] lamBody

9.2 Solution 2

If we don't want to do the whole disassembly-reassembly dance, we need to generate names on our own.

We can use newName for that.

{-# LANGUAGE TemplateHaskell #-}

import Control.Monad
import Language.Haskell.TH

tupleget :: Int -> Int -> Q Exp
tupleget n i = do
  when (i > n) $ reportError "i > n"
  var <- newName "m"
  let wildsBefore = replicate (i - 1) WildP
  let wildsAfter = replicate (n - i) WildP
  return $ LamE [TupP $ wildsBefore ++ [VarP var] ++ wildsAfter] (VarE var)

10 Reification

For generating code based on existing code (e.g. lenses based on a record type), we need some infrastructure to "look around" programatically in the code base.

This is reification.

10.1 Docs and testing

The API is documented in
Langauge.Haskell.TH -> Querying the compiler.

If we try to test this with our current debug mechanism, then we fail, because reification of course can't work in the IO monad, it can only work in the real Q monad, where we have access to the compiler backend.

> pprintQ $ reify '(+)
Template Haskell error: Can't do `reify' in the IO monad
*** Exception: user error (Template Haskell failure)

Then how do we use the REPL? We have to do a splice, so that reify runs in the Q monad, not in IO.

> import Control.Applicative
> $(pprint <$> reify '(+) >>= runIO . putStrLn >> [| () |])
> $(show <$> reify '(+) >>= runIO . putStrLn >> [| () |])

Exercise: how does this work?

10.2 Exercise

Given these data types:

data Tree a = Branch (Tree a) (Tree a)
data Maybe a = Nothing | Just a

Implement a function that determines if a datatype is data recursive!

So it should return true for Tree, but false for Maybe.

Caveats:

  • the list type is special, so although it's recursive, it's fine if your implementation says false, the same for other tricky builtin types (if any),
  • it's OK if your program only works for the easy cases, we know that the Haskell type system with all the extensions is hairy and complicated.

10.3 Solution

{-# LANGUAGE TemplateHaskell #-}

import Control.Applicative
import Control.Monad
import Language.Haskell.TH

isRecursive :: Name -> Q Bool
isRecursive n = isRecursive' n n

isRecursive' :: Name -> Name -> Q Bool
isRecursive' nOrig nNew = do
  info <- reify nNew
  case info of
    TyConI (NewtypeD _ _ _ cons _) -> checkConstructor nOrig cons
    TyConI (DataD _ _ _ conss _) -> or <$> (sequence $ map (checkConstructor nOrig) conss)
    TyConI (TySynD _ _ typ) -> checkType nOrig typ
    _ -> pure False

checkConstructor :: Name -> Con -> Q Bool
checkConstructor typName (NormalC _ ts) = or <$> (sequence $ map (checkType typName . snd) ts)
checkConstructor typName (RecC _ ts) = or <$> (sequence $ map (\(_, _, x) -> checkType typName x) ts)
checkConstructor typName (InfixC (_, typ1) _ (_, typ2)) = (||) <$> checkType typName typ1
                                                               <*> checkType typName typ2
checkConstructor typName (ForallC _ _ con) = checkConstructor typName con

checkType :: Name -> Type -> Q Bool
checkType typName (ForallT _ _ t) = checkType typName t
checkType typName (AppT t1 t2) = (||) <$> checkType typName t1 <*> checkType typName t2
checkType typName (SigT t _) = checkType typName t
checkType typName (ConT n) | typName == n = return True
                           | otherwise = isRecursive' typName n
checkType _ _ = pure False

11 A bigger example: HFlags

HFlags is a library to help handling command line arguments. It's basically a Haskell knockoff of the superb gflags library by Google.

The main idea is that you can declare command line flags anywhere in any module and HFlags automagically gathers them together in your main module and generates you a parser and --help output.

To implement the trickery we needed additional features in GHC regarding reification around modules and annotations. And just wanted to say that this was my first GHC contribution and it's definitely doable to contribute new features and ideas to GHC. Yes, it takes forever, yes they will try to just ignore you, but if you are persistent and a little bit pushy, it will work out!

HFlags is released on Hackage, it works and it's usable, but hopefully it will see major improvements during the Hackathon!

12 Missing features, current issues

  • No cross-compilation, being solved thanks to Luite (GHCJS)
  • No programmatic export/import of symbols from TH
  • Buggy reification behavior for current module since GHC 7.8
    (with the famous $(return []) workaround)
  • But as I said, patches to GHC are always welcome
    (at least welcome, even if not accepted) :)

13 Thank you!

Questions, comments?