From Term to Script: How PlutusType Drives Plutarch

23 Jun

Introduction

PlutusType forms a critical part of Plutarch: without it, almost no other mechanisms in Plutarch can work. We can clearly see this in our previous article, where in order to talk about PLiftable, we had to assume that all of our examples were already PlutusType instances. Clearly, everything must begin with PlutusType.

At the same time, PlutusType is by far the most cryptic part of Plutarch. Despite its centrality, understanding what PlutusType does and why we need it is far from easy. Due to how closely it ties into implementation choices made as part of Plutarch's design, as well as Plutarch's internals, to understand PlutusType is to understand Plutarch itself: no small task already. However, to make matters even more challenging, due to the fairly advanced uses of Haskell in Plutarch, and the many different ways PlutusType is implemented throughout its codebase, trying to understand PlutusType by way of reading its use sites isn't going to prove particularly enlightening unless you already know what to look for. Lastly, and most confusingly, PlutusType instances are almost never written 'by hand', instead using a range of via-deriving helpers. However, without at least understanding what those helpers are replacing, and why you need to use which one, it is unsurprising that PlutusType is seen as deeply magical and confusing by most application developers who use Plutarch[0].

This article aims to demystify PlutusCore for application developers. First, we will dive to the bottom of the pool, starting with what a Plutarch Term 'really is'. We will then discuss PlutusType and its parallels with Monad, introducing the motivations behind PlutusType in the process. To make its usage clear, we will then demonstrate three example PlutusType instances, using the examples from our PLiftable article to illustrate how PlutusType ties into the notion of representation we talked about in that same article. Lastly, we will show how PlutusType instances should be defined in practice, by discussing the various via-derivation helpers provided by Plutarch to make the definition of such instances faster and less tedious.

We recommend that readers be familiar with our PLiftable article before reading this one. The main reason for this is that the notion of representations talked about regarding PLiftable applies to PlutusType as well, though in a more distant way. On top of that, as we are re-using the same examples for this article as for the PLiftable article, the example instances of PlutusType we are about to write will feel more familiar.

'But what is a Term, really?'

Before we can discuss PlutusType at all, we must get to the bottom of a critical, but not clearly connected, component of Plutarch: Term. Term is the glue that holds together the entire language, and thus, by its nature, is quite low-level, fiddly and a bit obscure. We want to note that, for regular Plutarch use by application developers, understanding, or being able to manipulate, Terms at the level we are about to discuss is thankfully not necessary. However, to properly understand PlutusType and its role in the language, we must take a deep dive into Term, its exact definition, meaning and purpose.

Conceptually, Term s a in Plutarch means 'a computation that, when executed, will either produce a result of type a, or an error'[1]. In some ways, this is similar to IO in regular Haskell: just like IO a doesn't necessarily have an a 'inside' it, a Term s a doesn't necessarily have an a 'inside' it either. Instead, Term, like IO, allows us to sequence computations according to certain rules and capabilities. Much like IO, Term is normally kept 'closed' to avoid breaking those rules; however, much like IO, there is a way to 'look under the hood' to see what's really happening, which we will do next.

So, what is a Term, really?

newtype Term (s :: S) (a :: S -> Type) = 
   -- Or, if you prefer, ReaderT Word64 TermMonad TermResult
   Term {asRawTerm :: Word64 -> TermMonad TermResult}

This only raises more questions. What is a TermMonad? What is a TermResult? What does this Word64 mean? To address these, we show the definitions for TermMonad and TermResult:

newtype TermMonad (a :: Type) = 
   TermMonad {runTermMonad :: ReaderT (InternalConfig, Config) (Either Text) a}
   deriving (Functor, Applicative, Monad) via (ReaderT (InternalConfig, Config) (Either Text))
data TermResult = TermResult {
   getTerm :: RawTerm,
   getDeps :: [HoistedTerm]
   }

Thus, 'in reality', we can imagine Term to be:

newtype Term s a = Term (ReaderT (Word64, InternalConfig, Config) (Either Text) (RawTerm, [HoistedTerm]))

Before we go on to what RawTerm and HoistedTerm mean, it's worth considering what we can see already. A Term takes some environment, mostly involving configurations, and then either errors out with a textual error message, or produces some combination of 'raw' and 'hoisted' computations. This essentially makes Term a compilation environment, which is unsurprising given its stated purpose.

We will skip the exact definitions of InternalConfig and Config, as they don't really matter for our purposes, and move on to the definition of RawTerm. This is where the internals of Plutarch are really laid bare:

data RawTerm
 = RVar Word64
 | RLamAbs Word64 RawTerm
 | RApply RawTerm [RawTerm]
 | RForce RawTerm
 | RDelay RawTerm
 | RConstant (Some (ValueOf PLC.DefaultUni))
 | RBuiltin PLC.DefaultFun
 | RCompiled (UPLC.Term UPLC.DeBruijn UPLC.DefaultUni UPLC.DefaultFun ())
 | RError
 | RHoisted HoistedTerm
 | RPlaceHolder Integer
 | RConstr Word64 [RawTerm]
 | RCase RawTerm [RawTerm]

This would look familiar to anyone who's implemented any kind of DSL in Haskell: it is an abstract syntax tree. In fact, it closely follows another abstract syntax tree, this time from UntypedPlutusCore:

data Term name uni fun ann
   = Var !ann !name
   | LamAbs !ann !name !(Term name uni fun ann)
   | Apply !ann !(Term name uni fun ann) !(Term name uni fun ann)
   | Force !ann !(Term name uni fun ann)
   | Delay !ann !(Term name uni fun ann)
   | Constant !ann !(Some (ValueOf uni))
   | Builtin !ann !fun
   | Error !ann
   | Constr !ann !Word64 ![Term name uni fun ann]
   | Case !ann !(Term name uni fun ann) !(Vector (Term name uni fun ann))

Thus, at the heart of everything, Plutarch Terms are 'stand-in' for the UPLC they would generate when compiled. We see direct parallels:

RVar gets translated to Var
RLamAbs gets translated to LamAbs
RApply gets translated to Apply
RForce gets translated to Force
RDelay gets translated to Delay
RConstant gets translated to Constant
RBuiltin gets translated to Builtin
RError gets translated to Error
RConstr gets translated to Constr
RCase gets translated to Case

We can see that these 'arms' of RawTerm are structurally similar to their UPLC Term equivalents, even if they track slightly different information. As part of this, we can see why Term requires Word64 in its environment: because UPLC uses DeBruijn indexing, a Term must track how many 'scopes deep' we are in order to be able to compile references to variables correctly. Plutarch uses a Word64 to do this, hence the requirement to have this in our environment[2].

This leaves a few Plutarch-specific parts:

RCompiled allows direct embedding of a UPLC term into Plutarch without having to 'run it through' Plutarch first;
RHoisted allows us to share computations via 'hoisting' (more on this later);
RPlaceholder enables branch detection by lookahead.

These parts exist for a combination of convenience and performance: Plutarch could work without them, but would not provide anywhere near the performance guarantees in its generated code that we see.

To finish our description, we need to talk about hoisting. Hoisting is a key technique Plutarch uses to give good performance, as it allows sharing common terms. To see how this works, we need the definition of HoistedTerm:

data HoistedTerm = HoistedTerm (Digest Blake2b_244) RawTerm

We see that this associates a RawTerm with a hash, which is used to uniquely identify it. While the full details of hoisting are somewhat involved, the short version is as follows:

Every time we extend the computation represented by a Term, we track what subcomputations it depends on.
Each time something becomes a subcomputation, we hash it. That hash becomes that computation's unique identifier.
When it comes to generating code, we track all compiled subcomputations by associating them with their hash. If we encounter the same subcomputation again, we don't compile it again; instead, we refer back to it by using a variable.

To be precise, hoisting does not eliminate code duplication in Terms themselves[4]. As per the definition of HoistedTerm, we still have to have the dependent RawTerm available, as when we generate UPLC, we still need something to compile the first time we see it. However, the generated UPLC after we compile our Terms will not be duplicated the same way, as we will not generate code that re-does dependent computations many times.

This nicely brings us around to that mysterious s parameter Terms have. This serves a similar purpose to the s parameter in ST, hence the naming. In ST, the s ensures that we never have situations like this:

-- Extremely bad, as it puts an STRef outside of ST!
runST (newSTRef 10) :: forall s . STRef s Int

This is done by ensuring that runST is not parametric over the s parameter of ST, only the a. This means that we cannot 'leak' mutability outside of ST at all[3].

In the context of Terms, in order to support both hoisting and compilation as a whole, we need a notion of closed Terms, which are computations without free variables. This is unsurprising: we cannot give a sensible meaning to a computation containing free variables, and thus, we cannot sensibly compile such a computation either. Thus, the interface to both hoisting and compilation ensures that we cannot be parametric in s, which means that Term also cannot 'leak':

compile :: forall (a :: S -> Type) . Config -> (forall (s :: S) . Term s a) -> Either Text Script
phoistAcyclic :: forall (a :: S -> Type) (s :: S) . HasCallStack => (forall (s' :: S) . Term s' a) -> Term s a

compile looks similar to runST, in that it eliminates the s completely. phoistAcyclic is more interesting, as here, we must show that the argument computation does not depend on the result at all: this is why phoistAcyclic is polymorphic in s (the variable in the result Term), but not s' (the variable in the argument Term). This allows us to hoist safely as long as these rules are never broken; thus, Term, like ST must remain 'closed'.

Putting all the above together, we now see what exactly a Term is. Specifically, it is a code generation computation that takes a De Bruijn 'level' and a configuration, and then produces one of:

A compilation error; or
A RawTerm, together with a list of its hoisted dependences.

Knowing this, we can now discuss PlutusType and its implementation.

The PlutusType type class

Before we discuss PlutusType and its role, we need to make one observation about Term, which at first seems surprising. As we mentioned previously, Term represents computations, in a similar way to IO or ST. We also previously mentioned that IO and ST are 'closed' types: we do not have access to their internals, and must instead operate them via an interface. The most important such interface happens to be Monad[6], and unsurprisingly, both IO and ST are instances of Monad. However, Term cannot be a Monad; in fact, it cannot even be a Functor. This is no accident: in general, it makes no sense to talk about a Term s a where a is an arbitrary type, as there may be no sensible onchain computation for a. Partly for this reason, Plutarch tends to operate over terms of kind S -> Type, rather than Type, as it lets us control what can, and can't, be 'lifted' into Terms.

However, if Plutarch is to be a useful DSL for onchain scripts, we must have capabilities similar to that of Monad for Terms. Without this, we wouldn't be able to sequence computations at all, which would render Plutarch unusable. It is exactly these capabilities that PlutusType provides. Essentially, for a given type a of kind S -> Type, a PlutusType instance specifies:

What result onchain a term of type a corresponds to; and
How to sequence computations that produce an a with other computations inside Terms.

It is for this reason that we need PlutusType. Without it, we would have no way of relating our Haskell-level definitions to actual onchain computations, or indeed, even combining any computations at all. It is for this reason we had to spend time discussing Term. Without understanding Term, it is hard to see the need for PlutusType.

To help us understand PlutusType, it helps to remind ourselves of the interface provided by Monad. Our version is not the same as the one provided by base, but it serves our needs, and has a simpler presentation:

class Monad (m :: Type -> Type) where
   pure :: forall a . a -> m a
   bind :: forall a b . m a -> (a -> m b) -> m b

We can see that Monad is instantiated for computation environments which describe some kind of effect. The interface specifies how, given any kind of result, we can 'lift' that result into the computation environment, as well as how to sequence computations in that environment, regardless of what results they have. To ensure sensible behaviour, Monad also follows several laws:

bind (pure x) f = f x
bind x pure = x
bind x (\x' -> bind (f x') g) = bind (bind x f) g

The first law states that pure cannot introduce any effects. The second law states that bind can only introduce the effects its function argument produces. The third law shows how bind composes. With these laws alone, we can safely use any Monad instance as part of abstractions, without concern for precisely how pure and bind may be implemented for it. This enables the rich functionality found in Control.Monad, as well as many other modules and libraries.

Our needs for PlutusType are somewhat different. The computation environment is fixed in our case, as it is always Term. Instead, we need to specify for a particular result, how it can be lifted into a Term, as well as how computations that produce that result can be sequenced with other Term computations. This gives us the following[7]:

class PlutusType (a :: S -> Type) where
   type PInner a :: S -> Type
   pcon' :: forall (s :: S) . a s -> Term s (PInner a)
   pmatch' :: forall (s :: S) (b :: S -> Type) . Term s (PInner a) -> (a s -> Term s b) -> Term s b

Disregarding PInner for the moment, we can see that in many ways, PlutusType mirrors Monad. Indeed, pcon' looks a lot like pure, and pmatch' looks a lot like bind. The biggest difference is the change in focus: whereas for Monad, the environment is the type of interest, and the results are arbitrary, for PlutusType, the environment is fixed, and the result is the type of interest.

As PlutusType is analogous to Monad in a way, its laws must be similarly analogous to those of Monad. Indeed, this is the case:

pmatch' (pcon' x) f = f x
pmatch' x pcon' = x
pmatch' x (\x' -> pmatch' (f x') g) = pmatch' (pmatch' x f) g

This has an interesting implication regarding pcon' specifically, as it means that pcon' cannot introduce a Term that errors. This means that computations that are representable must be valid for anything that is an instance of PlutusType.

Lastly, we need to consider PInner. Conceptually, this is the 'underlying form' of the type, or put differently, a Plutarch type to which we can coerce without loss. The reason we need this notion is twofold[11]:

For types that represent computations with Data, allowing definitions of pcon' and pmatch' against PData; and
For newtypes wrapping Terms, allowing definitions of pcon' and pmatch' that only unwrap newtypes and do nothing more.

These are rather particular to both the way the onchain environment works and how Plutarch chooses to interact with Haskell. In practice, we rarely need to concern ourselves with PInner, as in almost any situation we are likely to see as application developers, PInner will be one of three things:

The underlying type of a newtype
PData
POpaque

To that end, we don't typically use pcon' and pmatch' in application code, preferring instead to use their more accessible forms below:

-- The definitions don't matter here
pcon :: forall (a :: S -> Type) (s :: S) . PlutusType a => a s -> Term s a
pmatch :: forall (a :: S -> Type) (s :: S) (b :: S -> Type) . PlutusType a => Term s a -> (a s -> Term s b) -> Term s b

These functions call pcon' and pmatch' underneath but hide PInner from us so we don't have to concern ourselves with it.

PlutusType versus PLiftable

At this stage, it is worth bringing up the related notion of representation, and the PLiftable type class that embodies it. PlutusType and PLiftable appear to overlap: both seem to relate to how a definition in Haskell translates to Plutus and the chain. Then why have both?

The answer lies in what aspect of this translation we focus on: PlutusType is concerned with how a given Haskell definition fits into Plutarch as such, whereas PLiftable concerns itself with how specifically the Haskell, Plutarch and Plutus 'universes' connect. Thus, PlutusType must concern itself with representations at least somewhat, as ultimately, any computation result onchain must be represented onchain somehow. However, most of the concerns about the specific connections between the 'universes' are left in PLiftable, allowing PlutusType to focus more on being an analogy to Monad in the context of Terms. This separation allows us not only more focused instances and simpler laws, but also the possibility that a type can be used in Plutarch Terms, but lacks any representation onchain or in the Haskell 'universe'.

It might seem like an odd thing for us to want: why would we possibly want something that has no onchain representation, but yet can inhabit Plutarch Terms and be sequenced in them? However, there is at least one case where we need this separation: the Plutarch function type :-->. We definitely want to be able to have Plutarch function-typed Terms, or we wouldn't be able to do anything. Yet, an instance of PLiftable for a :--> b poses significant problems. To see why, consider what AsHaskell (a :--> b) and PlutusRepr (a :--> b) would have to be - no choice we could make works!

Such a claim requires some examination. Even if we assume PLiftable instances for a and b, the only possible choice for AsHaskell (a :--> b) would be AsHaskell a -> AsHaskell b. Given this, no matter what we chose for PlutusRepr (a :--> b), the PLiftable instance in question would have to, given any function between any types that have Plutus 'universe' representations, construct something in the Plutus 'universe' that corresponds to that function in a reversible way. This idea falls apart at the first hurdle if we consider ByteString, as its Haskell 'universe' capabilities are far greater than those of its Plutus 'universe' equivalent, by design. Thus, a general instance of PLiftable (a :--> b) is completely impossible.

Thus, for these two reasons, PLiftable and PlutusType are kept separate. At the same time, we have to keep representations in mind with PlutusType as we will show in our upcoming examples.

Examples

To help see how PlutusType instances work in practice, as well as how they connect to representations, we will redo our three examples from the PLiftable article. In order, these types will use the builtin, SOP and Data representations. For clarity, we will repeat their definitions as they come up.

Builtin representation

For builtin representations, the instance is straightforward:

newtype PByteVector (s :: S) = PByteVector (Term s PByteString)
instance PlutusType PByteVector where
   type PInner PByteVector = PByteString
   -- expanded type signatures for clarity
   -- pcon' :: forall s . PByteVector s -> Term s PByteString
   pcon' (PByteVector t) = t
   -- pmatch' :: forall s b. Term s PByteString -> (PByteVector s -> Term s b) -> Term s b 
   pmatch' t f = f (PByteVector t)

Our goal is to delegate the PlutusType implementation to the type we're newtypeing around. Here, we see one of the use cases for PInner, as it enables exactly this kind of delegation. This is quite similar to how PLiftable works for such definitions. Thus, the method is entirely cookbook:

Set PInner to whatever type we are wrapping
Have pcon' unwrap the newtype
Have pmatch' rewrap the newtype before applying the function argument

If our goal is a builtin representation, this is the approach we take for defining PlutusType instances[5].

SOP representation

Next, we can consider a SOP-represented type:

data PThese (a :: S -> Type) (b :: S -> Type) (s :: S) = 
   PThis (Term s a) |
   PThat (Term s b) |
   PThese (Term s a) (Term s b)

As we want PThese to have an SOP representation, we must ensure that the implementation for pcon' and pmatch' produces the right UPLC primitives. This means we must build up Terms by hand, which will put our knowledge of Term to the test.

First, we must decide what PInner (PThese a b) should be. Unlike for PByteVector, this choice is far from obvious, as SOP representations ultimately fall back to SOP primitives. For this (and similar) cases, Plutarch provides a type corresponding to 'some onchain thing', called POpaque. You cannot do very much with this type (safely): it is, in some sense, a 'most general type', similar to Any in Haskell. However, in our case, due to our limited setting, we can be sure that we never do anything inappropriate with POpaques handed to us as part of this instance's methods.

This gives us the following initial implementation:

instance PlutusType (PThese a b) where
   type PInner (PThese a b) = POpaque
   -- expanded type signatures for clarity
   -- pcon' :: forall s . PThese a b s -> Term s POpaque
   pcon' = _
   -- pmatch' :: forall s b . Term s POpaque -> (PThese a b s -> Term s b) -> Term s b
   pmatch' = _

To define pcon', we must determine how to take a Haskell-level representation of a Plutarch computation, and lift it into a Term that, when computed, will produce 'some onchain thing'. We first start with a case analysis:

-- Repeated for clarity
instance PlutusType (PThese a b) where
   type PInner (PThese a b) = POpaque
   -- expanded type signatures for clarity
   -- pcon' :: forall s . PThese a b s -> Term s POpaque
   pcon' = \case
       PThis t1 -> _
       PThat t2 -> _
       PThese t1 t2 -> _
   -- pmatch' :: forall s b . Term s POpaque -> (PThese a b s -> Term s b) -> Term s b
   pmatch' = _

As our goal is to use an SOP representation, we must work in the RConstr constructor of RawTerm, as this represents SOP introductions. If we examine the RConstr data constructor, we see it has two arguments:

A Word64, corresponding to a data constructor index in a sum (or an 'arm'); and
A list of RawTerms, corresponding to the fields of the specified constructor.

The choice of constructor indices is quite important, as without the right choices here, we can't get pmatch' to work correctly. Specifically, constructor indices must begin at 0 and must be consecutive. While we can order the indexes whichever way we like, it's easier to assign the first data constructor listed in PThese' data type declaration the index 0, and then 'count up' from there. Thus, we will use 0 for PThis, 1 for PThat and 2 for PThese.

The RawTerms that go into the list argument for RConstr in each case must come from t1 or t2, depending on which 'arm' of PThese we're in. However, t1 and t2 are Terms, not RawTerms, and thus, our new Term needs to take into account their respective environments and dependencies and sequence the computations properly inside of the TermMonads they wrap.

If we consider the different parts of the environment of t1 and t2, we can see that we don't really have to do much with either of the environments:

The DeBruijn level does not need to change, as we do not introduce any lambdas;
The configuration can stay exactly the same, as we need not change anything (and in fact, very much shouldn't).

At the same time, we need to make sure that our result Term has all the dependencies of any Term whose RawTerm components we wish to embed. This makes sure that hoisting can work correctly. Thus, to define the rest of pcon', we must do the following for each match in the case statement:

Create a fresh Term wrapper, using a lambda to name the DeBruijn-level argument from the environment.
For each field in the 'arm' of PThis we are currently in, we must 'unpack' the Term using asRawTerm together with the named level argument from step 1.
Collect together all the dependencies of 'unpacked' Terms in step 2.
Build a TermResult using the RawTerms from step 2 and dependencies from step 3, and putting them into an RConstr with the appropriate index.

This gives us the following:

-- Repeated for clarity
instance PlutusType (PThese a b) where
   type PInner (PThese a b) = POpaque
   -- expanded type signatures for clarity
   -- pcon' :: forall s . PThese a b s -> Term s POpaque
   pcon' = \case
       PThis t1 -> Term $ \level -> do
                             TermResult rawT1 depsT1 <- asRawTerm t1 level
                             pure . TermResult (RConstr 0 [rawT1]) $ depsT1
       PThat t2 -> Term $ \level -> do
                             TermResult rawT2 depsT2 <- asRawTerm t2 level
                             pure . TermResult (RConstr 1 [rawT2]) $ depsT2
       PThese t1 t2 -> Term $ \level -> do
                                 TermResult rawT1 depsT1 <- asRawTerm t1 level
                                 TermResult rawT2 depsT2 <- asRawTerm t2 level
                                 pure . TermResult (RConstr 2 [rawT1, rawT2]) $ depsT1 <> depsT2
   -- pmatch' :: forall s b . Term s POpaque -> (PThese a b s -> Term s b) -> Term s b
   pmatch' = _

We see here that each 'arm' of PThese is just an application of the four-step process we gave above.

Implementing pmatch' is a bit more complicated. Before we begin, it is worth looking at the RCase data constructor. Its first field is a RawTerm, which is intended to be the SOP value to scrutinize; its second field is a list of RawTerms, intended as 'handlers' for each possible 'arm' of the SOP value we are scrutinizing. More specifically, the 'arm' given index i when using RConstr will be resolved using the 'handler' at position i in the list given to RCase: this is why our choice of indexes when defining pcon' is important.

To see what we need to provide in a more familiar context, consider this function from Data.These:

these :: forall a b c . (a -> c) -> (b -> c) -> (a -> b -> c) -> These a b -> c
these handleThis handleThat handleThese = \case
   This x -> handleThis x
   That y -> handleThat y
   These x y -> handleThese x y

What we want to supply to RCase is effectively the RawTerm equivalent to the arguments provided above. The main differences are that, instead of providing a literal PThese or literal Haskell (or rather, Plutarch) functions, we want to supply the RawTerm equivalents of the same, and in a slightly different order. We also have to ensure that hoisting dependencies are handled properly as we do so.

Let's begin by defining the 'handlers', and constructing their relevant RawTerms and dependencies. For this purpose, we have the (PThese a b s -> Term s b) argument to pmatch', but as we can see, this produces a Term. Thus, we have to use a combination of plam and asRawTerm, inside a freshly produced term with a lambda naming the De Bruijn level:

-- Repeated for clarity
instance PlutusType (PThese a b) where
   type PInner (PThese a b) = POpaque
   -- expanded type signatures for clarity
   -- pcon' :: forall s . PThese a b s -> Term s POpaque
   pcon' = \case
       PThis t1 -> Term $ \level -> do
                             TermResult rawT1 depsT1 <- asRawTerm t1 level
                             pure . TermResult (RConstr 0 [rawT1]) $ depsT1
       PThat t2 -> Term $ \level -> do
                             TermResult rawT2 depsT2 <- asRawTerm t2 level
                             pure . TermResult (RConstr 1 [rawT2]) $ depsT2
       PThese t1 t2 -> Term $ \level -> do
                                 TermResult rawT1 depsT1 <- asRawTerm t1 level
                                 TermResult rawT2 depsT2 <- asRawTerm t2 level
                                 pure . TermResult (RConstr 2 [rawT1, rawT2]) $ depsT1 <> depsT2
   -- pmatch' :: forall s b . Term s POpaque -> (PThese a b s -> Term s b) -> Term s b
   pmatch' t f = Term $ \level -> do
                   TermResult handleThis depsThis <- asRawTerm (plam $ \x -> f (PThis x)) level
                   TermResult handleThat depsThat <- asRawTerm (plam $ \y -> f (PThat y)) level
                   TermResult handleThese depsThese <- asRawTerm (plam $ \x y -> f (PThese x y)) level
                   _

It is worth looking at what we just did more closely. Instead of trying to manually juggle the De Bruijn levels required to build up Plutarch lambdas ourselves, we use Plutarch itself to construct the Terms, then 'unpack' them immediately using the De Bruijn level argument we just named. We use f inside these new handlers, together with the appropriate 'wrapper' in the form of a PThese data constructor, to always produce a Term s b as a result. This works because De Bruijn indices indicate how many enclosing scopes away an argument comes from: this allows us to plam fearlessly, as long as we don't use any arguments from outside scopes. As we don't ever need to do this in a handler, this saves us a lot of time and confusion.

The next steps are similar to what we had to do for pcon': 'unpack' t, collect together all dependencies (including the dependencies of every handler), then build an RCase inside a TermResult, together with our collected dependencies:

-- Repeated for clarity
instance PlutusType (PThese a b) where
   type PInner (PThese a b) = POpaque
   -- expanded type signatures for clarity
   -- pcon' :: forall s . PThese a b s -> Term s POpaque
   pcon' = \case
       PThis t1 -> Term $ \level -> do
                             TermResult rawT1 depsT1 <- asRawTerm t1 level
                             pure . TermResult (RConstr 0 [rawT1]) $ depsT1
       PThat t2 -> Term $ \level -> do
                             TermResult rawT2 depsT2 <- asRawTerm t2 level
                             pure . TermResult (RConstr 1 [rawT2]) $ depsT2
       PThese t1 t2 -> Term $ \level -> do
                                 TermResult rawT1 depsT1 <- asRawTerm t1 level
                                 TermResult rawT2 depsT2 <- asRawTerm t2 level
                                 pure . TermResult (RConstr 2 [rawT1, rawT2]) $ depsT1 <> depsT2
   -- pmatch' :: forall s b . Term s POpaque -> (PThese a b s -> Term s b) -> Term s b
   pmatch' t f = Term $ \level -> do
                   TermResult handleThis depsThis <- asRawTerm (plam $ \x -> f (PThis x)) level
                   TermResult handleThat depsThat <- asRawTerm (plam $ \y -> f (PThat y)) level
                   TermResult handleThese depsThese <- asRawTerm (plam $ \x y -> f (PThese x y)) level
                   TermResult rawT depsT <- asRawTerm t level
                   let allDeps = depsThis <> depsThat <> depsThese <> depsT
                   pure . TermResult (RCase rawT [rawThis, rawThat, rawThese]) $ allDeps

We note here that the exact order in which we combine the hoisting dependencies isn't important as long as we don't skip any. Furthermore, duplicate dependencies are not a problem either: as long as we indicate that we have a dependency, having it multiple times poses no issues.

Data representation

Lastly, let us look at a Data representation type:

data PTheseData (a :: S -> Type) (b :: S -> Type) (s :: S) = 
   PDThis (Term s (PAsData a)) |
   PDThat (Term s (PAsData b)) |
   PDThese (Term s (PAsData a)) (Term s (PAsData b))

Fortunately, for Data represented types such as these, we have an easier time and don't have to use the underlying representation of Term like for SOP representations. Here, we make use of the other purpose of PInner by setting PInner (PTheseData a b) to PData. This gives us the following initial implementation:

instance PlutusType (PTheseData a b) where
   type PInner (PTheseData a b) = PData
   -- expanded type signatures for clarity
   -- pcon' :: forall s . PThese a b s -> Term s PData
   pcon' = _
   -- pmatch' :: forall s b . Term s PData -> (PTheseData a b s -> Term s b) -> Term s b
   pmatch' = _

By convention, sum types like PTheseData are represented using the Constr data constructor of Data, with a unique Integer index for each 'arm', and their fields, encoded as data, packed into a list. While this is not always the most efficient encoding, we will follow this convention here.

In order to build the right PData, we need to use pconstrData, together with pforgetData. This can then be made into PData with punsafeCoerce, which, despite its name, is safe here. Furthermore, we have to convert the PAsData fields of each 'arm' back into PData using pforgetData. Otherwise, the process is quite similar to what we defined previously for the non-Data version:

-- Repeated for clarity
instance PlutusType (PTheseData a b) where
   type PInner (PTheseData a b) = PData
   -- expanded type signatures for clarity
   -- pcon' :: forall s . PThese a b s -> Term s PData
   pcon' = punsafeCoerce . pforgetData . \case
       PThis t1 -> pconstrData # 0 # pcon (PCons (pforgetData t1) PNil)
       PThat t2 -> pconstrData # 1 # pcon (PCons (pforgetData t2) PNil)
       PThese t1 t2 -> pconstrData # 2 # pcon (PCons (pforgetData t1) (PCons (pforgetData t2) PNil))
   -- pmatch' :: forall s b . Term s PData -> (PTheseData a b s -> Term s b) -> Term s b
   pmatch' = _

What about pmatch'? Here, we are taking apart PData with increasing assumptions. We know that we must have a Constr (as we are the only ones who could have put it there), but we need to check the index to determine what 'arm' we're in. Furthermore, we also know that once we've established our 'arm', the field(s) must be valid a or b, wrapped in PAsData. Once all this is established, we construct the appropriate PTheseData, and feed it to the function argument of pmatch' to finish the process:

-- Repeated for clarity
instance PlutusType (PTheseData a b) where
   type PInner (PTheseData a b) = PData
   -- expanded type signatures for clarity
   -- pcon' :: forall s . PThese a b s -> Term s PData
   pcon' = punsafeCoerce . pforgetData . \case
       PThis t1 -> pconstrData # 0 # pcon (PCons (pforgetData t1) PNil)
       PThat t2 -> pconstrData # 1 # pcon (PCons (pforgetData t2) PNil)
       PThese t1 t2 -> pconstrData # 2 # pcon (PCons (pforgetData t1) (PCons (pforgetData t2) PNil))
   -- pmatch' :: forall s b . Term s PData -> (PTheseData a b s -> Term s b) -> Term s b
   pmatch' t f = pmatch (pasConstrData # t) $ \p -> 
                   plet (pfstBuiltin # p) $ \i -> 
                     plet (psndBuiltin # p) $ \xs -> 
                       plet (phead # xs) $ \h -> 
                         pif (i #== 0)
                             (f $ PDThis (punsafCoerce @(PAsData a) h))
                             (pif (i #== 1)
                                  (f $ PDThat (punsafeCoerce @(PAsData b) h))
                                  (f $ PDThese (punsafeCoerce @(PAsData a) h)
                                               (punsafeCoerce @(PAsData b) $ phead #$ ptail # xs))

Once again, we use punsafeCoerce here, this time to transform the list elements directly into their corresponding type. This is once again safe: the only PData we will see in pmatch' is whatever we put there with pcon', and thus, we don't have to worry about it being wrong.

Helpers from Plutarch

Much like for PLiftable instances, the code for PlutusType instances is mechanical, repetitive and low-level. If we had to write PlutusType instances the way we just did every time, Plutarch would be quite hard to use, and unnecessarily so. Thus, for the same reasons as for PLiftable, Plutarch provides a range of via-deriving helpers to automatically derive PlutusType.

As part of the PLiftable article, we already saw three common via-deriving helpers, each designed for one of the three representations. As these are common choices for many types, we will discuss them first; afterwards, we will show other, more focused helpers that are better in some circumstances.

The first of these is DeriveNewtypePlutusType. We can replace our (admittedly short) instance of PlutusType from before with a few derivations:

newtype PByteVector (s :: S) = PByteVector (Term s PByteString)
   deriving stock (Generic)
   deriving anyclass (SOP.Generic)
   deriving PlutusType via (DeriveNewtypePlutusType PByteString)

We use Generic to allow a derivation of Generic from generics-sop[8]. This allows us to via-derive PlutusType from a helper newtype, which we will see is a common pattern. DeriveNewtypePlutusType will effectively generate the same code as we did in our manual instance, involving manually unwrapping and rewrapping the newtype. Furthermore, the derivation will check whether we are actually a newtype over the declared type or not, which can be helpful to protect you from accidental representation changes.

Next, we will consider DeriveAsSOPStruct, which can be used to derive a PlutusType instance for anything that you want to use a SOP representation for. Once again, the derivation becomes easy:

data PThese (a :: S -> Type) (b :: S -> Type) (s :: S) = 
   PThis (Term s a) |
   PThat (Term s b) | 
   PThese (Term s a) (Term s b)
   deriving stock (Generic)
   deriving anyclass (SOP.Generic)
deriving via DeriveAsSOPStruct (PThese a b) instance PlutusType (PThese a b)

We use a standalone derivation here, but the intent is the same. The code that will be generated will look identical to our example, but for clarity, DeriveAsSOPStruct will do the following:

Assign a consecutive index to each 'arm' of the type definition, starting at 0. For PThese, it will assign PThis an index of 0, PThat an index of 1, and PThese an index of 2.
For pcon', it will construct Terms which 'unpack' all fields into their RawTerms and dependencies, then build an RConstr using the index corresponding to the 'arm' we are in, and the RawTerms from the fields, along with the union of all their dependencies.
For each of the 'arms', pmatch' will construct a 'handler' function which takes an argument for each field, then packs them into the 'arm' and calls the function argument to pmatch' on them. It will then unpack the Term argument, collect all dependencies, and build a Term calling RCase. The 'handler' RawTerms will be put in the same order as the indexes they correspond to.

The last of the three 'typical' helpers is DeriveAsDataStruct. The name similarity to DeriveAsSOPStruct is intentional, as the behaviour is quite similar, as is its use. Our example would look like this:

data PTheseData (a :: S -> Type) (b :: S -> Type) (s :: S) = 
   PDThis (Term s (PAsData a)) |
   PDThat (Term s (PAsData b)) |
   PDThese (Term s (PAsData b)) (Term s (PAsData b))
   deriving stock (Generic)
   deriving anyclass (SOP.Generic)
deriving via DeriveAsDataStruct (PTheseData a b) instance PlutusType (PTheseData a b)

The code that gets generated will, once again, be identical to what we wrote manually, but again, for clarity, DeriveAsDataStruct-based derivations will do the following:

Assign a consecutive index to each 'arm' of the type definition, starting at 0.
For pcon', it will call pconstrData with the index of the 'arm' we are in, and a list of the fields, all of which have pforgetData called on them.
For pmatch', it will unpack to a Constr, then check the index against all available 'arm' indices. When it finds a match, it will unpack the list of fields, punsafeCoerce them back into the right type, then construct a value and give it to the function argument.

These three via-deriving helpers are likely enough to define PlutusType in almost all cases. This alone is much easier and less error-prone than the manual definitions we wrote above. However, there is an added advantage that we cannot accidentally use the wrong helper, as the helpers check whether what we're asking for is even possible. For example, if you tried to use DeriveAsSOPData on PThese, it would error, saying that your fields are not Data-represented.

However, these are not the only available options for automatically deriving PlutusType instances. Plutarch provides several others for more specific cases, usually for reasons of efficiency. We will examine two of these, and discuss when you may want to use them.

The first such helper we will look at is DeriveAsDataRec[9]. This helper is designed for Data-represented types that look like records: that is, a single 'arm' with some fields. Because there is only one possible data constructor, we can encode such types as Plutus lists instead of Constr. This leads to a smaller encoding (no need for a constructor index), and also less code to work with such a type (as we don't have to check the index).

The second is DeriveAsTag. This is designed to use with 'enum'-style types: sums with no fields in any 'arm'. These can be treated as if they were a PInteger, with the first 'arm' of the type definition being given the value 0, the second arm being given the value 1, etc. This allows us to represent such types onchain as a builtin, which is the simplest, and fastest, possible encoding. While not often useful, at least one major user base of Plutarch[10] found this approach to be worthwhile enough to warrant such a deriving helper, making it available to all in Plutarch.

Other helpers exist besides these five, but they're either no longer useful and provided only for backward compatibility or should only be used internally. Just with these five via-derivation helpers, we can define almost all PlutusType instances quickly and conveniently, instead of having to go through the manual and low-level process of defining PlutusType ourselves. This is a convenience of Plutarch that application developers should use at every opportunity: there is almost never a reason to define PlutusType manually[12].

Conclusion

Like PLiftable, PlutusType is a core part of Plutarch, and by extension, every script written in it. However, what exactly it does, and why we need it, can be difficult to follow, especially for someone who isn't familiar with DSLs or how they're implemented in Haskell. In this article, we examined PlutusType, as well as multiple adjacent concepts, that rest at the heart of Plutarch. Together, these hold the language together and enable it to do everything we know and love in Plutarch. We also put this knowledge to use, by demonstrating how PlutusType instances would be implemented, using three different examples. Lastly, we demonstrated the options Plutarch provides to allow constructing PlutusType instances without all the attendant fiddliness and tedium.

This kind of deep dive into the internals of Plutarch is not without its challenges, and in practice, knowing how PlutusType, much less Term, 'really works' is not necessary to use Plutarch effectively. However, examining the way these concepts work and the reasons for their existence helps us in two ways:

It can help us see whether we are making the right choices, even if we are using helpers or abstractions; and
If we ever encounter a case where we need more control, we have the knowledge to do this work the way we need it.

By demystifying Plutarch, we believe it will become easier to learn and work with. Due to PlutusType and its related concepts being at the heart of what Plutarch is and how it works, demystifying them is the basis upon which all other understanding can be built. While we don't have to (or want to) write PlutusType ourselves, knowing how it works can help us use the helpers Plutarch provides more mindfully and with known intent, rather than blindly.

We hope this deep dive has given you a clear picture of how PlutusType powers every step of the Plutarch workflow. By leaning on PlutusType’s Monad-like interface, you can write richly typed, composable onchain logic tailored to your application needs. If you’re developing on Cardano and would like assistance integrating these patterns, please don’t hesitate to get in touch.

At times, even the maintainers get confused. The author of this article is
    a Plutarch maintainer, and even then learned something new in the process of
    writing it!
To be very specific, the Plutus-level equivalent of whatever a would be,
    or an error, while possibly also producing some logging.
The reason we have this in a Reader rather than a State is because
    scopes mirror the behaviour of local, making this convenient. 
This is also how IO works. In fact, IO is secretly ST RealWorld.
In this regard, hoisting differs to hash consing: hash consing can
    eliminate duplication even in the pre-compiled representation, as it builds
    a DAG directly. However, this is more complex, and Plutarch does not perform
    hash consing (yet).
This is technically not true, as this would make certain 'ground' types
    in Plutarch undefinable. However, we chose not to discuss the alternative
    for two reasons: application developers in Plutarch would never have to
    define such types, and the article is long and complicated enough as it is!
    However, if you are interested in how we do this, look at the PlutusType
    definitions for types such as POpaque and PBool.
Functor and especially Applicative play a role here too, but Monad
    is what allows computations to be sequenced and for us to control order of
    effects, which becomes extremely critical when mutation of state is
    concerned. We definitely want to make sure mutations happen in the right
    order, or chaos will ensue.
This is slightly simplified from the PlutusType we actually have in
    Plutarch at the moment, as it disregards PCovariant, PContravariant and
    PVariant. These don't really change the presentation or intent of
    PlutusType, and are likely to go away in a future version of Plutarch, but
    do add some additional complications. Thus, we decided to ignore them.
As we mentioned regarding PLiftable, you could also derive Generic
    from generics-sop more directly using Template Haskell as well.
There is also a corresponding DeriveAsSOPRec, which is meant to be used
    on the same types, but for SOP representations instead. At the moment,
    however, there is no advantage to using DeriveAsSOPRec over
    DeriveAsSOPStruct.
Specifically, Liqwid Labs, where this approach was developed.
Technically, there is a third case: definitions of 'ground' Plutarch
    types, such as PInteger and POpaque. However, since these types can only
    be defined by Plutarch itself, we will not discuss them here.
Unless you're a Plutarch maintainer.

HaskellPlutarchProject CatalystValidatorsCardanoPlutusTypeMetaprogramming

Koz Ross

From Term to Script: How PlutusType Drives Plutarch

Do You Even PLift? Bridging Haskell & Plutarch with PLiftable for Efficient On-Chain Data