From Term to Script: How PlutusType Drives Plutarch
Introduction
PlutusType
forms a critical part of Plutarch: without it, almost no other mechanisms in Plutarch can work. We can clearly see this in our previous article, where in order to talk about PLiftable
, we had to assume that all of our examples were already PlutusType
instances. Clearly, everything must begin with PlutusType
.
At the same time, PlutusType
is by far the most cryptic part of Plutarch. Despite its centrality, understanding what PlutusType
does and why we need it is far from easy. Due to how closely it ties into implementation choices made as part of Plutarch's design, as well as Plutarch's internals, to understand PlutusType
is to understand Plutarch itself: no small task already. However, to make matters even more challenging, due to the fairly advanced uses of Haskell in Plutarch, and the many different ways PlutusType
is implemented throughout its codebase, trying to understand PlutusType
by way of reading its use sites isn't going to prove particularly enlightening unless you already know what to look for. Lastly, and most confusingly, PlutusType
instances are almost never written 'by hand', instead using a range of via
-deriving helpers. However, without at least understanding what those helpers are replacing, and why you need to use which one, it is unsurprising that PlutusType
is seen as deeply magical and confusing by most application developers who use Plutarch[0].
This article aims to demystify PlutusCore
for application developers. First, we will dive to the bottom of the pool, starting with what a Plutarch Term
'really is'. We will then discuss PlutusType
and its parallels with Monad
, introducing the motivations behind PlutusType
in the process. To make its usage clear, we will then demonstrate three example PlutusType
instances, using the examples from our PLiftable
article to illustrate how PlutusType
ties into the notion of representation we talked about in that same article. Lastly, we will show how PlutusType
instances should be defined in practice, by discussing the various via
-derivation helpers provided by Plutarch to make the definition of such instances faster and less tedious.
We recommend that readers be familiar with our PLiftable
article before reading this one. The main reason for this is that the notion of representations talked about regarding PLiftable
applies to PlutusType
as well, though in a more distant way. On top of that, as we are re-using the same examples for this article as for the PLiftable
article, the example instances of PlutusType
we are about to write will feel more familiar.
'But what is a Term, really?'
Before we can discuss PlutusType
at all, we must get to the bottom of a critical, but not clearly connected, component of Plutarch: Term
. Term
is the glue that holds together the entire language, and thus, by its nature, is quite low-level, fiddly and a bit obscure. We want to note that, for regular Plutarch use by application developers, understanding, or being able to manipulate, Term
s at the level we are about to discuss is thankfully not necessary. However, to properly understand PlutusType
and its role in the language, we must take a deep dive into Term
, its exact definition, meaning and purpose.
Conceptually, Term s a
in Plutarch means 'a computation that, when executed, will either produce a result of type a
, or an error'[1]. In some ways, this is similar to IO
in regular Haskell: just like IO a
doesn't necessarily have an a
'inside' it, a Term s a
doesn't necessarily have an a
'inside' it either. Instead, Term
, like IO
, allows us to sequence computations according to certain rules and capabilities. Much like IO
, Term
is normally kept 'closed' to avoid breaking those rules; however, much like IO
, there is a way to 'look under the hood' to see what's really happening, which we will do next.
So, what is a Term
, really?
newtype Term (s :: S) (a :: S -> Type) =
-- Or, if you prefer, ReaderT Word64 TermMonad TermResult
Term {asRawTerm :: Word64 -> TermMonad TermResult}
This only raises more questions. What is a TermMonad
? What is a TermResult
? What does this Word64
mean? To address these, we show the definitions for TermMonad
and TermResult
:
newtype TermMonad (a :: Type) =
TermMonad {runTermMonad :: ReaderT (InternalConfig, Config) (Either Text) a}
deriving (Functor, Applicative, Monad) via (ReaderT (InternalConfig, Config) (Either Text))
data TermResult = TermResult {
getTerm :: RawTerm,
getDeps :: [HoistedTerm]
}
Thus, 'in reality', we can imagine Term
to be:
newtype Term s a = Term (ReaderT (Word64, InternalConfig, Config) (Either Text) (RawTerm, [HoistedTerm]))
Before we go on to what RawTerm
and HoistedTerm
mean, it's worth considering what we can see already. A Term
takes some environment, mostly involving configurations, and then either errors out with a textual error message, or produces some combination of 'raw' and 'hoisted' computations. This essentially makes Term
a compilation environment, which is unsurprising given its stated purpose.
We will skip the exact definitions of InternalConfig
and Config
, as they don't really matter for our purposes, and move on to the definition of RawTerm
. This is where the internals of Plutarch are really laid bare:
data RawTerm
= RVar Word64
| RLamAbs Word64 RawTerm
| RApply RawTerm [RawTerm]
| RForce RawTerm
| RDelay RawTerm
| RConstant (Some (ValueOf PLC.DefaultUni))
| RBuiltin PLC.DefaultFun
| RCompiled (UPLC.Term UPLC.DeBruijn UPLC.DefaultUni UPLC.DefaultFun ())
| RError
| RHoisted HoistedTerm
| RPlaceHolder Integer
| RConstr Word64 [RawTerm]
| RCase RawTerm [RawTerm]
This would look familiar to anyone who's implemented any kind of DSL in Haskell: it is an abstract syntax tree. In fact, it closely follows another abstract syntax tree, this time from UntypedPlutusCore
:
data Term name uni fun ann
= Var !ann !name
| LamAbs !ann !name !(Term name uni fun ann)
| Apply !ann !(Term name uni fun ann) !(Term name uni fun ann)
| Force !ann !(Term name uni fun ann)
| Delay !ann !(Term name uni fun ann)
| Constant !ann !(Some (ValueOf uni))
| Builtin !ann !fun
| Error !ann
| Constr !ann !Word64 ![Term name uni fun ann]
| Case !ann !(Term name uni fun ann) !(Vector (Term name uni fun ann))
Thus, at the heart of everything, Plutarch Term
s are 'stand-in' for the UPLC they would generate when compiled. We see direct parallels:
RVar
gets translated toVar
RLamAbs
gets translated toLamAbs
RApply
gets translated toApply
RForce
gets translated toForce
RDelay
gets translated toDelay
RConstant
gets translated toConstant
RBuiltin
gets translated toBuiltin
RError
gets translated toError
RConstr
gets translated toConstr
RCase
gets translated toCase
We can see that these 'arms' of RawTerm
are structurally similar to their UPLC Term
equivalents, even if they track slightly different information. As part of this, we can see why Term
requires Word64
in its environment: because UPLC uses DeBruijn indexing, a Term
must track how many 'scopes deep' we are in order to be able to compile references to variables correctly. Plutarch uses a Word64
to do this, hence the requirement to have this in our environment[2].
This leaves a few Plutarch-specific parts:
RCompiled
allows direct embedding of a UPLC term into Plutarch without having to 'run it through' Plutarch first;RHoisted
allows us to share computations via 'hoisting' (more on this later);RPlaceholder
enables branch detection by lookahead.
These parts exist for a combination of convenience and performance: Plutarch could work without them, but would not provide anywhere near the performance guarantees in its generated code that we see.
To finish our description, we need to talk about hoisting. Hoisting is a key technique Plutarch uses to give good performance, as it allows sharing common terms. To see how this works, we need the definition of HoistedTerm
:
data HoistedTerm = HoistedTerm (Digest Blake2b_244) RawTerm
We see that this associates a RawTerm
with a hash, which is used to uniquely identify it. While the full details of hoisting are somewhat involved, the short version is as follows:
Every time we extend the computation represented by a
Term
, we track what subcomputations it depends on.Each time something becomes a subcomputation, we hash it. That hash becomes that computation's unique identifier.
When it comes to generating code, we track all compiled subcomputations by associating them with their hash. If we encounter the same subcomputation again, we don't compile it again; instead, we refer back to it by using a variable.
To be precise, hoisting does not eliminate code duplication in Term
s themselves[4]. As per the definition of HoistedTerm
, we still have to have the dependent RawTerm
available, as when we generate UPLC, we still need something to compile the first time we see it. However, the generated UPLC after we compile our Term
s will not be duplicated the same way, as we will not generate code that re-does dependent computations many times.
This nicely brings us around to that mysterious s
parameter Term
s have. This serves a similar purpose to the s
parameter in ST
, hence the naming. In ST
, the s
ensures that we never have situations like this:
-- Extremely bad, as it puts an STRef outside of ST!
runST (newSTRef 10) :: forall s . STRef s Int
This is done by ensuring that runST
is not parametric over the s
parameter of ST
, only the a
. This means that we cannot 'leak' mutability outside of ST
at all[3].
In the context of Term
s, in order to support both hoisting and compilation as a whole, we need a notion of closed Term
s, which are computations without free variables. This is unsurprising: we cannot give a sensible meaning to a computation containing free variables, and thus, we cannot sensibly compile such a computation either. Thus, the interface to both hoisting and compilation ensures that we cannot be parametric in s
, which means that Term
also cannot 'leak':
compile :: forall (a :: S -> Type) . Config -> (forall (s :: S) . Term s a) -> Either Text Script
phoistAcyclic :: forall (a :: S -> Type) (s :: S) . HasCallStack => (forall (s' :: S) . Term s' a) -> Term s a
compile
looks similar to runST
, in that it eliminates the s
completely. phoistAcyclic
is more interesting, as here, we must show that the argument computation does not depend on the result at all: this is why phoistAcyclic
is polymorphic in s
(the variable in the result Term
), but not s'
(the variable in the argument Term
). This allows us to hoist safely as long as these rules are never broken; thus, Term
, like ST
must remain 'closed'.
Putting all the above together, we now see what exactly a Term
is. Specifically, it is a code generation computation that takes a De Bruijn 'level' and a configuration, and then produces one of:
A compilation error; or
A
RawTerm
, together with a list of its hoisted dependences.
Knowing this, we can now discuss PlutusType
and its implementation.
The PlutusType type class
Before we discuss PlutusType
and its role, we need to make one observation about Term
, which at first seems surprising. As we mentioned previously, Term
represents computations, in a similar way to IO
or ST
. We also previously mentioned that IO
and ST
are 'closed' types: we do not have access to their internals, and must instead operate them via an interface. The most important such interface happens to be Monad
[6], and unsurprisingly, both IO
and ST
are instances of Monad
. However, Term
cannot be a Monad
; in fact, it cannot even be a Functor
. This is no accident: in general, it makes no sense to talk about a Term s a
where a
is an arbitrary type, as there may be no sensible onchain computation for a
. Partly for this reason, Plutarch tends to operate over terms of kind S -> Type
, rather than Type
, as it lets us control what can, and can't, be 'lifted' into Term
s.
However, if Plutarch is to be a useful DSL for onchain scripts, we must have capabilities similar to that of Monad
for Term
s. Without this, we wouldn't be able to sequence computations at all, which would render Plutarch unusable. It is exactly these capabilities that PlutusType
provides. Essentially, for a given type a
of kind S -> Type
, a PlutusType
instance specifies:
What result onchain a term of type
a
corresponds to; andHow to sequence computations that produce an
a
with other computations insideTerm
s.
It is for this reason that we need PlutusType
. Without it, we would have no way of relating our Haskell-level definitions to actual onchain computations, or indeed, even combining any computations at all. It is for this reason we had to spend time discussing Term
. Without understanding Term
, it is hard to see the need for PlutusType
.
To help us understand PlutusType
, it helps to remind ourselves of the interface provided by Monad
. Our version is not the same as the one provided by base
, but it serves our needs, and has a simpler presentation:
class Monad (m :: Type -> Type) where
pure :: forall a . a -> m a
bind :: forall a b . m a -> (a -> m b) -> m b
We can see that Monad
is instantiated for computation environments which describe some kind of effect. The interface specifies how, given any kind of result, we can 'lift' that result into the computation environment, as well as how to sequence computations in that environment, regardless of what results they have. To ensure sensible behaviour, Monad
also follows several laws:
bind (pure x) f = f x
bind x pure = x
bind x (\x' -> bind (f x') g) = bind (bind x f) g
The first law states that pure
cannot introduce any effects. The second law states that bind
can only introduce the effects its function argument produces. The third law shows how bind
composes. With these laws alone, we can safely use any Monad
instance as part of abstractions, without concern for precisely how pure
and bind
may be implemented for it. This enables the rich functionality found in Control.Monad
, as well as many other modules and libraries.
Our needs for PlutusType
are somewhat different. The computation environment is fixed in our case, as it is always Term
. Instead, we need to specify for a particular result, how it can be lifted into a Term
, as well as how computations that produce that result can be sequenced with other Term
computations. This gives us the following[7]:
class PlutusType (a :: S -> Type) where
type PInner a :: S -> Type
pcon' :: forall (s :: S) . a s -> Term s (PInner a)
pmatch' :: forall (s :: S) (b :: S -> Type) . Term s (PInner a) -> (a s -> Term s b) -> Term s b
Disregarding PInner
for the moment, we can see that in many ways, PlutusType
mirrors Monad
. Indeed, pcon'
looks a lot like pure
, and pmatch'
looks a lot like bind
. The biggest difference is the change in focus: whereas for Monad
, the environment is the type of interest, and the results are arbitrary, for PlutusType
, the environment is fixed, and the result is the type of interest.
As PlutusType
is analogous to Monad
in a way, its laws must be similarly analogous to those of Monad
. Indeed, this is the case:
pmatch' (pcon' x) f = f x
pmatch' x pcon' = x
pmatch' x (\x' -> pmatch' (f x') g) = pmatch' (pmatch' x f) g
This has an interesting implication regarding pcon'
specifically, as it means that pcon'
cannot introduce a Term
that errors. This means that computations that are representable must be valid for anything that is an instance of PlutusType
.
Lastly, we need to consider PInner
. Conceptually, this is the 'underlying form' of the type, or put differently, a Plutarch type to which we can coerce without loss. The reason we need this notion is twofold[11]:
For types that represent computations with
Data
, allowing definitions ofpcon'
andpmatch'
againstPData
; andFor
newtype
s wrappingTerm
s, allowing definitions ofpcon'
andpmatch'
that only unwrapnewtype
s and do nothing more.
These are rather particular to both the way the onchain environment works and how Plutarch chooses to interact with Haskell. In practice, we rarely need to concern ourselves with PInner
, as in almost any situation we are likely to see as application developers, PInner
will be one of three things:
The underlying type of a
newtype
PData
POpaque
To that end, we don't typically use pcon'
and pmatch'
in application code, preferring instead to use their more accessible forms below:
-- The definitions don't matter here
pcon :: forall (a :: S -> Type) (s :: S) . PlutusType a => a s -> Term s a
pmatch :: forall (a :: S -> Type) (s :: S) (b :: S -> Type) . PlutusType a => Term s a -> (a s -> Term s b) -> Term s b
These functions call pcon'
and pmatch'
underneath but hide PInner
from us so we don't have to concern ourselves with it.
PlutusType versus PLiftable
At this stage, it is worth bringing up the related notion of representation, and the PLiftable
type class that embodies it. PlutusType
and PLiftable
appear to overlap: both seem to relate to how a definition in Haskell translates to Plutus and the chain. Then why have both?
The answer lies in what aspect of this translation we focus on: PlutusType
is concerned with how a given Haskell definition fits into Plutarch as such, whereas PLiftable
concerns itself with how specifically the Haskell, Plutarch and Plutus 'universes' connect. Thus, PlutusType
must concern itself with representations at least somewhat, as ultimately, any computation result onchain must be represented onchain somehow. However, most of the concerns about the specific connections between the 'universes' are left in PLiftable
, allowing PlutusType
to focus more on being an analogy to Monad
in the context of Term
s. This separation allows us not only more focused instances and simpler laws, but also the possibility that a type can be used in Plutarch Term
s, but lacks any representation onchain or in the Haskell 'universe'.
It might seem like an odd thing for us to want: why would we possibly want something that has no onchain representation, but yet can inhabit Plutarch Term
s and be sequenced in them? However, there is at least one case where we need this separation: the Plutarch function type :-->
. We definitely want to be able to have Plutarch function-typed Term
s, or we wouldn't be able to do anything. Yet, an instance of PLiftable
for a :--> b
poses significant problems. To see why, consider what AsHaskell (a :--> b)
and PlutusRepr (a :--> b)
would have to be - no choice we could make works!
Such a claim requires some examination. Even if we assume PLiftable
instances for a
and b
, the only possible choice for AsHaskell (a :--> b)
would be AsHaskell a -> AsHaskell b
. Given this, no matter what we chose for PlutusRepr (a :--> b)
, the PLiftable
instance in question would have to, given any function between any types that have Plutus 'universe' representations, construct something in the Plutus 'universe' that corresponds to that function in a reversible way. This idea falls apart at the first hurdle if we consider ByteString
, as its Haskell 'universe' capabilities are far greater than those of its Plutus 'universe' equivalent, by design. Thus, a general instance of PLiftable (a :--> b) is completely impossible.
Thus, for these two reasons, PLiftable
and PlutusType
are kept separate. At the same time, we have to keep representations in mind with PlutusType
as we will show in our upcoming examples.
Examples
To help see how PlutusType
instances work in practice, as well as how they connect to representations, we will redo our three examples from the PLiftable
article. In order, these types will use the builtin, SOP and Data
representations. For clarity, we will repeat their definitions as they come up.
Builtin representation
For builtin representations, the instance is straightforward:
newtype PByteVector (s :: S) = PByteVector (Term s PByteString)
instance PlutusType PByteVector where
type PInner PByteVector = PByteString
-- expanded type signatures for clarity
-- pcon' :: forall s . PByteVector s -> Term s PByteString
pcon' (PByteVector t) = t
-- pmatch' :: forall s b. Term s PByteString -> (PByteVector s -> Term s b) -> Term s b
pmatch' t f = f (PByteVector t)
Our goal is to delegate the PlutusType
implementation to the type we're newtype
ing around. Here, we see one of the use cases for PInner
, as it enables exactly this kind of delegation. This is quite similar to how PLiftable
works for such definitions. Thus, the method is entirely cookbook:
Set
PInner
to whatever type we are wrappingHave
pcon'
unwrap thenewtype
Have
pmatch'
rewrap thenewtype
before applying the function argument
If our goal is a builtin representation, this is the approach we take for defining PlutusType
instances[5].
SOP representation
Next, we can consider a SOP-represented type:
data PThese (a :: S -> Type) (b :: S -> Type) (s :: S) =
PThis (Term s a) |
PThat (Term s b) |
PThese (Term s a) (Term s b)
As we want PThese
to have an SOP representation, we must ensure that the implementation for pcon'
and pmatch'
produces the right UPLC primitives. This means we must build up Term
s by hand, which will put our knowledge of Term
to the test.
First, we must decide what PInner (PThese a b)
should be. Unlike for PByteVector
, this choice is far from obvious, as SOP representations ultimately fall back to SOP primitives. For this (and similar) cases, Plutarch provides a type corresponding to 'some onchain thing', called POpaque
. You cannot do very much with this type (safely): it is, in some sense, a 'most general type', similar to Any
in Haskell. However, in our case, due to our limited setting, we can be sure that we never do anything inappropriate with POpaque
s handed to us as part of this instance's methods.
This gives us the following initial implementation:
instance PlutusType (PThese a b) where
type PInner (PThese a b) = POpaque
-- expanded type signatures for clarity
-- pcon' :: forall s . PThese a b s -> Term s POpaque
pcon' = _
-- pmatch' :: forall s b . Term s POpaque -> (PThese a b s -> Term s b) -> Term s b
pmatch' = _
To define pcon'
, we must determine how to take a Haskell-level representation of a Plutarch computation, and lift it into a Term
that, when computed, will produce 'some onchain thing'. We first start with a case analysis:
-- Repeated for clarity
instance PlutusType (PThese a b) where
type PInner (PThese a b) = POpaque
-- expanded type signatures for clarity
-- pcon' :: forall s . PThese a b s -> Term s POpaque
pcon' = \case
PThis t1 -> _
PThat t2 -> _
PThese t1 t2 -> _
-- pmatch' :: forall s b . Term s POpaque -> (PThese a b s -> Term s b) -> Term s b
pmatch' = _
As our goal is to use an SOP representation, we must work in the RConstr
constructor of RawTerm
, as this represents SOP introductions. If we examine the RConstr
data constructor, we see it has two arguments:
A
Word64
, corresponding to a data constructor index in a sum (or an 'arm'); andA list of
RawTerm
s, corresponding to the fields of the specified constructor.
The choice of constructor indices is quite important, as without the right choices here, we can't get pmatch'
to work correctly. Specifically, constructor indices must begin at 0 and must be consecutive. While we can order the indexes whichever way we like, it's easier to assign the first data constructor listed in PThese
' data type declaration the index 0, and then 'count up' from there. Thus, we will use 0
for PThis
, 1
for PThat
and 2
for PThese
.
The RawTerm
s that go into the list argument for RConstr
in each case must come from t1
or t2
, depending on which 'arm' of PThese
we're in. However, t1
and t2
are Term
s, not RawTerm
s, and thus, our new Term
needs to take into account their respective environments and dependencies and sequence the computations properly inside of the TermMonad
s they wrap.
If we consider the different parts of the environment of t1
and t2
, we can see that we don't really have to do much with either of the environments:
The DeBruijn level does not need to change, as we do not introduce any lambdas;
The configuration can stay exactly the same, as we need not change anything (and in fact, very much shouldn't).
At the same time, we need to make sure that our result Term
has all the dependencies of any Term
whose RawTerm
components we wish to embed. This makes sure that hoisting can work correctly. Thus, to define the rest of pcon'
, we must do the following for each match in the case
statement:
Create a fresh
Term
wrapper, using a lambda to name the DeBruijn-level argument from the environment.For each field in the 'arm' of
PThis
we are currently in, we must 'unpack' theTerm
usingasRawTerm
together with the named level argument from step 1.Collect together all the dependencies of 'unpacked'
Term
s in step 2.Build a
TermResult
using theRawTerm
s from step 2 and dependencies from step 3, and putting them into anRConstr
with the appropriate index.
This gives us the following:
-- Repeated for clarity
instance PlutusType (PThese a b) where
type PInner (PThese a b) = POpaque
-- expanded type signatures for clarity
-- pcon' :: forall s . PThese a b s -> Term s POpaque
pcon' = \case
PThis t1 -> Term $ \level -> do
TermResult rawT1 depsT1 <- asRawTerm t1 level
pure . TermResult (RConstr 0 [rawT1]) $ depsT1
PThat t2 -> Term $ \level -> do
TermResult rawT2 depsT2 <- asRawTerm t2 level
pure . TermResult (RConstr 1 [rawT2]) $ depsT2
PThese t1 t2 -> Term $ \level -> do
TermResult rawT1 depsT1 <- asRawTerm t1 level
TermResult rawT2 depsT2 <- asRawTerm t2 level
pure . TermResult (RConstr 2 [rawT1, rawT2]) $ depsT1 <> depsT2
-- pmatch' :: forall s b . Term s POpaque -> (PThese a b s -> Term s b) -> Term s b
pmatch' = _
We see here that each 'arm' of PThese
is just an application of the four-step process we gave above.
Implementing pmatch'
is a bit more complicated. Before we begin, it is worth looking at the RCase
data constructor. Its first field is a RawTerm
, which is intended to be the SOP value to scrutinize; its second field is a list of RawTerm
s, intended as 'handlers' for each possible 'arm' of the SOP value we are scrutinizing. More specifically, the 'arm' given index i
when using RConstr
will be resolved using the 'handler' at position i
in the list given to RCase
: this is why our choice of indexes when defining pcon'
is important.
To see what we need to provide in a more familiar context, consider this function from Data.These
:
these :: forall a b c . (a -> c) -> (b -> c) -> (a -> b -> c) -> These a b -> c
these handleThis handleThat handleThese = \case
This x -> handleThis x
That y -> handleThat y
These x y -> handleThese x y
What we want to supply to RCase
is effectively the RawTerm
equivalent to the arguments provided above. The main differences are that, instead of providing a literal PThese
or literal Haskell (or rather, Plutarch) functions, we want to supply the RawTerm
equivalents of the same, and in a slightly different order. We also have to ensure that hoisting dependencies are handled properly as we do so.
Let's begin by defining the 'handlers', and constructing their relevant RawTerm
s and dependencies. For this purpose, we have the (PThese a b s -> Term s b)
argument to pmatch'
, but as we can see, this produces a Term
. Thus, we have to use a combination of plam
and asRawTerm
, inside a freshly produced term with a lambda naming the De Bruijn level:
-- Repeated for clarity
instance PlutusType (PThese a b) where
type PInner (PThese a b) = POpaque
-- expanded type signatures for clarity
-- pcon' :: forall s . PThese a b s -> Term s POpaque
pcon' = \case
PThis t1 -> Term $ \level -> do
TermResult rawT1 depsT1 <- asRawTerm t1 level
pure . TermResult (RConstr 0 [rawT1]) $ depsT1
PThat t2 -> Term $ \level -> do
TermResult rawT2 depsT2 <- asRawTerm t2 level
pure . TermResult (RConstr 1 [rawT2]) $ depsT2
PThese t1 t2 -> Term $ \level -> do
TermResult rawT1 depsT1 <- asRawTerm t1 level
TermResult rawT2 depsT2 <- asRawTerm t2 level
pure . TermResult (RConstr 2 [rawT1, rawT2]) $ depsT1 <> depsT2
-- pmatch' :: forall s b . Term s POpaque -> (PThese a b s -> Term s b) -> Term s b
pmatch' t f = Term $ \level -> do
TermResult handleThis depsThis <- asRawTerm (plam $ \x -> f (PThis x)) level
TermResult handleThat depsThat <- asRawTerm (plam $ \y -> f (PThat y)) level
TermResult handleThese depsThese <- asRawTerm (plam $ \x y -> f (PThese x y)) level
_
It is worth looking at what we just did more closely. Instead of trying to manually juggle the De Bruijn levels required to build up Plutarch lambdas ourselves, we use Plutarch itself to construct the Term
s, then 'unpack' them immediately using the De Bruijn level argument we just named. We use f
inside these new handlers, together with the appropriate 'wrapper' in the form of a PThese
data constructor, to always produce a Term s b
as a result. This works because De Bruijn indices indicate how many enclosing scopes away an argument comes from: this allows us to plam
fearlessly, as long as we don't use any arguments from outside scopes. As we don't ever need to do this in a handler, this saves us a lot of time and confusion.
The next steps are similar to what we had to do for pcon'
: 'unpack' t
, collect together all dependencies (including the dependencies of every handler), then build an RCase
inside a TermResult
, together with our collected dependencies:
-- Repeated for clarity
instance PlutusType (PThese a b) where
type PInner (PThese a b) = POpaque
-- expanded type signatures for clarity
-- pcon' :: forall s . PThese a b s -> Term s POpaque
pcon' = \case
PThis t1 -> Term $ \level -> do
TermResult rawT1 depsT1 <- asRawTerm t1 level
pure . TermResult (RConstr 0 [rawT1]) $ depsT1
PThat t2 -> Term $ \level -> do
TermResult rawT2 depsT2 <- asRawTerm t2 level
pure . TermResult (RConstr 1 [rawT2]) $ depsT2
PThese t1 t2 -> Term $ \level -> do
TermResult rawT1 depsT1 <- asRawTerm t1 level
TermResult rawT2 depsT2 <- asRawTerm t2 level
pure . TermResult (RConstr 2 [rawT1, rawT2]) $ depsT1 <> depsT2
-- pmatch' :: forall s b . Term s POpaque -> (PThese a b s -> Term s b) -> Term s b
pmatch' t f = Term $ \level -> do
TermResult handleThis depsThis <- asRawTerm (plam $ \x -> f (PThis x)) level
TermResult handleThat depsThat <- asRawTerm (plam $ \y -> f (PThat y)) level
TermResult handleThese depsThese <- asRawTerm (plam $ \x y -> f (PThese x y)) level
TermResult rawT depsT <- asRawTerm t level
let allDeps = depsThis <> depsThat <> depsThese <> depsT
pure . TermResult (RCase rawT [rawThis, rawThat, rawThese]) $ allDeps
We note here that the exact order in which we combine the hoisting dependencies isn't important as long as we don't skip any. Furthermore, duplicate dependencies are not a problem either: as long as we indicate that we have a dependency, having it multiple times poses no issues.
Data
representation
Lastly, let us look at a Data
representation type:
data PTheseData (a :: S -> Type) (b :: S -> Type) (s :: S) =
PDThis (Term s (PAsData a)) |
PDThat (Term s (PAsData b)) |
PDThese (Term s (PAsData a)) (Term s (PAsData b))
Fortunately, for Data
represented types such as these, we have an easier time and don't have to use the underlying representation of Term
like for SOP representations. Here, we make use of the other purpose of PInner
by setting PInner (PTheseData a b)
to PData
. This gives us the following initial implementation:
instance PlutusType (PTheseData a b) where
type PInner (PTheseData a b) = PData
-- expanded type signatures for clarity
-- pcon' :: forall s . PThese a b s -> Term s PData
pcon' = _
-- pmatch' :: forall s b . Term s PData -> (PTheseData a b s -> Term s b) -> Term s b
pmatch' = _
By convention, sum types like PTheseData
are represented using the Constr
data constructor of Data
, with a unique Integer
index for each 'arm', and their fields, encoded as data, packed into a list. While this is not always the most efficient encoding, we will follow this convention here.
In order to build the right PData
, we need to use pconstrData
, together with pforgetData
. This can then be made into PData
with punsafeCoerce
, which, despite its name, is safe here. Furthermore, we have to convert the PAsData
fields of each 'arm' back into PData
using pforgetData
. Otherwise, the process is quite similar to what we defined previously for the non-Data
version:
-- Repeated for clarity
instance PlutusType (PTheseData a b) where
type PInner (PTheseData a b) = PData
-- expanded type signatures for clarity
-- pcon' :: forall s . PThese a b s -> Term s PData
pcon' = punsafeCoerce . pforgetData . \case
PThis t1 -> pconstrData # 0 # pcon (PCons (pforgetData t1) PNil)
PThat t2 -> pconstrData # 1 # pcon (PCons (pforgetData t2) PNil)
PThese t1 t2 -> pconstrData # 2 # pcon (PCons (pforgetData t1) (PCons (pforgetData t2) PNil))
-- pmatch' :: forall s b . Term s PData -> (PTheseData a b s -> Term s b) -> Term s b
pmatch' = _
What about pmatch'
? Here, we are taking apart PData
with increasing assumptions. We know that we must have a Constr
(as we are the only ones who could have put it there), but we need to check the index to determine what 'arm' we're in. Furthermore, we also know that once we've established our 'arm', the field(s) must be valid a
or b
, wrapped in PAsData
. Once all this is established, we construct the appropriate PTheseData
, and feed it to the function argument of pmatch'
to finish the process:
-- Repeated for clarity
instance PlutusType (PTheseData a b) where
type PInner (PTheseData a b) = PData
-- expanded type signatures for clarity
-- pcon' :: forall s . PThese a b s -> Term s PData
pcon' = punsafeCoerce . pforgetData . \case
PThis t1 -> pconstrData # 0 # pcon (PCons (pforgetData t1) PNil)
PThat t2 -> pconstrData # 1 # pcon (PCons (pforgetData t2) PNil)
PThese t1 t2 -> pconstrData # 2 # pcon (PCons (pforgetData t1) (PCons (pforgetData t2) PNil))
-- pmatch' :: forall s b . Term s PData -> (PTheseData a b s -> Term s b) -> Term s b
pmatch' t f = pmatch (pasConstrData # t) $ \p ->
plet (pfstBuiltin # p) $ \i ->
plet (psndBuiltin # p) $ \xs ->
plet (phead # xs) $ \h ->
pif (i #== 0)
(f $ PDThis (punsafCoerce @(PAsData a) h))
(pif (i #== 1)
(f $ PDThat (punsafeCoerce @(PAsData b) h))
(f $ PDThese (punsafeCoerce @(PAsData a) h)
(punsafeCoerce @(PAsData b) $ phead #$ ptail # xs))
Once again, we use punsafeCoerce
here, this time to transform the list elements directly into their corresponding type. This is once again safe: the only PData
we will see in pmatch'
is whatever we put there with pcon'
, and thus, we don't have to worry about it being wrong.
Helpers from Plutarch
Much like for PLiftable
instances, the code for PlutusType
instances is mechanical, repetitive and low-level. If we had to write PlutusType
instances the way we just did every time, Plutarch would be quite hard to use, and unnecessarily so. Thus, for the same reasons as for PLiftable
, Plutarch provides a range of via
-deriving helpers to automatically derive PlutusType
.
As part of the PLiftable
article, we already saw three common via
-deriving helpers, each designed for one of the three representations. As these are common choices for many types, we will discuss them first; afterwards, we will show other, more focused helpers that are better in some circumstances.
The first of these is DeriveNewtypePlutusType
. We can replace our (admittedly short) instance of PlutusType
from before with a few derivations:
newtype PByteVector (s :: S) = PByteVector (Term s PByteString)
deriving stock (Generic)
deriving anyclass (SOP.Generic)
deriving PlutusType via (DeriveNewtypePlutusType PByteString)
We use Generic
to allow a derivation of Generic
from generics-sop
[8]. This allows us to via
-derive PlutusType
from a helper newtype, which we will see is a common pattern. DeriveNewtypePlutusType
will effectively generate the same code as we did in our manual instance, involving manually unwrapping and rewrapping the newtype
. Furthermore, the derivation will check whether we are actually a newtype
over the declared type or not, which can be helpful to protect you from accidental representation changes.
Next, we will consider DeriveAsSOPStruct
, which can be used to derive a PlutusType
instance for anything that you want to use a SOP representation for. Once again, the derivation becomes easy:
data PThese (a :: S -> Type) (b :: S -> Type) (s :: S) =
PThis (Term s a) |
PThat (Term s b) |
PThese (Term s a) (Term s b)
deriving stock (Generic)
deriving anyclass (SOP.Generic)
deriving via DeriveAsSOPStruct (PThese a b) instance PlutusType (PThese a b)
We use a standalone derivation here, but the intent is the same. The code that will be generated will look identical to our example, but for clarity, DeriveAsSOPStruct will do the following:
Assign a consecutive index to each 'arm' of the type definition, starting at 0. For PThese, it will assign PThis an index of 0, PThat an index of 1, and PThese an index of 2.
For pcon', it will construct Terms which 'unpack' all fields into their RawTerms and dependencies, then build an RConstr using the index corresponding to the 'arm' we are in, and the RawTerms from the fields, along with the union of all their dependencies.
For each of the 'arms', pmatch' will construct a 'handler' function which takes an argument for each field, then packs them into the 'arm' and calls the function argument to pmatch' on them. It will then unpack the Term argument, collect all dependencies, and build a Term calling RCase. The 'handler' RawTerms will be put in the same order as the indexes they correspond to.
The last of the three 'typical' helpers is DeriveAsDataStruct. The name similarity to DeriveAsSOPStruct is intentional, as the behaviour is quite similar, as is its use. Our example would look like this:
data PTheseData (a :: S -> Type) (b :: S -> Type) (s :: S) =
PDThis (Term s (PAsData a)) |
PDThat (Term s (PAsData b)) |
PDThese (Term s (PAsData b)) (Term s (PAsData b))
deriving stock (Generic)
deriving anyclass (SOP.Generic)
deriving via DeriveAsDataStruct (PTheseData a b) instance PlutusType (PTheseData a b)
The code that gets generated will, once again, be identical to what we wrote manually, but again, for clarity, DeriveAsDataStruct
-based derivations will do the following:
Assign a consecutive index to each 'arm' of the type definition, starting at
0
.For
pcon'
, it will callpconstrData
with the index of the 'arm' we are in, and a list of the fields, all of which havepforgetData
called on them.For
pmatch'
, it will unpack to aConstr
, then check the index against all available 'arm' indices. When it finds a match, it will unpack the list of fields,punsafeCoerce
them back into the right type, then construct a value and give it to the function argument.
These three via
-deriving helpers are likely enough to define PlutusType
in almost all cases. This alone is much easier and less error-prone than the manual definitions we wrote above. However, there is an added advantage that we cannot accidentally use the wrong helper, as the helpers check whether what we're asking for is even possible. For example, if you tried to use DeriveAsSOPData
on PThese
, it would error, saying that your fields are not Data
-represented.
However, these are not the only available options for automatically deriving PlutusType
instances. Plutarch provides several others for more specific cases, usually for reasons of efficiency. We will examine two of these, and discuss when you may want to use them.
The first such helper we will look at is DeriveAsDataRec
[9]. This helper is designed for Data
-represented types that look like records: that is, a single 'arm' with some fields. Because there is only one possible data constructor, we can encode such types as Plutus lists instead of Constr
. This leads to a smaller encoding (no need for a constructor index), and also less code to work with such a type (as we don't have to check the index).
The second is DeriveAsTag
. This is designed to use with 'enum'-style types: sums with no fields in any 'arm'. These can be treated as if they were a PInteger
, with the first 'arm' of the type definition being given the value 0
, the second arm being given the value
1, etc. This allows us to represent such types onchain as a builtin, which is the simplest, and fastest, possible encoding. While not often useful, at least one major user base of Plutarch[10] found this approach to be worthwhile enough to warrant such a deriving helper, making it available to all in Plutarch.
Other helpers exist besides these five, but they're either no longer useful and provided only for backward compatibility or should only be used internally. Just with these five via
-derivation helpers, we can define almost all PlutusType
instances quickly and conveniently, instead of having to go through the manual and low-level process of defining PlutusType
ourselves. This is a convenience of Plutarch that application developers should use at every opportunity: there is almost never a reason to define PlutusType
manually[12].
Conclusion
Like PLiftable
, PlutusType
is a core part of Plutarch, and by extension, every script written in it. However, what exactly it does, and why we need it, can be difficult to follow, especially for someone who isn't familiar with DSLs or how they're implemented in Haskell. In this article, we examined PlutusType
, as well as multiple adjacent concepts, that rest at the heart of Plutarch. Together, these hold the language together and enable it to do everything we know and love in Plutarch. We also put this knowledge to use, by demonstrating how PlutusType
instances would be implemented, using three different examples. Lastly, we demonstrated the options Plutarch provides to allow constructing PlutusType
instances without all the attendant fiddliness and tedium.
This kind of deep dive into the internals of Plutarch is not without its challenges, and in practice, knowing how PlutusType
, much less Term
, 'really works' is not necessary to use Plutarch effectively. However, examining the way these concepts work and the reasons for their existence helps us in two ways:
It can help us see whether we are making the right choices, even if we are using helpers or abstractions; and
If we ever encounter a case where we need more control, we have the knowledge to do this work the way we need it.
By demystifying Plutarch, we believe it will become easier to learn and work with. Due to PlutusType
and its related concepts being at the heart of what Plutarch is and how it works, demystifying them is the basis upon which all other understanding can be built. While we don't have to (or want to) write PlutusType
ourselves, knowing how it works can help us use the helpers Plutarch provides more mindfully and with known intent, rather than blindly.
We hope this deep dive has given you a clear picture of how PlutusType powers every step of the Plutarch workflow. By leaning on PlutusType’s Monad-like interface, you can write richly typed, composable onchain logic tailored to your application needs. If you’re developing on Cardano and would like assistance integrating these patterns, please don’t hesitate to get in touch.
- At times, even the maintainers get confused. The author of this article is a Plutarch maintainer, and even then learned something new in the process of writing it!
- To be very specific, the Plutus-level equivalent of whatever a would be, or an error, while possibly also producing some logging.
- The reason we have this in a Reader rather than a State is because scopes mirror the behaviour of local, making this convenient.
- This is also how IO works. In fact, IO is secretly ST RealWorld.
- In this regard, hoisting differs to hash consing: hash consing can eliminate duplication even in the pre-compiled representation, as it builds a DAG directly. However, this is more complex, and Plutarch does not perform hash consing (yet).
- This is technically not true, as this would make certain 'ground' types in Plutarch undefinable. However, we chose not to discuss the alternative for two reasons: application developers in Plutarch would never have to define such types, and the article is long and complicated enough as it is! However, if you are interested in how we do this, look at the PlutusType definitions for types such as POpaque and PBool.
- Functor and especially Applicative play a role here too, but Monad is what allows computations to be sequenced and for us to control order of effects, which becomes extremely critical when mutation of state is concerned. We definitely want to make sure mutations happen in the right order, or chaos will ensue.
- This is slightly simplified from the PlutusType we actually have in Plutarch at the moment, as it disregards PCovariant, PContravariant and PVariant. These don't really change the presentation or intent of PlutusType, and are likely to go away in a future version of Plutarch, but do add some additional complications. Thus, we decided to ignore them.
- As we mentioned regarding PLiftable, you could also derive Generic from generics-sop more directly using Template Haskell as well.
- There is also a corresponding DeriveAsSOPRec, which is meant to be used on the same types, but for SOP representations instead. At the moment, however, there is no advantage to using DeriveAsSOPRec over DeriveAsSOPStruct.
- Specifically, Liqwid Labs, where this approach was developed.
- Technically, there is a third case: definitions of 'ground' Plutarch types, such as PInteger and POpaque. However, since these types can only be defined by Plutarch itself, we will not discuss them here.
- Unless you're a Plutarch maintainer.