Unlifted Datatypes¶
This proposal is about enabling users to define new unlifted, boxed data types in source Haskell, like
data SMaybe a :: TYPE (BoxedRep Unlifted) where
SJust :: !a -> SMaybe a
SNothing :: SMaybe a
SMaybe a
has the same inhabitants as Maybe a
, with the exception that
it deprives itself and its arguments of ⊥. The advantage compared to using
bangs everywhere (and hence -XStrict
) is that code generation can assume
that every pointer is correctly tagged, as well as not having to write the
bangs in the first place.
There’s a WIP implementation which works reasonably well already.
Motivation¶
Consider the following implementation of a set of Int
s, not unlike
Data.Set.Strict
specialised for Int
:
data IntSet
= Branch IntSet !Int IntSet
| Leaf
member :: Int -> IntSet -> Bool
member !k Leaf = False
member k (Branch l k' r)
| k == k' = True
| k < k' = member k l
| otherwise = member k r
The data structure is strict in the integer field so that it can be unboxed, to
squeeze out the last bit of performance. Yet if we look at the resulting C–,
we find that each pattern-match on IntSet
is preceded with the following
check:
if (R1 & 7 != 0) goto c1fi; else goto c1fj;
c1fj: // global
call (I64[R1])(R1) returns to c1fi, args: 8, res: 8, upd: 8;
c1fi: // global
What does it do? It tests whether the scrutinee pointer in R1 has its tag bits
(the lower 3 bits) set. If not, the code will enter the heap object to
possibly evaluate it and get to know the constructor tag (i.e. 1 for Branch
and 2 for Leaf
). Now, most of the time, the heap object has already been
evaluated and the pointer is in fact properly tagged (the GC for example will
tag every pointer to an evaluated heap object).
Nevertheless, we end up with the code above (which we loosely refer to as the
“zero tag check”, as in
#16820), because a call
site like member k undefined
is possible! Obviously, undefined
is
neither evaluated nor tagged. Even making IntSet
spine-strict by adding
bangs to the Branch
constructor doesn’t get rid of the problem: member k
undefined
is still well-typed, so we have to generate code for it.
Fundamentally, ⊥ is an inhabitant of every data type in Haskell, so every data
declaration introduces a lifted type of kind TYPE (BoxedRep Lifted)
.
So omission of the zero tag check is only possible if ⊥ is not an inhabitant of
the type. This is exactly the condition for a type to be unlifted! Since we
are still talking about heap objects, we need unlifted, but boxed types, which
have kind TYPE (BoxedRep Unlifted)
. Indeed, for types of this kind GHC will
already omit the zero tag check today! So all that is left is to extend GHC in
a way that
data IntSet :: TYPE (BoxedRep Unlifted) where
Branch :: IntSet -> !Int -> IntSet -> IntSet
Leaf :: IntSet
member :: Int -> IntSet -> Bool
member !k Leaf = False
member k (Branch l k' r)
| k == k' = True
| k < k' = member k l
| otherwise = member k r
does the Right Thing, namely disallowing unevaluated boxes to be passed around by using call-by-value. This way we can have the guarantee that all unlifted boxes are always properly tagged.
Preliminary benchmarks suggest that the unlifted variant is 5-10% faster, just due to omission of the zero tag check.
Proposed Change Specification¶
Henceforth, data type declaration refers to both data type and data family instance declarations.
The entire proposal assumes that the pointer rep proposal and its amendment have landed.
Static semantics¶
Kind signatures for regular data type declarations must have a return kind of
TYPE (BoxedRep Lifted)
. Activating -XUnliftedDatatypes
will lift this
restriction to allow any return kind that unifies with TYPE (BoxedRep _)
. A
data type explicitly may have levity polymorphic kind
TYPE (BoxedRep l)
for some type variable l
. Example:
data List a :: TYPE (BoxedRep Lifted) where
Nil :: List a
Cons :: a -> List a -> List a
data SList a :: TYPE (BoxedRep Unlifted) where
SNil :: SList a
SCons :: a -> SList a -> SList a
data PList a :: TYPE (BoxedRep l) where
PNil :: PList a
PCons :: a -> PList a -> PList a
Note that Haskell98-style data declarations can use standalone kind signatures to specify the return kind. Example:
type SList :: Type -> TYPE (BoxedRep Unlifted)
data SList a = SNil | SCons a (SList a)
If the user provides no kind signature, the default
remains that the data type has lifted return kind (TYPE (BoxedRep Lifted)
).
Specifying a kind signature is the only way to declare an unlifted or
levity-polymorphic data type.
The same applies to data family instances, in which case the data family
application’s result kind must reduce to TYPE (BoxedRep _)
. Example:
data family DF a :: TYPE (BoxedRep l)
data instance DF Int :: TYPE (BoxedRep Unlifted) where
TInt :: Int -> DF Int -- unlifted!
data instance DF Char :: TYPE (BoxedRep Lifted) where
TChar :: Char -> DF Char -- lifted!
See
the section on data families in the UnliftedNewtypes proposal
and Note [Implementation of UnliftedNewtypes]
for details involving
type-checking the parent data family.
As usual for types of unlifted kind, values of unlifted data types may not be declared at the top-level:
-- rejected, `SList a` is not lifted
nil :: SList a
nil = SNil
-- accepted, `a -> SList a -> SList a` is a function, thus lifted
cons :: a -> SList a -> SList a
cons x xs = SCons x xs
The top-level restriction is not fundamental (see #17521), but best discussed in a separate proposal.
Dynamic Semantics¶
There is no change in GHC’s existing dynamic semantics, namely
Values of unlifted type are always computed eagerly
Only values of types with a concrete
RuntimeRep
can be let-bound
Example:
f x = let y = if odd 42 then SNil else SCons 42 SNil
in ... y ...
Since the binding for y
is unlifted, the let
binding (is legal and) is
evaluated eagerly, without building a thunk.
This proposal simply allows data declarations to have kinds other than
TYPE (BoxedRep Lifted)
and the existing dynamic semantics of GHC takes care
of the rest.
Examples¶
Declarations¶
Here are a few example declarations that should all be accepted:
data PList a :: TYPE (BoxedRep l) where
PNil :: PList a
PCons :: a -> PList a -> PList a
-- alternative using a SAKS:
-- type PList :: Type -> TYPE (BoxedRep l)
-- This one is with visible quantification in a SAKS
type DF :: forall l. Type -> TYPE (BoxedRep l)
data family DF
data instance DF Unlifted a where
Unlifted (call-by-value) semantics¶
Example:
data UPair a b :: TYPE (BoxedRep Unlifted) where
UPair :: a -> b -> UPair a b
When occuring in a constructor field (e.g.
data T = MkT (UPair Int Bool)
), the semantics are identical to a strict field (data T = MkT !(Int, Bool)
).In an application
f (if odd 42 then UPair a a else UPair b b)
the unlifted argument is evaluated before the application is beta reduced. So call-by-value instead of call-by-need.
In a let binding
let x = UPair a b in e
let x = if odd 42 then UPair a a else UPair b b in e
the unlifted right-hand side of
x
is evaluated before the bodye
.
The Strict data type¶
We get to define Strict
data Strict a :: TYPE (BoxedRep Unlifted) where
Force :: !a -> Strict a
that deprives itself and its argument of ⊥.
Strict
is the very essence of this proposal: Every unlifted data type can
be defined in terms of lifted data types and Strict
, at the cost of an
additional indirection.
Whether we could define Strict
as a newtype
– which would then mean we
coerce between kinds, meaning coercions aren’t zero-cost – is left for a
separate proposal.
Low-level code¶
Consider the following rather low-level, performance sensitive code:
{-# LANGUAGE MagicHash #-}
module Lib where
import GHC.Exts
pack :: Bool -> Bool -> Int#
pack False False = 0#
pack False True = 1#
pack True False = 2#
pack True True = 3#
The programmer manually unboxed the resulting Int
in a desperate endeavour
of squeezing out the last bit of performance. This is the generated Core, which
looks good enough:
pack
= \ (ds_d11d :: Bool) (ds1_d11e :: Bool) ->
case ds_d11d of {
False ->
case ds1_d11e of {
False -> 0#;
True -> 1#
};
True ->
case ds1_d11e of {
False -> 2#;
True -> 3#
}
}
STG looks similar. Now look what happens in C–:
c1fp: // global
if ((Sp + -16) < SpLim) (likely: False) goto c1fq; else goto c1fr;
c1fq: // global
R3 = R3;
R2 = R2;
R1 = Lib.pack_closure;
call (stg_gc_fun)(R3, R2, R1) args: 8, res: 0, upd: 8;
c1fr: // global
I64[Sp - 16] = c1fi;
R1 = R2;
P64[Sp - 8] = R3;
Sp = Sp - 16;
if (R1 & 7 != 0) goto c1fi; else goto c1fj; <-- Zero tag check
c1fj: // global
call (I64[R1])(R1) returns to c1fi, args: 8, res: 8, upd: 8; <-- Dead enter if argument was always evaluted
c1fi: // global
_s1fa::P64 = P64[Sp + 8];
if (R1 & 7 != 1) goto c1fn; else goto c1fm;
c1fn: // global
I64[Sp + 8] = c1fJ;
R1 = _s1fa::P64;
Sp = Sp + 8;
if (R1 & 7 != 0) goto c1fJ; else goto c1fL; <-- Zero tag check
c1fL: // global
call (I64[R1])(R1) returns to c1fJ, args: 8, res: 8, upd: 8; <-- Dead enter if argument was always evaluted
c1fJ: // global
if (R1 & 7 != 1) goto c1fV; else goto c1fR;
c1fV: // global
R1 = 3;
Sp = Sp + 8;
call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
c1fR: // global
R1 = 2;
Sp = Sp + 8;
call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
c1fm: // global
I64[Sp + 8] = c1fu;
R1 = _s1fa::P64;
Sp = Sp + 8;
if (R1 & 7 != 0) goto c1fu; else goto c1fw; <-- Zero tag check
c1fw: // global
call (I64[R1])(R1) returns to c1fu, args: 8, res: 8, upd: 8; <-- Dead enter if argument was always evaluted
c1fu: // global
if (R1 & 7 != 1) goto c1fG; else goto c1fC;
c1fG: // global
R1 = 1;
Sp = Sp + 8;
call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
c1fC: // global
R1 = 0;
Sp = Sp + 8;
call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
Wow, that’s quite a mouthful, all due to the lifted representation of Bool
!
Assuming that the call site can prove evaluatedness at a lower cost than
pack
, we can define a new unlifted datatype SBool
:
{-# LANGUAGE MagicHash #-}
module Lib where
import GHC.Exts
data SBool :: TYPE (BoxedRep Unlifted) where
STrue :: SBool
SFalse :: SBool
pack :: SBool -> SBool -> Int#
pack SFalse SFalse = 0#
pack SFalse STrue = 1#
pack STrue SFalse = 2#
pack STrue STrue = 3#
And then after removing dead code (by hand, so no liability assumed) and freeing up stack space the C– would water down to:
c1fr: // global
R1 = R2;
if (R1 & 7 != 1) goto c1fn; else goto c1fm;
c1fn: // global
R1 = R3;
if (R1 & 7 != 1) goto c1fV; else goto c1fR;
c1fV: // global
R1 = 3;
call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
c1fR: // global
R1 = 2;
call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
c1fm: // global
R1 = R3;
if (R1 & 7 != 1) goto c1fG; else goto c1fC;
c1fG: // global
R1 = 1;
call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
c1fC: // global
R1 = 0;
call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
Much better! A decent backend should be able to turn this into a couple of bitshifts on the tags.
Data structure re-use¶
With levity polymorphism, we can even re-use currently lifted-only data structures:
data PList a :: TYPE (BoxedRep l) where
PNil :: PList a
PCons :: a -> PList a -> PList a
mapLifted :: (a -> b) -> PList a -> PList b -- assumes defaulting of Levity to @Lifted
mapLifted f PNil = PNil
mapLifted f (PCons x xs) = PCons (f x) (mapLifted f xs)
mapUnlifted :: (a -> b) -> PList @Unlifted a -> PList @Unlifted b
mapUnlifted f PNil = PNil
mapUnlifted f (PCons x xs) = PCons (f x) (mapUnlifted f xs)
There no chance of sharing the map
definition (not this one anyway)
currently, because we don’t have levity polymorphism in expressions yet, which
should be tackled in a separate proposal.
Effect and Interactions¶
Introduction of user-defined unlifted data types means we can finally write code processing data types that can be compiled as if we were in a strict language.
Strict constructor fields share considerable overlap with Strict
, yet they
proved insufficient for encoding invariants for efficient code generation.
This proposal consciously left out further work like a new specification for levity polymorphic code. Similar to data types, functions can be levity polymorphic, too. There’s #15532, which wants to weaken the restrictions we have in place for runtime-rep polymorphic code.
Pattern match checking with unlifted types will be weird in some edge cases. Consider the following example:
data SVoid :: TYPE (BoxedRep Unlifted)
f :: SVoid -> ()
f _ = ()
Should this program be accepted without warning? It would be accepted for
lifted Void
, because f (error "boom")
is a valid call and would
evaluate to ()
. But with unlifted Void
this doesn’t make sense anymore:
Because of call-by-value, the error
thunk will be evaluated before entering
f
, resulting in a crash. In that regard, it’s similar to the situation with
data Void
f :: Void -> ()
f !_ = ()
And should probably elicit an inaccessible RHS warning. I guess this is accurate for unlifted functions as well as long as we don’t allow functions without bindings.
-XStrict/-XStrictData could implicitly turn all data declarations into unlifted ones. I see two potential problems:
If a data type is exported, it’s now an unlifted type. That’s a breaking change.
For data family instances, this is only possible if the parent data family was kind polymorphic. Plus it’s a strange thing to do change kinds of a declaration just by switching on a language extension.
So rather dreadfully, we probably shouldn’t “augment” -XStrict
.
Lazy pattern matches let ~t = ... in ...
should not be allowed if t
is unlifted. That is exactly the behavior that is currently implemented in GHC.
Concerning GHC’s implementation (only) of Data con wrappers: The wrappers of nullary constructors carrying some constraint, like
data T a :: TYPE (BoxedRep Unlifted) where
TInt :: T Int
will become unlifted top-level bindings in Core:
-- data constructor worker:
TInt :: a ~ Int => T Int
-- data constructor wrapper:
$WInt :: T Int
$WInt = TInt $d
which GHC currently forbids in source syntax. I think these will always be properly tagged, though, so it’s just an implementation detail. The Core formalism should only be minimally affected by such unlifted bindings at the top-level. Either regard them as evaluated on start up or just allow normal forms (modulo lifted fields) See #17521. Another way forward would be to turn wrappers of nullary unlifted constructors into functions by adding a Void# argument.
Costs and Drawbacks¶
Thanks to previous work on unlifted types and -XUnliftedNewtypes
, this
proposal seems rather easy to implement, with the majority of changes happening
in the parser and type-checker. Notably the backend is not only affected at
all. Very good cost to benefit ratio.
As for the risk of making the language harder to learn: Beginners won’t come in
touch with unlifted datatypes at all. Unless they crave for better performance
in a custom data structure, at which point I wouldn’t consider them beginners
anymore. There’s precedent in going from unlifted to lifted by
Idris or
OCaml with
their Lazy
data type.
Alternatives¶
Implement the Strict data type only. Doing so provides the same semantics at the cost of more syntactic overhead, plus we can’t get rid of the additional box.
Implement strict unboxed tuples instead. Rules out the promising direction of levity polymorphism in the future, though.
Introduce unlifted
as a new contextual keyword (“special id”) after
data
, like in a previous state of this proposal. Apart from the negligible
disadvantage that it steals syntax (data unlifted a => T = ...
could parse
unlifted
as unlifted :: Type -> Constraint
) it seems inconsistent
compared to -XUnliftedNewtypes
, which uses kind signatures and would not
support the syntax. Also, levity polymorphic data type declarations would be
impossible.
Unresolved Questions¶
Should we allow data family instances without kind signatures when the return kind can be inferred to be
TYPE (BoxedRep _)
? We do so forType
and unlifted newtype instances. The user would need to write the kind signature according to this proposal.
Implementation Plan¶
I will implement the changes. There is a working implementation at !2218, but it may still be a little rough around the edges.