1 HasField redesign¶
This proposal recommends improvements to the design of the HasField
typeclass (differing from those planned under #158). In particular,
the proposed design supports updates using a new SetField
class, adds
support for unlifted datatypes and fields, and specifies laws for the classes.
In order to keep this proposal simple, it does not yet propose adding support for type-changing update, which is left for a future proposal.
1.1 Motivation¶
Following proposal #6,
GHC 8.2 introduced a special built-in typeclass HasField
in the
GHC.Records
module, defined thus:
class HasField (x :: k) r a | x r -> a where
getField :: r -> a
When the constraint solver sees a constraint of the form HasField "foo" T a
,
where T
is a concrete datatype and foo
is a symbol corresponding to one
of its fields, and this field is in scope, the constraint will be solved
automatically with a dictionary derived from the record selector function for
the field.
This makes it possible to get a form of type-directed name resolution for field
selection: given the expression getField @"foo" t
, the inferred type of
t
can be used to determine which foo
field is meant, even if there are
multiple foo
fields in scope and hence the expression foo t
would be
ambiguous. (This arises in particular with the DuplicateRecordFields
extension, which has a somewhat ad hoc mechanism for disambiguating such
expressions that has been removed in GHC 9.4, following proposal #366.)
GHC 9.2 includes support for using “record dot syntax” for selection with the
OverloadedRecordDot
extension, e.g. t.foo
can be used as syntactic sugar
for getField @"foo" t
. This is described in the accepted proposal #282 (as modified by
proposal #405).
However, while the proposals describe both OverloadedRecordDot
and another
extension OverloadedRecordUpdate
which allows type-based disambiguation of
record update expressions, only the selection part is fully implemented so far.
The accepted proposal #158 plans to change the
definition of HasField
to support updates, which is necessary for the full
implementation of the OverloadedRecordUpdate
extension.
An implementation of proposal #158 is available as GHC merge request !3257, but has not yet
been merged, because the compile-time performance cost of the selected
implementation strategy is unacceptably high. Such costs were not really
considered in previous discussions, but it is not appropriate to slow down
compilation of all programs with records for the benefit only of those using
HasField
.
In the light of experience implementing these proposals, and discussion arising
from proposal #405,
it seems worth systematically re-evaluating the design choices surrounding
HasField
and type-directed name resolution for field updates.
1.1.1 Recap: Planned changes to HasField¶
The accepted proposal #158 plans to change the
definitions in GHC.Records
to the following:
class HasField (x :: k) r a | x r -> a where
hasField :: r -> (a -> r, a)
getField :: forall x r a . HasField x r a => r -> a
getField = snd . hasField @x
setField :: forall x r a . HasField x r a => r -> a -> r
setField = fst . hasField @x
This makes it possible to both get and set fields, based on a single class. The
OverloadedRecordDot
extension would continue to desugar field selection
syntax to call getField
, while the OverloadedRecordUpdate
extension
would desugar record update syntax to call setField
.
Since setField
has not yet been added to a released compiler, the version of
OverloadedRecordUpdate
in GHC 9.2 requires RebindableSyntax
to be
enabled and a user-defined setField
function to be in scope. It provides no
standard definition of this function.
1.1.2 Design highlights¶
The essence of the new design is captured in the following definitions, which
will replace the existing contents of GHC.Records
. For a complete picture
of the new contents of this module, including auxiliary definitions, see the
Proposed Change Specification.
type HasField :: forall {k} {r_rep} {a_rep} . k -> TYPE r_rep -> TYPE a_rep -> Constraint
class HasField x r a | x r -> a where
getField :: r -> a
type SetField :: forall {k} {r_rep} {a_rep} . k -> TYPE r_rep -> TYPE a_rep -> Constraint
class SetField x r a | x r -> a where
modifyField :: (a -> a) -> r -> r
setField :: a -> r -> r
{-# MINIMAL modifyField | setField #-}
These are the key points of the new design. Detailed justification for each point is deferred to subsequent sections.
The existing
HasField x r a
class continues to have a single method for record field selection,getField :: r -> a
.There is a new class
SetField x r a
for updates, rather than combining both selection and update into theHasField
class (as in proposal #158).SetField x r a
has two methodssetField :: a -> r -> r
andmodifyField :: (a -> a) -> r -> r
.The order of arguments to
setField :: a -> r -> r
is reversed compared to the status quo: it takes the new field value first, followed by the record being updated.The classes are representation-polymorphic, allowing support for unlifted fields and datatypes.
The classes are polymorphic in the kind
k
of field labels.Functional dependencies are used to allow type inference to determine the field type from the record type and field name.
As noted above, type-changing update is not being considered in this proposal, but may be addressed in a follow-up proposal.
1.1.3 Motivation for changing the accepted design¶
Why change the accepted design from proposal #158?
Defining getField
and modifyField
in separate classes is a better design:
It gives more flexibility to users, in particular to define read-only or write-only virtual fields (cf. proposal #286), and it leaves open the possibility of devising modifiers to mark particular fields as read-only or write-only.
Types can be used to indicate whether particular definitions need read-only, write-only or read-write access to named fields. For example, a function of type
(HasField "foo" r Int, SetField "bar" r Bool) => r -> r
can only read thefoo
field and write thebar
field.GHC can emit more precise warnings when partial fields are used, indicating whether they are being used for selection or update.
Keeping
HasField
essentially unchanged is more backwards-compatible, rather than forcingHasField
users to change their code unnecessarily.A separate
SetField
class should make it easier to add type-changing update in a future proposal (though this is a controversial point, and this proposal does not commit to doing so).
1.2 Proposed Change Specification¶
This proposal involves both changes to existing definitions in base
, and
adding new definitions. As per the plan agreed with CLC,
the latter should first be added to the forthcoming ghc-experimental
package. Thus it adds two new modules, GHC.Records
and
GHC.Records.Experimental
.
The GHC.Records
module (in the base
package) will be defined as follows:
{-# LANGUAGE AllowAmbiguousTypes #-} -- for type of getField
{-# LANGUAGE FunctionalDependencies #-} -- for HasField class
module GHC.Records
( HasField(getField)
) where
import GHC.Types (Constraint, TYPE)
-- | Constraint representing the fact that a field @x@ of type @a@ can be
-- selected from the record type @r@.
--
-- This will be solved automatically for built-in records where the field is
-- in scope, but manual instances may be provided as well.
--
type HasField :: forall {k} {r_rep} {a_rep} . k -> TYPE r_rep -> TYPE a_rep -> Constraint
class HasField x r a | x r -> a where
-- | Selector function to extract the field @x@ from the record @r@.
getField :: r -> a
The GHC.Records.Experimental
module (in the ghc-experimental
package)
will be defined as follows:
{-# LANGUAGE AllowAmbiguousTypes #-} -- for type of setField
{-# LANGUAGE DefaultSignatures #-} -- for setField/modifyField
{-# LANGUAGE FunctionalDependencies #-} -- for SetField class
module GHC.Records.Experimental
( HasField(getField)
, SetField(setField, modifyField)
, Field
) where
import GHC.Records (HasField(getField))
import GHC.Types (Constraint, TYPE)
-- | Constraint representing the fact that a field @x@ of type @a@ can be
-- updated in the record type @r@.
--
-- This will be solved automatically for built-in records where the field is
-- in scope, but manual instances may be provided as well.
--
-- Instances of this class are subject to the following laws, for every record
-- value @r@ and field @x@:
--
-- > modifyField @x id r === r or ⊥
-- > (modifyField @x g . modifyField @x f) r === modifyField @x (g . f) r
-- > setField @x v r == modifyField @x (\ _ -> v) r
--
-- Where a 'HasField' instance is available as well as an instance of this
-- class, they must together satisfy the laws defined on 'Field'.
--
type SetField :: forall {k} {r_rep} {a_rep} . k -> TYPE r_rep -> TYPE a_rep -> Constraint
class SetField x r a | x r -> a where
-- | Change the value stored in the field @x@ of the record @r@.
modifyField :: (a -> a) -> r -> r
default modifyField :: (r_rep ~ LiftedRep, a_rep ~ LiftedRep, HasField x r a) => (a -> a) -> r -> r
modifyField f r = setField @x (f (getField @x r)) r
-- | Update function to set the field @x@ in the record @r@.
setField :: a -> r -> r
default setField :: a_rep ~ LiftedRep => a -> r -> r
setField v = modifyField @x (\ _ -> v)
{-# MINIMAL modifyField | setField #-}
-- | Constraint representing the fact that a field @x@ of type @a@ can be
-- selected from or updated in the record @r@.
--
-- Where both 'HasField' and 'SetField' instances are defined for the
-- same type, they must satisfy the following laws:
--
-- For every @r@ which has the field @x@
-- (that is, wherever 'getField @x r' is defined):
--
-- > getField @x (setField @x v r) === v
-- > setField @x (getField @x r) r === r
--
-- For every @r@ which does not have the field @x@
-- (that is, wherever 'getField @x r' is not defined):
--
-- > getField @x (setField @x v r) === ⊥
-- > setField @x (getField @x r) r === r or ⊥
type Field :: forall {k} {r_rep} {a_rep} . k -> TYPE r_rep -> TYPE a_rep -> Constraint
type Field x r a = (HasField x r a, SetField x r a)
See the Design highlights for a brief summary of the changes in this design relative to the previously-accepted proposal #158. There are many possible alternative choices of detail here, which are explored in the Alternatives section.
1.2.1 Automatic constraint solving¶
Constraint solving for HasField
constraints is essentially unchanged from
the behaviour of existing GHC versions, as described in the GHC user’s guide.
The only change is the introduction of representation-polymorphism, so that
getField
may be used even if the types involved are unlifted.
A constraint SetField x r a
will be solved automatically if and only if the
corresponding constraint HasField x r a
would be solved
automatically. Specifically, this occurs when r
is a concrete record type,
x
is a Symbol
naming one of the fields of the record, the field is in
scope and is not existentially quantified or higher-rank.
When a constraint is solved automatically, GHC will generate a dictionary with
an implementation of modifyField
, as if an instance for SetField
existed. It will not actually generate instances of SetField
, however,
because instances have global scope whereas SetField
constraints are solved
automatically only if the field is in scope. (This is identical to the
behaviour for HasField
.)
If R x y
is a record type with a field f :: T x
belonging to
constructors MkR1
and MkR2
but not MkR3
, the generated dictionary
for SetField "f" (R x) a
will be equivalent to:
instance a ~ T x => SetField "f" (R x y) a where
modifyField :: (T x -> T x) -> R x y -> R x y
modifyField g MkR1{f=x, ..} = MkR1{f=g x, ..}
modifyField g MkR2{f=x, ..} = MkR2{f=g x, ..}
modifyField g MkR3{..} = throw (RecUpdError ...)
That is, where a record type has a partial field, the generated definition of
modifyField @x f r
will throw an exception if and only if getField @x r
will throw an exception.
1.2.2 User-defined instances¶
Current GHC versions impose restrictions on when users may define their own
instances of HasField
. Proposal #515 seeks to lift these
restrictions, but at the time of writing has not yet been accepted. For
consistency, SetField
will be subject to the same restrictions, and they
will be lifted for SetField
if they are lifted for HasField
.
1.2.3 Change to -Wincomplete-record-updates
¶
Accepted proposal #516
introduces a warning flag -Wincomplete-record-selectors
that emits a warning
when a HasField
constraint is solved for a partial field.
For consistency with this, when a SetField
constraint is solved for a
partial field, a warning will emitted if the existing
-Wincomplete-record-updates
warning flag is enabled. (This warning flag is
not enabled as part of the -Wall
warning group.)
Notice that easily distinguishing between selection and update in these warnings
requires the separation of the HasField
and SetField
classes. Were they
a single class, it would be difficult to determine at the time of solving the
constraint whether it was being used for selection, update or both.
1.2.4 Change to OverloadedRecordUpdate
¶
The Order of arguments to setField has been changed so that the field value
comes first, followed by the record value. Correspondingly, the
OverloadedRecordUpdate
extension will be changed so that it calls
setField
with the arguments in the same order:
Expression |
Previous interpretation |
New interpretation |
---|---|---|
|
|
|
This includes the case where RebindableSyntax
is enabled, so setField
refers to whichever name is in scope, rather than to GHC.Records.Experimental.setField
.
While this is a breaking change, the support for OverloadedRecordUpdate
in
GHC 9.2 was explicitly advertised as experimental, so this should not
inconvenience users unexpectedly.
1.3 Examples¶
For the first field of each example datatype, we describe the behaviour of the constraint solver by giving the corresponding instances (though GHC does not actually generate these instances).
1.3.1 Simple record¶
data Person = Person { name :: String, age :: Int }
instance a ~ String => HasField "name" Person a where
getField = name
instance a ~ String => SetField "name" Person a where
modifyField g (Person name age) = Person (g name) age
1.3.2 Partial field¶
data T = MkT1 { f1 :: Int } | MkT2 { g2 :: Bool }
instance a ~ Int => HasField "f1" T a where
getField = f1
instance a ~ Int => SetField "f1" T a where
modifyField g (MkT1 f1) = MkT1 (g f1)
modifyField g (MkT2 _) = throw (RecUpdError ...)
1.3.3 Representation polymorphism¶
With an unlifted field:
data U = MkU { f :: Int# }
instance a ~ Int# => HasField "f" U a where
getField = f
instance a ~ Int# => SetField "f" U a where
modifyField g (MkU f) = MkU (g f)
setField v (MkU f) = MkU v
With UnliftedDatatypes
:
type V :: UnliftedType -> UnliftedType
data V x = MkV { f :: x }
instance a ~ x => HasField "f" (V x) a where
getField = f
instance a ~ x => SetField "f" (V x) a where
modifyField g (MkV f) = MkV (g f)
setField v (MkV f) = MkV v
1.4 Effect and Interactions¶
1.4.1 Record dot syntax¶
This proposal does not significantly affect OverloadedRecordDot
, as the
HasField
class is essentially unchanged. It will allow
OverloadedRecordDot
to be used for unlifted datatypes and fields.
This proposal will make it easier to fully implement OverloadedRecordUpdate
,
which depends on having setField
implemented. As noted above, there is a
change to OverloadedRecordUpdate which may be noticed by users who are using
it already via RebindableSyntax
.
1.4.2 Virtual fields¶
A “virtual field” is an instance of the HasField
or SetField
classes
that is defined explicitly by the user, and which does not correspond to an
existing record datatype. For example:
data V = MkV Int
instance HasField "foo" V Int where
getField (MkV i) = i
instance SetField "foo" V Int where
modifyField f (MkV i) = MkV (f i)
Even though V
is not defined as a record, the presence of these instances
means foo
can be used as a field, e.g. let e = MkV i in e.foo
is
accepted with OverloadedRecordDot
.
Splitting HasField
into separate HasField
and SetField
classes means
it is possible to define get-only or set-only virtual fields (although set-only
fields must still have the ability to define modifyField
).
Unlike the automatic constraint solving, which takes account of whether the
field name is in scope, normal instance
declarations are globally scoped and
cannot be hidden at module boundaries. This means that once a virtual field is
defined, its existence cannot be hidden from client code, which may be
undesirable as it may expose internal implementation details.
Virtual fields are sometimes useful for backwards compatibility after a field
has been refactored, since pattern synonym fields do not lead to automatic
constraint solving for HasField
.
It is sometimes useful to define virtual HasField
instances that are
polymorphic in the field name, to give a specific datatype a convenient syntax
using OverloadedRecordDot
. For example, this is used by
esqueleto.
Various more general virtual field HasField
instances have been proposed,
some of which (to be non-orphan) would need to live in GHC.Records
, such as:
Instances for tuples with numeric field names (currently available in the tuple-fields package).
Unit datatypes with virtual fields based on MonadReader or MonadWriter.
While these are undoubtedly convenient in some cases, some of them may lead to
code that cannot be easily understood in terms of field selection and update,
and (having been designed for RecordDotSyntax
) they may or may not interact
well with uses of HasField
/SetField
in optics libraries. Thus we do not
propose to add such instances to GHC.Records
for now, pending further
experimentation. In some cases it may be more appropriate to define new
operators, rather than overloading .
with yet more potential
interpretations. The intent of HasField
/SetField
is to allow type
information to help resolve otherwise ambiguous field names from Haskell
records, not to be a general abstraction over all possible notions of record or
uses of dot syntax.
1.5 Costs and Drawbacks¶
The costs of this proposal should be no greater than those of the previously accepted proposal #158:
This will require moderate development effort, but does not seem like it will introduce a substantial maintenance burden.
Novice users may find
HasField
,SetField
and overloaded record dot/update syntax more complex to reason about than traditional Haskell record syntax.
1.6 Backward Compatibility¶
This proposal is more limited in its backward compatibility impact than the
previously accepted design (which would break all user-defined HasField
instances).
Users relying on OverloadedRecordUpdate
plus RebindableSyntax
will need
to follow the change to the order of arguments to setField
. This is a
breaking change, but OverloadedRecordUpdate
has been explicitly advertised
as experimental and subject to change.
Otherwise, this proposal does not break backward compatibility. Existing code
importing GHC.Records
is unaffected because the module does not expose the
new definitions. While HasField
has been generalised to support
representation polymorphism, GHC’s existing defaulting support for
RuntimeRep
should ensure that user code continues to compile unchanged.
1.7 Alternatives¶
There are many alternative designs possible for HasField
and related
classes, which is part of the reason progress in this area has been slow. This
proposal attempts a detailed discussion of each individual design choice, but
there are many variations possible.
Proposal #158 used a design with a single
HasField
class, no type-changing update, functional dependencies. This is the current accepted design, although the implementation is not yet merged into GHC HEAD.Proposal #286 suggests splitting
HasField
into two classes and switching to type families in place of functional dependencies. It gives a rather larger definition for theSetField
class, includingGetField
as a superclass.Proposal #510 adds support for overloaded variants alongside the existing support for overloaded records.
Another possibility is to abandon the plan to generalise HasField
to support
updates and deprecate the OverloadedRecordUpdate
extension, perhaps in
favour of another approach.
Optics libraries provide various options for working with record types, and they do not necessarily need
HasField
, although some use cases could directly benefit from it.Proposal #310 suggests adding a syntax for record update that would explicitly specify the type, thereby avoiding the need for type-directed field resolution.
It would be possible to extend name resolution so that datatype names could be used like module qualifiers, somewhat along the lines of proposal #283 on local modules. (See discussion #506 for more background on this idea.) This would not allow updates that are polymorphic in the record type, but it would make it easier to disambiguate selectors/updates to uniquely refer to a single type.
This proposal does not address support for anonymous records. There are many
design choices around different ways to integrate anonymous records with
Haskell, and the right way forward is not obvious. HasField
is designed to
reflect the capabilities of existing Haskell records. It may be useful for some
libraries implementing anonymous records as they can provide HasField
instances in order to support record dot syntax or optics. However, it does not
attempt to add support for row polymorphism, in contrast with e.g.
proposal #180.
Subsequent subsections discuss alternative choices for particular aspects of the design recommended by this proposal.
1.7.1 Order of arguments to setField¶
Proposal #158
specifies that the type of setField
is:
setField :: HasField x r a => r -> a -> r
However, swapping the order of arguments so that the new field value is first means that composing of multiple updates for a single record becomes simpler:
setField :: HasField x r a => a -> r -> r
example :: (HasField "age" r Int, HasField "colour" r String) => r -> r
example = setField @"age" 42 . setField @"colour" "Blue"
While we do not typically expect users to call setField
directly, in cases
where they prefer to do so, this seems like a good reason to prefer this
argument order. Moreover, this order is consistent with the set
function in
the lens
and optics
libraries. It is not clear what the rationale was
for the alternative order in the previous proposal.
Since this proposal specifies that calls to setField
take the field value
first, followed by the record, it is not backward compatible with code that
relied on the previous behaviour when using OverloadedRecordUpdate
with
RebindableSyntax
. We could revert to the previous order of arguments to
avoid this backward incompatibility, if the committee prefers this approach.
1.7.2 Single class vs. multiple classes¶
Proposal #286
suggests splitting HasField
into two classes, there named GetField
and
SetField
, permitting selection and update respectively. It was primarily
motivated by the possibility of supporting read-only (virtual) fields.
The present proposal similarly splits HasField
into two classes, for the
reasons set out in Motivation for changing the accepted design.
1.7.2.1 Relationships between the classes¶
There are various options for the superclass relationships between the split
classes. Proposal #286 suggests having
GetField
be a superclass of SetField
. However, this would rule out the
possibility of write-only fields, and incur additional compile-time cost at each
overloaded update in order to generate an (often unnecessary) GetField
dictionary.
Instead we propose that HasField
and SetField
should be independent
classes, with no superclasses, and that Field
should be a constraint synonym
for both constraints. This constraint synonym means that where both
getField
and setField
are used, users can write simpler types, and GHC
can use it to represent inferred types more simply.
1.7.2.2 Naming the classes¶
We propose to keep the name HasField
for the existing class. This is
backwards-compatible with existing code, avoiding unnecessary breaking changes.
However, this will lead to a long-lasting inconsistency in naming, because
GHC.Records.Experimental
will export HasField(getField)
and SetField(modifyField)
.
An alternative would be to rename HasField
(e.g. to GetField
), at the
cost of breaking any code with an explicit import like HasField(getField)
,
or that defines a virtual field instance of HasField
.
While we could use a type synonym type HasField = GetField
for partial
backwards compatibility, this would not allow defining instances, and would mean
that a HasField(..)
import could no longer import getField
.
1.7.2.3 Downsides of keeping the classes independent¶
A potential disadvantage of splitting HasField
into two independent classes
is that where a user defines a “virtual field” that requires indexing into a
data structure (e.g. a map), it may be possible to implement an operation that
gets and modifies a field more efficiently than defining it from getField
and modifyField
. This is why proposal #158 settled on
hasField :: r -> (a -> r, a)
. This represents a lens, i.e. the combination
of a getter and setter into a single value, although it uses a first-order
representation that is simpler and does not compose as well as the “van
Laarhoven” or profunctor representations of lenses.
However, practical cases where the choice of hasField
vs. the combination of
getField
and modifyField
matters are likely to be rare. In particular,
normal record types with the built-in constraint-solving behaviour do not gain
anything from hasField
being a single method. Where this matters, users are
likely to be better off using an optics library. Thus we prefer the simplicity
of separate classes.
If users do wish to organise field-like lenses into a class, they can define an auxiliary class such as the following:
class Field x r a => FieldLens x r a where
fieldLens :: Lens' r a
fieldLens = lens getField setField
-- | Instance will be selected by default, but can be overridden by defining an
-- instance for a specific type with a non-default 'fieldLens' implementation
instance {-# OVERLAPPABLE #-} Field x r a => FieldLens x r a
We do not propose to add such a class to GHC.Records.Experimental
, since it is better
defined by specific optics libraries. (The optics
library defines a class
LabelOptic
that plays essentially this role.)
1.7.3 Laws¶
Where HasField
and SetField
instances are defined we expect the lens
laws to hold. As noted in the Haddocks in the Proposed Change Specification,
the specific laws are:
For each type with a
SetField
instance and every record valuer
and fieldx
:modifyField @x id r === r or ⊥ (modifyField @x g . modifyField @x f) r === modifyField @x (g . f) r
This ensures that
modifyField :: (a -> a) -> r -> r
defines a functor. The “PutPut” lens law follows as a consequence.For each type with both
HasField
andSetField
instances and every record valuer
which has a fieldx
:getField @x (setField @x v r) === v -- PutGet setField @x (getField @x r) r === r -- GetPut
or if
r
does not have the fieldx
(i.e.getField @x r === ⊥
):getField @x (setField @x v r) === ⊥ setField @x (getField @x r) r === r or ⊥
Where the constraint solver automatically solves one of these constraints, the laws will be satisfied.
Where a field is absent, that is where getField
is undefined,
the laws permit modifyField
to be defined (to be a no-op) or undefined.
However it may not change the constructor so that the field is present.
A disadvantage of independent classes is that it is slightly unsatisfactory to have typeclass laws relating them (as the instances may be defined in separate modules). This would is unlikely to cause practical problems, however. It would be more of an issue in a language where the laws were enforced as part of the class.
1.7.4 Functional dependencies¶
The existing HasField
class expresses the relationship between the record
type and the field type using a functional dependency:
class HasField x r a | x r -> a
That is, the field label and record type should together determine the field type. This is necessary to allow good type inference. In particular, it allows the type of a composition of field selectors to be inferred:
getField @"foo" . getField @"bar"
:: (HasField "foo" b c, HasField "bar" a b) => a -> c
The middle type b
appears only in the context, so it would be ambiguous in
the absence of the functional dependency.
Instead of using a functional dependency, it is also possible to express this using a type family (associated or otherwise), like so:
class HasField x r where
type FieldType x r :: Type
getField :: r -> FieldType x r
With this definition, we obtain:
getField @"foo" . getField @"bar"
:: (HasField "foo" (FieldType "bar" a), HasField "bar" a) =>
a -> FieldType "foo" (FieldType "bar" a)
Introducing such a type family would give more options to optics library implementers and other power users, and proposal #286 suggests making this change.
However, we propose to retain the use of functional dependencies in the class definitions, for the following reasons:
The functional dependency approach generally leads to simpler inferred types because unsolved constraints look like
HasField x r a
which has a natural reading “r
has a fieldx
of typea
”. In contrast, the type family approach ends up with unsolvedHasField x r
constraints (meaningr
has a fieldx
of unspecified type) and equalities includingFieldType
. (See previous discussion on proposal #158.)Supporting representation polymorphism with the type family approach would introduce extra complexity, because we would need another type family to determine the
RuntimeRep
of the field, and it would be difficult to hide this type family from users. In contrast, supporting them is relatively straightforward with functional dependencies, and GHC will automatically hide unused representation polymorphism.If we wish to extend
SetField
to support type-changing update in the future, it is desirable that either the original or updated types may be used to infer the other. This can be achieved using multiple functional dependencies, something like this:class SetField (x :: k) s t a b | x s -> a, x t -> b, x s b -> t, x t a -> s
A similar effect is possible to achieve with type families (e.g. see the SameModulo approach by @effectfully) but requires additional complexity. While we do not propose type-changing update for now, we wish to leave the door open for adding it in a follow-up proposal.
It is desirable to permit user-defined
HasField
instances that may not strictly be consistent with the automatic constraint-solving behaviour in some corner cases (see proposal #515). This is relatively harmless with functional dependencies, because the worst that can happen is the equivalent of incoherent instance resolution (risking the results of type inference being confusing, but not threatening type soundness). In contrast, type family consistency checks are crucial to type soundness, so more care would be needed to ensure theFieldType
type family could not reduce to inconsistent values as a result of user-defined instances interacting with the built-in constraint solver.
Functional dependencies do not carry evidence. This means that from the given
constraints (HasField x r a, HasField x r b)
it would not be possible to
conclude that a ~ b
. However this does not seem like a significant
practical limitation in the HasField
context.
1.7.5 Higher-rank fields¶
Consider the following:
data Rank1 = Rank1 { identity :: forall a . a -> a }
data Rank2 = Rank2 { withIdentity :: (forall a . a -> a) -> Bool }
In the first definition, the field has a rank-1 type, but this means the
selector function has a type with a forall
to the right of an arrow.
Similarly, in the second definition, a rank-2 field type leads to a higher-rank
selector function type:
identity :: Rank1 -> forall a . a -> a -- NOT forall a . Rank1 -> a -> a (in recent GHCs)
withIdentity :: Rank2 -> (forall a . a -> a) -> Bool
Should it be possible to solve HasField
or SetField
constraints
involving such fields? Unfortunately it is not feasible to solve for
“impredicative” constraints such as
HasField "identity" Rank1 (forall a . a -> a)
,
even with the recent introduction of Quick Look Impredicativity (following
proposal #274).
Bidirectional type inference, on which both RankNTypes
and
ImpredicativeTypes
(now) rely, requires that instantiations of
forall
-bound variables be determined while traversing the term, prior to the
constraint solver being invoked.
On the other hand, it would be possible in principle to solve constraints such
as HasField "identity" Rank1 (a -> a)
for arbitrary a
, making it appear
as if the field has an infinite family of types. However, this would not extend
to SetField
, because there we really need the value being set to be
polymorphic. Moreover, it would violate the functional dependency x r -> a
on the HasField
class, leading to a violation of confluence: given wanteds
HasField "identity" r (α -> α)
and HasField "identity" r (β -> β)
,
applying the fundep forces α ~ β
; whereas if we were first to learn r ~
Rank1
then we could solve both constraints without requiring α ~ β
.
Accordingly, we propose that HasField
or SetField
constraints
involving fields with higher-rank types should not be solved automatically.
(This is the existing behaviour for HasField
in current GHC versions.)
1.7.6 Partial fields¶
In Haskell2010
it is permitted to define partial fields, i.e. fields that
do not belong to every constructor of the datatype. This means that traditional
record selection and update may throw runtime exceptions, as in these examples:
data T = MkT1 { partial :: Int } | MkT2
t = MkT2
oops1 = partial t
oops2 = t { partial = 0 }
Many Haskell programmers prefer not to define partial fields, as part of a general desire to avoid unnecessary partiality (see for example proposal #351).
Partial fields may be identified at definition sites via the existing
-Wpartial-fields
warning. However, this is somewhat conservative: it is
perfectly safe to define partial fields provided they are used only via
record construction and pattern-matching, not via selection or update.
Users have asked for the ability to prevent unsafe uses while permitting
datatype definitions,
because giving field names can help with readability when a datatype has many
constructors and many fields. The accepted proposal #516 adds a
new warning -Wincomplete-record-selectors
when HasField
constraints are
solved with a partial selector function, and this proposal adds the
corresponding feature for SetField
. This relies on the fact that
HasField
and SetField
are distinct classes, so GHC can emit an
appropriate warning for selection and update.
1.7.6.1 Updates could ignore partial fields¶
In principle, it is not necessary for setField
or modifyField
to emit a
runtime error if used with a field that is not present in the datatype; they
could silently return the value unchanged instead. This behaviour may be more
convenient in some circumstances, but may also mask errors, and would not be
consistent with traditional record updates.
We could imagine giving the option to the user, e.g. via some modifier on the
datatype definition. Somewhat related is proposal #535, which suggests an
extension MaybeFieldSelectors
to control whether partial fields can lead to
runtime exceptions.
1.7.6.2 Refrain from solving partial fields?¶
Another option would be for GHC to refrain from solving HasField
or
SetField
constraints automatically where the fields involved are partial.
This would allow users to define virtual fields with the behaviour they want,
without conflicting with the automatic solutions. See this comment from
@pnotequalnp
for more motivation for this idea.
However, this would make getField
and setField
less consistent with
traditional record selectors and record updates. Moreover it would lead to
backwards incompatibility for HasField
.
1.7.6.3 Affine traversals¶
Optics libraries in principle have a better story to tell here. Partial fields
give rise to affine traversals, where the accessor function returns a
Maybe
value and the setter leaves the value unchanged if it does not mention
the field (rather than throwing a runtime exception).
We could consider supporting this using built-in classes like the following:
class GetPartialField x r a | x r -> a where
getPartialField :: r -> Maybe a
class SetPartialField x r a | x r -> a where
modifyPartialField :: (a -> a) -> r -> r
class FieldTotal x r a (is_total :: Bool) | x r -> a is_total
Note that modifyField
will throw an exception on missing fields, whereas
modifyPartialField
would return the value unchanged. The FieldTotal
class would allow an optics library to determine whether a particular field was
total and hence whether it should produce a lens or an affine traversal.
For now we propose not to include support for partial fields through classes
like this, in the interests of minimizing complexity, and because it is not
clear how they could be used together with OverloadedRecordDot
.
1.7.7 Setting vs modification¶
A previous iteration of the design supported only setField :: a -> r -> r
and not
modifyField :: (a -> a) -> r -> r
. The latter generalises setField
to
allow modifying any a
values in the datatype (of which there may be none).
It is easy to implement setField
in terms of modifyField
, but not vice
versa, because we would need to define:
modifyFieldAlt :: forall x r a . (HasField x r a, SetField x r a) => (a -> a) -> r -> r
modifyFieldAlt f r = setField @x (f (getField @x r)) r
This imposes an additional HasField
constraint, and will necessarily be
partial if getField
is partial (whereas modifyField
can in principle be
total, although this will not be the case for automatically solved constraints,
as discussed above).
Thus we propose to include both modifyField
and setField
as class
methods. Default implementations can be provided such that users implementing
virtual field instances typically need implement only one (except where
representation polymorphism is in use, or where there is no HasField
instance).
A consequence of this is that it is not possible to use SetField
for types
that are “write-only”, e.g. where they do not contain a value for the field at
all, and hence modifyField
cannot be defined.
Another possibility would be to define setField
at the top level, rather
than being a class method. This would make the SetField
dictionary smaller
(and lead to it being represented as a newtype). However, from the perspective
of a user defining instances of SetField
it seems preferable to be able to
define either setField
or modifyField
(or both, if there is some runtime
performance advantage to doing so). Moreover, the presence of representation
polymorphism would require this definition to be given a “compulsory unfolding”,
meaning that setField x
would be inlined at every call site (at which point
the representation of the argument is necessarily fixed). See previous
discussion on the ghc-devs mailing list.
1.7.8 Kind of field labels¶
When HasField
was originally introduced in proposals #6, the kind of the
parameter x
representing the field label was polymorphic:
class HasField (x :: k) r a | x r -> a where ...
While the class allows k :: Type
to vary freely, HasField
constraints
will be solved only if it is instantiated to Symbol
. Moreover,
OverloadedRecordDot
and OverloadedRecordUpdate
will only ever generate
constraints using Symbol
. Other possibilities were permitted in order to
support hypothetical anonymous records libraries, which might support different
kinds of fields, e.g. drawn from explicitly-defined enumerations.
In principle it would be possible to simplify the class by specialising it to
use Symbol
rather than k
. However we propose to retain the poly-kinded
definition in the interests of generality and compatibility. For example,
the record-hasfield
library makes use of the possibility to define label
kinds other than Symbol
, allowing tuples of labels to be used for
composition of fields. In particular, it defines an instance like:
instance (HasField x1 r1 r2, HasField x2 r2 a2)
=> HasField '(x1, x2) r1 a2
which means getField @("foo", "bar")
will be treated like the composition
getField @"bar" . getField @"foo"
.
Another potential application could be to use labels of kind Nat
to index
into a tuple:
instance HasField 1 (x,y) where
getField = fst
1.7.9 Representation polymorphism¶
The existing definition of HasField
does not support unlifted fields or
datatypes, such as in the following example:
data T = MkT { foo :: Int# }
type R :: forall (l :: Levity) . TYPE (BoxedRep l) -> TYPE (BoxedRep l)
data R a where
MkR :: { bar :: a } -> R a
The constraint HasField "foo" T Int#
or HasField "bar" (R a) a
are not
even well-kinded, because the field type and record type are required to be
lifted.
At the time HasField
was introduced, it was not possible to define type
classes over potentially unlifted types. However, thanks to representation polymorphism
in more recent GHC versions, this is now relatively straightforward. In
particular, we can define:
type HasField :: forall {k}{r1 :: RuntimeRep}{r2 :: RuntimeRep} .
k -> TYPE r1 -> TYPE r2 -> Constraint
class HasField x r a | x r -> a where
-- | Selector function to extract the field from the record.
getField :: r -> a
This makes it possible to formulate and solve constraints such as HasField
"foo" T Int#
.
See #22156 for a request
for this feature.
Observe that the RuntimeRep
parameters are inferred rather than specified
(hence the curly braces in the kind signature). This means that when
getField
is used with explicit type application, the RuntimeRep
parameters are skipped.
The default implementation of setField
in terms of modifyField
(and vice
versa) works only when the representation is constrained via a default signature
to be LiftedRep
. This is currently necessary for the default definition to
typecheck, because there is no other way to express the requirement that at each
instance the representation should be concrete. It may be possible to lift this
restriction in the future (see #14917), but for the moment,
users defining their own SetField
instances for unlifted types will need to
define both setField
and modifyField
.
1.7.10 Linear types¶
Rather like representation polymorphism, it is possible to make the definition
of HasField
multiplicity-polymorphic, so that it could be used with the
LinearTypes
extension, like this (kind and representation polymorphism
omitted for clarity):
type HasField :: Multiplicity -> Symbol -> Type -> Type -> Constraint
class HasField m x r a | ... where
getField :: r %m -> a
type SetField :: Multiplicity -> Multiplicity -> Type -> Type -> Type -> Constraint
class SetField m1 m2 x s t b | ... where
setField :: b %m1 -> s %m2 -> t
The constraint solver would set the Multiplicity
parameters appropriately
when solving a HasField
or SetField
constraint for a particular concrete
record type and field.
However, this introduces extra complexity, the current implementation of
LinearTypes
does not yet support linear record projection (#18570) or multiplicity annotations
on fields (#18462),
and it has various limitations on solving constraints involving Multiplicity
.
Thus we do not propose to support multiplicity-polymorphic HasField
or SetField
constraints for the time being.
1.7.11 Visible foralls¶
At the time of writing, GHC supports “visible foralls” (visible dependent
quantification) in kinds, but not in the types of terms. The accepted proposal #281 allows
the types of terms to use visible foralls. This is desirable for getField
and similar functions, because it is always necessary to supply the field name
using a type application.
We currently have:
getField :: forall {k} (x :: k) r a . HasField x r a => r -> a
which at use sites must use an explicit type application, e.g. getField
@"foo"
. If the type application is omitted, an ambiguity error will result,
because there is no way to infer the field label from the record type or field
type.
If and when support for visible foralls is added, the type of getField
could
change to:
getField :: forall r a {k} . forall (x :: k) -> r -> a
meaning that we could instead use getField "foo"
at use sites. (Per the
visible forall proposal, here "foo"
is a type-level Symbol
even though
it syntactically resembles a String
literal.)
This would be a breaking change, and visible dependent quantification is not yet
fully implemented, so changing getField
and setField
to use it is not
part of the present proposal.
1.7.12 Pattern synonyms¶
An infelicity with the current constraint solving behaviour for HasField
is
that it does not work for record pattern synonyms. Thus where
OverloadedRecordDot
or similar is used, replacing a datatype with an
equivalent record pattern synonym may require declaring manual HasField
and
SetField
instances.
It would be relatively easy to extend the automatic behaviour to support single record pattern synonyms. For example, given the declaration:
pattern MyPair{car,cdr} = (car, cdr)
it would be possible to solve a constraint like:
HasField "car" (x, y) x
and hence a declaration like this would be accepted:
swap :: (x, y) -> (y, x)
swap p = MyPair { car = p.cdr, cdr = p.car }
However, the fact that pattern synonyms can be added for arbitrary types (in
this example, for the built-in type of pairs) mean that such behaviour can give
rise to incoherent solutions to HasField
constraints (cf. proposal #515). For example, if
another module defined:
pattern MyPair2{car,cdr} = (cdr, car)
then the constraint HasField "car" (x, x) x
would be solved differently
depending on whether car
from MyPair
or from MyPair2
was in scope.
Moreover, it is unclear how to extend the automatic treatment of pattern synonyms to handle multiple-constructor types. For example, given the declarations:
pattern MyLeft{val} = Left val
pattern MyRight{val} = Right val
we would ideally generate a solution to HasField "val" (Either a a) a
that
used both patterns, as in:
get_val :: Either a a -> a
get_val MyLeft{val} = val
get_val MyRight{val} = val
However, it is not clear how to do this in general, since pattern synonyms are
not necessarily grouped and may overlap in arbitrarily complex ways. (While
COMPLETE
pragmas do give a notion of grouping for pattern synonyms, their
purpose is currently limited to the pattern-match completeness checker, and is
not clear that they should have a semantic impact.)
1.8 Unresolved Questions¶
Changing SetField
to support type-changing update is deliberately left out
of this proposal, so that it can be considered in detail as a subsequent
proposal.
1.9 Implementation Plan¶
Support with the implementation of this proposal would be welcome. The
implementation of setField
(in some form) is
currently blocking the full implementation of OverloadedRecordUpdate
(proposal #282).
1.9.1 Compile-time performance¶
It is important that the implementation of this proposal should not regress compile-time (or runtime) performance. This was a problem for the previous implementation of proposal #158 (GHC merge request !3257).
The existing implementation of HasField
benefits from being able to reuse
the record selector functions that GHC already generates ahead of time for every
field of every datatype. Since HasField
has a single method corresponding
to this function, the constraint solver is able to construct a dictionary merely
by casting the existing selector function.
The previous implementation attempt followed this, generating additional
functions ahead of time for every field of every datatype. However, this can
add a significant cost when defining large record datatypes, especially if
SetField
is not subsequently used. Thus a better implementation strategy
would probably be to generate the dictionaries on-the-fly in the constraint
solver (much as when GHC compiles a traditional record update it generates and
type-checks a suitable case expression).
If necessary, we could imagine adding flags to allow the user to control whether
to generate the needed functions at datatype definition sites (which may be more
efficient if SetField
is used frequently) or at use sites (which may be more
efficient if records are large and SetField
is used rarely).
1.9.2 Default methods¶
The default definitions of setField
and modifyField
as written above are
not currently accepted by GHC, which appears to be a bug (see #23884). If it turns out to be
difficult to resolve this, we may wish to revisit the design.