Hexadecimal Floats in Haskell¶
Currently, Haskell only allows writing floating-point numbers in the decimal format. Unfortunately, writing floats in decimal/scientific format is not always the best option: To write finite floats precisely one might need an extraordinary number of decimal digits, for instance.
As an alternative, there’s the so called “hexadecimal floating point” format, described in p57-58 of http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
Another source for the same format is the IEEE-754 spec itself: Section 5.12.3 of https://www.csee.umbc.edu/~tsimo1/CMSC455/IEEE-754-2008.pdf
The format is rather simple, unambiguous, and relatively easy to implement. And it’s been around for about 10 years by now. Some examples are:
0x1p+1
0x1p+8
0x1.b7p-1
0x1.fffffffffffffp+1023
0X1.921FB4D12D84AP-1
It would be nice if the Haskell standard was changed to include such literals. But in the meantime,
perhaps GHC can support such literals via a pragma, such as LANGUAGE HexadecimalFloats
or similar.
Eventually, the change should make it into the Haskell report.
Note that there’s already a feature request filed for GHC: https://gitlab.haskell.org/ghc/ghc/issues/13126
Motivation¶
Floating-point is always tricky, due to loss of precision during computation. This starts from the reading of such values, as decimal notation is not sufficient for expressing floats precisely anc concisely. The hexadecimal notation provides a concise notation without losing precision, and conforms to the new standards as implemented by gcc.
Proposed Change Specification¶
The changes are rather simple, and follows that of the other languages with some Haskell specific deviations:
Introduce a new pragma
LANGUAGE HexadecimalFloats
or similar.
- Follow the grammar given in p57-58 of http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
Exception: We do not need the suffix
F
orL
as types would be enough to do the appropriate conversion.Exception: We will make the exponent (the
p...
part) optional, just like the regular e is optional.Exception: If a literal contains a dot, then it must have some hex digits before and after it
Provide the corresponding pretty-printer (showHFloat) in the Numeric package.
Read instance for floats-doubles should be changed to support the new format.
Note that the original allows 0x.Ap4
as a literal, which we disallow in Haskell. This is similar to the case
for .5
or 5.
that is allowed by other languages, but not by Haskell.
Desugaring¶
Ordinarily, a literal such as 2.5
is overloaded in Haskell, and inhabits all Fractional
values.
It desugars to fromRational (25 % 10)
.
The new notation makes no changes to this semantics. One thing to note, however, is that since precision is the primary motive for this notation, we should be more vigilant in issuing warnings for underflow/overflow cases. (See below.)
Note that the .
and p
are both optional in the notation. If either exist, we desugar through fromRational
.
If neither exists, then it’s already a hexadecimal literal that desugars as usual via fromInteger
. Various
cases to consider:
0xAB
: No dots, no exponents: Regular literal. Desugars viafromInteger
.
0x1a.3
: Dot, no exponent. Floating point literal: Desugars viafromRational
.
0x1p-4
: No dot, exponent. Floating point literal: Desugars viafromRational
.
0x1.2p3
: Both dot and exponent. Floating point literal. Desugars viafromRational
.
So, the rule is simple: If .
or p
is present: Desugar through fromRational
. Otherwise use fromInteger
.
Effect and Interactions¶
None. The addition is orthogonal, and the changes to the grammar is unambiguous by design. No significant complexity to any part of the compiler anticipated.
Costs and Drawbacks¶
This proposal should be fairly simple to implement. Perhaps about a day of coding and test cases for someone familar with the code base. Even if it’s tackled as an intern/summer-of-code idea, it should not take more than a few days to flesh it out at the worst case. Also, some code reuse is possible as the idea is already implemented as a library. See below.
Drawbacks: It was pointed out that the Read
instance would break backwards compatibility. Consider:
Prelude> reads "0x1p3" :: [(Double, String)]
[(1.0,"p3")]
With the new implementation, this would return: [(8.0, "")]
instead. While this is a change in behavior, I think
it’s an acceptable one given the new syntax for floats. The drawback here is that we cannot guard against this using
a language pragma.
Alternatives¶
The obvious alternative is to use quasi-quoting to implement this in a library. Indeed, there is already a hackage package that implements this as a quasi-quoter, together with the pretty printer: http://hackage.haskell.org/package/FloatingHex
Unfortunately, the “library” solution is really not ideal:
It relies on the rather heavy mechanism for quasi-quotes
Usage requires importing a new module
Usage requires a pragma (
QuasiQuotes
)Most imporantly: Usage requires dependency on a hackage package
This is indeed a lot of requirements and heavy machinery to be able to write literals! With this proposal, we will
reduce the dependency to one pragma (HexadecimalFloats
); and when the Haskell standard catches up, even that
will become unnecessary.
Overflow/Underflow¶
The format allows for specifying numbers that are larger or smaller than what the underlying type can represent. For instance
a number like 0x1p5000
would not fit in a Double
and thus would have the special value Infinity
.
(Similar to 1/0
). In the other direction, a number like 0x1p-5000
is too small to be represented, and would round to
the correct value based on the rounding-mode, which is by default round-to-nearest-ties-to-even in Haskell. This is really
no different than how decimal floats are treated in Haskell today.
I think the right thing to do when the literal is too large/small is to print a warning, similar to what we already have for other literals:
Prelude Data.Word> 200000::Word16
<interactive>:3:1: warning: [-Woverflowed-literals]
Literal 200000 is out of the Word16 range 0..65535
3392
However, I’ll note that GHC currently doesn’t provide a similar warning for decimal floats (such as 2E20000
).
Indeed, the recommended practice section of
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf on page 58 says:
The implementation should produce a diagnostic message if a hexadecimal constant cannot be represented exactly in its evaluation format; the implementation should then proceed with the translation of the program.
I think GHC should follow the same practice, and issue warnings for all float values when the coversion
would cause undeflow/overflow,
controlled by the -Woverflowed-literals
flag.
Unresolved Questions¶
None
Implementation Plan¶
Iavor Diatchki (@yav) has a Phabricator patch that implements the proposal (https://phabricator.haskell.org/D3066). which
requires minimal amount of work to be complete. (Essentially the read
instance and the pretty-printer are missing;
as of Feb 20 2017.)