1 Qualified Strings¶
This proposal proposes replicating -XQualifiedDo for literal strings, to enable more ergonomic and more powerful syntax than OverloadedStrings. Another way to view this proposal would be replicating -XRebindableSyntax for literal strings, but only within a local scope.
See also:
1.1 Motivation¶
1.1.1 Problems with Type Class-driven overloading¶
OverloadedStrings works by desugaring to a type-class-overloaded function. That leads to certain general shortcomings:
It is a module-wide setting
Anecdotally, people would rather avoid
OverloadedStringsthan deal with overloaded strings in the entire module.It’s possible that this is one reason this extension isn’t a default in GHC202X language editions, despite being in GHC for a long time.
Type inference ambiguity.
Consider the following code:
import Data.Text qualified as T output :: IsString s => s -> IO () main = do -- Rejected by typechecker if OverloadedStrings is enabled output "hello" output "world" -- -- Requires OverloadedStrings -- output $ T.replace " " "_" input
This originally works with no extensions, due to the string literals being typed to concrete
String. But if the developer wants to useTextliterals withT.replace, addingOverloadedStringswould cause ambiguity to the existing locations because they are now no longer concretelyString.
This proposal would allow using a module qualifier to say precisely which function to desugar to, rather than using type classes, in a similar manner as -XQualifiedDo. This would allow writing the previous code as
{-# LANGUAGE QualifiedStrings #-}
main = do
output "hello"
output "world"
output $ T.replace T." " T."_" input
The existing locations would continue working as String, while the new line would unambiguously desugar to T.replace (T.pack " ") (T.pack "_") input.
1.2 Proposed Change Specification¶
Introduce -XQualifiedStrings that desugars literal string syntax to function calls in a similar way to -XQualifiedDo (docs, proposal).
As long as the desugared expressions/patterns type check, users are free to define these functions however they want. No whitespace is allowed between the . and the module name / literal.
Currently, string literals have the following desugaring:
Expression |
Enabled extensions |
Desugared expression syntax |
|
|
|
|
|
|
|
|
|
With -XQualifiedStrings, we gain the following syntaxes:
New expression syntax |
Additional extensions |
Desugared expression syntax |
|
|
|
|
|
|
New pattern syntax |
Additional extensions |
Desugared pattern syntax |
|
|
|
|
|
|
It is highly recommended that all types with IsString instances include a top-level fromString function, to enable locally-scoped overloading over -XOverloadedStrings:
module Data.MyString where
import Data.String qualified as S
data MyString = ...
instance S.IsString MyString where
fromString = ...
-- Alternatively, this can be defined in aonther
-- module like Data.MyString.Qualified
fromString :: String -> MyString
fromString = S.fromString
Qualified multiline strings are only allowed if -XMultilineStrings is enabled. Qualified multiline strings are desugared to single line strings first, then desugared as a qualified string literal. See Multiline Strings for more information.
1.2.1 Lexical Structure¶
Section 10.2 of the Haskell 2010 report defines:
literal → integer | float | char | string
Proposal #569 extended this to:
literal → integer | float | char | string | multilineString
This proposal further extends it to add modid . string and modid . multilineString:
literal → integer | float | char | string | multilineString | modid . string | modid . multilineString
1.2.2 Module name resolution¶
Module names are resolved immediately, when parsing a quote. This matches the behavior of resolving modules in normal qualified values in quotes.
module A where
import OneImpl qualified as M
-- Immediately resolves to OneImpl."foo"
-- Errors if M is not in scope
foo = [| M."foo" |]
1.3 Proposed Library Change Specification¶
1.3.1 Template Haskell¶
We intentionally avoid adding a “qualified string” constructor to Template Haskell (at least for now), since it’s simply syntax sugar that can be represented in Template Haskell as an explicit appE (varE ‘MyMod.fromString) (litE $ stringL “…”). This is also consistent with MultilineStrings, which doesn’t have a TH constructor.
1.4 Examples¶
1.4.1 ByteString¶
It’s a known issue that ByteString has surprising IsString behavior, due to ambiguity in how to handle Unicode characters.
With QualifiedStrings, bytestring could define the following modules:
module Data.ByteString.Qualified.Ascii where
-- truncates unicode
fromString :: String -> ByteString
fromString = Char8.pack
module Data.ByteString.Qualified.Utf8 where
-- encodes unicode
fromString :: String -> ByteString
fromString = BS.toStrict . BS.toLazyByteString . BS.stringUtf8
Users would then be forced to decide what behavior they want (and can switch between the two!):
import Data.ByteString qualified as BS
import Data.ByteString.Qualified.Ascii qualified as Ascii
import Data.ByteString.Qualified.Utf8 qualified as Utf8
main = do
-- [98,108,97,158]
print $ BS.unpack Ascii."bla語"
-- [98,108,97,232,170,158]
print $ BS.unpack Utf8."bla語"
1.5 Effect and Interactions¶
With QualifiedStrings, there’s no more typeclass ambiguity; e.g. the text library could provide a module like:
module Data.Text.Qualified where
import Data.Text
fromString :: String -> Text
fromString = pack
and users can do
import Data.Text.Qualified qualified as T
main = print T."asdf"
The equivalent code with OverloadedStrings would have failed to compile with -Wall -Werror enabled (due to type defaulting).
1.5.1 Interactions with other extensions¶
Related to QualifiedLists and QualifiedNumerics, but all three proposals are orthogonal to each other.
Qualified multiline strings are allowed when
-XMultilineStringsis enabled, as mentioned in the specificationAllow arbitrary identifiers as fields in OverloadedRecordDot has similar syntax to the proposed qualified string literal, but as
M.baris parsed as a qualified identifier even with OverloadedRecordDot, it makes sense thatM."bar"is also parsed as a qualified literal.Allow native string interpolation syntax proposes adding string interpolation syntax with
s"...". If both proposals are accepted, this syntax could provide a mechanism similar to Javascript’s tagged template literals. See the other proposal for more details.
1.6 Costs and Drawbacks¶
Development and maintenance should be low effort, as the core implementation is in the renamer step, and typechecking would proceed as normal.
The syntax is approachable for novice users and shouldn’t be an extra barrier to understand.
1.7 Backward Compatibility¶
No breakage, as the new syntax is only enabled with the extension.
Furthermore, turning on the extension will generally not break existing code. Any existing code written as M."asdf" would be parsed as function composition between a data constructor and a literal, which would only typecheck if someone adds an IsString instance for a function type.
1.8 Alternatives¶
Use PatternSynonyms for string literals in patterns
The View pattern more closely matches Section 3.17.2 in the 2010 Report
1.8.1 Future work¶
Some literals are not supported yet (Chars, unboxed literals) due to lack of use-cases, but could be extended in the future.
Future work could be done to allow compile time logic, e.g.
$M."hello"=>$(M.fromString [|"hello"|]), but that is out of scope of this proposal.
1.9 Implementation Plan¶
Brandon Chinn will volunteer to implement.