1 Qualified Strings

This proposal proposes replicating -XQualifiedDo for literal strings, to enable more ergonomic and more powerful syntax than OverloadedStrings. Another way to view this proposal would be replicating -XRebindableSyntax for literal strings, but only within a local scope.

See also:

1.1 Motivation

1.1.1 Problems with Type Class-driven overloading

OverloadedStrings works by desugaring to a type-class-overloaded function. That leads to certain general shortcomings:

  • It is a module-wide setting

    • Anecdotally, people would rather avoid OverloadedStrings than deal with overloaded strings in the entire module.

    • It’s possible that this is one reason this extension isn’t a default in GHC202X language editions, despite being in GHC for a long time.

  • Type inference ambiguity.

    • Consider the following code:

      import Data.Text qualified as T
      
      output :: IsString s => s -> IO ()
      
      main = do
        -- Rejected by typechecker if OverloadedStrings is enabled
        output "hello"
        output "world"
      
        -- -- Requires OverloadedStrings
        -- output $ T.replace " " "_" input
      

      This originally works with no extensions, due to the string literals being typed to concrete String. But if the developer wants to use Text literals with T.replace, adding OverloadedStrings would cause ambiguity to the existing locations because they are now no longer concretely String.

This proposal would allow using a module qualifier to say precisely which function to desugar to, rather than using type classes, in a similar manner as -XQualifiedDo. This would allow writing the previous code as

{-# LANGUAGE QualifiedStrings #-}

main = do
  output "hello"
  output "world"

  output $ T.replace T." " T."_" input

The existing locations would continue working as String, while the new line would unambiguously desugar to T.replace (T.pack " ") (T.pack "_") input.

1.2 Proposed Change Specification

Introduce -XQualifiedStrings that desugars literal string syntax to function calls in a similar way to -XQualifiedDo (docs, proposal).

As long as the desugared expressions/patterns type check, users are free to define these functions however they want. No whitespace is allowed between the . and the module name / literal.

Currently, string literals have the following desugaring:

Expression

Enabled extensions

Desugared expression syntax

"hello"

"hello"

"hello"

-XOverloadedStrings

GHC.Exts.fromString "hello"

"""hello"""

-XMultilineStrings

"hello"

With -XQualifiedStrings, we gain the following syntaxes:

New expression syntax

Additional extensions

Desugared expression syntax

M."asdf"

M.fromString "asdf"

M."""asdf"""

-XMultilineStrings

M.fromString "asdf"

New pattern syntax

Additional extensions

Desugared pattern syntax

M."asdf"

((== M.fromString "asdf") -> True)

M."""asdf"""

-XMultilineStrings

((== M.fromString "asdf") -> True)

It is highly recommended that all types with IsString instances include a top-level fromString function, to enable locally-scoped overloading over -XOverloadedStrings:

module Data.MyString where

import Data.String qualified as S

data MyString = ...

instance S.IsString MyString where
  fromString = ...

-- Alternatively, this can be defined in aonther
-- module like Data.MyString.Qualified
fromString :: String -> MyString
fromString = S.fromString

Qualified multiline strings are only allowed if -XMultilineStrings is enabled. Qualified multiline strings are desugared to single line strings first, then desugared as a qualified string literal. See Multiline Strings for more information.

1.2.1 Lexical Structure

Section 10.2 of the Haskell 2010 report defines:

literal  integer | float | char | string

Proposal #569 extended this to:

literal  integer | float | char | string | multilineString

This proposal further extends it to add modid . string and modid . multilineString:

literal  integer | float | char | string | multilineString | modid . string | modid . multilineString

1.2.2 Module name resolution

Module names are resolved immediately, when parsing a quote. This matches the behavior of resolving modules in normal qualified values in quotes.

module A where

import OneImpl qualified as M

-- Immediately resolves to OneImpl."foo"
-- Errors if M is not in scope
foo = [| M."foo" |]

1.3 Proposed Library Change Specification

1.3.1 Template Haskell

We intentionally avoid adding a “qualified string” constructor to Template Haskell (at least for now), since it’s simply syntax sugar that can be represented in Template Haskell as an explicit appE (varE ‘MyMod.fromString) (litE $ stringL “…”). This is also consistent with MultilineStrings, which doesn’t have a TH constructor.

1.4 Examples

1.4.1 ByteString

It’s a known issue that ByteString has surprising IsString behavior, due to ambiguity in how to handle Unicode characters.

With QualifiedStrings, bytestring could define the following modules:

module Data.ByteString.Qualified.Ascii where

-- truncates unicode
fromString :: String -> ByteString
fromString = Char8.pack

module Data.ByteString.Qualified.Utf8 where

-- encodes unicode
fromString :: String -> ByteString
fromString = BS.toStrict . BS.toLazyByteString . BS.stringUtf8

Users would then be forced to decide what behavior they want (and can switch between the two!):

import Data.ByteString qualified as BS
import Data.ByteString.Qualified.Ascii qualified as Ascii
import Data.ByteString.Qualified.Utf8 qualified as Utf8

main = do
  -- [98,108,97,158]
  print $ BS.unpack Ascii."bla語"

  -- [98,108,97,232,170,158]
  print $ BS.unpack Utf8."bla語"

1.5 Effect and Interactions

With QualifiedStrings, there’s no more typeclass ambiguity; e.g. the text library could provide a module like:

module Data.Text.Qualified where

import Data.Text

fromString :: String -> Text
fromString = pack

and users can do

import Data.Text.Qualified qualified as T

main = print T."asdf"

The equivalent code with OverloadedStrings would have failed to compile with -Wall -Werror enabled (due to type defaulting).

1.5.1 Interactions with other extensions

1.6 Costs and Drawbacks

Development and maintenance should be low effort, as the core implementation is in the renamer step, and typechecking would proceed as normal.

The syntax is approachable for novice users and shouldn’t be an extra barrier to understand.

1.7 Backward Compatibility

No breakage, as the new syntax is only enabled with the extension.

Furthermore, turning on the extension will generally not break existing code. Any existing code written as M."asdf" would be parsed as function composition between a data constructor and a literal, which would only typecheck if someone adds an IsString instance for a function type.

1.8 Alternatives

  • Use PatternSynonyms for string literals in patterns

    • The View pattern more closely matches Section 3.17.2 in the 2010 Report

1.8.1 Future work

  • Some literals are not supported yet (Chars, unboxed literals) due to lack of use-cases, but could be extended in the future.

  • Future work could be done to allow compile time logic, e.g. $M."hello" => $(M.fromString [|"hello"|]), but that is out of scope of this proposal.

1.9 Implementation Plan

Brandon Chinn will volunteer to implement.