Nikita Volkov bio photo

Nikita Volkov

Email Twitter LinkedIn Github Stackoverflow

Today I’m releasing the “record” library, which is an API of just two quasi-quoters, providing a full-scale solution to the notorious records problem of Haskell!

Contents

First, a bit of a background on the problem.

The records problem

The current record system is notorious for its major flaws:

  1. It does not solve the namespacing problem. I.e., you cannot have two records share a field name in a single module. E.g., the following won't compile:

    data A = A { field :: String }
    data B = B { field :: String }
  2. It's partial. The following code results in a runtime error despite compiling without any warnings:

    data A = A1 { field1 :: String } | 
             A2 { field2 :: String }
    
    main = print $ field1 $ A2 "abc"
  3. It does not allow to use the same field name for different types across constructors:

    data A = A1 { field :: String } | 
             A2 { field :: Int }
  4. Updating nested records is exponentially tedious:

    addManStk team = team {
      manager = (manager team) {
        diet = (diet (manager team)) {
          steaks = steaks (diet (manager team)) + 1
        }
      }
    }

Turns out, a lot of proposals have been made to solve this problem in the past two decades, even I published a proposal a year ago. But nothing got done. Well, except for the 4th flaw, which was perfectly solved by a plethora of Lens libraries.

So, you must be wondering now: so many attempts by giants like SPJ, for Christ’s sake, yet nothing gets implemented for more than 15 years and then somebody just jumps at it with a library? Well, turns out, a solution like the one I’m presenting you with only became possible since GHC 7.6, which introduced the type-level string literals.

Alright, alright, you must be anxious now. Here comes the

Introduction to the “record” library

In the mentioned proposal I express the ideas that I’ve tried to implement in this library. Following is what is achieved.

Type declarations

type Person = 
  [r| {name :: String, 
       birthday :: {year :: Int, month :: Int, day :: Int},
       country :: Country} |]

type Country =
  [r| {name :: String, 
       language :: String} |]

Notice a few things: we can safely use the field name in different records, the records themselves are not declarations like ADT but first class types, meaning that you can use them in any context, where type is usable: as a parameter to an ADT, in a function signature, inside the record itself as is done with the birthday field.

Value expressions

Pretty straight-forward as well:

person :: Person
person =
  [r| {name = "Yuri Alekseyevich Gagarin", 
       birthday = {year = 1934, month = 3, day = 9},
       country = {name = "Soviet Union", language = "Russian"}} |]

Value manipulation

As stated earlier, this problem is already solved perfectly with the Lens abstraction. So I didn’t reinvent any wheels in that area and just provided an interface to records:

getPersonBirthdayYear :: Person -> Int
getPersonBirthdayYear =
  view ([l|birthday|] . [l|year|])

Note that for convenience it’s possible to perform the lens composition from inside of the quotation:

setPersonBirthdayYear :: Int -> Person -> Person
setPersonBirthdayYear =
  set [l|birthday.year|]

That’s it. It’s pretty much everything you need to be able to use records! Following is a detailed explanation of the library’s features.

Features

No name collisions

Yes! Finally no name collisions. Use whatever field names you want!

No need to import any modules to access the fields of a datastructure

Even with the nifty “lens” library your lens compositions will often contain qualified module references like the following:

view (A.value . B.value)

Not only does “record” just not inherit this problem, but there is nothing to namespace in it above all. Field names are represented not by definitions, but by type level values, and as such they have no namespace and hence the problem of collisions simply does not pertain to them.

So wherever those A and B originate from, you’ll always refer to their lenses the same way:

view ([l|value|] . [l|value|])

Both record values and types are first class citizens

No need to predefine any types or functions. You can construct and access record values and types with any fields from any place. Just like with tuples!

No memory footprint penalties

Records are as lightweight as tuples or single-constructor datatypes! All their functionality is implemented using type-level features and requires no runtime overhead.

This is because there’s actually not much trickery that the quasi-quoters hide from you, all the functionality is achieved using type-level string literals and a predefined set of flat record types for different arities, like the following:

data Record2 (n1 :: Symbol) v1 (n2 :: Symbol) v2 =
  Record2 v1 v2  

The above is a definition of a predefined polymorphic record type with arity of 2. n1 and n2 are type-level string literals used for field names, v1 and v2 are the normal types of their associated values. As you can see the field names are represented by phantom types and do not get stored in the constructor, hence they occupy no memory during runtime.

As you might have guessed, there exists a limited set of predefined records for different aritites. There’s 24 of them. It should be more than enough for any sane use case.

No performance penalties

That’s right! No lookups or traversals need to be done. The field gets selected at compile time at type-level.

Field names are type-checked

You can’t refer to an inexistent field.

It’s impossible to assign the same field twice

The quasi-quoter performs checking on that matter as well.

Compiler error messages are comprehendable

E.g.,

functionOnARecord :: [r|{name :: String, age :: Int}|] -> Int
functionOnARecord =
  view [lens|ageasdf|]

results in

Could not deduce (Record.Types.FieldOwner
                    "ageasdf" Int (Record.Types.Record2 "age" Int "name" String))
  arising from a use of ‘Record.Types.lens’
  ...

No problems with type inference

Yep. You heard me.

Order of fields does not matter

[r|{year = 1958, month = 1, day = 18}|] 
  == 
[r|{month = 1, day = 18, year = 1958}|]

How? Simple. The quasi quoters always reorder the fields by name. Since they also ensure that you don’t specify duplicate fields, you’re guaranteed that there’s no ambiguity possible on that matter.

Tuples are records too!

You can manipulate any member of a tuple of an arbitrary arity. The API is the same as with records. Tuple fields are accessed using their integer indexes (starting from 1).

E.g.,

getY :: [r|{name :: String, coordinates :: (Int, Int)}|] -> Int
getY =
  view [l|coordinates.2|]

Identity is like a single-element tuple

runIdentity' :: Identity a -> a
runIdentity' =
  view [l|1|]

This might turn out to be useful in case of lens composition.

Named function parameters

Yes, being first class certainly has its benefits!

connect :: [r| {host :: ByteString,
                port :: Int,
                user :: ByteString,
                password :: ByteString} ] -> 
           IO Connection

instead of the opaque

connect :: ByteString ->
           Int ->
           ByteString ->
           ByteString -> 
           IO Connection

You can still restrict record types

Just wrap them in newtype:

newtype A = A [r| {x :: Int, y :: Int} |]
newtype B = B [r| {x :: Int, y :: Int} |]

Tagged unions are now more flexible

data A =
  A1 [r| {a :: Int} |] |
  A2 [r| {a :: Char} |]

Sorry for a dull example. The point is that constructors have no effect on what’s going on inside of a record.

Project status

Unfortunately since the introduced syntax is not supported by “haskell-src-exts” and since neither does that library expose a parser API, Haskell syntax parsing needs to be reimplemented in the “record” library. This is not a trivial task, so currently the quasi-quoters do not support all of the Haskell syntax. All the basic stuff is supported however: variables, literals, tuple and list syntax. Don’t worry though, if you’ll try to use something unsupported, you’ll get notified during compilation.

Now let’s make this a community effort guys! Please come with your contributions! Maybe some marvelous day we’ll have pushed this far enough to integrate it into the native Haskell syntax.