[ANN] Enhanced Ergonomics for Record Types

TypeScript’s Pick needs [compile-time] reflection on record keys:

type Pick<T, K extends keyof T> = {
    [P in K]: T[P];
};

I wonder how a feature like this can affect ReScript’s compilation times.

There have been attempts to mimic omit or pick with ppx.

1 Like

interested in the type level strings here too

To me these (add/remove/pick) operations would be better suited to being exposed as a library module like Belt.Array / Belt.Option rather than adding customizations to the language/compiler.
Something like Belt.Record. So the code I imagine would look like:

module Record = Belt.Record
type catFood =
  | Milk
  | Fish
type catBase = {
    name: string,
    age: int,
    food: catFood
}
type catWithId = catBase->Record.add({id: int})
type catWithoutFood = catBase->Record.omit(["food"])
type catNameAndFood = catBase->Record.pick(["name", "food"])

// More imaginary operations
let doesCatHaveFood = catWithoutFood->Record.find("food") // false
let isCatWithoutFoodSubType = catWithoutFood->Record.compareWithField(catBase, "food") // -1 or 0 or 1

This also means we might need to look at justifying having first class types (which goes against adding features to the language), so we could pass types as arguments to functions.

I want ReScript to be small and simple; but with TS features being looked at, we might need to think through the possible options to keep the language lean and simple.

2 Likes

how do you implement those library functions…they will still need language support internally and youre back in the same place but have also added type level functions?

You are correct and I did mention, it goes against adding features to the language.

My point was simply this: If we have library modules for data structures like Option and Array, it makes sense to follow the same tradition for Record as well.

I realize that the add/remove/pick operations for records are more like pattern matching/destructuring for arrays. In the sense that they operate on the structure of the type rather than operating on the data of the said type.

// for array arr
// this requires language support to destructure
switch arr {
| [] => "empty"
| [one] => "single item"
| [one, _] => "multiple items"
}
// whereas this is a library function
let length = arr->Belt.Array.length

// for record type catBase
// this would require language support
catBase & { id: number } 
// whereas for data of type catBase
let neko = {
  name: "Neko",
  age: 4,
  food: Fish
}
// this would be a library function
let fields = neko->Belt.Record.keys

With the current set of language features, this is an important distinction that I missed. Thank you for pointing me to look at what needs language support.

Indeed there are many little features like pick one can come up with. And they all add up easily: more complexity.
If on the other hand there are more core, powerful features that open up a number of different applications, then those are more likely to be adopted.

For example, a more core feature that makes pick and 5 other new things expressible while learning only 1, could be interesting.

2 Likes

It is interesting to note that pick operation is redundant since it can be replicated by omit/remove. Though it can get tedious if you had to pick one out of 10 fields, in which case you have to remove 9 fields or you can create a new type without those 9 fields.

I think the add and remove operations might be sufficient:

type catFood = Milk | Fish
type catBase = {
    name: string,
    age: int,
    food: catFood
}
type catWithId = catBase & {id: int}
type catWithoutFood = catBase ! {food: catFood}
type catNameAndFood = catBase ! {age: int}

// And the ability to chain them:
type person: = {
  name: string,
  age: int
}
type catWithOutFoodButWithOwner = catBase ! {food: catFood} & {owner: person}
// generates
// type catWithOutFoodButWithOwner = {
//   name: string,
//   age: int,
//   owner: person
// }

Note: I did not illustrate with spread (…) operator because I could not find a good remove syntax.

I think we need to clearly distinguish between compile-time and run-time features:

  • defining or manipulating types is only possible during compile-time, since type information is not present in the compiled js
  • working with data based on different types is compiled to js and only executed during run-time
  • accessing object keys (Js.Obj.keys) during runtime, only returns information about the actual data structure inspected. Omiting optional record fields, not defined on the record inspected. It’s not possible to programmatically retrieve all keys defined in a record type during run-time.
  • creating generic code, that defines runtime behaviour based on type information is currently not possible (easily) and has been solved by ppxes, which is discouraged: e.g. deriving a toString function for any type

Probably the main difference (when coming from js) are the type guarantees rescript is able to provide. Most of the necessity of reflection are covered by the current type system as is. Instead of using reflection in run time code, you would just tag the record with a variant and pattern match on that.
Furthermore, if you really don’t care about the specific types anymore, you could convert any record to a Js.t and use e.g. Js.Obj.keys.

Therefore I see 2 features that could make sense to me:

  1. more fine grained control, when using the spread operator in records:

    • overwrite existing label
    • use just a part of an existing record type to spread

    This is already possible by adapting the types. (Decomposing the complex record type into the necessary smaller “subtypes”.) So the question remains: Is it necessary for rescript to mimic the way to define datamodels, like TS is doing? Or do we just need to provide more information on how someone would achieve specific things - coming from TS - in rescript?

  2. Compiler / Library support to enable code-generation: currently you either use/write a ppx (which is discouraged) or use the rescript parser as a library and work on the parsetree to generate code. I’d like to see more support or guidance on this topic by the core Team, while I still need to follow up on my promise and publish a new topic, describing our current approach to code generation in more detail.

1 Like

I don’t know if this is a trivial example, but I’d argue that the real issue is that it turns out that CatBase wasn’t appropriate as a base interface after all.

I think the new approach holds more merit, i.e. build from atoms rather than creating a data structure that becomes too broad (data clumps).

As an aside, personally, when I have to reach for something like Pick, Partial etc, I tend to pause and contemplate whether there’s something else amiss. I think there are valid use cases, e.g. maybe a React component that wraps the properties of a primitive or similar, but in most cases I think one would benefit from approaching the data structures from a different angle.

Going back to the example, while trivial, if we look at the domain, here’s my interpretation.

We have a cat, it may be chipped with an id, we potentially know it’s favourite food and/or age.

So to me, a simpler representation might be:

type catFood = | Fish | Milk

type cat = {
  name: string,
  age: option<int>,
  food: option<catFood>,
  id: option<int>,
}

It’d be interesting to see a real-world use case, I’d personally find it easier to reason about!

2 Likes

Is the use case mostly bindings or some code that the developer has control over?
In the case of bindings, there’s the reality of how things have been set up, and the question of how to map to that given representation in a way that is reasonably efficient to express in the language.

This kind of reminds me of the Wet Codebase talk from Dan Abramov: Dan Abramov The wet codebase - YouTube.

The repetition here might not be a bad thing, or CatBase could just be the wrong abstraction.

Without a real world example it’s hard to give an alternative, but what @lessp suggested could work… I tend to try to avoid option types in my models, unless the options encode all possible valid states.

For CatBase and CatWithId I’d probably redo as:

type catFood = | Fish | Milk;
type cat = {
    name: string;
    age: number;
    food: catFood
}

type catEntity = {
    id: number;
    value: Cat
}

Alternatively as a tuple.

type catEntity = (number, Cat);

Or if you have many types of values + IDs

type entity<'a> = {
    id: number,
    value: 'a
};

If you absolutely needed the model you described this is how I’d do it. By defining the types algebraically like this it also composes better in the type system of Rescript.

type catFood = | Fish | Milk;
type catDetails = {
    name: string;
    food: option<catFood>
}
type cat = {
    age: number;
    details: catDetails
}

1 Like

Your example is bottom-up which means building bigger types from smaller ones, this is only possible if you are the one building the types, when working with 3rd party libraries it’s common to go top-down, which means you are given an interface which you need to break it into smaller pieces to operate on different places, for example in your component hierarchy one may pass few props across the component tree which are fragments of a bigger type, until they are merged in a component leaf down in tree.

That 3rd interface may be a big one like a rich text editor state where you don’t want to manually retype its constituent parts.

So for that use case it’s still better to explicitly type out your contract (the types) rather than try to dynamically pick them out from a 3rd party library. This way you can develop your interface in terms of what you actually need rather than what you actually need that just so happens to be part of the 3rd party library. You also become decoupled from the 3rd party library this way. While you get the short term benefit of piggy-backing off the types of the library, you end up coupled to it in the type system. This approach is also recommended even if you’re using Typescript.

4 Likes

We just landed support for type variables, both uninstantiated and instantiated ones. Here’s a (contrived) example of what you can now do: rescript-compiler/jscomp/test/record_type_spread.res at ad9e0319e59b083f75134739b84b9c2c1dc42218 · rescript-lang/rescript-compiler · GitHub

Remember that the semantics are still “copy paste”. Not in a published release yet, but will come in the next release. You can try it right now via the package CI builds if you want.

4 Likes

When targeting a 3rd party library, the code reuse vs abstraction complexity dilemma, is not really a design decision, of course you glue them, but the 3rd party types and abstractions, they’re fixed and manual duplication can just break on library updates.

Another thing is that a nominal type system extracting types from types may be a good fit, where just matching a type shape maye not be enough.

If the third party library api has changed it’s going to break your code regardless no?

Why do we want this

type d = {
    ...t<int, int>,
  }

and not this?

type d = t<int, int>

Is it to not make d type of t?

No, it is to make d a supertype of t and extend it with some more props. The initial record type spread implementation did not account for props with parameters.

2 Likes