What stable type system features we are going to preserve from OCaml

In general, we try to preserve the ML type system as much as we can.

We only remove (or not support) those features under such conditions:

  • It does not translate well to existing JS semantics.
  • The maintenance overhead is nontrivial compared with the benefit (we are a small team, we want the development to be focused)

If some features meet both the two requirement, it will be dropped. In the foreseeable future(in the next two years), here are two
type system features that we plan to remove

  • OCaml style classes

OCaml style OO was never translated well into JS, the maintenance overhead is non-trivial since we did lots of modification to the runtime encoding, while in the native compiler, the Class backend uses lots Obj.magic style code, these invariants are hard to be preserved when using our own patches. Worse, such kind of bugs don’t always show up, it will only show up when you actually hit that code path in some niche cases.

Note the good news is we adopted a subset of OCaml style classes which can be compiled to both JS and native code (it is there but not explored yet), so we can still get structural typing for free.

This removal is on the way and is delivered in the next major release. (Thanks to the clean up, the structural typing experience will be even better, we will cover this in the release note)

  • Format/Printf

When you use Printf for a hello world, the generated code size is huge. The underlying encoding of Format is fairly complex, it is interesting from type system point of view, but it it is not pragmatic when you have to pay so much for a type safe “hello world” on the JS backend. I made a demo here to show the cost, it is around 128KB for a hello world.

The other part is that some complex ad-hoc logic is employed to make the type checking happy, so it is a language feature instead of a library. It makes cross-compiling difficult when the internal of Format changes.

Since removal of such module would cause some breakage (if you do use it today, remove such usage and check the bundle size, you will gain a lot), we are going to add a warning first and remove it in a few releases.

These above two are the main ML type system features we are not going to support, your feedback is welcome!

22 Likes

I think this all makes sense. I’ve been bitten by the printf bloat previously - and didn’t notice that was the culprit for a while.

I will say one of the things I’ve really liked about OCaml is that every feature was considered with performance as requirement (although this was for native code). I think removing the ones that don’t perform well in JS is the only way to stay true to that ethos. I know there’s people who are vocal about their favourite features, but there is also silent majority who think the ReScript whole team is doing fantastic work. Keep it up!

11 Likes

For a bit of context, last week I was tasked with making some improvements to an internal application that was built using Reason more than a year ago. I’ve not kept up with goings-on in Reason-land, so everything that seems to have happened in the interim has came as a bit of a shock. As much as I’ve come to accept the churn in e.g. Javascript APIs, it’s rare (in my experience) to have one’s chosen language pursue the same approach.

That said, I was able to make the needed improvements using a recent Rescript by the end of last week. There was a fair bit of churn to cope with; yes, we used objects (for good reason IMO), as well as some now disallowed infix operators, but those aspects were relatively minor, and I could get along otherwise; one day of work turned into a full five, but fine.

Seeing this posted yesterday leaves me quote upset though:

The application in question is not small (~22K LOC), and it makes heavy and fairly sophisticated use of Format. But even leaving our particular circumstances out of it, the proposed excision makes no sense:

  1. The rationale given is that Printf-ing hello world yields an unreasonable generated code footprint. So what? Format and Printf are not there for printing trivial strings, they exist to provide a modular, extensible, and composable pretty-printing solution. Unless a novel alternative that is fills the same space is being offered, removing them is really irresponsible.
  2. The great thing about Reason Rescript is that you only pay for what you use: an app that doesn’t touch Format doesn’t pay for its weight, but an app that does, does. That seems like the fairest thing in the world. It seems like the sentiment now is that Format/Printf are being removed to save hapless programmers from stumbling into using them for trivial things, and having to pay the commensurate cost.

Finally:

I cannot disagree more. I chose Reason for this project originally because it had the perfect mix of capabilities: easily targeting browsers with all of the power of OCaml (including type-safe, composable pretty-printing, something I knew we would lean on heavily). I’m sure many are concerned about bundle size, but that is something I have literally never thought about once for this project, and something I never thought would be criteria for recommending breaking stable APIs for given the “pay for what you use” dynamic I already mentioned.

I’ve seen suggestions along the lines of “if you don’t like our direction, go use JSOO”; if that were so straightforward, I’d have done so already. But I guess that may end up being the only option, after much work, and probably spending as long as possible pinned to the last non-Rescripted build of Reason.


Finally-finally, a quote from last August’s https://rescript-lang.org/blog/bucklescript-is-rebranding:

Will BuckleScript (now ReScript) break my existing code?

No. The new syntax & tools sit alongside the existing code. We won’t remove OCaml and Reason support from ReScript for a long time.

:thinking: :thinking: :thinking: :thinking:

1 Like

Since I wrote my first reply, I learned of https://anmonteiro.com/2021/03/on-ocaml-and-the-js-platform/, which maybe might be a great way forward for our purposes, maybe for others, too. I’ll have to disentangle the recent enhancements I’ve made from my removal of the object system usage, etc., but that would absolutely be easier than e.g. moving to JSOO (at least, given my past experiments with it where using e.g. npm modules and certain JS APIs was quite difficult).

@clmill why couldn’t you just find a printf package on npm? There’s plenty there

They wouldn’t be type-safe like OCaml’s implementation, and it would still be a different API because Format supports custom formatters, among other things.

Another option would be for someone to package up Printf and Format as a separate npm package for ReScript, unless the language-level support (parsing format strings and GADTs) is also being removed, in which case that wouldn’t work.

At one point we’re gonna pull out https://github.com/rescript-lang/syntax/blob/f27e1079433430cd9017f16c905887d7e6e27f87/src/res_doc.mli and advertise it. This is a much better alternative. That entire little thing is a superset of Prettier and others while being much cleaner and powering our entire code formatting.

3 Likes

Hi, I am sorry this introduces you some pain.
That’s exactly why we make a heads up notice to make time for you to prepare the transition(if you want).

If it is indeed " you only pay for what you use", there is no reason to remove it. The reality is that it breaks a property we desire: cross compiling against any version of native compiler so that our compiler will always work with latest native compiler.

Our community is small and bad things could happen, for example, some day, we don’t have resources to have full time developers to work on this project. With this property held, we can always enjoy the enhancement from upstream.

I wish I could have your trust all what I did is for the health of this project, it may introduce some pain, but it is for the long term healthy growth.

10 Likes

Thanks for the explanation, the additional details are helpful

I think this would be a good detail to explicitly include in your original post, it might seem trivial but to me this seems like a much greater benefit than just:

Do you mind clarifying this?

Do you mean the compiler internals using OCaml classes have been removed or support for them is being removed?

1 Like

Because Format et al. are type-safe, and many of the formatters we have are generated and type-directed. We’re not using Format for interpolating e.g. scalar values into strings.

But this kind of mindset gets you pretty quickly to, “why not just use Typescript?”

Sure, and such notice is appreciated. However, the whole effort is really at odds with the stated answer to the “Will BuckleScript (now ReScript) break my existing code?” FAQ.

…except if those enhancements are in the N areas which Rescript just doesn’t support anymore. It seems clear that Rescript really is a new language; maybe inspired by Reason and OCaml and using a lot of the latter’s facilities, but when tentpole capabilities are removed wholesale, the result is something else.

Trust is a very personal thing; the questions here are just about utility. Rescript is a (functionally IMO) new thing that looks like it won’t offer the set of things that we looked for when we first built our project. That’s okay! I just wish in hindsight I had understood that the OCaml lineage was really being severed as much as it seems set to be.

2 Likes

Note this not only helps us reduce the maintenance overhead and get cross compiling for free. It also encourages good practice.

Here I rewrite Arg module to not depend on Format, it is just a couple of lines change while the reduced JS size is huge (not just the reduced code in the diff, but also ./printf.js and its huge dependencies).

Now the Arg module is of much higher quality to use than the code before refactoring. This also explains why we were not very attractive to the existing ocaml libraries. The native backend and JS backend have very different requirements, it is very hard to make high quality libraries when you have to worry about such two different platforms at the same time, they are just different.

4 Likes

I’d like to chime in to say that we care about and prioritize not just making things work, but making things work well. This is philosophically important for a language that’s geared toward shipping and impressing enough to reach wider audience and stick around. Although we already try our hardest to aim for some existing folks’ cross-platform needs, sometime some pieces are simply too different and we’d be better off guiding folks toward making dedicated small pieces for platform-specific needs than a catch-all solution, especially one here that have hidden costs that aren’t clear until users accidentally ship it.

Basically, if one wants to do cross-platform, it’s better to aim for the intersection of good features on both platforms and write that as shared code, plus platform-specific code, instead of aiming for the union of both platforms and trying to abstract over that while thinking that one knew better than either platform’s maintainers. Of course, this assumes that said intersection of features is large and useful enough, which very much still the case. It’s not like we’re removing arrays.

Concretely, there are much better alternatives than Format for most tasks, though admittedly this warrants more and earlier promotions from us.

8 Likes

Obviously everyone behind Rescript believes it is heading towards being a better language. Great!

But you can do that without posting things like compatibility statements that are quite obviously false:

Will BuckleScript (now ReScript) break my existing code?

No. The new syntax & tools sit alongside the existing code. We won’t remove OCaml and Reason support from ReScript for a long time.

Yes, I’m repeating the quote from https://rescript-lang.org/blog/bucklescript-is-rebranding. In large part, Rescript is defined by how it breaks from OCaml and Reason.

This is all academic to me at this point, but it is strange to have returned to this space to see compatibility assurances paired with what seems like an exploratory approach to programming language design. Absent divergent forks (I’m reminded of Perl 5 vs 6), you can’t do both.

1 Like

Well it says “for a long time” and nothing has been removed yet. Defining “a long time” is as personal as trust I guess. Also it won’t break your code as you don’t have to update to the new version. Current version is quite stable at least to my taste.

I would like to also address the “why not just use Typescript?” sentiment as well. And as with the above it’s my personal position (I’m not affiliated with the rescript team in any way). I would use Typescript instead of reason if it wasn’t so damn awkward and complex as a type system. And that’s what I want from rescript. I started with ReasonML like a lot of folks out here. I switched to rescript for one of our projects a couple of months ago. I don’t want to go back to ReasonML. Yes, async is still annoying and there are some things I’m not a fan of. But as a “compile-to-js” language it’s the best on the market.

Again, my opinion only

4 Likes

Just popping in to offer that I was recently profiling my webpack bundle, and bs-platform alone takes up 500k (ungzipped),

Format seems to be a significant part of this.

format.js = about 50k
calmInternalFormat = about 224k
caml_format = about 20k.

Total 294k.

9 Likes

If you don’t use it, it won’t be contributing to the bundle size. There are a lot of use cases where bundle size isn’t an issue (nodejs, electron, react-native). So it’s not entirely useless.

5 Likes

For the record, recently we want to generalize type char = int to fix some subtle bugs and pave the way for proper unicode support.

Everything goes quite smooth until we hit the bootstrapping issue with camlinternalFormatBasics, this bite other people too https://www.dhil.net/blog/posts/2017-11-21-bootstrapping-caml-with-format.html

To conclude, the format module is tightly coupled with how type system works, even minor changes become quite subtle

1 Like

But the good news is that unicode support should be close once we remove the format module.
The unicode support will be put on hold for a while.
:ship:

6 Likes

This is interesting. Would it mean that functions like Js.String.codePointAt would essentially return a char? And, on a language level, that unicode chars could compile? (e.g. '😺'). That seems potentially useful for implementing a parser in ReScript.