Runtime free type safe array access for javascript interop

danielo515 · August 16, 2021, 8:42pm

First of all, sorry for the long example. I already stripped it down a bit, but I am not sure how to provide a good example.

Today I had my first bad experience wit arrays on rescript. I just happily used what I amused to, which is [0] and because the compiler didn’t complained I think it was all good. After I got a runtime error about index out of bounds I checked for that basic array access the default ocaml standard library was being imported and used for accessing the array. Very bad surprise I would say.

Without checking on the forum I look at the documentations and I found the only get available on the Js namespace is unsafe_get. So that is what I went for, and I have to say that I’m quite happy with the outcome despite its verbosity. Because the unsafe_get is, well unsafe, I decided to put the type safety manually and say that the external Js code is returning an array of optional strings:

type t = {args: array<option<string>>}
@module("commander") @new external command: unit => t = "Command"
@send external parse: (t, array<string>) => t = "parse"
@module("path") external resolve: string => string = "resolve"

let getArguments = () => {
  let program = command()->parse(Node_process.argv)

  let fileArg = program.args->Js.Array2.unsafe_get(0)
  let fileToMove = switch fileArg {
  | Some(path) => resolve(path)
  | _ => {
      Js.log(`No file specified. Please provide a file name to move`)
      Node.Process.exit(1)
    }
  }

  {
    "fileToMove": fileToMove,
  }
}

Despite it’s “unsafe” nature it forces me to check for the content, and the produced JS is very close to native javascript (despite the weird formatting):

  var fileArg = program.args[0];
  var fileToMove = fileArg !== undefined ? Path.resolve(fileArg) : (console.log("No file specified. Please provide a file name to move"), Process.exit(1));

Latter I just checked on the forum and I saw that the recommended way is to open belt and use the "native "method to access the array. So I changed it like this:

type t = {args: array<string>}
@module("commander") @new external command: unit => t = "Command"
@send external parse: (t, array<string>) => t = "parse"
@module("path") external resolve: string => string = "resolve"

let getArguments = () => {
  open Belt
  let program = command()->parse(Node_process.argv)

  let fileArg = program.args[0]
  let fileToMove = switch fileArg {
  | Some(path) => resolve(path)
  | _ => {
      Js.log(`No file specified. Please provide a file name to move`)
      Node.Process.exit(1)
    }
  }

  {
    "fileToMove": fileToMove,
  }
}

To my surpise, the generated code is almost identical with the only difference that it includes a runtime dependency of belt:

  var fileArg = Belt_Array.get(program.args, 0);
  var fileToMove = fileArg !== undefined ? Path.resolve(fileArg) : (console.log("No file specified. Please provide a file name to move"), Process.exit(1));

If I have to be honest, I prefer my solution because it doesn’t introduce any runtime dependency and has the same safety and the same Js guarantees.
Is this a wrong approach? why does people recommend belt instead?

Regards

yawaramin · August 17, 2021, 12:19am

How about:

{
  "fileToMove": switch program.args {
    | [] => invalid_arg("No file specified. Please provide a file name to move")
    | args => resolve(args[0])
  }
}

danielo515 · August 17, 2021, 5:41am

That is indeed a more functional approach.
I also thought about returning one Error message in case of the missing parameter, but at first I wanted to mimmic the JS api.

But your solution doesn’t tackle my concerns right? which is not using any runtime dependency just to access an array index. Or I am missing something?

  return {
          fileToMove: args.length !== 0 ? Path.resolve(Caml_array.get(args, 0)) : Pervasives.invalid_arg("No file specified. Please provide a file name to move")
        };

kevanstannard · August 17, 2021, 12:19pm

IMO your approach looks good considering your goals of no runtime dependencies.

People usually recommend Belt for its convenient type safety.

Btw, there may be a typo in your example code? Js.Array2.unsafe_get() returns a string, but in your switch you’re treating it as an option. Maybe you meant something like this?

type t = {args: array<string>}
@module("commander") @new external command: unit => t = "Command"
@send external parse: (t, array<string>) => t = "parse"
@module("path") external resolve: string => string = "resolve"

let getArguments = () => {
  let program = command()->parse(Node_process.argv)
  if Js.Array2.length(program.args) > 0 {
    let path = program.args->Js.Array2.unsafe_get(0)
    {
      "fileToMove": resolve(path),
    }
  } else {
    Js.log(`No file specified. Please provide a file name to move`)
    Node.Process.exit(1)
  }
}

danielo515 · August 17, 2021, 1:37pm

Yes, there is indeed a typo which in fact defeats the entire message. Thanks for spotting it. The actual type should be like this:

type t = {args: array<option<string>>}

Which is what gives that type safety without runtime.
So the whole point is to treat any JS array as an array of options, forcing you to check the existence of content even using unsafe_get.
That is my actual proposal which is virtually identical to using Belt.Array.get but without runtime cost. That is what I was asking opinions about.

yawaramin · August 17, 2021, 1:55pm

May I ask why ReScript’s use of its internal libraries is such a big concern? If it is performance, have you considered that, next to file I/O, the performance impact of calling a couple of functions is negligible?

danielo515 · August 17, 2021, 2:06pm

It is more bundle size what worries me. The generated JS is not very three shaking friendly (or that is how it looks to me):

import * as Belt_Array from "./stdlib/belt_Array.js";

If belt_Array has all that is required then it’s ok, but I guess it will have a lot of other utility functions that I don’t need.

Also, sure, on this particular scenario it doesn’t make much difference and I usually don’t mind, but… why introduce it if it is not needed at all? It has virtually no benefit.

johnj · August 17, 2021, 2:55pm

The Belt modules should be very tree-shakable since they’re mostly (entirely?) pure functions. AFAIK, the import * as ... syntax does not affect tree-shaking for any modern JS bundler.

If you really need to avoid any extra overhead, though, then you can use Belt.Array.getUndefined. It compiles to plain JS a[i] and returns a Js.undefined<'a> type which can be converted to an option<'a> type with no runtime.

(option and Js.undefined are very similar but have subtly different semantics that make them not 100% interchangeable. Someone who knows more about the compiler can probably explain why this is better than I can.)

ReScript:

let f = a =>
  switch Belt.Array.getUndefined(a, 0)->Js.Undefined.toOption {
  | None => Js.log("nothing")
  | Some(x) => Js.log(x)
  }

JavaScript output:

function f(a) {
  var x = a[0];
  if (x !== undefined) {
    console.log(x);
  } else {
    console.log("nothing");
  }
}

But unless you’re working in a hot loop or really need to compress your output by a few extra bytes, then the regular Belt.Array.get will almost always be better to use.

yawaramin · August 17, 2021, 4:50pm

Is bundle size a huge concern on the server side? If you’re deploying to Node, strictly speaking, you might not even need to bundle.

danielo515 · August 17, 2021, 4:55pm

Node is just for this small experiment. My main target is going to be web. But I think belt is a reasonable dependency.
Bundling is nice even for node because it speeds up start time, and my project is a little CLI, so bundling makes sense in my context