How to write self add more easily in rescript?

Mng12345 · November 16, 2022, 3:37am

In js or ts, we can easily write self add to get or set the byte of a typedarray in one line.

type Decoder = {
  data: Uint8Array,
  p: number
}
const len = 1000000
const decoder = {
  data: new Uint8Array(len),
  p: 0
}
decoder.data[decoder.p++] = 0
decoder.data[decoder.p++] = 1
decoder.data[decoder.p++] = 2
if (decoder.data[decoder.p++] === 0 && decoder.data[decoder.p++] === 1 && decoder.data[decoder.p++] === 2) {
  // do something
}

Using self add operator in getting or setting is very simple, but in rescript, we need to write a lot of code to do the same thing.

type decoder = {
  data: Js.TypedArray2.Uint8Array.t,
  mutable p: int
}
let len = 1000000
let decoder = {
  data: Js.TypedArray2.Uint8Array.fromLength(len),
  p: 0
}
// get 
decoder.data->Js.TypedArray2.Uint8Array.unsafe_get({
  let p = decoder.p
  decoder.p = p + 1
  p
})
// or this
let value = decoder.data->Js.TypedArray2.Uint8Array.unsafe_get(decoder.p)
decoder.p = decoder.p + 1
// if case
if (decoder.data->Js.TypedArray2.Uint8Array.unsafe_get({
  let p = decoder.p
  decoder.p = decoder.p + 1
  p
}) === 0 && decoder.data->Js.TypedArray2.Uint8Array.unsafe_get({
  let p = decoder.p
  decoder.p = decoder.p + 1
  p
}) === 1 && decoder.data->Js.TypedArray2.Uint8Array.unsafe_get({
  let p = decoder.p
  decoder.p = decoder.p + 1
  p
}) === 2) {
  // do something
}

This is too terrible!

hoichi · November 16, 2022, 11:26am

This example looks like very low-level, almost assembly-like pointer magic, not like operations on arrays suitable for higher-level languages. What exactly are you trying to do?

For instance, do the first 6 bytes you traverse have any semantic meaning? You can go let [foo, bar, ...] = decoder.data; and work from that.

Mng12345 · November 16, 2022, 12:16pm

I’m writting gif encoder and decoder in rescript

glennsl · November 16, 2022, 12:53pm

These usual way to abstract away repeated code is to create a function for it:

let incr = decoder => {
  let p = decoder.p
  decoder.p = p + 1
  p
}

Not you can just do decoder->incr in place of decoder.p++. That’s not too bad I think.

The function can also be defined in raw JavaScript to use the ++ operator directly, which may result in the JS engine being able to optimize it better:

let incr = %raw("function (decoder) { return decoder.p++ }")

Mng12345 · November 16, 2022, 12:59pm

This would cause a performance issue if the js jit compiler doesn’t inline the function incr. And if using the decorator @inline in rescript, the function incr can be inlined indeed, but the inlining code has a bug, see the issue here

Ryan · November 16, 2022, 4:32pm

I looked at your bug report and the playground you linked there, but I don’t think it is a bug with the inlining.

(Note too, that if you inline the readNextByte function into readNextTwoByte by hand, you get the same generated JS as letting the compiler do the inlining.)

If ReScript is like OCaml in the way it evaluates functions, then the order in which the function expression and its arguments are evaluated is unspecified. (See here.)

(Note: I’m assuming that ReScript works the same way as OCaml here. I couldn’t find info about it in the ReScript docs, and when that is the case, I refer back to how OCaml works.)

If the evaluation order of your function arguments is unspecified, you should not rely on the side-effects that may occur in the evaluation of those arguments being ordered in any certain way. And your readNextBtye function is a side-effecting function, so providing it multiple times as function arguments in the same expression will lead to unspecified behavior.

Rather you should do explicit sequencing yourself. E.g.,

@inline
let readNextTwoByte = decoder => {
  let x = readNextByte(decoder)
  let y = readNextByte(decoder)
  lor(x, lsl(y, 8))
}

which gives

function readNextTwoByte(decoder) {
  var p = decoder.p;
  var x = decoder.data[decoder.p = p + 1 | 0, p];
  var p$1 = decoder.p;
  var y = decoder.data[decoder.p = p$1 + 1 | 0, p$1];
  return x | (y << 8);
}

Here is playground link that has all this hand inlined and sequenced stuff.

Edit: Found some issues for you to look at

Mng12345 · November 17, 2022, 12:05am

@Ryan Thanks for the reply, i think the inlining of an unpure function is a trap.