Is using a functor for a fetch binding overkill or just right?

sprkv5 · November 28, 2022, 4:41am

+1. Js.Null.toOption is something I am relying on as well for transforming with JSON data to ReScript records. Is this an inferior approach?

kswope · November 28, 2022, 5:29am

I can explain a little and hopefully not do too much damage. JSON over the network is just a string, and then when that string is parsed into an object (of your preferred language), its an extra pain when that language is typed.

When you parse a json into a object in JS using JSON.parse, since its dynamically typed, it happily makes an object without an typed structure (just a bag of nulls, strings, numbers). You deal with the typing later when your app blows up.

But with something like like rescript, typescript, go, etc, it wants a type right away dammit. So your option is to create a type, and bind it the data from over the network.

looks like parseExn isn’t very impressive, its just JSON,parse

file.res

let j = `{"a":1,"b":"2"}`
let p = Js.Json.parseExn(j)
Js.log(p)

file.mjs

// Generated by ReScript, PLEASE EDIT WITH CARE
var j = "{\"a\":1,\"b\":\"2\"}";
var p = JSON.parse(j);
console.log(p);

Little baffled by Js.Json.deserializeUnsafe, seems to result in the same thing but is undocumented. Looks like it has an extra step in there for validation or something.

file.res

let j = `{"a":1,"b":"2"}`
// let p = Js.Json.parseExn(j)
let p = Js.Json.deserializeUnsafe(j)
Js.log(p)

file.mjs

// Generated by ReScript, PLEASE EDIT WITH CARE
import * as Js_json from "rescript/lib/es6/js_json.js";
var j = "{\"a\":1,\"c\":\"2\"}";
var p = Js_json.deserializeUnsafe(j);
console.log(p);

And Js_json.deserializeUnsafe(j) is just

function deserializeUnsafe(s) {
  return patch(JSON.parse(s));
}

Looks like json decoders like decco/spice will take rescript object and its type and encode it to a json (string), which then can be decoded back into a rescript type with all the rescript goodness preserved, like variants, etc. Not sure how that works, does it use reflection? I have no idea if they do validation too.

But what I’m doing is just pretending the underlying data is already the correct type as an added bonus to using a binding (since bindings are very gullible about returned types), even though its really just a total fantasy, but because I have total control over the data I don’t see why not. The reason for the post was I was just surprised that rescript figured out what type I wanted without me having to tell it what I wanted, just by using the binding itself, but I guess rescript is gonna rescript.

Ok I’ll stop before I make you as clueless as me on this subject.

danielo515 · November 28, 2022, 8:00pm

In my opinion, there are very few situations where using something like decco or spice is not the best option. I am on my mobile phone, so I can’t give a very good answer, but I have a public REPL with decco and spice setup. Will share it with you with some examples of code and what you get in the JS side.

sprkv5 · November 29, 2022, 2:29pm

@kswope In a more formalized way, here is my workflow for parsing JSON of an certain type w with nullable fields. I have been using this for the past 6 months with no hiccups.

Critiques against this technique are welcome since I would like to understand why this should not be used compared to decco/spice. @danielo515 waiting for your REPL with the decco/spice setup for comparison.

I highly appreciate the presence of the Js.Nullable module that helps me work with old REST APIs and convert the HTTP responses to ReScript Records without relying on additional NPM packages.

dangdennis · November 29, 2022, 7:33pm

To pile onto the case for decoding, every other fully typed language, including C#, Java, and Go, require a decoding step (or “marshalling” for Go) to safely convert a json string into the runtime’s respective data types.

We have the luxury of not having to do this for the javascript runtime, a blessing in my opinion. It’s totally fine that you pretend the underlying data is correct though since you do own the data. It’s the same case for whether someone decides to decode the data coming from a database they control.

Matter of time vs safety

DZakh · November 29, 2022, 9:19pm

The biggest problem with the setup compared to using Js.Json or decoding libraries is that it missing validation of the incoming data.
I personally saved hours of debugging by not trusting any external data coming into my application and failing fast when I get something I don’t expect. The same goes for data I own because there might be a typo, a breaking release, or just a bug. So there’s no way of having an invalid application state like “true” instead of true in the bool field.
To prevent this, you either need a shared schema with the backend and codegen of types, or pass all the data through decoders.

DZakh · November 29, 2022, 9:47pm

Being a creator of rescript-struct I’m a little bit biased about decco/spice and prefer to have decoding and data mapping in one go. Not talking about other benefits. But the convenience of creating a decoder is definitely top-notch. Just by writing:

@decco.decode
type data = {foo: string, num: float, bool: bool}

You can decode any Js.Json.t to the data type with:

json->data_decode

And it’ll generate following Js code:

function data_decode(v) {
  var dict = Js_json.classify(v);
  if (typeof dict === "number") {
    return Decco.error(undefined, "Not an object", v);
  }
  if (dict.TAG !== /* JSONObject */2) {
    return Decco.error(undefined, "Not an object", v);
  }
  var dict$1 = dict._0;
  var foo = Decco.stringFromJson(Belt_Option.getWithDefault(Js_dict.get(dict$1, "foo"), null));
  if (foo.TAG === /* Ok */0) {
    var num = Decco.floatFromJson(Belt_Option.getWithDefault(Js_dict.get(dict$1, "num"), null));
    if (num.TAG === /* Ok */0) {
      var bool = Decco.boolFromJson(Belt_Option.getWithDefault(Js_dict.get(dict$1, "bool"), null));
      if (bool.TAG === /* Ok */0) {
        return {
                TAG: /* Ok */0,
                _0: {
                  foo: foo._0,
                  num: num._0,
                  bool: bool._0
                }
              };
      }
      var e = bool._0;
      return {
              TAG: /* Error */1,
              _0: {
                path: ".bool" + e.path,
                message: e.message,
                value: e.value
              }
            };
    }
    var e$1 = num._0;
    return {
            TAG: /* Error */1,
            _0: {
              path: ".num" + e$1.path,
              message: e$1.message,
              value: e$1.value
            }
          };
  }
  var e$2 = foo._0;
  return {
          TAG: /* Error */1,
          _0: {
            path: ".foo" + e$2.path,
            message: e$2.message,
            value: e$2.value
          }
        };
}

kswope · November 29, 2022, 11:26pm

Do you do this for database schemas too? as @dangdennis pointed out. Somebody might change a field name! On the other hand, this could be guarded against by simple tests which should be there anyway.

kswope · November 29, 2022, 11:47pm

Have you considered stripping nulls from the remote source and using optional record fields to handle options automatically? disclaimer: I haven’t tried it yet in a real stack.

For example, up at the server, in JS all that is needed is using the optional “replacer” parameter

let str = JSON.stringify(data, (k, v) => v ?? undefined)

and in golang you just need to tag the struct field with a

type ColorGroup struct {
    ID     int `json:",omitempty"`
    Name   string
    Colors []string
}

dangdennis · November 29, 2022, 11:59pm

I wrote a whole thread about looking for a type safe database lib in rescript. Without code gen or schema inference, decoding from the database is necessary for type safety. This incurs a runtime cost. I’d love to see something like zapatos (typescript) for rescript. Or sqlc for rescript.

The typical “low-level” non codegen approach is that each row in the queried results gets decoded to their respective data model (usually an object). It usually is positional based instead of keying off the actual field name. Libraries like node-postgres and others usually do this work of mapping values back to their field names.

Ultimately the codegen approach is best imo. C# and F# has LINQ. Great library.

Okay done talking to myself

danielo515 · November 30, 2022, 6:01am

I’m glad you posted an example, because now we have some common ground to work with.
There is nothing wrong with your way of doing things. At my very beginning of Rescript journey I was quite in the same boat, wanted to do everything myself. However, the JSON parsing was always frustrating for the big amount of work they require for even the smallest piece of data, and still I was getting some runtime errors and unexpected outcomes, and still I didn’t wanted to use any ppx or annotation. This made my development of any app that involved JSON (almost all of them) slow and eventually I just abandoned rescript for a couple of years.

Your example, it is just to parse a very little piece of data, and it is still a lot to code, not only to write, but also to maintain! Every time you add a little property you need to write new parsers and update the existing ones. Put something as simple as a nested record and you will have to expand your (already long) example from 30 lines of code to probably 50 or 60.
I’m pretty sure it took you, a decent amount of code to write that code (which is nothing bad if you enjoy it, really, but it is a lot of work), compare it with the 30 seconds that took me to write a spice parser:

@spice
type t = {
  id: int,
  name: string,
  description: option(string)
}

That is! that is all that I need to do to have, not the safety level of parsing than the one you wrote manually, but even better.

I actually run your example adding this line

let w = `{"id":99, "name": "bro", "description": 55 }`->s2w->w2t;
Js.log(w);

Guess what I got on the console?

{ id: 99, name: 'bro', description: 55 }

Yes, a correctly parsed invalid value.

Compare it with the error I get using the parser spice wrote for me:

let w = `{"id":99, "name": "bro", "description": 55 }`->t_decode;
Js.log(w);

Which correctly lies to

{
  TAG: 1,
  _0: { path: '.description', message: 'Not a string', value: 55 },
  [Symbol(name)]: 'Error'
}

Path, meaningful error message and even the original value that is incorrect. All this is valuable information when you are trying to understand why X failed.
Again, this is just a very little example and things already went wrong. Imagine dozens of fields and nested records. Thing can get wild quite fast.
Having this speed at writing json decoders and encoders, almost as fast as just using plain JS but with the safety and explicit errors of rescript is a godsend.
This is the REPL if you want to play with it (reason syntax): Reason Node.js - Replit

danielo515 · November 30, 2022, 6:08am

I don’t know you, but ME is the one I trust the less. I have been already bitten so many hundreds of times by a new field on a record, or me changing the type of a (deeply) nested fields so many times that I don’t even want to try to remember that. Happened to me a lot when I was using firebase, a NoSql database where, guess what? the data is schema-less, unstructured, as wild as it can be.
Nah, I don’t trust the database, I don’t trust the wire, and over all those things, I don’t trust myself

Test? Yes, they are valuable. But no test will protect you against runtime errors because the API you are contacting has changed overnight, or the value parsed from local-storage has been altered, or you thought you knew your data good enough and manually input the wrong field (guess who suffered this several times )

On top of all the things I wrote… if using decco/spice were significantly more work… we could have a discussing, but the thing is that it is even less effort, less code to write and maintain for a better outcome. For me it is just a no-brain.

I have been there hundreds of times @DZakh . On top of that, the very well written article parse don’t validate has changed the way I see data ( and my life as a programmer)

kswope · November 30, 2022, 6:58am

Sorry but there’s your problem. Not trusting data from a Nosql database is not even remotely the same thing as not trusting data from a sql database.

hoichi · November 30, 2022, 10:15am

In what way? Off the top of my head, an SQL db has schema, but what are the guarantees that schema always matches the one that your code expects? (Outside of the codegen scenarios)

kswope · November 30, 2022, 10:47am

Isn’t one of the “advantages” of NoSQL is it has a very flexible schema? So flexible, I think, that records don’t even have the same structure. A properly designed SQL database schema is the opposite of flexible, and sure you can change it, but all the data has to conform to that new schema.

danielo515 · November 30, 2022, 10:59am

I’m not trying to convince anyone, nor waste anyone’s time into a discussion about how to handle your data.
If you think you are safe that way, sure, go for it. Still you will have to send the data over the wire, so if you also want to trust the wire, again, go for it.

hoichi · November 30, 2022, 11:14am

Yes, but that schema can (and likely will) be changed without updating the code, especially if the schema is owned by a different team. So now you have to think of syncing schema migrations to code update, of deployment order, of failing gracefully when your client version doesn’t match the API version—and to fail gracefully, it’s a good idea to fail early, as DZach mentioned.

kswope · December 1, 2022, 12:03am

If one team is changing the db schema without the other team knowing about it, and worse, and your deployment pipeline isn’t catching it, you’ve got bigger problems than validating data at the client.

danielo515 · December 2, 2022, 9:08pm

I don’t know in which companies you have worked. But in my experience, this kind of communication problems is more common than not, and I would not call it “a bigger problem than using a validation library”, specially because that sentence suggest very bad things IMO.

woeps · December 6, 2022, 12:18am

Please, put the pitchforks and torches back into the barn - where they belong…

A case can be made for both sides. I agree. But I believe there is no right decision (using some serialization lib or not) it all depends on the requirements, the actual reality of the team(s) working on it, tradeoffs the team is willing to make and finally simply preferences.

In the end (in both approaches), you can’t show/process the data if your recieve something different than expected. The difference is in how to do error-handling and performance.

I found the thread ReScript decoding libraries benchmark interesting in regard of the performance aspect.
We (a colleague of mine at work and I) are currently researching the code-gen approach (slowly, as a side-project) because I want to get rid of all ppxes in our code base in the long run. I believe this should hit quite a sweet-spot regarding dev-experience and performance. (Currently we heavliy rely on decco, but if we had less and simpler data-structures I’d be tempted to write my own simplest possible decoder.)