Better way to parse this variant-y JSON?

Given JSON strings like these:

{"hello": ["B", "world"]}
{"hello": ["A"]}

And this type to represent in ReScript the info from the JSON:

type thing = A | B(string)

Here’s some bs-json code to parse the JSON into thing values. In a strongly typed language I do expect this kind of code to be more complicated than you might expect coming from languages like JavaScript, but can anybody suggest any improvements?

type t = {hello: thing}

let decodeA = json => {
  if json == Js.Json.stringArray(["A"]) {
    A
  } else {
    raise(Json.Decode.DecodeError("Not an A"))
  }
}

let decodeB = {
  open Json.Decode
  pair(string, string) |> map(((code, value)) =>
    if code == "B" {
      B(value)
    } else {
      raise(DecodeError("Not a B"))
    }
  )
}

let decodeThing = {
  open Json.Decode
  either(decodeA, decodeB)
}

let decodeHellow = json => {
  open Json.Decode
  {
    hello: json |> field("hello", decodeThing),
  }
}

let parseHellow = json => json |> Json.parseOrRaise |> decodeHellow

let x = parseHellow("{\"hello\": [\"B\", \"world\"]}")
Js.log2("parsed value", switch(x.hello) {| A => "A" | B(value) => "B " ++ value})

The input JSON here comes from OCaml values of essentially the same type (so there’s not much chance of them being invalid):

type thing = A | B of string [@@deriving yojson]
type t = { hello : thing } [@@deriving yojson]

let hello = { hello = B "world" }
let jsony = yojson_of_t hello
let json = Yojson.Safe.to_string jsony

By the way I did notice GitHub - roddyyaga/bs-yojson: Low-level JSON parsing and pretty-printing library for OCaml – but if that still works with ReScript now, I guess I’m not sure it’s going to keep working.

I would simplify like this:

let decodeThing = json => {
  open Json.Decode

  switch json |> array(string) {
  | ["A"] => A
  | ["B", str] => B(str)
  | _
  | exception DecodeError(_) => raise(DecodeError("Expected A | B(string"))
  }
}
3 Likes

Also, it might be worth mentioning that you have this quick & dirty option as well:

type thing = A | B(string)

let toJsonString = (t: thing) => {
  Js.Json.serializeExn(t)
}

let fromJsonString = (json): thing => {
  Js.Json.deserializeUnsafe(json)
}

Source: Parsing JSON into variant based on the data shape - #3 by ryyppy

Interesting, these functions are actually not documented in the API docs: Js.Json | ReScript API (unless I went totally blind :sweat_smile: )

But looking at the source code: rescript-compiler/js_json.mli at fe15e6efc27004da114710884a92052861b9d761 · rescript-lang/rescript-compiler · GitHub , the doc comment says:

It is unsafe in two aspects

  • It may throw during parsing
  • when you cast it to a specific type, it may have a type mismatch

Looking at the source code, it uses this function patch: rescript-compiler/js_json.ml at fe15e6efc27004da114710884a92052861b9d761 · rescript-lang/rescript-compiler · GitHub … which looks super complicated. I’m not actually sure what it’s doing, so let’s actually try it out on the JSON data we know we’ll get:

$ node
> const patch = function (json) {...
> patch({'hello': ['A']})
{ hello: [ 'A' ] }
> patch({'hello': ['B', 'world']})
{ hello: [ 'B', 'world' ] }

This actually seems to be a no-op. I think it tries to do a best-effort ‘patch’ to make the JSON data ‘look’ like ReScript’s expected representation. But ReScript’s representation is not compatible with ppx_deriving_yojson representation. Here’s ReScript:

// Input ReScript:

let test1 = {hello: A}
let test2 = {hello: B("world")}

// Output JS:

var test1 = {
  hello: /* A */0
};

var test2 = {
  hello: /* B */{
    _0: "world"
  }
};

Definitely this mismatch would cause runtime type errors. I think it’s safer to stick with bs-json.

2 Likes

It looks like patch and serializeExn are both mainly (only?) concerned with converting ReScript None values into a JSON representation ({RE_PRIVATE_NONE : true}), since undefined isn’t allowed in JSON.

1 Like

that’s the correct answer. It’s a really adhoc api.

That is vastly better and now seems absurdly obvious, thanks.

I found I had to do this:

let decodeThing2 = json => {
  open Json.Decode

  switch json |> array(string) {
  | ["A"] => A
  | ["B", str] => B(str)
  | _ => raise(DecodeError("Expected A | B(string"))
  | exception DecodeError(_) => raise(DecodeError("Expected A | B(string"))
  }
}

or else I get:

Exception patterns must be at the top level of a match case.

Obviously I could pull out the raise into a function but I wonder if I’m missing something about the syntax and I could use fall-through somehow as your suggested code does? I don’t know what “top level” refers to in the error: if I try fall-through with the exception listed first, I get the same error:

  | exception DecodeError(_)
  | _ => raise(DecodeError("Expected A | B(string"))

Hmm, that’s weird, I’m using exactly that pattern in a newer version of OCaml. Perhaps ReScript’s version doesn’t support it yet. Actually come to think of it, since we’re just rethrowing the exception here, it might not even be worth catching it at all.