[Beginner-Help] Typing & Confusing Errors

I’ve been mostly working on scala for the past few years and completely lost all my RS chops here (rewrote most of my orgs FE with reason-react) and there and I’m trying (slowly) pick it back up, and I was like “OH THIS FEELS SO FAMILIAR” as I was writing it until the compiler told me what was in my brain was wrong.

@module("shelljs")
external cd: (string) => string = "cd"
external exec: (string) => string = "exec"

let cloneInTemp = (temp: string): string => {
  cd(temp)
}

I get no error.

let cloneInTemp = (temp: string): string => { 
  cd(temp)
  exec("git clone git@github.com:myorg/myrepo.git")
}

Now throws an error

 9 │ 
10 │ let cloneInTemp = (temp: string): string => { 
11 │   **cd(temp)**
12 │   exec("git clone git@github.com:myorg/myrepo.git")
13 │ }

This has type: string
Somewhere wanted: unit

Playground Link here.

“Somewhere wanted: unit” just doesn’t make much sense to me because cd: string => string and exec: string => string, and cloneInTemp: string => string.

Any help would be useful, it just seems whenever I have 2 statements after each other, they seem to not know how to handle it.

(also as an aside: I know that error handling has been something that’s changed since I last picked up RS, is there a corollary to “finally”?)

1 Like

The error is trying to say that cd(temp) is an intermediate statement expression in the body of the function cloneInTemp and so must return unit.

If you don’t want to use the value returned by cd, then you have to ignore it with

cd(temp)->ignore

or

let _ = cd(temp)

2 Likes

Thanks so much! Is this in the documentation anywhere? I haven’t been able to point to it anywhere after learning about this? Is there any other ways around it in regards to the external to get past that? Internal functions, I’m assuming still don’t have that constraint?

Going like this, makes it compile, but lose the typing that cd would return (a string vs undefined)

external cd: (string) => unit = "cd"

The sense that it has to be assigned just feels awkward?

In my experience, ML-like languages (at least) use the last expression in a function body as the return value, while the returned value of the intermediate expressions are to be used or ignored.

I thought Scala would throw a compile-time error too, but apparently not: Scastie - An interactive playground for Scala.

Internal functions, I’m assuming still don’t have that constraint?

Anyway, this is a general rule applied to all functions, not just external. I wouldn’t label them ‘internal’ though, just because the FFIs are external. :sweat_smile:

The sense that it has to be assigned just feels awkward?

Its more that it has to be used than assigned.

May be someone else can explain it with better/correct terms.

3 Likes

It is quite idiomatic to pretend to the type system that cd returns unit.

If you still need it to return string sometimes, you can also keep the other binding for that case and just name it differently, e.g.

external cdReturns: string => string = "cd"

I suck at naming, though.

1 Like

Is there any other ways around it in regards to the external to get past that?

Basically, what you discovered (typing return value as unit) and what @fham suggested (multiple bindings).

Though I personally would stick to the correct return type in the binding, and ignore the returned value if I am not using it. So, if cd does return string, then

// binding
external cd: string => string = "cd"

// usage
let _ = cd(dir)

Most languages I use (ruby, js, scala, erlang/elixir, lisp), all tend to return the last expression, but don’t expect that side effects consume the values of the intermediate expressions. even in ocaml, you’re looking at in to chain the local scope bindings here… It’s just that it’s a silly trip-up here that I wasn’t expecting because I didn’t see any notation in the documentation, specifically because I was out here rewriting typescript into rescript.

It’s a nice guarantee that we’re not using the values, but also it’s ugly as hell imo :wink:.

It is exactly the same as in OCaml.

If you literally translate your example, it would look like this:

let cloneInTemp (temp : string) =
  let _ = cd temp in
  exec ("git clone git@github.com:myorg/myrepo.git")

OCaml playground link

EDIT: Sorry, the OCaml playground link does not work for some reason, it leads to the vanilla playground :man_shrugging:

2 Likes

OCaml would force you to do the same thing. So, in

(* OCaml code *)
let cloneInTemp temp = 
  let x = cd temp in
  exec "git clone git@github.com:myorg/myrepo.git"

you are still binding the return value. The alternatives

(* OCaml code *)
let cloneInTemp temp = 
  let _ = cd temp in
  exec "git clone git@github.com:myorg/myrepo.git"
(* OCaml code *)
let cloneInTemp temp = 
  (* cd temp; *) (* The commented expression is incorrect: https://forum.rescript-lang.org/t/beginner-help-typing-confusing-errors/3962/10 *)
  cd temp |> ignore;
  exec "git clone git@github.com:myorg/myrepo.git"

still make you choose to ignore the return value.

It’s just that it’s a silly trip-up here that I wasn’t expecting because I didn’t see any notation in the documentation, specifically because I was out here rewriting typescript into rescript.

Agreed. The error looks cryptic without the pre-requisite knowledge.

It’s a nice guarantee that we’re not using the values, but also it’s ugly as hell imo

While ugly is subjective, all I can say is that you will get used to it. If you see it as a con, then it will look like a small trade-off against all the advantages of ReScript.

1 Like

Just to note, cd temp; (with the semicolon) won’t work gives a warning in OCaml. ; is an expression separator, but your cd is returning string not unit. So it is trying to put a non-unit type expression on the left-hand side of a sequence, which won’t work.

You would still need to ignore it if you want to use the semicolon: cd temp |> ignore; or ignore @@ cd temp; (or like you do with let _ = blah in ...).

Edit: see below, but it gives a compiler warning, which can be treating as an error in certain situations.

2 Likes

True. My bad. I forgot the semantics.

2 Likes

Actually, you were correct, you can use the semicolon with a non-unit returning function like you show. It isn’t an error, rather a warning (warning 10), but dune, the build system, promotes warnings to errors in certain profiles.

1 Like

Here’s where you are going wrong I think. cd(temp) and exec(...) are expressions, not statements.

Let me just write it like this so it’s clear:

let someValue = {
  expression1
  expression2
  expression3
}

In that let-binding, there is a scope delimited with {}. In that scope there is a sequence of three expressions, expression1, expression2, and expression3. And in a sequence of expressions, they are evaluated in order and the value of the final expression is returned (docs, in OCaml, but there if you need it). Note that the expression sequencing is implicit, meaning that you don’t use an expression separator to separate them as in OCaml, so you need the brackets to define the scope of the expression as a whole.

In ReScript there are certain rules you have to follow when sequencing multiple expressions like that. Any expression on the left hand side of a sequence must evaluate to unit. That means in the above example, expression1, and expression2 must have a value of unit, whereas expression3 can be any value. Note that the value of the expression sequence is the last expression in the sequence (thus that mention in the docs about the value of the last expression is implicitly returned in a block/scope).


I actually don’t think it’s a silly trip up on your part…unless you’re coming from OCaml, I don’t think it’s obvious those rules about sequencing. And unless you dig through ReScript compiler github issues, even if you know how expression sequencing works in OCaml, you may not realize that ReScript is more strict with what it allows.

I also couldn’t find this in the docs anywhere, but here are some quotes/links from ReScript compiler github issues:

In general the typing rules are more strict expr1; expr2, the expr1 is expected to have type unit

(link)

in recent releases we already do a strict check over sequences of expressions: e0 ; e1, it will assume e0 to have type unit

(link)

In OCaml, if the value of an expression on the left-hand side of a sequence is not unit, it will be a warning (warning 10), and as far as I can tell, this was the case in older versions of Bucklescript too (see here). However, now in ReScript, it is always an error (link).


So back to your example.

You have a let binding for a functional value:

In that let-binding there is a sequence of two expressions, cd(temp) and exec(...). We now from the ReScript rules shown above that left-hand side expressions in a sequence must return unit. However, the cd function returns string. Thus the error: Somewhere wanted: unit.

1 Like

Just to add that the rationale for this is that if the return value is not unit you’re throwing away a value. Throwing away the return value of pure (side-effect-free) functions is always an error, because why would you evaluate a pure function if you’re not going to use its result? And almost all of your code should be pure, because throwing side-effects around left and right is bad juju. It makes your code hard to reason about, and therefore prone to bugs.

Also for that reason, functions that (primarily) perform side-effects should return unit so that you can easily distinguish them from pure functions, and get the benefit of this warning if you do make this kind of mistake, which we all do now and then.

This doesn’t make it any easier to use JavsScript functions that have been created in a very different environment, but it does make perfect sense internally in the ReScript environment.

3 Likes

I agree, that the feature is nice, but would be nice to have a more guiding error message.

glenn how do you mean effectful functions should return unit… How about network requests, async, etc?

Sorry, I could have phrased that better. What I mean is just that functions whose main output is a side-effect should avoid also returning something that can usually be ignored, for convenience typically. That way, throwing away the output isn’t just a chore, but something meaningful that should be communicated in the code. For example, if it’s a function that can fail, and returns a result to indicate that, explicitly throwing away the result communicates that this call is thought to never fail.

For functions whose main purpose is to retrieve some value through a side-effect, like network requests and other async functions, they should typically return a promise to indicate that. But there are exceptions to this too of course, like functions that generate random numbers that are returned immediately. Either way, in the context of this discussion these are all functions whose return values should not just be thrown away.

There’s a natural insertion point for such a custom error message.
A “statement” is an expression either on the lhs of a sequence, or the body of a for/while loop.
The insertion point is where this is unified with unit.

What’s less clear, is how give an error message phrased is a way that is helpful to someone new to the language. What phrasing would have helped?

I suck at making error messages myself :new_moon_with_face:. But I remember one tweet. I think the recommendations are good and applicable to our case as well.

I think we should describe what caused the error and also provide some code examples of how to make the code work.

How verbose do we want to go? Off the top of my head:

A value of type string is being ignored. If you mean it, please use cd(temp)->ignore to make it explicit

(I’ guessing here ignore is more idiomatic than let _).

Not sure if this is going to be enough for all the newcomers though. So maybe ideally, we need actual error codes, like TypeScript, or short links like those included in (some of?) React error messages. But that’s beyond the scope of this thread, of course.

EDIT: Grammar

1 Like