Weak Type Variables

fweth · July 21, 2023, 5:39pm

Hello, I just stumbled accross the peculiarity that this doesn’t compile:

let f = (x, y) => x
let g = f(3)

In OCaml, it would compile but produce a very weird thing called weak type variables. I haven’t fully digested the examples given on that page yet, but I wonder if they would be relevant for ReScript as well, or if ReScript could just infer the type ('a) => int for g?

Ryan · July 23, 2023, 2:55am

That article you linked is very good and explains why you can’t just give it that type (in OCaml at least, but most applies to rescript as well).

The main problem is that rescript (and ocaml) allow mutability.

Upon partial application, the f you show will construct a pure function. And that could in theory be given the type 'a => int.

let f: ('x, 'y) => 'x = (x, _y) => x
// Pretend then rescript will allow this.
let g: 'a => int = f(3)
let _ = g(1234) // will return 3
let _ = g("yo") // will return 3

Okay that looks fine. But let’s write a function that when partially applied will return a function with that same type signature as your original g ('a => int). (Again pretend rescript would allow it to happen.)

let f: (unit, 'x, 'y) => 'x = () => {
  let vals = []

  (x, y) => {
    Array.push(vals, y)
    x
  }
}

// Again, just pretend rescript will allow this and it's a polymorphic function.
let g': 'a => int = f'((), 3)

let _ = g'(1234) // returns 3, and puts 1234 into the array
let _ = g'("yo") // returns 3, and puts "yo" into the array...BOOM!

If rescript allowed that type to really be 'a => int and not put a weak type in there, then as you see, I can easily violate the type of the array (ie that it is a thing filled with one type of value). And you can do lot’s of other zany stuff as well (that linked article has some more fun examples). But luckily it can’t happen like that, and we can live in type-safe land even with mutability because the typechecker gives it the weak type.

Let me just put a nice quote from that article here for reference:

The result of a function application is a runtime value called a closure. Since this value will be created only at runtime, the typechecker can’t inspect it, so it is left beyond its reach. The typechecker needs to be very pessimistic (if not paranoid!) and must assume that the closure may create arbitrary references and could use them to breach type soundness.

(Of course, OCaml and Rescript don’t share the same runtime…so somebody more familiar with the compiler/typechecker will have to say if that above doesn’t quite apply the same way in rescript…but the above type issues are still there.)

By the way, your original example doesn’t compile in OCaml either. I’m guessing you probably ran that in utop, or another ocaml toplevel, in which you can sort of get it…

# let f x y = x ;;
val f : 'a -> 'b -> 'a = <fun>
# let g = f 3;;
val g : '_weak1 -> int = <fun>

But if you’re compiling the source code, it will give an error.

$ cat yo.ml 
let f x y = x
let g = f 3
$ ocamlc yo.ml 
File "yo.ml", line 2, characters 4-5:
2 | let g = f 3
        ^
Error: The type of this expression, '_weak1 -> int,
       contains type variables that cannot be generalized

fweth · July 23, 2023, 6:30am

Thanks a lot for that awesome reply! Yes, makes sense to me now!

I ran the example in OCaml Playground. But it’s very weird, because

let f x y = x
let g = f 1
let a = g "hello"

compiles but

let f x y = x
let g = f 1
let a = g "hello"
let b = g true

because g filled the weak variable with string in the 3rd line. I mean it makes much more sense to me now that you gave this nice example, but I think I prefer ReScript’s behaviour of just declining altogether

Ryan · July 23, 2023, 2:50pm

Apologies if I’m misinterpreting what you mean here, but the Rescript behavior is the same as OCaml in this case you show.

let f: ('a, 'b) => 'a = (x, _y) => x
let g: string => int = f(1)
let a: int = g("hello") // This gives info to infer type of g

let f: ('a, 'b) => 'a = (x, _y) => x
let g: string => int = f(1)
let a: int = g("hello")
let b: int = g(true) // Error! (This type has bool, somewhere wanted string)

fweth · July 23, 2023, 3:08pm

True, but if you don’t put in any types or type variables, ReScript refuses to compile:

let f = (x, y) => x
let g = f(1)

while OCaml (at least version 5) allows this.

But I like ReScript’s way better. Even after understanding OCaml’s intend a bit better, I think functions should not change their type (or set a type variable) after calling them. The idea that g wants only string after being fed string once and wants only int after being fed int once is not intuitive.

EDIT: Oh, I just saw that ReScript does that too. In your example, you set the type of g explicitly to string => int but I guess that was a mistake?

Ryan · July 23, 2023, 3:27pm

In the example you show, that will be the type of g whether you annotate it or not, because after typechecker sees let a = g("hello"), it can fill in the weak type with a non-weak type.

Edit: the type annotation was just for clarity for reading

Ryan · July 23, 2023, 3:33pm

The weak type variable is a delayed concrete type. It’s “weakly” polymorphic ie it can be any type, but only any single type. It’s weak until rescript/ocaml has enough info to give it a concrete type, and this must happen before the end of the compilation unit or you get the error.

So, as soon as it sees let a = g("hello"), then it will “know” that type of g must be string => int.

Here are some relavent parts to some articles:

This gives us yet another insight into what a weak type variable is — it is a delayed concrete type. However, the compiler may not delay the decision indefinitely, it must determine a ground type before the end of the file, a.k.a. the compilation unit. Indeed, because OCaml allows for separate compilation, other files cannot provide information to infer the type of a variable defined in our file.

^ Weak Type Variables | OCamlverse

The underscore in the type variable '_weak1 tells us that the variable is only weakly polymorphic, which is to say that it can be used with any single type…OCaml will convert a weakly polymorphic variable to a concrete type as soon as it gets a clue as to what concrete type it is to be used…

^ Imperative Programming - Real World OCaml

fweth · July 23, 2023, 3:34pm

Thanks a lot for the explanations!

Ryan · July 23, 2023, 3:36pm

Yeah it’s a little tricky…if you’re comfortable with OCaml at all, that imperative programming chapter in Real World OCaml I linked above has a lot of good info about this stuff.