Read cmt file and load env to use Env.find_type

adrianhunter · April 22, 2021, 10:06pm

Hey,

I want to read the generated .cmt file from a rescript project and do some analysis of the types. I found an example:


    let get_initial_env = (file_name: string): Env.t => {
      let cmt = readCmt(file_name);
      switch (cmt) {
      | {Cmt_format.cmt_initial_env: env, cmt_loadpath} =>
        Config.load_path := cmt_loadpath;

        /* We call [env_of_only_summary] else the environment is empty
              (contains only the summary) and cannot be searched.
           */
        try(Envaux.env_of_only_summary(env)) {
        | Envaux.Error(error) =>
          Envaux.report_error(Format.str_formatter, error);
          failwith(Format.flush_str_formatter());
        };
      | _ => failwith("Cannot extract cmt data.")
      };
    };

But it doesn’t work for code that uses/imports other modules. I always get something like:
FAILED: src/html/Html.cmj
Fatal error: exception Env.Error(_)

So, my question is, is this actually supposed to work, or should I somehow use the env module from https://github.com/rescript-lang/ocaml/tree/24bd8f7f4ee641ea78c536ef557e25396cb4d537

It looks like it’s trying to load the cmj files and fails

I also had a quick look at reanalyze, but I could not make sense of it. They somehow define an Env module here: https://github.com/rescript-association/reanalyze/blob/6335ce4399ecb305e653561d2d01912fd6c8f917/src/Il.ml ,but manually add all the stuff.

So to sum it up, I want to load a cmt file + the environment so that I can traverse the typedtree and look up types with Env.find_type

cristianoc · April 22, 2021, 10:53pm

Env doe not contain info on other modules than the current file.
Not sure it can help much in what you need. Reanalyze and genType don’t use it.
References across files are indeed tricky and have several special case and idiosyncracies.
Would you expand a bit on what you’d like to build? A good starting point would be the analysis for dead types in reanalyze, as it needs to e.g. track all the type references.

adrianhunter · April 22, 2021, 11:40pm

Mh, ok. I will have a look at reanalyze then. Gentype should also help.

I want to build a json schema generator and a bitcoin script compiler. I already got basic versions working, but just as a ppx. For the json schema generator I want to read the declaration of the referenced types, like creating an enum from a referenced variant that is in another file. The bitcoin script compiler basically reads function bodies that have the basic bitcoin op codes as functions piped together, so like a little forth in rescript, with lists or tuples as the “stack”. This already works pretty well with not much code, but now I want/need more type information.

I thought the cmt would already contain all the information about the types when I load the environment, or at least load the needed file internally when calling a function that tries to find/load the type

cristianoc · April 22, 2021, 11:59pm

Yes so you can either take the reanalyze route, and inherit all the complexity, or try to put all in one file.
There are also ways to “pack” a project automatically so that it ends up in one file for analysis. The compiler for example does that.