Ppx performance benchmark

Following up on:

Pupilfirst is still on an old version, so I can’t produce a benchmark for the newest version. I am happy to be proven wrong, so if you’d share your actual metrics of your production codebases with my benchmark script, that would be really useful.

As always, our biggest concern is that we need to manage expectations. We can’t guarantee that a ppx will always allow fast compilation speeds. It highly depends on the setup and project size, that’s why we tend to be defensive on ppx usage in general.

2 Likes

Is there source code? I have some idea it might be due to a too defensive parse function, that generated a large Cartesian product of generated code for deeply nested queries (not uncommon for GraphQL) in earlier versions of graphql-ppx. One of my first patches. Nowadays the parse function is very lean and should look like the code you would write yourself.

For GraphQL a ppx is the best I can come up with with the language features we have at the moment. I very much agree that it’s quite literally a language extension, and should be used sparingly (in our codebase we don’t use any other ppx for that reason). But perhaps we can have a few ppx’s that are “sanctioned” by the community for essential use cases like GraphQL? Happy to think about this and contribute, I am not married to ppx, but merely the value it provides for my company.

2 Likes

It is here: https://github.com/pupilfirst/pupilfirst

Would be interesting to see benchmark comparisons between the old and new version!

A clean build of that is 6.5 seconds, not bad for a 36k line codebase. Is this much more than expected?

“Lean parse” landed in 0.6.1 and I see graphql-ppx is on the most recent 0.x version, perhaps it’s possible that that already solved most performance issues?

Hm not a big difference when I try an older version of graphql-ppx. Do you think 6 seconds is too slow? I did some experimentation with performance but would be surprised if graphql-ppx would be responsible for more than 1s of this compilation. (Usually the syntax transform/refmt and graphql-ppx are only a small fraction of the compile time).

Use https://rescript-lang.org/docs/manual/latest/build-performance#profile-your-build

Full build is 14s, ppx + old reason syntax in red; more than a third of the build.

2 Likes

I guess my machine is a little faster (16" mbp). I’ll try to fiddle with this tool. Do you also know the spit between the ppx and the old reason syntax? If it’s around 10% that is kind of in the right ball park because it also does a lot of work (always good to see if we can optimize even more, but at least no major performance problems it looks like).

Given how pervasive and important graphql_ppx is in a code base, 30% doesn’t look unjustified indeed

1 Like

This isn’t GraphQL related, but even a seemingly simple PPX can be expensive. The team I’m working on decided to dip our toes into PPX with decco and clean build time increased by 40-50% (from around 6 seconds to around 9) on a 2020 MacBook Pro 13". We don’t need to do clean builds much so this was deemed acceptable.

3 Likes

The decco performance is so bad because the ppx is invoked via a shell script, see https://github.com/reasonml-labs/decco/blob/master/ppx, i.e. a shell needs to be forked for every file that is compiled.

Compare with the approach in e.g. graphql-ppx where the correct binary is copied into place by a post install script: https://github.com/reasonml-community/graphql-ppx/blob/master/copyPlatformBinaryInPlace.js.

1 Like

Interesting, I wonder if ppx_deriving does it correctly. Could be used with the yojson plugin.

Yeah I suspected as much. Unfortunately the binary can’t be used directly, when I tried that manually it gave me a weird error. The way it’s been built means it requires the extra --as-ppx argument :neutral_face:

Definitely not, it uses a JS script to determine which binary to execute. Multiple hundreds of milliseconds for every file that is compiled.

2 Likes