Any dead value/dead code?

shapespeare · July 29, 2022, 2:36am

Hello.

I’m currently working on static analysis for dead value/dead code in ReScript. Even though ReScript supports dead code elimination by its own, I wonder if there are some coding habits or frequently used phrases that could cause uncaught scraps behind. It would be a great pleasure if you could share any of those experiences.

zth · July 29, 2022, 6:44am

Side note, but have you seen https://github.com/rescript-association/reanalyze? It does project wide dead code analysis too.

shapespeare · July 29, 2022, 7:12am

Yes, thank you for comment! I’ve checked this already and considering as a reference, but still thanks for reminding it.

cristianoc · July 29, 2022, 2:26pm

@shapespeare are you from SNU?

shapespeare · July 29, 2022, 9:41pm

Don’t have any clue how you knew it, but yes.

cristianoc · July 29, 2022, 10:17pm

Context: @shapespeare belongs to a research group at Seoul National University looking into static analysis for ReScript.
One of the first aspect looked at is dead code/value analysis, but might explore others in future.
This is a parallel effort to reanalyze, and might complement its use case, depending on how things develop.
Any feedback where reanalyze currently falls short could help the group direct attention to where the important problems to solve are. Also, suggestion for specific open source code bases to use as target are welcome.

zth · July 30, 2022, 8:16am

Here’s a brain dump of things I’ve thought about in this area, both when using JS tooling that does dead code elimination, as well as reanalyze. Hoping it can be helpful in some way. I’ve focused a lot on dead code elimination of object properties in JS, because I believe that’s one of the more important things for the size of the JS generated by ReScript, as modules are modeled as JS objects when emitted.

Dead code elimination in JS

Standard JS dead code elimination is typically pretty good at eliminating the most basic cases of dead code. This includes analyzing what exports are actually from a file, and removing anything that is never accessed. Also includes removing code internal to the file that’s never exported, etc. The basic stuff.

But, because of the dynamic nature of JS, and the numerous ways to do the same thing (like accessing properties of an object), my experience is it falls short for anything that’s not the most basic cases. This includes things like removing unused object properties. It’s just too easy to write overly dynamic JS code that’s hard to analyze, or impossible to optimize because of its dynamic nature.

One example is that Rollup (a popular bundler with good tree shaking capabilities) has recently done a big job in this area, to try and tree shake object properties. I believe some of it has landed and can be used, but the long thread of PRs and bugs that has surfaces really highlights the complexity of doing this type of statical analysis for JS: https://github.com/rollup/rollup/pull/4520

Dead code analysis in ReScript

One upside of ReScript being simple as compared to JS/TS etc, is that I don’t think we suffer from a lot of the difficulties that JS/TS has. One example is that it’s perfectly fine to do this in JS:

var someObj = { someMethod: () => ... otherMethod: (someVar) => ... };
var myPropName = somethingComputed(someVariable);
var result = someObject[myPropName](someOtherVariable);

Since property access here is dynamic, there’s just no way for JS dead code eliminators to remove object properties from the underlying someObject, because there’s no easy way to guarantee what properties are and aren’t accessed at runtime.

The good part for us is that it’s impossible to write that type of code in idiomatic ReScript. This should give us a much much simpler outlook than what JS currently has, since the number of cases one needs to analyze should almost be orders of magnitudes fewer than in JS (where it’s, like said before, often possible to do the same thing in 10 different ways).

Side note: The fact that ReScript is so much simpler than JS and that it’s impossible to do a lot of the “weird” things you can do in stock JS is really something we should continue lifting as an advantage of the language. It’s what opens up for tooling like reanalyze.

Reanalyze

I don’t understand enough of the technicals around this, but dreaming freely I’ve often thought about how powerful extending ReScript’s dead code elimination could be, if we could leverage for example Reanalyze to remove all dead code program wide before even emitting the JS. That would mean we could do all of the hard stuff in ReScript, that a regular JS bundler currently struggle with for dead code elimination, and then let the JS bundlers just do what they’re good at.

For example, Reanalyze can track what properties of a record are actually accessed in the code, and how they’re accessed (read/write). Maybe that could be leveraged to simply never emit those properties in the first place. Same goes for anything in a module - reanalyze can track what’s used and not used. Don’t emit what’s not used. Or emit markers that makes it trivial for the JS bundler to remove code marked as not used. Or something else. The potential is there at least IMO.

Misc

It might also be interesting to take a look at the various optimizations the Closure JS compiler does. That project has a ton of effort put into it over a long period of time. Been a while since I’ve used it actively, but I believe it does fairly advanced dead code elimination.

Sorry for the rambling, hoping it might help in some way.

shapespeare · July 30, 2022, 9:36am

Thanks a lot! JS is somewhat… enigmatic, in a bad way.