How to speed up ReScript in CI Server?

JesterXL · January 3, 2022, 9:10pm

Running npm ci with ReScript as a dev dep takes 23 minutes, but 30 seconds locally.

More details:

If I run rm -rf node_modules then time npm ci, my 4 libraries take about 30 seconds to install.
If I npm i rescript -D then nuke node_modules, then run again time npm ci, it takes 30 seconds again.

I’ve already installed ReScript in the past on my machine, so I reckon it may have already cached all the Python/Ninja stuff.

When I run on my CI server (Gitlab + some EC2 things):

with the 4 deps, it takes about 90 seconds.
If I install ReScript as a dev dep, and run npm ci there, it takes 23 minutes.

My only guess is it cannot cache the whole Python/Ninja install thing so each time has to run it.

Is there anything I can do to speed this up? Pre-install on a Docker container somehow? Or is there something else I’m overlooking?

yawaramin · January 3, 2022, 9:29pm

Small suggestion, ReScript should not be a devDep, it should be a normal dep. It has runtime libraries that are used in consuming applications (like Curry, etc.).

Beyond that–23 minutes seems pretty excessive. What is it spending all that time on, mostly?

JesterXL · January 3, 2022, 9:31pm

I thought that was what https://github.com/rescript-lang/std was for; I’ve been putting that in my deps.
That’s a great question and impossible for me to tell; the npm log does that “blah blah blah uuid@v1 is deprecated” and just sits there, so not sure where in the npm install drama it is hanging. Is there a way to do more verbose logging to see?

JesterXL · January 3, 2022, 9:35pm

Case in point, there’s a 20 minute gap between the warning and the “added 954”.

npm WARN deprecated request@2.88.2: request has been deprecated, see https://github.com/request/request/issues/3142
added 954 packages, and audited 955 packages in 20m

yawaramin · January 3, 2022, 9:41pm

Try disabling npm’s fancy progress bar with npm set progress=false (this tells it to just print progress one per line), then running in verbose with npm --verbose ci, to force it to print more info.

JesterXL · January 3, 2022, 10:18pm

Here’s a snippet with the beast in the middle:

pm timing reifyNode:node_modules/rescript Completed in 2934ms
npm timing reify:unpack Completed in 2935ms
npm timing reify:unretire Completed in 1ms
npm timing build:queue Completed in 3ms
npm timing build:link:node_modules/rescript Completed in 6ms
npm timing build:link Completed in 7ms
npm info run rescript@9.1.4 postinstall node_modules/rescript node scripts/install.js

hangs on the above command 20 minutes, and then...

npm info run rescript@9.1.4 postinstall { code: 0, signal: null }
npm timing build:run:postinstall:node_modules/rescript Completed in 633669ms
npm timing build:run:postinstall Completed in 633669ms
npm timing build:deps Completed in 633679ms
npm timing build Completed in 633680ms
npm timing reify:build Completed in 633681ms
npm timing reify:trash Completed in 0ms
npm timing reify Completed in 636645ms
added 1 package, and audited 2 packages in 11m
found 0 vulnerabilities
npm timing command:ci Completed in 636691ms
npm verb exit 0
npm timing npm Completed in 636964ms
npm info ok 
Saving cache for successful job

yawaramin · January 3, 2022, 10:25pm

Ah, there we go, it’s running ReScript install scripts for 20 minutes. You might be using a build agent OS for which ReScript doesn’t provide prebuilt binaries? You can try switching to something standard, like ubuntu-latest or something like that. Or if that doesn’t help, your remaining option is to Dockerize the build and use that image for all builds going forward.

JesterXL · January 3, 2022, 10:43pm

I’m trying that now. I’m installing it as the last step on the image, installing them globally. My hope is, when my code goes to install ReScript, it should be fast “because it’s already in global with the same version”… I hope?

BTW, did my rescript/std makes sense or no?

yawaramin · January 3, 2022, 10:50pm

Yeah, that makes sense, but why have two dependencies when you can have just one? I assume you’re building a Node app, and don’t need to distribute a lightweight library. So depending on rescript as a normal dep doesn’t really hurt.

Unless you are building a library, in which case ignore my suggestion

JesterXL · January 3, 2022, 10:52pm

Got it.

Ok, running into same speed problem. Do I just remove rescript from package.json now that it’s global or something? I mean it should already be on the container; I updated the hash. Blarg…

JesterXL · January 3, 2022, 10:55pm

Like in Docker I cd’d to /usr/bin and just npm i -g rescript; I saw it put it globally, which is cool, but for some reason my build using the latest Docker published image hash is attempting to build ReScript again.

yawaramin · January 3, 2022, 11:03pm

My suggestion is that you don’t need to install rescript globally, you can still use npm ci to install it, but to use a specially-built Docker image in your pipeline. E.g., How to Build a Docker Image and Push it to the GitLab Container Registry from a GitLab CI pipeline | by Valentin Despa | DevOps with Valentine | Medium

The idea is to build prepare a Docker image with the npm ci part already done, i.e. rescript already installed. Then on each pipeline build, use that image in the pipeline. Then, doing npm ci again will just install the diff of any changed dependencies, and then a full clean build should be fairly fast.

DZakh · January 3, 2022, 11:31pm

And don’t use alpine based image for this. I’ve written more about it in the issue https://github.com/rescript-lang/rescript-compiler/issues/3666#issuecomment-923378525

JesterXL · January 4, 2022, 4:46pm

My challenge here is directories. Like, in Docker, where do I cd too? I believe Gitlab does a git clone and then cd’s to something like /builds/your/gitlab/project/path. My base Docker image won’t know that, and the CI_BUILDS_DIR environment variable seems to be only available in gitlab-ci.yml file.

My entire CI team uses Alpine, so I’ll see if I can pawn this off to one of the Ops crew. I’ve tried all morning to get node:bullseye to work, but Alpine commands vs. Debian are completely different and my Google Fu isn’t strong enough for figure out bugs (like why is Alpine curl -o fine, but Debian is like “wtf is -o?”). I’ll keep you posted.

DZakh · January 6, 2022, 12:17am

You don’t need to know where gitlab ci checkouts your project. If you do, then you probably use absolute paths and it’s better to replace them to relative ones.

What about curl. You can figure it out creating a small job with the script curl - - help, and compare the output.

JesterXL · January 6, 2022, 6:59pm

Sorry, Omnicron + food poisoning had me crushed for 2 days. Still have the brain fog.

Figured it out; stupid \ trailing slash lelz. Docker continues to be the worst thing on the planet. However, using it + cache + artifacts, POW, 90 seconds vs 20 minutes, BOOOYYAAAA
ReScript in node_modules is 200 megs. That would fly for EC2, but not for serverless deploy into Lambda, lol, sorry, back to rescript/std.

Odd thing, too; it’s 250 meg bundle size for me 3 Lambdas in CI, but locally it’s 80. wat.

JesterXL · January 6, 2022, 7:12pm

Weird, even with rescript/std it’s 173 meg on CI, hrm…

yawaramin · January 6, 2022, 8:06pm

Hmm, thinking about this a bit more, it makes sense for the Docker image that actually does the build, to be large. It runs in your CI pipeline, caches all the build dependencies (download + build of ReScript, all npm dependency packages).

To really cut down on the output size, it makes sense to have the above pipeline produce a single minified, tree-shaken, bundled JS file. This single file should be pretty easy to deploy anywhere.

DZakh · January 6, 2022, 10:36pm

It’s ok to have 200mb Debian docker image. If it was alpine, it would be something like 15mb. It’s not related to the js output, but system and it’s dependencies

JesterXL · January 7, 2022, 1:57am

Naw, I mean the “yourcode.zip” that the Serverless framework generates from your ReScript compiled JS files + the package.json dependencies (not devDependencies) that it then uploads to S3 so it can then go “CloudFront, deploy that zip”. Lambdas can be like 250 meg, but they really shouldn’t be above 10 meg unless you’re doing beast work. Since API’s need to be fast, you want 'em in the kilobyte range, really, but hard to do with our current tooling. I don’t care about the Docker file size, lol, that works fine.

When I’m feeling better tomorrow I’ll take a look at what’s actually getting into the zip file locally vs. on the ci server. I reckon it’s my horrible “cache vs. artifacts” skills in the gitlab-ci.yml file or perhaps npm is installing different things remotely.