You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 29, 2018. It is now read-only.
Dependency resolution and some other build tasks take a long time but essentially do the same thing every time. If toolchains could write to a directory tree that would be present at future invocations, they could be made to run a lot faster.
Examples:
mvn install can easily download hundreds of megabytes per pom.xml file, and a single repository can have many pom.xml files.
go get in a repository with multiple Go packages has to download dependencies once per package (source unit) it appears in.
If each repo could have a "src unit" or "src module" cache, you could identify individual dependencies by the unique URLs.
The challenge I see here is that each toolchain often leverages default package managers to do dependency resolution. 1. each PM has to leverage caching. 2. the directory structure has to be compatible with the PM and is sometimes non-configurable. In this sense things are very tc-dependent and maybe solving it at the srclib level is the wrong place to start.
Maybe what srclib could do is leverage this if containers are being run in docker. When a tc announces its depresolve step, srclib could commit the state, name it something like src-depresolve-<REPOID>-<COMMITID> (where REPOID is a hash of say the URL + name of repo), and then pick up that image and retag upon the next build. Old build data would truly be cached for each respective tc, and yet each tc would still be sandboxed from each other as only the relevant build data for each tc will be stored in the forks of the docker image. That means reviving an independent depresolve state means picking up an old image and just running the depresolve step again, assuming the tc leverages stateful caching.
My first guess:
srclib-javascript would work fine as node_modules would be cached
srclib-python wouldn't leverage this as it tells pip to download to tmp folders that are cleaned up after each execution
Dependency resolution and some other build tasks take a long time but essentially do the same thing every time. If toolchains could write to a directory tree that would be present at future invocations, they could be made to run a lot faster.
Examples:
mvn install
can easily download hundreds of megabytes per pom.xml file, and a single repository can have many pom.xml files.go get
in a repository with multiple Go packages has to download dependencies once per package (source unit) it appears in.Prior art:
The text was updated successfully, but these errors were encountered: