Skip to content
This repository has been archived by the owner on May 29, 2018. It is now read-only.

Support caching build/dep directories #93

Open
sqs opened this issue Dec 20, 2014 · 1 comment
Open

Support caching build/dep directories #93

sqs opened this issue Dec 20, 2014 · 1 comment

Comments

@sqs
Copy link
Member

sqs commented Dec 20, 2014

Dependency resolution and some other build tasks take a long time but essentially do the same thing every time. If toolchains could write to a directory tree that would be present at future invocations, they could be made to run a lot faster.

Examples:

  • mvn install can easily download hundreds of megabytes per pom.xml file, and a single repository can have many pom.xml files.
  • go get in a repository with multiple Go packages has to download dependencies once per package (source unit) it appears in.

Prior art:

@xizhao
Copy link
Contributor

xizhao commented Dec 20, 2014

Related: #65

If each repo could have a "src unit" or "src module" cache, you could identify individual dependencies by the unique URLs.

The challenge I see here is that each toolchain often leverages default package managers to do dependency resolution. 1. each PM has to leverage caching. 2. the directory structure has to be compatible with the PM and is sometimes non-configurable. In this sense things are very tc-dependent and maybe solving it at the srclib level is the wrong place to start.

Maybe what srclib could do is leverage this if containers are being run in docker. When a tc announces its depresolve step, srclib could commit the state, name it something like src-depresolve-<REPOID>-<COMMITID> (where REPOID is a hash of say the URL + name of repo), and then pick up that image and retag upon the next build. Old build data would truly be cached for each respective tc, and yet each tc would still be sandboxed from each other as only the relevant build data for each tc will be stored in the forks of the docker image. That means reviving an independent depresolve state means picking up an old image and just running the depresolve step again, assuming the tc leverages stateful caching.

My first guess:

srclib-javascript would work fine as node_modules would be cached
srclib-python wouldn't leverage this as it tells pip to download to tmp folders that are cleaned up after each execution

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants