Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binder API for specifying, launching, and pooling notebook servers. #5

Closed
wants to merge 27 commits into from

Conversation

rgbkrk
Copy link
Member

@rgbkrk rgbkrk commented Sep 27, 2015

The temporary notebook system (tmpnb) was put together to solve some immediate
demands, the primary of which were:

  • Instantaneous launching of a brand new notebook server
  • Fully reproducible environment (installation, code, data, notebooks, etc.)
  • Multi-tenant
  • Launch as a user hits a URL (no interaction)
  • Launch via a simple API (POST to /api/spawn), as used by e.g. thebe

In order to make running a multi-tenant service like this simple to operate,
maintain, and extend we need a REST API that assists three classes of users:

  • Front end developers developing against the API (e.g. Thebe and associated contexts)
  • Operators (e.g. codeneuro, try.jupyter.org, mybinder)
  • Users (consuming kernels as developers, readers, scientists, researchers)

There are four main actions:

  • build - build an image from the contents of a GitHub repository (or possibly some other specification)
  • stage - make one or more images ready for deployment, including specifying any additional services, and resource allocation
  • deploy - deploys a named environment, and provides status about running versions of that environment
  • pool - pre-allocate and view details about current pools of running environments

The four resources that support these actions are:

  • builds
  • stagings
  • servers
  • pools

Some of these operations should have authorization, depending on their usage.
These are assumed to be run on an API endpoint (e.g. api.mybinder.org) or
potentially with a leading /api/ path.

The README is the main source to review right now and we can work toward a full spec in swagger.

This was also brought up in binder-project/binder#8 as well as https://groups.google.com/forum/#!msg/jupyter/2K2Wuem1HB8/r4dQ_6FbEAAJ.


* Front end developers developing against the API (e.g. Thebe and associated contexts)
* Operators (e.g. codeneuro, try.jupyter.org, mybinder)
* Users (consuming kernels as developers, readers, scientists, researchers)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Links to each of these examples would be nice to use for context.

@minrk
Copy link
Member

minrk commented Sep 28, 2015

@rgbkrk probably a good idea is to go into some detail about how this relates to the kernel-providers work. We don't want to end up building and maintaining two similar-but-not-quite-the-same projects.

@betatim
Copy link
Member

betatim commented Sep 28, 2015

Below some questions, no answers.

Should we rename services to resources? Mainly semantics, my thinking is that you might want to specify as a service something like:

{
  ...
 "services": [
    {
      "name": "gpu",
      "params": {
       ...
      }
    },
   {
      "name": "a-very-large-storage-system",
      "params": {
       ...
      }
    }
  ]
}

to specify that the notebook needs a GPU as well as access to a resource called "a-very-large-storage-system". Both aren't things that need starting but they need to be available/influence the environment in which the kernel is started.

How does the API know how to setup a service called "postgres"? Should the meaning of "postgres" be global or (potentially) specific to each provider of resources? Global is much, much harder to do, undecided how much value it would add.

Thoughts on either?

@minrk
Copy link
Member

minrk commented Sep 28, 2015

@betatim the definitions of resources/services could live in repos themselves. A single name like 'spark' could expand to github.com/binder/services-spark, but other services could be defined via full git URLs (git://bitbucket.com/me/myservice). At this point, how much does specifying a service have in common with specifying a binder?

@freeman-lab
Copy link

Great thoughts @betatim @minrk! Just discussed with @andrewosh, a couple comments re: services / resources

  • The specification of services currently acts more or less as a lookup table against this folder (https://github.com/binder-project/binder/tree/master/services). Those pieces could easily be split off into separate repos, in which case a service could be specified via a git repo and some configuration files, more like a binder. The big difference now is that the spec for a binder leverages existing, generic formats (e.g. requirements.txt), whereas the spec of a service right now is tightly wedded to kubernetes and involves custom parameterized configs we more or less invented. One option would be to try to specify services with a repo + more generic config -- docker-compose.yml might work -- and binder translates those into deployable images.
  • We'd probably distinguish between services (external processes / databases) and resources (configuration of the underlying hardware the environment will run on), though we haven't yet exposed the latter. Maybe these could live as two separate items in the binding spec?

@andrewosh
Copy link

The API looks really nice, some comments:

Some of the fields in the specification file have changed (we need to update the examples in the binder repo too). Since we want to support a growing number of dependency files, we changed the requirements key to a dependencies key, which then takes a list of file names. We also got rid of the notebooks key, instead saying that the root of the repo is the notebook location.

That being said, if we're going to pull notebooks from a variety of sources (like @rgbkrk suggested in the Binder issue), we would want this field to exist, but with a more fleshed out schema (i.e. a NotebookLocation schema). Binder is currently using a spec that's had some fields removed because we're very tied to GitHub, which might not be what we want here.

Might still have a few issues with word choice =) In binder-project/binder#8 I think we were debating between environments and builds in place of what's currently bindings. Seems like we should remove references to Binder as much as possible to keep it implementation-agnostic.

And as discussed above, other key issues to tackle are:

  1. How services are specified and where they're stored (@minrk)
  2. How to specify computational resources (@betatim)
  3. How to specify data
  4. How more diverse front-ends, beyond the Jupyter notebook, might be specified, or if they should be included in this file at all.

Maybe part of this should be creating a detailed spec.json schema similar to the REST API description in swagger.yaml that lays these issues out in detail? Or maybe that's where this starts to overlap with the kernel API spec proposal?

* `bindings` - create a new binding which is a specification/template for a collection of resources
* `binders` - spawn a binder by `bindingID|Name` as specified by the binding template, list currently running binders
* `pools` - pre-allocate and view details about current pools of running binders

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to get a definition and example of the concepts the resources represent. The binders resources "spawns a binder". But what's a binder? A container with a notebook server? A container with anything? Nothing to do with containers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a container that has a notebook server as the main front end with some amount of Spark, Postgres, etc. services running attached to it. We should lay out the discussions we had today about this as its much easier to reason without these attached services.

@parente
Copy link
Member

parente commented Sep 29, 2015

@rgbkrk probably a good idea is to go into some detail about how this relates to the kernel-providers work. We don't want to end up building and maintaining two similar-but-not-quite-the-same projects.

Agreement here. It seems like binders and bindings could be come more generic concepts and obviate the need for at least the container spawning parts of the kernel gateway proposal (KGP?). But I'm not quite sure how the API spec'ed here would support the use cases stated in the KGP (and maybe it's not supposed to.)

Taking one for consideration: writing new web apps that use remote kernels (dashboards, interactive books, etc.).

Say that the web app creator could to define a binding to create a container environment with specific libraries preinstalled for my kernel language of choice. Say that one of the libraries stated for inclusion is a (yet-to-be-written) websocket-to-0mq bridge. The backend of the web app could then use a new binder client lib to talk to the binder server to request instantiation of the binding as a binder, passing whatever auth tokens are necessary to get an instance. After getting an instance, the app could use jupyter-js-services to talk to the kernel in the launched binder container.

In this particular scenario, it feels a bit clunky to have to talk to one API to get a kernel container and another to talk to the kernel itself. Contrast this with using jupyter-js-services to both request and comm with the kernel. But I suspect this is the tradeoff of a generic container launching service versus separate kernel launching and a notebook launching services. (Which is binder intended to become?)

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 2, 2015

Should we rename services to resources?

@betatim Interestingly, I keep calling them resources in my head and thinking it was too generic. Since I participated in plenty of bike shedding today on this, 😄, I ended up not stating it. 😉

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 2, 2015

Today @andrewosh, @freeman-lab, @odewahn, @zischwartz, and I iterated on this spec (and names, lots of names) to end up respeccing this into four main actions with corresponding resources:

  • build - build an image from the contents of a GitHub repository (or possibly some other specification)
  • stage - make one or more images ready for deployment, including specifying any additional services, and resource allocation
  • deploy - deploys a named environment, and provides status about running versions of that environment
  • pool - pre-allocate and view details about current pools of running environments

A little bit of iteration happened in a hackpad: https://jupyter.hackpad.com/Thebe-1012015-cwvDHMfWqJG, but we should probably come back to write up some of our reasoning. At the very least, the commits are in from our collaboration and we can formalize this enhancement proposal a bit more.

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 2, 2015

It's worth noting that a given implementation of this doesn't necessarily have to expose build and could use the stagings endpoint to stage bare Docker images for pooling and insta-deployment.

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 2, 2015

One thing that seemed to get some consensus over 🍕 was that maybe a binder.yml that is largely based on travis.yml (or Heroku procfiles) would be the best way to not deal with the combinatorial explosion of all the different requirements.txt flavors across all the different languages.

@minrk
Copy link
Member

minrk commented Oct 2, 2015

@rgbkrk I think something travis.yml-based is a great idea, since it provides an escape hatch for running code that isn't resorting to starting from scratch with a custom Dockerfile. There's also the conda-style meta.yml and build.sh, which is similar, but separates the declarative bits from the build script. Of course, as with Travis, you can have both by making your build step sh build.sh if it's similar to .travis.yml.

POST /builds/repos HTTP 1.1
Content-Type: application/json

{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably makes sense for the body here to be a little more generic, similar to the proposed binder.yml, and let other translation tools handle logic like "if you specify a repo and a requirements.txt file, it actually means you want to do a pip install"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is mostly in that combinatorial explosion of what "dependencies" mean as well as which Python, Ruby, Go, node, etc. a user means. We're in a bit of an insular world if we encode for Python here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, exactly what we were thinking. If we instead make this generic, other modules can handle the more language-specific translation, and as a result be more open to extension by others. If the body here was like our binder.yml sketch, which was language-neutral, plus a few extra fields (repository, contents, ...) might that work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah that makes sense. The YAML should directly map to JSON as well.

@freeman-lab
Copy link

@rgbkrk the revision of template is nice, a couple fields like cull-timeout probably refer to both the frontend container and the services together, but in general having as much as possible relagated under frontend-container makes a lot of sense to me

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 5, 2015

Yeah, I imagine cull-timeout is for the entire bundle ahem pod of services started together and when they're getting shutdown.

@parente
Copy link
Member

parente commented Oct 5, 2015

Sorry I wasn't in NY to work on this with you. It's shaping up nicely.

Given that the register API now allows for an arbitrary image and start command, I wonder if it removes the need for the kernel gateway incubator project completely. What do you think? Would it help if I tried to map the narratives from the kernel gateway proposal over to this API to see if they fit? For example, walk what would happen in the "Jupyter notebook launches remote kernel" use case complete with API calls made?

If the binder proposal covers them all, then I'll wholeheartedly help out here instead of hacking away on the kernel gateway as a separate thing.

I'm still a bit lost on how the binder.yml file maps to the API. Is that intended to be part of the proposal too? Or just a declarative implementation of how binder-build turns GitHub repos into API calls?

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 5, 2015

Given that the register API now allows for an arbitrary image and start command, I wonder if it removes the need for the kernel gateway incubator project completely.

My sense was that the kernel gateway incubator project was also about exploring ways of mapping the current Kernel APIs but with an auth layer. No extra specs on top, but certainly a way to do discovery of kernelspecs, etc. There's certainly overlap with this proposal.

What do you think? Would it help if I tried to map the narratives from the kernel gateway proposal over to this API to see if they fit? For example, walk what would happen in the "Jupyter notebook launches remote kernel" use case complete with API calls made?

That would be wonderful.

I'm still a bit lost on how the binder.yml file maps to the API. Is that intended to be part of the proposal too? Or just a declarative implementation of how binder-build turns GitHub repos into API calls?

The second one. We didn't scope it out here yet (mostly because we ran out of time in NYC).

If the binder proposal covers them all, then I'll wholeheartedly help out here instead of hacking away on the kernel gateway as a separate thing.

Something tells me there are going to be pieces of each that we need, particularly things like the drop-in-binaries to add to Docker images. When we started talking about these months ago, I recognized there was overlap in the proposals but wasn't sure where things would go. I imagined that a kernel gateway would actually operate in front of an API like the one specced out here.

@minrk
Copy link
Member

minrk commented Oct 5, 2015

I think it would be great if we can get binder to provide the spec part of deployments. There's still work to do on the kernel-provider side of actually making the kernel-only services, and things like auth/cors-related configuration that will be part of a particular deployment. But I think it would be great if we could make that a use case for the specs proposed here, rather than a competing spec.

Related: Is it an important or assumption that binder provides something that's at least a human-facing web server (e.g. notebook application)? Is it abusing binder to spin up 'headless' services like a kernel provider? That affects things like whether binder can redirect users to the container, etc.

@parente
Copy link
Member

parente commented Oct 5, 2015

Side question: because this is a JEP, does that mean the binder project is going to move into the jupyter org? I ask because, unfortunately, where the project lives impacts whether some of us can contribute. :/

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 5, 2015

@freeman-lab stated a willingness to eventually move binder under Jupyter.

For the time being, we can break up some of these things to be projects within the jupyter incubator. More than happy to move the binder-registry repo over, though it is nice to have a go namespace that is similar to github.com/binder-project/registry. Makes me wish that we can call Jupyter the umbrella and github.com/binder-project as one of the projects (with many of their own repos).

@parente
Copy link
Member

parente commented Oct 6, 2015

Would it help if I tried to map the narratives from the kernel gateway proposal over to this API to see if they fit?

That would be wonderful.

Working on it. Will submit as a PR against the PR. 😱

@parente
Copy link
Member

parente commented Oct 7, 2015

I adapted the narratives from the kernel gateway proposal to the binder spec proposal. I did not get to adding the specific REST request/response payloads within the text until there's consensus that they even belong in the proposal.

Along the way, I took these notes (which was the point of the exercise, I think).

  • It feels like the API spec'ed can be used (abused?) beyond Jupyter. It builds container images, launches containers, pools containers, and proxies to containers. What's in the container is not dictated. For example, I can register a template that runs Apache Zeppelin in a container and Binder will happily launch it. Is this the right scope?
  • Related to the above, is there a way to avoid implementing binder-db, binder-logs, binder-registry and other general purpose PaaS-like components by adopting an existing container platform that already provides them like http://deis.io/, https://flynn.io/, http://lattice.cf/, etc.?
  • Some of this API will inevitably need UI (e.g., admin tasks). If that's intended to be a separate enhancement or left undefined, this proposal might call that out so folks know to implement their own UI or collaborate on an "official" one.
  • Binder must be proxying HTTP connections to applications it launches. What provides this proxy capability? binder-launch? How does it discover the port to which to proxy? Is the assumption that each template exposes a single port?
  • Is GET /templates missing? It should probably be here to list all available templates for client use.
  • Templates define an image-name field. The contents of said image are not discoverable. How does a client (human or programmatic) learn the contents of the image? A provenance / source URL to the image on Docker Hub or some such might do.
  • If deploy is async, and the app is not "already in the pool" per the spec, what reflects the status in the response of GET /applications/{template-name}/{id}? Does lack of the location attribute on repeated polling me "still not ready" or should this be more explicit?
  • I think @minrk said this above, but the location field has different meanings if the application process is a web UI vs a headless kernel gateway API vs something else entirely. I guess the client would need to know what was spawned to use it properly. But it would be nice if this was somehow indicated in the response too.
  • Not really an impact on binder, but does it make sense for Thebe to talk to Binder to launch a container, and then talk to a /api/kernels implementation within that container to get a kernel? Or should Thebe talk to Binder to launch a container and then assume there's a websocket endpoint in that container already open and listening? The first fits what Thebe already does, but why take that extra step in the Binder world where the template can include a command to launch a kernel by default?
  • How are failures represented? What HTTP error codes / JSON bodies?

@odewahn
Copy link

odewahn commented Oct 7, 2015

Sorry for jumping in so late -- this is all shaping up really nicely.

One thing not captured here is how to connect data sources into the container. The yml file lists services, so maybe there should also be a data-sources section that would map to the --volumes-from option when the the pod is started.

We'd talked about a number of other data options, but this at least seems like something we could do quickly and that would allow some semblance of sharing data.

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 7, 2015

How are failures represented? What HTTP error codes / JSON bodies?

I'd like to standardize either on GitHub API Style Errors or JSON API errors. In the current registry PR, I'm using GitHub API Style Errors.

I'll come back later today to discuss others

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 7, 2015

In response to @parente:

Related to the above, is there a way to avoid implementing [various binder services] and other general purpose PaaS-like components by adopting an existing container platform that already provides them like http://deis.io/, https://flynn.io/, http://lattice.cf/, etc.?

binder-db,

I can't recall what binder-db was.

binder-logs,

To me this is wholly unrelated to the API spec, but there are several classes of logs and metrics in production here:

  • User containers (e.g. notebook servers)
  • Logs from the orchestration services here
  • Metrics about our own deployment (# active users, etc.), which is beyond just container metrics (e.g. `cadvisor)

I'm personally just going to use a Docker logging driver to forward logs to logstash and deal with splitting these later.

binder-registry

The registry is pretty lightweight. It's a glorified whitelist for images for the deploy/pool/whatever to rely on for pulling and running (which builds atop kubernetes or swarm).

As for exploring the general purpose PaaS solutions, I think that should be done. There's probably some thin layer above that allows you to hit the main goals:

  • Instantaneous launching of a brand new notebook server
  • Fully reproducible environment (installation, code, data, notebooks, etc.)
  • Multi-tenant
  • Launch as a user hits a URL (no interaction)
  • Launch via a simple API (e.g. POST to /api/spawn), as used by e.g. thebe

If someone wants to explore that, I'm certainly open to it. For the time being I'm going to be running with some of the ideas here as prototypes to deal with our currently running infrastructure but I would love to maintain less code.

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 7, 2015

@odewahn what's the difference between data going in the container versus volumes-from? Are you expecting lots of the frontend containers to have the same dependencies yet use different data sources?

If that was the case, I'd suggest building off the same image base and adding the data directly (as well as notebooks). Otherwise you also have the operational burden of dealing with Docker volumes, which don't always clean up well (you now have more steps to account for: always launching these two containers for users and always making sure to delete the volume artifacts when tearing down the image docker rm -fv)

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 7, 2015

It feels like the API spec'ed can be used (abused?) beyond Jupyter. It builds container images, launches containers, pools containers, and proxies to containers. What's in the container is not dictated. For example, I can register a template that runs Apache Zeppelin in a container and Binder will happily launch it. Is this the right scope?

As long as it can be routed on a path, I think that's an accidental feature (which tmpnb has). We do make specific decisions to cater to running things like the notebook which would impact the feasibility of running other applications. Generally speaking, this is designed to be a service where a development environment is provisioned for a user on demand.

On a side note, I tried running rstudio on tmpnb. It didn't work well and I couldn't figure out how to tell it to route down the assigned path.

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 7, 2015

Some of this API will inevitably need UI (e.g., admin tasks). If that's intended to be a separate enhancement or left undefined, this proposal might call that out so folks know to implement their own UI or collaborate on an "official" one.

It would be nice to collaborate on an official one, we had not really thought about that. We were mostly aiming for CLI tools here.

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 7, 2015

Binder must be proxying HTTP connections to applications it launches. What provides this proxy capability? binder-launch?

https://github.com/jupyter/configurable-http-proxy provides the actual proxy which launch uses directly.

How does it discover the port to which to proxy? Is the assumption that each template exposes a single port?

The way tmpnb currently works is to use the first EXPOSEd port, allocating a random one for that port. It's definitely an assumption.

@freeman-lab
Copy link

@parente re:

Templates define an image-name field. The contents of said image are not discoverable.

Good call, I like adding source to the template spec, and in practice it would probably either be whatever docker registry binder-build is talking to, or dockerhub

@rgbkrk
Copy link
Member Author

rgbkrk commented Oct 7, 2015

Good call, I like adding source to the template spec, and in practice it would probably either be whatever docker registry binder-build is talking to, or dockerhub

👍

@rgbkrk
Copy link
Member Author

rgbkrk commented Feb 6, 2016

This thread is a goldmine for reflecting on use cases.

@rgbkrk
Copy link
Member Author

rgbkrk commented Aug 10, 2016

Closing as no resolution yet with a quality discussion.

Thanks all. Happy to re-open again later.

@rgbkrk rgbkrk closed this Aug 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants