Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overview issue for proof-of-concept #1

Open
2 of 5 tasks
cdepillabout opened this issue Jul 28, 2022 · 7 comments
Open
2 of 5 tasks

overview issue for proof-of-concept #1

cdepillabout opened this issue Jul 28, 2022 · 7 comments

Comments

@cdepillabout
Copy link
Contributor

cdepillabout commented Jul 28, 2022

I wanted to do a small write-up on what I think we'll need for an initial proof-of-concept release. Hopefully with this list it will be easier for us to have a conversation about how to split up the work.

  • A web api. We'll need a web api that can accept http requests like binplz.dev/stack, and know how to parse out that the user actually wants the stack binary.

    I'm imagining doing this as a Haskell servant API, but I imagine our API will be simple enough that it doesn't really matter what we use.

    In the future I'd also like to support additional url parameters like the following: binplz.dev/stack.tar.gz for downloading a whole package (instead of just a single binary), or potentially a whole closure. Also URL parameters like the following: binplz.dev/procps?arch=aarch64-linux&binary=ps&nixpkgs_commit=abcde1234&linking=dynamic. We definitely don't need to support these types of things in our initial proof-of-concept release, but we should at least try to to implement our proof-of-concept in a way that doesn't makes it impossible to implement these types of additional features.

    Another worry I have here is whether or not we have to worry about http timeouts. Building a binary could potentially takes tens of minutes, and I'm worried that curl will timeout.

    edit: completed in Server #6

  • Determine package from bin name
    We also need to decide whether we'd like the path (like stack or procps in the example above) to refer to a Nixpkgs top-level attribute name, or an actual binary. Having it refer to an actual binary would be easier for end-users, but significantly harder for us. For instance, even for the ps binary, there are a bunch of packages to that provide it:

    $ sqlite3 ~/.nix-defexpr/channels/nixos-unstable/programs.sqlite
    sqlite> select * from Programs where name = "ps" and system = "x86_64-linux";
    ps|x86_64-linux|busybox
    ps|x86_64-linux|cope
    ps|x86_64-linux|procps
    ps|x86_64-linux|ps
    ps|x86_64-linux|toybox
    ps|x86_64-linux|unixtools.procps

    We don't really have a good way to know that ps should come from procps.

  • A function to take a Nixpkgs top-level attribute (like stack, xterm, haskellPackages.weeder, etc), fork off a nix-build process, monitor the nix-build process, and return either a binary when nix-build is finished, or an error.

    I think this is relatively straight-forward, except that we want to make sure the user can't build some arbitrary Nix code that includes the output of buitins.readFile /etc/shadow or something weird. I think there are nix-build options you can pass to disable these types of built-ins, but we'd need to research it.

    We also likely want to cache negative build results. We'd ideally like to cache negative build results per derivation, but I don't know how easy that would be.

    We probably also want to make sure that we run builds until they finish (so that the derivation output is cached), even if a user presses Ctrl-C on curl and disconnects from the api.

  • Some sort of static documentation website / landing-page. I'm imagining that accessing the top-level binplz.dev/ will redirect to docs.binplz.dev, and we can host the docs on github or something. I think https://nixery.dev/ is really nice, and we should basically just copy what they do.

  • Deployment notes #3

@cdepillabout
Copy link
Contributor Author

Oh and here are two things that I don't think are necessary for an initial release, but it would be nicer to have them:

  • An easy way run a full development environment locally with a single command. Maybe something like docker-compose or arion would be nice here.

  • A Nixpkgs overlay that gets more static stuff working. For instance, if I remember correctly in Nixpkgs postgresql_14 can't be compiled statically, but postgresql_12 can. Also, systemd can't be built statically, but it is a reverse dependency of a lot of packages. We'd like an overlay that did things like aliased postgresql to postgresql_12, disabled systemd support in a lot of packages, and other helpful things like that.

    There are a bunch of good ideas in https://github.com/nh2/static-haskell-nix/blob/master/survey/default.nix and Bump to nixpkgs-21.11 nh2/static-haskell-nix#111.

@kayhide
Copy link
Contributor

kayhide commented Jul 28, 2022

The A web api. seems to contain two different things: how to implement server and how to determine package from command name.

timeout issue looks to belong to the server part.

@cdepillabout
Copy link
Contributor Author

@kayhide Yeah, I think you're right.

@kayhide @jonascarpay Also, please feel free to directly edit the comments in this issue to clean things up or better separate tasks! I imagine we might want to eventually split out each of these bullet points to separate issues when we really start working on each one.

@jonascarpay
Copy link
Contributor

Do you know if it's possible to easily distinguish a curl query vs a browser query? I'd like it if accessing binplz.dev/foo with a browser could give you some information, maybe the build log, and a download link, but with curl it would download the binary. Although maybe there's a better way to go about this.

Also, I want to get some experience with IaC, do any of you have experience with this?

@cdepillabout
Copy link
Contributor Author

Do you know if it's possible to easily distinguish a curl query vs a browser query? I'd like it if accessing binplz.dev/foo with a browser could give you some information, maybe the build log, and a download link, but with curl it would download the binary.

I was actually thinking the exact same thing! I think this would be a really neat feature.

Although I imagine we should probably only work on this after the initial release, since I could see this potentially being a lot of work.

In order to determine whether the user is accessing through curl or through the browser, you can look at the user agent. For instance, curl sets its user agent to curl:

$ curl --verbose google.com
> GET / HTTP/1.1
> Host: google.com
> User-Agent: curl/7.81.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 301 Moved Permanently
< Location: http://www.google.com/
...

It's the User-Agent: curl/7.81.0 line. There are probably a bunch of sites that will show you the user-agent for your browser. Here's the first one that came up on google: https://www.whatismybrowser.com/detect/what-is-my-user-agent/

Also, I want to get some experience with IaC, do any of you have experience with this?

We're currently using Terraform at work, and it is pretty nice. I was going to suggest we do deployments to AWS (including the ec2 instances or docker containers, dynamo db creation, route53 registration(?), s3 bucket creation, cloudflare cdn setup(?)) with something like Terraform.

@kayhide
Copy link
Contributor

kayhide commented Jul 29, 2022

I'd like it if accessing binplz.dev/foo with a browser could give you some information, maybe the build log, and a download link, but with curl it would download the binary. Although maybe there's a better way to go about this.

I am kind of against this idea. Changing responses depending on a user-agent is often more confusing than beneficial.
What if a user tried to get the html by curl after seeing its browser version?

Or have you had any good experience of user-agent switching behavior?

Rather than that, I would suggest different urls for the browser versions like binpls.dev/web/vim or binpls.web.dev/vim.

Also, I want to get some experience with IaC, do any of you have experience with this?

I have experience of serverless and kubernetes, and I don't think either of them fits our case (at least for the beginning).
I am interested in terraform.

@cdepillabout
Copy link
Contributor Author

Here's what we talked about during our meeting today:

  • Maybe we should use https://nixbuild.net/ for the builder? It might just be cheaper than trying to run an always-on build machine on AWS. Or maybe we could just preemptively build everything in Nixpkgs and push it to some sort of shared cache on S3 or something.

  • negative caching: We do need to do negative caching, but not necessarily do positive caching. We'll let the Nix store handle our positive caching. Let's use sqlite for the initial mvp. For every query, we have a boolean flag for whether it is failure, and a number for how many times it has been requested.

  • specification for the URL api thing for our MVP: /PROGRAM_NAME or /NIXPKGS_ATTR_PATH/PROGRAM_NAME. When just specifying PROGRAM_NAME, first look for NIXPKGS_ATTR_PATH with same name, and if it doesn't exist, take first nixpkgs package that provides the given PROGRAM_NAME.

  • @jonascarpay will work on putting together a proof of concept in Haskell for the web API and Nix build watcher thread thing. I will look into putting together a documentation site. @kayhide will do a little investigation of how we can deploy to AWS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants