Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move README rendering off the web server #1745

Merged
merged 2 commits into from
May 30, 2019

Conversation

sgrif
Copy link
Contributor

@sgrif sgrif commented May 13, 2019

⚠️ This PR contains destructive migrations. The commits included must be deployed separately. ⚠️

Rendering the README is a process that can take an arbitrarily long amount of time, and ends with uploading a file to S3 -- which also can take an arbitrarily long period of time. Since there are no README contents which we consider invalid (which I've indicated by changing the return type of the rendering functions to String instead of Result), there's no reason for us to do any of this work on the web server.

Just like the index updates, we can offload this to our worker to be completed at some point in the future. This does mean that a crate will render in our UI before the README has finished rendering. During this brief period, we will behave the same as if the crate had no README file, showing the description instead.

Eventually I think it'd make sense to have this job behave like the render-readmes worker, where it pulls the tar file down and reads the actual readme file. This would mean that Cargo can stop sending us the README as its own field when its already sending us a tar file which contains the README. I've left it alone for now though, since there's no benefit while Cargo is already sending us the README, and we can't have it stop doing that until we move to publish v2.

A few implementation notes:

  • The readme column on crates was never used for anything in Rust, but is used on the DB side to generate the search index. I've left the column in place, but removed most uses of it on the Rust side.
  • The readme_file column on crates never used for anything.
  • background_jobs::Environment gained two fields that are just straight up copied from App. I've opted to do this rather than just giving it App, since I do want to separate "the stuff you need to do background jobs" from "the stuff you need to be the web server". Since we do eventually plan on not having s3 uploads happen on the server, I want to bundle all of this into a single field and just have it live on background_jobs::Environment eventually.
  • I did some general cleanup of krate::publish while I was in there, removed a lot of pre-1.0 style code.
  • I've left the render-readmes binary alone for now since it's untested. I'll change it to just queue up jobs once we get to the point where jobs are doing all the tar file work as well.

@bors
Copy link
Contributor

bors commented May 22, 2019

☔ The latest upstream changes (presumably #1686) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Member

@jtgeibel jtgeibel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this in staging and confirmed that the readme is uploaded correctly. (Although I'm currently missing CORS headers in my S3 bucket to allow the readme to display in the page.)

This needs rebased and I have a few minor comments on the doc comments. Otherwise, this looks good to me.

src/uploaders.rs Show resolved Hide resolved
src/controllers/krate/publish.rs Outdated Show resolved Hide resolved
sgrif added 2 commits May 30, 2019 11:11
Rendering the README is a process that can take an arbitrarily long
amount of time, and ends with uploading a file to S3 -- which also can
take an arbitrarily long period of time. Since there are no README
contents which we consider invalid (which I've indicated by changing the
return type of the rendering functions to `String` instead of `Result`),
there's no reason for us to do any of this work on the web server.

Just like the index updates, we can offload this to our worker to be
completed at some point in the future. This does mean that a crate will
render in our UI before the README has finished rendering. During this
brief period, we will behave the same as if the crate had no README
file, showing the description instead.

Eventually I think it'd make sense to have this job behave like the
`render-readmes` worker, where it pulls the tar file down and reads the
actual readme file. This would mean that Cargo can stop sending us the
README as its own field when its already sending us a tar file which
contains the README. I've left it alone for now though, since there's no
benefit while Cargo is already sending us the README, and we can't have
it stop doing that until we move to publish v2.

A few implementation notes:

- The `readme` column on `crates` was never used for anything in Rust,
  but is used on the DB side to generate the search index. I've left the
  column in place, but removed most uses of it on the Rust side.
- The `readme_file` column on `crates` never used for anything. This
  commit does not include the migration, which must be deployed
  separately. This commit only includes code which works with or without
  the migration for deployment.
- `background_jobs::Environment` gained two fields that are just
  straight up copied from `App`. I've opted to do this rather than just
  giving it `App`, since I do want to separate "the stuff you need to do
  background jobs" from "the stuff you need to be the web server". Since
  we do eventually plan on not having s3 uploads happen on the server, I
  want to bundle all of this into a single field and just have it live
  on `background_jobs::Environment` eventually.
- I did some general cleanup of `krate::publish` while I was in there,
  removed a lot of pre-1.0 style code.
- I've left the `render-readmes` binary alone for now since it's
  untested. I'll change it to just queue up jobs once we get to the
  point where jobs are doing all the tar file work as well.
⚠️ This commit contains a destructive migration. ⚠️
⚠️ It must be deployed separately from the previous commit. ⚠️

This column was never used. It's no longer referenced in Rust code, so
we can just remove it.
@sgrif sgrif force-pushed the sg-async-readme-renderings branch from 08aa788 to b631302 Compare May 30, 2019 17:11
@sgrif
Copy link
Contributor Author

sgrif commented May 30, 2019

@bors r=jtgeibel

@bors
Copy link
Contributor

bors commented May 30, 2019

📌 Commit b631302 has been approved by jtgeibel

@bors
Copy link
Contributor

bors commented May 30, 2019

⌛ Testing commit b631302 with merge e8130f3...

bors added a commit that referenced this pull request May 30, 2019
Move README rendering off the web server

:warning: This PR contains destructive migrations. The commits included must be deployed separately. :warning:

Rendering the README is a process that can take an arbitrarily long amount of time, and ends with uploading a file to S3 -- which also can take an arbitrarily long period of time. Since there are no README contents which we consider invalid (which I've indicated by changing the return type of the rendering functions to `String` instead of `Result`), there's no reason for us to do any of this work on the web server.

Just like the index updates, we can offload this to our worker to be completed at some point in the future. This does mean that a crate will render in our UI before the README has finished rendering. During this brief period, we will behave the same as if the crate had no README file, showing the description instead.

Eventually I think it'd make sense to have this job behave like the `render-readmes` worker, where it pulls the tar file down and reads the actual readme file. This would mean that Cargo can stop sending us the README as its own field when its already sending us a tar file which contains the README.  I've left it alone for now though, since there's no benefit while Cargo is already sending us the README, and we can't have it stop doing that until we move to publish v2.

A few implementation notes:

- The `readme` column on `crates` was never used for anything in Rust, but is used on the DB side to generate the search index. I've left the column in place, but removed most uses of it on the Rust side.
- The `readme_file` column on `crates` never used for anything.
- `background_jobs::Environment` gained two fields that are just straight up copied from `App`. I've opted to do this rather than just giving it `App`, since I do want to separate "the stuff you need to do background jobs" from "the stuff you need to be the web server". Since we do eventually plan on not having s3 uploads happen on the server, I want to bundle all of this into a single field and just have it live on `background_jobs::Environment` eventually.
- I did some general cleanup of `krate::publish` while I was in there, removed a lot of pre-1.0 style code.
- I've left the `render-readmes` binary alone for now since it's untested. I'll change it to just queue up jobs once we get to the point where jobs are doing all the tar file work as well.
@bors
Copy link
Contributor

bors commented May 30, 2019

☀️ Test successful - checks-travis
Approved by: jtgeibel
Pushing e8130f3 to master...

@bors bors merged commit b631302 into rust-lang:master May 30, 2019
@sgrif sgrif deleted the sg-async-readme-renderings branch August 14, 2019 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants