Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blog: remote optimization post #1451

Closed
wants to merge 2 commits into from

Conversation

pmrowla
Copy link
Contributor

@pmrowla pmrowla commented Jun 19, 2020

You may disregard these recommendations if you used the Edit on GitHub button from dvc.org to improve a doc in place.

❗ Please read the guidelines in the Contributing to the Documentation list if you make any substantial changes to the documentation or JS engine.

🐛 Please make sure to mention Fix #issue (if applicable) in the description of the PR. This causes GitHub to close it automatically when the PR is merged.

Please choose to allow us to edit your branch when creating the PR.

Thank you for the contribution - we'll try to review it as soon as possible. 🙏

Initial draft for the remote optimization write up

TODO

  • improve introduction
  • needs conclusion
  • update placeholder image
  • update placeholder date

@pmrowla pmrowla self-assigned this Jun 19, 2020
@pmrowla
Copy link
Contributor Author

pmrowla commented Jun 19, 2020

Not sure if the initial draft is too in depth/technical.

@andronovhopf I'd appreciate it if you can take a look at this and give some suggestions on how to make it more interesting/applicable for users from an ML perspective

@@ -0,0 +1,174 @@
---
title: Optimizing DVC Remotes
date: 2020-06-29
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

placeholder date

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be whatever you prefer.

date: 2020-06-29
description: |
An overview of how syncing data to and from remote storage is optimized in DVC.
picture: 2020-05-04/owl.png
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

placeholder image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the impression if you leave it blank it uses a default img BTW.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was having issues running the dev server (via yarn develop) when picture was unset, maybe that's just some problem with my local environment though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I'm not sure exactly how the blog engine works! You can create a bug report though and Ivan or Roger will probably answer to that 🙂

@@ -0,0 +1,174 @@
---
title: Optimizing DVC Remotes
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably needs a more interesting title

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the intro think this post is more about "Optimization improvements in DVC 1.0"

@skshetry skshetry added A: docs Area: user documentation (gatsby-theme-iterative) and removed A: docs Area: user documentation (gatsby-theme-iterative) labels Jun 19, 2020
Comment on lines 7 to 13
author: peter_rowlands
---

One of the key features provided by DVC is the ability to efficiently sync
versioned datasets between a user's local machine and
[remote storage](https://dvc.org/doc/command-reference/remote), and version 1.0
includes several performance optimizations related to syncing data with remotes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would start if possible with something like "Our users have presented the need for optimizing remotes blah blah" and give some examples e.g. Discord message screenshots.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, minor: I personally prefer "synchronizing" or "syncing". The pronunciation of the latter is questionable, no?

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick review of blog intro. Some of these suggestions can probably be applied to other places in the blog.

Comment on lines 7 to 13
author: peter_rowlands
---

One of the key features provided by DVC is the ability to efficiently sync
versioned datasets between a user's local machine and
[remote storage](https://dvc.org/doc/command-reference/remote), and version 1.0
includes several performance optimizations related to syncing data with remotes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, minor: I personally prefer "synchronizing" or "syncing". The pronunciation of the latter is questionable, no?

content/blog/2020-06-29-optimizing-dvc-remotes.md Outdated Show resolved Hide resolved
content/blog/2020-06-29-optimizing-dvc-remotes.md Outdated Show resolved Hide resolved
content/blog/2020-06-29-optimizing-dvc-remotes.md Outdated Show resolved Hide resolved
3. Determine the difference between the two sets of files

Commonly used cloud sync utilities, such as [rclone](https://rclone.org/), must
be generalized to support any arbitrary file structure, which can come at the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
be generalized to support any arbitrary file structure, which can come at the
be generalized to support any file structure, which can come at the

content/blog/2020-06-29-optimizing-dvc-remotes.md Outdated Show resolved Hide resolved
Comment on lines 32 to 35
operations (i.e. `status -c`,
[push](https://dvc.org/doc/command-reference/push),
[pull](https://dvc.org/doc/command-reference/pull),
[fetch](https://dvc.org/doc/command-reference/fetch)). In DVC version 1.0, these
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
operations (i.e. `status -c`,
[push](https://dvc.org/doc/command-reference/push),
[pull](https://dvc.org/doc/command-reference/pull),
[fetch](https://dvc.org/doc/command-reference/fetch)). In DVC version 1.0, these
operations (i.e. `dvc status -c`,
`dvc push`,
`dvc pull`,
`dvc fetch`). In DVC version 1.0, these

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Jun 19, 2020

@pmrowla very nice! Please note on this repo we don't mind if you push a branch directly to upstream, in fact that's usually better because it fires up a review app automatically. I created one manually for this PR, you can see your post here: https://dvc-landing-blog-remote-uhiudf.herokuapp.com/blog/optimizing-dvc-remotes Cheers

@shcheklein shcheklein temporarily deployed to dvc-landing-blog-remote-uhiudf June 22, 2020 05:45 Inactive
@pmrowla pmrowla closed this Jun 22, 2020
@pmrowla pmrowla deleted the blog-remote-optimization branch June 22, 2020 05:48
@pmrowla pmrowla mentioned this pull request Jun 22, 2020
4 tasks
@shcheklein shcheklein temporarily deployed to dvc-landing-blog-remote-rbug3z June 22, 2020 05:50 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants