Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation around the DataHub RFC process. #1754

Merged
merged 1 commit into from
Jul 28, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@ DataHub is LinkedIn's generalized metadata search & discovery tool. To learn mor
* [Generalized Metadata Service](https://github.com/linkedin/datahub/tree/master/gms)
* [Metadata Ingestion](https://github.com/linkedin/datahub/tree/master/metadata-ingestion)
* [Metadata Processing Jobs](https://github.com/linkedin/datahub/tree/master/metadata-jobs)
* [The RFC Process](rfc.md)
123 changes: 123 additions & 0 deletions docs/rfc.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# DataHub RFC

## What is an RFC?

The "RFC" (request for comments) process is intended to provide a consistent and controlled path for new features,
significant modifications, or any other significant proposal to enter DataHub and its related frameworks.

Many changes, including bug fixes and documentation improvements can be implemented and reviewed via the normal GitHub
pull request workflow.

Some changes though are "substantial", and we ask that these be put through a bit of a design process and produce a
consensus among the DataHub core teams.

## The RFC life-cycle

An RFC goes through the following stages:

- *Discussion* (Optional): Create an issue with the "RFC" label to have a more open ended, initial discussion around
your proposal (useful if you don't have a concrete proposal yet). Consider posting to #rfc in [Slack](../slack.md)
for more visibility.
- *Pending*: when the RFC is submitted as a PR. Please add the "RFC" label to the PR.
- *Active*: when an RFC PR is merged and undergoing implementation.
- *Landed*: when an RFC's proposed changes are shipped in an actual release.
- *Rejected*: when an RFC PR is closed without being merged.

[Pending RFC List](https://github.com/linkedin/datahub/pulls?q=is%3Apr+is%3Aopen+label%3Arfc+)

## When to follow this process

You need to follow this process if you intend to make "substantial" changes to any components in the DataHub git repo,
their documentation, or any other projects under the purview of the DataHub core teams. What constitutes a "substantial"
change is evolving based on community norms, but may include the following:

- A new feature that creates new API surface area, and would require a feature flag if introduced.
- The removal of features that already shipped as part of the release channel.
- The introduction of new idiomatic usage or conventions, even if they do not include code changes to DataHub itself.

Some changes do not require an RFC:

- Rephrasing, reorganizing or refactoring
- Addition or removal of warnings
- Additions that strictly improve objective, numerical quality criteria (speedup)

If you submit a pull request to implement a new, major feature without going through the RFC process, it may be closed
with a polite request to submit an RFC first.

## Gathering feedback before submitting

It's often helpful to get feedback on your concept before diving into the level of API design detail required for an
RFC. You may open an issue on this repo to start a high-level discussion, with the goal of eventually formulating an RFC
pull request with the specific implementation design. We also highly recommend sharing drafts of RFCs in #rfc on the
[DataHub Slack](../slack.md) for early feedback.

## The process

In short, to get a major feature added to DataHub, one must first get the RFC merged into the RFC repo as a markdown
file. At that point the RFC is 'active' and may be implemented with the goal of eventual inclusion into DataHub.

- Fork the DataHub repository.
- Copy the `000-template.md` template file to `docs/rfcs/active/000-my-feature.md`, where `my-feature` is more
descriptive. Don't assign an RFC number yet.
- Fill in the RFC. Put care into the details. *RFCs that do not present convincing motivation, demonstrate understanding
of the impact of the design, or are disingenuous about the drawback or alternatives tend to be poorly-received.*
- Submit a pull request. As a pull request the RFC will receive design feedback from the larger community, and the
author should be prepared to revise it in response.
- Update the pull request to add the number of the PR to the filename and add a link to the PR in the header of the RFC.
- Build consensus and integrate feedback. RFCs that have broad support are much more likely to make progress than those
that don't receive any comments.
- Eventually, the DataHub team will decide whether the RFC is a candidate for inclusion.
- RFCs that are candidates for inclusion will entire a "final comment period" lasting 7 days. The beginning of this
period will be signaled with a comment and tag on the pull request. Furthermore, an announcement will be made in the
\#rfc Slack channel for further visibility.
- An RFC acan be modified based upon feedback from the DataHub team and community. Significant modifications may trigger
a new final comment period.
- An RFC may be rejected by the DataHub team after public discussion has settled and comments have been made summarizing
the rationale for rejection. The RFC will enter a "final comment period to close" lasting 7 days. At the end of the "FCP
to close" period, the PR will be closed.
- An RFC author may withdraw their own RFC by closing it themselves. Please state the reason for the withdrawal.
- An RFC may be accepted at the close of its final comment period. A DataHub team member will merge the RFC's associated
pull request, at which point the RFC will become 'active'.


## Details on Active RFCs

Once an RFC becomes active then authors may implement it and submit the feature as a pull request to the DataHub repo.
Becoming 'active' is not a rubber stamp, and in particular still does not mean the feature will ultimately be merged; it
does mean that the core team has agreed to it in principle and are amenable to merging it.

Furthermore, the fact that a given RFC has been accepted and is 'active' implies nothing about what priority is assigned
to its implementation, nor whether anybody is currently working on it.

Modifications to active RFC's can be done in followup PR's. We strive to write each RFC in a manner that it will reflect
the final design of the feature; but the nature of the process means that we cannot expect every merged RFC to actually
reflect what the end result will be at the time of the next major release; therefore we try to keep each RFC document
somewhat in sync with the language feature as planned, tracking such changes via followup pull requests to the document.

## Implementing an RFC

The author of an RFC is not obligated to implement it. Of course, the RFC author (like any other developer) is welcome
to post an implementation for review after the RFC has been accepted.

An active RFC should have the link to the implementation PR(s) listed, if there are any. Feedback to the actual
implementation should be conducted in the implementation PR instead of the original RFC PR.

If you are interested in working on the implementation for an 'active' RFC, but cannot determine if someone else is
already working on it, feel free to ask (e.g. by leaving a comment on the associated issue).

## Implemented RFCs

Once an RFC has finally be implemented, first off, congratulations! And thank you for your contribution! Second, to
help track the status of the RFC, please make one final PR to move the RFC from `docs/rfc/active` to
`docs/rfc/finished`.

## Reviewing RFCs

Most of the DataHub team will attempt to review some set of open RFC pull requests on a regular basis. If a DataHub
team member believes an RFC PR is ready to be accepted into active status, they can approve the PR using GitHub's
review feature to signal their approval of the RFCs.



*DataHub's RFC process is inspired by many others, including [Vue.js](https://github.com/vuejs/rfcs) and
[Ember](https://github.com/emberjs/rfcs).*
65 changes: 65 additions & 0 deletions docs/rfc/templates/000-template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
- Start Date: (fill me in with today's date, YYYY-MM-DD)
- RFC PR: (after opening the RFC PR, update this with a link to it and update the file name)
- Discussion Issue: (GitHub issue this was discussed in before the RFC, if any)
- Implementation PR(s): (leave this empty)

# <RFC title>

## Summary

> Brief explanation of the feature.

## Basic example

> If the proposal involves a new or changed API, include a basic code example. Omit this section if it's not applicable.

## Motivation

> Why are we doing this? What use cases does it support? What is the expected outcome?
>
> Please focus on explaining the motivation so that if this RFC is not accepted, the motivation could be used to develop
> alternative solutions. In other words, enumerate the constraints you are trying to solve without coupling them too
> closely to the solution you have in mind.

## Detailed design

> This is the bulk of the RFC.

> Explain the design in enough detail for somebody familiar with the framework to understand, and for somebody familiar
> with the implementation to implement. This should get into specifics and corner-cases, and include examples of how the
> feature is used. Any new terminology should be defined here.

## How we teach this

> What names and terminology work best for these concepts and why? How is this idea best presented? As a continuation
> of existing DataHub patterns, or as a wholly new one?

> What audience or audiences would be impacted by this change? Just DataHub backend developers? Frontend developers?
> Users of the DataHub application itself?

> Would the acceptance of this proposal mean the DataHub guides must be re-organized or altered? Does it change how
> DataHub is taught to new users at any level?

> How should this feature be introduced and taught to existing audiences?

## Drawbacks

> Why should we *not* do this? Please consider the impact on teaching DataHub, on the integration of this feature with
> other existing and planned features, on the impact of the API churn on existing apps, etc.

> There are tradeoffs to choosing any path, please attempt to identify them here.

## Alternatives

> What other designs have been considered? What is the impact of not doing this?

> This section could also include prior art, that is, how other frameworks in the same domain have solved this problem.

## Rollout / Adoption Strategy

> If we implemented this proposal, how will existing users / developers adopt it? Is it a breaking change? Can we write
> automatic refactoring / migration tools? Can we provide a runtime adapter library for the original API it replaces?

## Unresolved questions

> Optional, but suggested for first drafts. What parts of the design are still TBD?
6 changes: 6 additions & 0 deletions docs/slack.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Slack

The DataHub team has an open source slack channel to discuss development and support.

[Sign up](https://join.slack.com/t/datahubspace/shared_invite/zt-dkzbxfck-dzNl96vBzB06pJpbRwP6RA) or [log in with an
existing account](https://datahubspace.slack.com/) to `datahubspace` on Slack.