-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
6ac5b1c
commit 2fa5370
Showing
1 changed file
with
34 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# Dataset | ||
|
||
This is the **Myriad People** dataset. All these files have been generated by running `script.py` in the `../mining` folder. Here are their descriptions. | ||
|
||
`all_loggedin_contributors.json`: list of all logged-in (i.e. not anonymous) GitHub contributors, with: | ||
- `type`: type of contributor, `User` or `Bot` | ||
- `id`: GitHub username | ||
- `contributions`: list of repositories they contributed to, with: | ||
- `repo_name`: name of repository | ||
- `contributions`: number of contributions that they made to this project | ||
|
||
`categories_info.json`: list of categories, with: | ||
- `category`: name of the category | ||
- `repos`: list of names of repositories in that category | ||
|
||
`repos_info.json`: list of all repositories for which the GitHub API managed to fetch the data, with: | ||
- `name`: name of the repository | ||
- `category`: category it belongs to | ||
- `exclusivity`: either the name of an artwork if this repository was exclusively used in that artwork, or `null` if it was used in at least two artworks | ||
- `created_at`: creation date of the repository, in the Python `datetime` format | ||
- `total_contributions`: total number of contributions | ||
- `anonymous_contributors`: number of anonymous contributors | ||
- `loggedin_contributors`: number of logged-in contributors | ||
|
||
`gh_api_failures.json`: list of repositories for which the GitHub API failed (because they are too big), with `name`, `category` and `exclusivity`, as in `repos_info.json` | ||
|
||
`individual_repos` folder: one file per repository, in the format `owner&name.json`, with: | ||
- `repo_name`: name of the repository | ||
- `contributors`: list of contributors, with: | ||
- `type`: type of contributor, `User` or `Bot` | ||
- `id`: GitHub username | ||
- `contributions`: number of contributions that they made to this project | ||
|
||
In all the files, repository names attributes are in the format `owner/name`. |