OSCI, an open source project, aiming to track and measure open source activity on GitHub by commercial organizations. It allows organizations, communities, analysts and individuals involved in Open Source to get insights about contribution trends among commercial organizations by providing access to up-to-date data through an intuitive interface.
- How does OSCI work?
- How are commit authors linked to commercial organizations?
- How can I submit my company for ranking?
- How can I contribute to OSCI?
- Quick Start
- OSCI Versioning
- License
- Contact Us
To create this index, the system processes GitHub push events data from GH Archive:
OSCI tracks two measures at each organization:
- Active contributors, the number of people who authored 10 or more commits over a period of time
- Total community, the number of people who made at least one commit over a period of time
The system uses email domain of the commit author to identify the organization. Your organization is missing in the ranking? Feel free to add your organization to the list.
Note: OSCI does not rank open source activity contributed by universities, research institutions and individual entrepreneurs.
-
Check whether the organization you propose to add matches OSCI definition:
- not an educational, governmental, non-profit or research institution;
- registered, commercial organization;
- sells goods or services for the purpose of making a profit.
-
Create a new pull request.
-
Go to company domain match list (company_domain_match_list.yaml)
-
Double check that the organization you want to add is not listed.
-
Add the email domain of the company and the company name to the table. For example:
- company: Facebook domains: - fb.com regex:
-
If the company has more than 1 email domain for its employees, add all of them to block
domains
(orregex
for using regular expression). For example:- company: Facebook domains: - fb.com - facebook.com regex: - ^.*\.fb\.com$ - ^.*\.facebook\.com$
-
Select the industry to which your company belongs from the following list:
- Automotive;
- Banking, Insurance & Financial Services;
- Education;
- Energy & Utilities;
- Entertainment;
- Healthcare and Pharma;
- Professional Services;
- Public Sector;
- Retail & Hospitality;
- Technology;
- Media & Telecoms;
- Travel & Transport;
- Other (please specify);
For example:
- company: Facebook domains: - fb.com - facebook.com regex: - ^.*\.fb\.com$ - ^.*\.facebook\.com$ industry: Media & Telecoms
Our team will review your pull request and merge it if everything is correct.
Note: since OSCI processes the data for the previous month, you'll see your organization's rank in the beginning of the next month.
See CONTRIBUTING.md for details on contribution process.
OSCI is deployed into Azure Cloud environment using Azure DataFactory, Azure Function and Azure DataBricks. However, the code available on GitHub does not require using of Azure Cloud. Run the application from the command line using the instruction below.
- Clone repository
git clone https://github.com/epam/OSCI.git
- Go to project directory
cd OSCI
- Install requirements
pip install -r requirements.txt
Create a file local.yml
(by default this file added to .gitignore) in the directory osci/config/files
.
A sample file default.yml
is included, please don't change values in this file
- Run script to download data from archive (for example for 01 January 2020)
python3 osci-cli.py get-github-daily-push-events -d 2020-01-01
- Run script to add company field (matched by domain) (for example for 01 January 2020)
python3 osci-cli.py process-github-daily-push-events -d 2020-01-01
- Run script to add company field (matched by domain) (for example for 01 January 2020)
python3 osci-cli.py daily-osci-rankings -td 2020-01-02
For a comprehensive OSCI versioning we adopted the following approach <year>.<month>.<number of patch >
) e.g. 2021.05.0. We expect regularly monthly updates including releases associated with submission of a new company for ranking.
OSCI is licensed under the GNU General Public License v3.0.
For support or help using OSCI, please contact us at [email protected].