"The Science of Science (SciSci) is based on a transdisciplinary approach that uses large data sets to study the mechanisms underlying the doing of science—from the choice of a research problem to career trajectories and progress within a field"[1].
The pySciSci
package offers a unified interface to analyze several of the most common Bibliometric DataBases used in the Science of Science, including:
The pySciSci
package also provides efficient implementations of recent metrics developed to study scientific publications and authors, including:
Publications Metrics | |
---|---|
Measure | Example |
Interdisciplinarity - Simpsons Index | Example of Interdisciplinarity |
Interdisciplinarity - Shannons Index | Example of Interdisciplinarity |
Interdisciplinarity - RoaStirling Index | Example of Interdisciplinarity |
Disruption Index | Example Publication Citations |
Sleeping Beauty Coefficient | Example Publication Citations |
Novelty & Conventionality | Example Novelty |
Long Term Citation | Example Publication Citations |
Author Metrics | |
---|---|
Measure | Example |
H-index | Example Career Analysis |
G-index | Example Career Analysis |
Q-factor | Example Career Analysis |
Annual productivity trajectories | Example Career Analysis |
Author Pagerank | Example of Scientific Credit |
Collective credit allocation | Example of Credit Allocation |
Career Topic Switching | Example Career Topic Switching |
HotStreak | Example Career Analysis |
Advanced tools for constructing and analyzing network objects (both static and temporal):
Network Analysis | |
---|---|
Measure | Example |
Citation Network | |
Author Citation Network | Example of Diffusion of Scientific Credit |
Co-citation network | Example of Cocitation Network |
Co-authorship network | |
Co-mention network | Example of Coword Mention Network |
Graph2vec network embedding | Example_Node2vec |
Multiscale Backbone | Example of Cocitation Network |
Career Topic Switching | Example Career Topic Switching |
Natural Language Processing
- Publication matching
- Author matching
Visualization
- Career Timelines Example Career Analysis
pip install pyscisci
Pull and install in the current directory:
pip install git+https://github.com/SciSciCollective/pyscisci
- To enable all extra functionality run: pip install pyscisci[nlp,hdf]
- The requirement to only use hdf tables has been removed, thus the dependency on tables is moved to an extra: pip install pyscisci[hdf]
- Advanced NLP dependencies can be installed by running: pip install pyscisci[nlp]
Currently, the pySciSci
is built on top of pandas, and keeps entire dataframes in working memory. We have found that most large-scale analyzes
can be performed on a personal computer with extended RAM. If you don't have enough computational power, consider a smaller database (DBLP or APS), or running on a cloud computing platform (Google Cloud, Microsoft Azure, Amazon Web Services, etc).
We also support basic Dask implementations for multiprocessing. An example notebook can be found here.
See the contributing guide for detailed instructions on how to get started with our project.
- HTML documentation is available readthedocs.
- Email: Alex Gates ([email protected])
[1] Fortunato et al. (2018). Science of Science. Science, 359(6379), eaao0185.
[2] Wang & Barabasi (2021). Science of Science. Cambridge University Press.
pySciSci
was originally written by Alexander Gates, and has been developed
with the help of many others. Thanks to everyone who has improved pySciSci
by contributing code, bug reports (and fixes), documentation, and input on design, and features.
Original Author
- Alexander Gates, GitHub: ajgates42
Contributors
Optionally, add your desired name and include a few relevant links. The order is an attempt at historical ordering.
- Jisung Yoon, GitHub: jisungyoon
- Kishore Vasan, GitHub: kishorevasan
pySciSci
those who have contributed to pySciSci
have received
support throughout the years from a variety of sources. We list them below.
If you have provided support to pySciSci
and a support acknowledgment does
not appear below, please help us remedy the situation, and similarly, please
let us know if you'd like something modified or corrected.
Research Groups
pySciSci
was developed with full support from the following:
- School of Data Science, University of Virginia, Charlottesville, VA; PI: Alexander Gates
- Network Science Institute, Northeastern University, Boston, MA; PI: Albert-Laszlo Barabasi
Funding
pySciSci
acknowledges support from the following grants:
- Air Force Office of Scientific Research Award FA9550-19-1-0354
- Templeton Foundation Contract 61066