Skip to content

v0.10.3

Compare
Choose a tag to compare
@J535D165 J535D165 released this 22 Sep 14:07
· 29 commits to main since this release
fe3908b

What's Changed

  • Update the benchmark dataset and run a larger benchmark.

Coverage report

The following benchmark was applied to 1000 randomly selected records from Datacite.

Percentages

Percentage of datasets supported: 27.4%
Percentage of datasets not supported: 69.9%
Percentage of datasets with error: 2.7%

Table with unexpected errors

id type url service error
9 10.48448/kgfs-s492 dois https://underline.io/lecture/50210-findings-thai-nested-named-entity-recognition-corpus nan 500 Server Error: Internal Server Error for url: https://underline.io/lecture/50210-findings-thai-nested-named-entity-recognition-corpus
52 10.18730/v7c2= dois https://glis.fao.org/glis/doi/10.18730/V7C2= nan '10.18730/v7c2=' is not a correct resource identifier (e.g. a URL, DOI, Handle)
73 10.20345/digitue.1029.61 dois http://idb.ub.uni-tuebingen.de/opendigi/litrdsch_1902#p=141 nan 500 Server Error: Internal Server Error for url: https://idb.ub.uni-tuebingen.de/opendigi/litrdsch_1902#p=141
81 10.7916/d8-qcx3-yp94 dois https://dlc.library.columbia.edu/resolve/10.7916/d8-qcx3-yp94 nan 500 Server Error: Internal Server Error for url: https://dlc.library.columbia.edu/catalog/10.7916/d8-qcx3-yp94
96 10.17876/plate/dr.2/plates/201_33742 dois https://www.plate-archive.org/objects/dr.2/plates/201_33742 nan 500 Server Error: Internal Server Error for url: https://www.plate-archive.org/objects/dr.2/plates/201_33742/
119 10.18430/m3.irrmc.4168 dois https://proteindiffraction.org/project/SETDB1-x122 nan 'NoneType' object has no attribute 'find'
133 10.14469/ch/8676 dois https://spectradspace.lib.imperial.ac.uk:8443/dspace/handle/10042/to-8701 nan HTTPSConnectionPool(host='spectradspace.lib.imperial.ac.uk', port=8443): Max retries exceeded with url: /dspace/handle/10042/to-8701 (Caused by SSLError(SSLError(1, '[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:1006)')))
201 10.17188/1652700 dois https://www.osti.gov/servlets/purl/1652700/ nan HTTPSConnectionPool(host='www.osti.gov', port=443): Max retries exceeded with url: /servlets/purl/1652700/ (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f14556469d0>, 'Connection to www.osti.gov timed out. (connect timeout=3)'))
362 10.14469/ch/1303 dois https://spectradspace.lib.imperial.ac.uk:8443/dspace/handle/10042/to-1328 nan HTTPSConnectionPool(host='spectradspace.lib.imperial.ac.uk', port=8443): Max retries exceeded with url: /dspace/handle/10042/to-1328 (Caused by SSLError(SSLError(1, '[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:1006)')))
373 10.25716/hfmdk-505 dois https://hfmdk.hebis.de/jspui/handle/123456789/507 nan HTTPSConnectionPool(host='hfmdk.hebis.de', port=443): Read timed out. (read timeout=10)
397 10.17876/plate/dr.2/envelopes/201_50873 dois https://www.plate-archive.org/objects/dr.2/envelopes/201_50873 nan 500 Server Error: Internal Server Error for url: https://www.plate-archive.org/objects/dr.2/envelopes/201_50873/
400 10.23725/akhp-6959 dois https://ors.datacite.org/doi:/10.23725/akhp-6959 nan HTTPSConnectionPool(host='ors.datacite.org', port=443): Max retries exceeded with url: /doi:/10.23725/akhp-6959 (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7f14556577d0>: Failed to resolve 'ors.datacite.org' ([Errno -2] Name or service not known)"))
452 10.14469/ch/129258 dois https://spectradspace.lib.imperial.ac.uk:8443/dspace/handle/10042/134211 nan HTTPSConnectionPool(host='spectradspace.lib.imperial.ac.uk', port=8443): Max retries exceeded with url: /dspace/handle/10042/134211 (Caused by SSLError(SSLError(1, '[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:1006)')))
458 10.14469/ch/41814 dois https://spectradspace.lib.imperial.ac.uk:8443/dspace/handle/10042/48213 nan HTTPSConnectionPool(host='spectradspace.lib.imperial.ac.uk', port=8443): Max retries exceeded with url: /dspace/handle/10042/48213 (Caused by SSLError(SSLError(1, '[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:1006)')))
483 10.18730/12n7m$ dois https://glis.fao.org/glis/doi/10.18730/12N7M$ nan '10.18730/12n7m$' is not a correct resource identifier (e.g. a URL, DOI, Handle)
501 10.14456/stj.2019.4 dois http://doi.nrct.go.th/?page=resolve_doi&resolve_doi=10.14456/stj.2019.4 nan HTTPSConnectionPool(host='doi.nrct.go.th', port=443): Read timed out. (read timeout=10)
505 10.14469/ch/175982 dois https://spectradspace.lib.imperial.ac.uk:8443/dspace/handle/10042/180406 nan HTTPSConnectionPool(host='spectradspace.lib.imperial.ac.uk', port=8443): Max retries exceeded with url: /dspace/handle/10042/180406 (Caused by SSLError(SSLError(1, '[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:1006)')))
551 10.17876/plate/dr.2/plates/201_35722 dois https://www.plate-archive.org/objects/dr.2/plates/201_35722 nan 500 Server Error: Internal Server Error for url: https://www.plate-archive.org/objects/dr.2/plates/201_35722/
581 10.48550/arxiv.2309.02963 dois https://arxiv.org/abs/2309.02963 nan HTTPSConnectionPool(host='arxiv.org', port=443): Read timed out. (read timeout=10)
625 10.17182/hepdata.60582.v1/t187 dois https://www.hepdata.net/record/61173 nan HTTPSConnectionPool(host='www.hepdata.net', port=443): Read timed out. (read timeout=10)
683 10.20379/dbaud-1041 dois http://webdatenbank.grass-medienarchiv.de/receive/ggrass_mods_00001019 nan 503 Server Error: Service Unavailable for url: https://webdatenbank.grass-medienarchiv.de/receive/ggrass_mods_00001019
757 10.18730/q3s0= dois https://glis.fao.org/glis/doi/10.18730/Q3S0= nan '10.18730/q3s0=' is not a correct resource identifier (e.g. a URL, DOI, Handle)
782 10.20372/nadre:1554185535.13 dois https://nadre.ethernet.edu.et/record/3238?ln=en nan HTTPSConnectionPool(host='nadre.ethernet.edu.et', port=443): Max retries exceeded with url: /record/3238?ln=en (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f145582d490>, 'Connection to nadre.ethernet.edu.et timed out. (connect timeout=3)'))
816 10.14469/ch/90617 dois https://spectradspace.lib.imperial.ac.uk:8443/dspace/handle/10042/97675 nan HTTPSConnectionPool(host='spectradspace.lib.imperial.ac.uk', port=8443): Max retries exceeded with url: /dspace/handle/10042/97675 (Caused by SSLError(SSLError(1, '[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:1006)')))
822 10.48550/arxiv.2309.02836 dois https://arxiv.org/abs/2309.02836 nan HTTPSConnectionPool(host='arxiv.org', port=443): Read timed out. (read timeout=10)
894 10.5287/bodleianjpcy.2 dois https://databank.ora.ox.ac.uk/ww1archives/datasets/ww1-3945?version=2 nan HTTPSConnectionPool(host='databank.ora.ox.ac.uk', port=443): Max retries exceeded with url: /ww1archives/datasets/ww1-3945?version=2 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f14555ab890>, 'Connection to databank.ora.ox.ac.uk timed out. (connect timeout=3)'))
993 10.7916/d8-47rs-s759 dois https://dlc.library.columbia.edu/resolve/10.7916/d8-47rs-s759 nan 500 Server Error: Internal Server Error for url: https://dlc.library.columbia.edu/catalog/10.7916/d8-47rs-s759

Table with unsupported repositories

netloc count
pid.geoscience.gov.au 103
app.geosamples.org 79
doi.plutof.ut.ee 60
www.gbif.org 57
glis.fao.org 30
www.e-periodica.ch 26
ba.e-pics.ethz.ch 22
dlc.library.columbia.edu 19
bacdive.dsmz.de 18
rgdoi.net 16
digitallibrary.usc.edu 14
www.ccdc.cam.ac.uk 14
www.lfi.ch 11
nakala.fr 9
catalog.paradisec.org.au 8
www.osti.gov 8
www.plate-archive.org 7
doi.library.ubc.ca 7
digital.ucd.ie 7
architekturmuseum.ub.tu-berlin.de 6
doi.nrct.go.th 6
www.die-bonn.de 6
spectradspace.lib.imperial.ac.uk:8443 6
ntnu.tind.io 6
straininfo.dsmz.de 5
dis.iodp.pangaea.de 5
dadosdepesquisa.fiocruz.br 5
digi.ub.uni-heidelberg.de 5
publikationen.bibliothek.kit.edu 5
hdl.handle.net 4
era.library.ualberta.ca 4
www.rvdata.us 4
data.neotomadb.org 4
apex.ipk-gatersleben.de 3
statisticaldatasets.data-planet.com 3
epos.myesr.org 3
www.boldsystems.org 3
repository.edition-topoi.org 3
sage.figshare.com 3
journals.ub.uni-heidelberg.de 3
sr.ethz.ch 3
ageconsearch.umn.edu 3
www.hepdata.net 3
doi.ala.org.au 3
hasp.ub.uni-heidelberg.de 2
d.lib.msu.edu 2
core.tdar.org 2
arxiv.org 2
www.e-gs.ethz.ch 2
www.e-manuscripta.ch 2
search.rads-doi.org 2
bib-pubdb1.desy.de 2
pqr.pitt.edu 2
147.156.5.176:8080 2
cocoon.huma-num.fr 2
ikee.lib.auth.gr 2
springernature.figshare.com 2
gdac.broadinstitute.org 2
biosys.e-pics.ethz.ch 2
doi.roper.center 2
scholarworks.wm.edu 2
classiques-garnier.com 2
cyberleninka.ru 2
data.caltech.edu 1
archiviostorico.fondazione1563.it 1
resume.uni.lu 1
www.icpsr.umich.edu 1
databank.ora.ox.ac.uk 1
encyclopedia.1914-1918-online.net 1
epub.uni-regensburg.de 1
proteindiffraction.org 1
archiv.ub.uni-heidelberg.de 1
ad.e-pics.ethz.ch 1
ads.nipr.ac.jp 1
data.oceannetworks.ca 1
www.sozialpolitik.ch 1
www.openaccessrepository.it 1
qatest.labarchives.com 1
ap.elte.hu 1
www.bindingdb.org 1
cdr.lib.unc.edu 1
depositonce.tu-berlin.de 1
deepblue.lib.umich.edu 1
esdcdoi.esac.esa.int 1
psyarxiv.com 1
dataverse.callisto.calmip.univ-toulouse.fr 1
ascomycete.org 1
b2share.eudat.eu 1
resolver.caltech.edu 1
www.openagrar.de 1
ojs.utlib.ee 1
tecnoscienza.unibo.it 1
www.repository.cam.ac.uk 1
daac.ornl.gov 1
www.tib.eu 1
doi.ciser.cornell.edu 1
academiccommons.columbia.edu 1
bl.iro.bl.uk 1
journals.open.tudelft.nl 1
tuprints.ulb.tu-darmstadt.de 1
idb.ub.uni-tuebingen.de 1
www.archaeolog.ru 1
webdatenbank.grass-medienarchiv.de 1
rockstore.csiro.au 1
rucore.libraries.rutgers.edu 1
dlc.mpg.de 1
www.crd.york.ac.uk 1
nadre.ethernet.edu.et 1
www.psycharchives.org 1
underline.io 1
cwm-archiv.gbv.de 1
publica.fraunhofer.de 1
theses.gla.ac.uk 1
www.jamstec.go.jp 1
drops.dagstuhl.de 1
dataservices.gfz-potsdam.de 1
boris.unibe.ch 1
ors.datacite.org 1
www.e-rara.ch 1
hfmdk.hebis.de 1
elib.spbstu.ru 1
resolver.tudelft.nl 1
campagnes.flotteoceanographique.fr 1
archive.materialscloud.org 1
www.worldpop.org.uk 1
nsidc.org 1
archaeologydataservice.ac.uk 1
didomena.ehess.fr 1
www.elibrary.ru 1
cyberdoi.ru 1
opus.bibliothek.uni-wuerzburg.de 1
www.zora.uzh.ch 1