-
-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect and honor all licenses #57
Comments
This was one of the first test repos I uploaded, I have at least some code that should filter out books with copyright other than public domain. But I wrote it after I had uploaded this test repo AFAICT. The repo that you mention has been deleted. |
The only books in the PG archives with copyright that I am interested in are 400 books that were released on a CD in the early 90s that claim copyright via sweat-of-brow digitization. I would like to get in touch with the copyright owner, and request they remove their copyright claim, but this is a low priority. Until then I'll leave them out of the repos. |
The NYPL folks did a heroic survey of ALL the copyright statements in PG, which they have shared with us. There are a significant number of the non-PD books that can be cleared. There are some that can't. Amusingly, the largest number of non-PD "books" are text-to-speech versions of PD books; the copyright claims on these are bogus, but the quality of the text-to-speech is laughable today, and are not of interest as books. |
@eshellman Is their survey in the repo somewhere? |
I was checking if I had permission to share it, and it seems I do. Where should I put it? I'll give you access in a minute. |
Thanks. That provides a good summary of the messiness of the non-public
domain works in PG, but I'm not sure how much it helps with determining
what works could be distributed. It really needs to be summarized in a
binary Yes/No column or a small number of choices (perhaps aligned with CC
terminology?). The things with a copyright and no license are clearly
out. Many of the ones which do allow redistribution seem to basically be
equivalent of CC-ND-NC. Of course the ones which are already CC licensed
are already categorized.
|
ok. I will create a "no-distribution" list in this repo, and make sure the copyright statements line up with what's in metadata |
The "launch" email suggests that 1363 books were omitted, so perhaps this has been addressed? |
Most of the 1363 books are audio or data. The books that are not in the public domain or openly licensed are also included in this list. The books that are not public domain in Germany and are the subject of the lawsuit against PG are denoted with a I need to go back and double check whether the books that are not public domain but ARE openly licensed have been correctly handled. It's been so long... |
Project Gutenberg is not entirely public domain works. It also includes works which have been licensed to them for distribution and which are not necessarily redistributable.
It's probably too much work to go an get permission from all the authors, so the safest thing to do is just exclude them. An example which is not public domain is Zen and the Art of the Internet https://github.com/GITenberg/Zen-and-the-Art-of-the-Internet_34 (it actually is legal to redistribute it, including modified copies, but it needs to include the specified license which makes the PG license in the repo misleading)
The text was updated successfully, but these errors were encountered: