Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect and honor all licenses #57

Open
tfmorris opened this issue May 27, 2013 · 10 comments
Open

Detect and honor all licenses #57

tfmorris opened this issue May 27, 2013 · 10 comments

Comments

@tfmorris
Copy link
Contributor

Project Gutenberg is not entirely public domain works. It also includes works which have been licensed to them for distribution and which are not necessarily redistributable.

It's probably too much work to go an get permission from all the authors, so the safest thing to do is just exclude them. An example which is not public domain is Zen and the Art of the Internet https://github.com/GITenberg/Zen-and-the-Art-of-the-Internet_34 (it actually is legal to redistribute it, including modified copies, but it needs to include the specified license which makes the PG license in the repo misleading)

@sethwoodworth
Copy link
Contributor

This was one of the first test repos I uploaded, I have at least some code that should filter out books with copyright other than public domain. But I wrote it after I had uploaded this test repo AFAICT. The repo that you mention has been deleted.

@sethwoodworth
Copy link
Contributor

The only books in the PG archives with copyright that I am interested in are 400 books that were released on a CD in the early 90s that claim copyright via sweat-of-brow digitization. I would like to get in touch with the copyright owner, and request they remove their copyright claim, but this is a low priority. Until then I'll leave them out of the repos.

@eshellman
Copy link
Contributor

The NYPL folks did a heroic survey of ALL the copyright statements in PG, which they have shared with us. There are a significant number of the non-PD books that can be cleared. There are some that can't. Amusingly, the largest number of non-PD "books" are text-to-speech versions of PD books; the copyright claims on these are bogus, but the quality of the text-to-speech is laughable today, and are not of interest as books.

@tfmorris
Copy link
Contributor Author

tfmorris commented Oct 7, 2015

@eshellman Is their survey in the repo somewhere?

@eshellman
Copy link
Contributor

I was checking if I had permission to share it, and it seems I do. Where should I put it? I'll give you access in a minute.

@eshellman
Copy link
Contributor

@tfmorris
Copy link
Contributor Author

tfmorris commented Oct 7, 2015 via email

@eshellman
Copy link
Contributor

ok. I will create a "no-distribution" list in this repo, and make sure the copyright statements line up with what's in metadata

@tfmorris
Copy link
Contributor Author

The "launch" email suggests that 1363 books were omitted, so perhaps this has been addressed?

@eshellman
Copy link
Contributor

Most of the 1363 books are audio or data. The books that are not in the public domain or openly licensed are also included in this list. The books that are not public domain in Germany and are the subject of the lawsuit against PG are denoted with a rights_note, which is referenced in the read-me.

I need to go back and double check whether the books that are not public domain but ARE openly licensed have been correctly handled. It's been so long...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants