-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling of posterior licenses #230
Comments
Should all models here use same permissive license? |
@avehtari will check if Aalto data support team can help us with solving this. If we should have a different licence for data and code. |
Was there ever a licensing decision? I want to import the models elsewhere, but can't do that without a license. Also, this repo needs to cite the source and license of any model retrieved from elsewhere. This might be hard as I think some of them came from the P.S. The reason I'm asking is that we are building out a database of models at Flatiron Institute with just the model implementation, data, and draws, but we can't do that without a license in place. The main motivation is that (a) we can distribute big data sets through our cluster, and (b) we can strip down the complexity of the R and Python packages so that the distribution is just the Stan programs, data, and draws. This is another thing I don't want to do through the Stan project because I don't want to have to compromise with a bunch of people on the goals or contents. |
So, I opened up this discussion with the Aalto lawyers in 2020, but the pandemic struck, and everything stopped. We need to know how to do this in a good way. As you say, most models are from within the community, and some data I have gotten okay on to put in posteriordb, but we didn't discuss licence. Do we have someone that knows about licencing of data and this type of this? |
I know the basics, but we also have access to IP lawyers for free through NumFOCUS if there are more complicated issues. Copyright is automatically assigned to whoever writes code or text. The author can reassign copyright. For example, most faculty contracts at American universities stipulate the all copyright for code is reassigned to the university, but all text is owned by the author. I have no idea what contracts in Sweden are like. The copyright owner can choose to distribute their copyrighted work with a license. Once the copyright holder does that, you can use it according to the license. For example, other projects can use stan-dev/stan and stan-dev/math and stan-dev/stanc3 code under the BSD-3 license without further permission from the Stan team. (Our name and logo are trademarked, which is a different branch of IP law.) When you redistribute the copyrighted works of others, you are legally required to respect the licensing terms. They almost always require you to cite the copyright holder and the license under which the copyrighted work is used. If you try to combine code with multiple licenses into a single project, there's an issue of license compatibility and copyleft. Some licenses are fundamentally incompatible, like Apache 2 and GPL 2, but others are compatible, like GPL 3 and BSD-3. If all you do is redistribute each contribution under its own license, it makes it harder for people to use the project (they have to scan the license for everything they use), but it's otherwise OK to do that. If you have more complicated question, we'll have to get help from a real lawyer. |
So the idea I think @avehtari had was to set the licence for each model and data in the database. So maybe the easiest would be to do that. Then you could filter out everything that has the licence that you are ok with? |
You can create a repo with each component licensed under its own license. There just can't be an overall license unless you find one that's compatible. I'm not sure what that would mean relative to your writing Python or R code that compiles against those models. That's something I'd ask the NumFOCUS IP attorneys. |
Ok. You mean our code? Thats mainly written by me so thats no problem. So do I understand you correctly that you are happy with clear licences per model/posterior? |
Yes, I mean the R and Python code. It has to be released under some license, but I don't know what the implications would be of it using a bunch of Stan code under different licenses. This is where license compatibility becomes an issue and also a reading of what it means to be a derivative product. If all of the models you are using are licensed under GPL v3 or BSD-3 or Apache 2 or MIT license or similar, you're OK with going with a BSD-3 license for your code. As soon as you try to include something with an incompatible license (e.g. homebrew "academic only" use license or GPL v2), I would urge you to ask NumFOCUS lawyers. |
There are separate repositories The different model codes in We do need to mention the licenses for each code in the repository, and remove hose codes for which the licence is not clear and we can't get the original author to license it with something suitable. |
That makes sense, @avehtari. I hadn't realized I think distributing a repo with a bunch of separately licensed codes is OK. I would personally stay away from anything other than the standard open source licenses, but that's your call. I haven't thought about copyright on draws. You can only copyright things produced by human, but I don't know what the status of things produced by tools by a human. |
Is the intention to accept data with closed licenses? If not, the
CONTRIBUTING.md text should be clarified as to what licenses other
than BSD-3 are acceptable. If there are multiple licenses, there
needs to be a master list and they all need to be compatible if you're
going to put them into the same package as very few licenses are as
compatible with other licenses as BSD-3.
The text was updated successfully, but these errors were encountered: