Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define KpointsData via constructor #3287

Open
ltalirz opened this issue Aug 29, 2019 · 6 comments
Open

Define KpointsData via constructor #3287

ltalirz opened this issue Aug 29, 2019 · 6 comments
Labels

Comments

@ltalirz
Copy link
Member

ltalirz commented Aug 29, 2019

It should be possible to set the properties of a KpointsData directly in the constructor. Instead of having to write

kpoints = KpointsData()
kpoints.set_kpoints_mesh([4,4,4])
builder.kpoints = kpoints

one should be able to write

builder.kpoints = KpointsData(mesh=[4,4,4])
@sphuber
Copy link
Contributor

sphuber commented Aug 29, 2019

I would be all for this. There are multiple open issues related to the usability of the KpointsData class and these discussion have typically led to the conclusion that it badly needs to be redesigned. Also the fact that it is used both for meshes and explicit k-point lists leads to very unexpected behavior in the API. There were ideas of having KpointsData be sub classed into two specific classes for these respective use cases.

@ltalirz ltalirz added the good first issue Issues that should be relatively easy to fix also for beginning contributors label Feb 3, 2020
@unkcpz unkcpz added the topic/materials-science-related Issues that relate to the materials science label Nov 11, 2021
@JPchico
Copy link
Contributor

JPchico commented Mar 9, 2023

Hi! I was looking at this, and indeed it would be very nice to be able to define this via the constructor.
I agree with @sphuber about having the same datastructure for mesh and kpoints can be a bit confusing. However, I think that having two different sub-classes for this might lead to further complications on the implementation of workflows part.

Thinking about the use cases, I think besides the mesh as suggested by @ltalirz , one should see if it is also possible to define via the constructor using the k-points density, i.e. if we pass

structure = StructureData(ase=ase_structure)
kpoints = KpointsData(structure=structure, density=0.1)

Maybe it is a bit of overkill, but that kind of functionality could make aiida easier to use.

What do you think?

@mbercx
Copy link
Member

mbercx commented Apr 22, 2023

@JPchico I fully agree this would be very useful functionality to have, but in line with my rant in https://github.com/orgs/aiidateam/discussions/5976 1 I would avoid adding arguments to the constructor signature. Rather, the constructor should be complete but minimal.

Instead of adding the structure input arguments, I would add a from_structure method:

structure = StructureData(ase=ase_structure)  # <-- This should also be StructureData.from_ase, but that's a fight for another day
kpoints = KpointsData.from_structure(structure=structure, density=0.1)

A few notes:

  1. One question is whether we want to keep track of the logic that converts a StructureData and float / Float into a corresponding KpointsData. In the aiida-quantumespresso plugin package the decision was made to use a calcfunction for this, see the create_kpoints_from_distance. Note, this may be exactly the kind of provenance-purism that convolutes usage and stops people from using AiiDA, but I can see an argument for the KpointsData being connected to the original structure as well. One idea might be to make the from_structure method a calcfunction? Not sure if that's ever been tried, and probably not desirable because that would immediately store the nodes.

  2. With regard to the minimal but complete constructor: this becomes impossible if we want to maintain a single KpointsData class. A KpointsListData class would never need the mesh to be stored, and hence there will always be superfluous inputs. I'm also not sure if I agree that having different sub-classes leads to complications on the usage of KpointsData-like classes. In fact I have the feeling trying to keep one class does so. It leads to a lot of usage with try-except snippets like this [source]:

         try:
             kpoints_list = kpoints.get_kpoints()
         except AttributeError:
             kpoints_list = kpoints.get_kpoints_mesh(print_list=True)

    Which I'm frankly not a huge fan of, and also leads to confusion for users. What if it could just be kpoints_list = kpoints.get_kpoints_list(), and the class has different method implementations for dealing with the list?

Point [2] is definitely beyond the original scope of the issue though, so I won't go into more detail and further derail the conversation. I'll add my further comments on the constructor by getting involved in the review of #5942.

Footnotes

  1. which interestingly enough doesn't seem to get linked automatically even though I mentioned this issue; -1 for discussions...

@sphuber
Copy link
Contributor

sphuber commented Apr 22, 2023

Not sure if that's ever been tried, and probably not desirable because that would immediately store the nodes.

This is definitely possible. You can generate calcfunctions dynamically and run them. See for example Parser.parse_from_node that does this. Note also that you can always pass metadata.store_provenance = False to not record provenance and so not store the nodes.

Which I'm frankly not a huge fan of, and also leads to confusion for users. What if it could just be kpoints_list = kpoints.get_kpoints_list(), and the class has different method implementations for dealing with the list?

This might solve this problem for this example, but don't think this will be applicable in general. I think that having two subclasses makes perfect sense and should be the preferred solution. An interface can decide to support both by accepting the base class KpointsData, and in the implementation use isinstance check on the more specific subclass to see what methods to call. This is perfectly "normal" behavior in code that uses inheritance.

In the aiida-quantumespresso plugin package the decision was made to use a calcfunction for this, see the create_kpoints_from_distance. Note, this may be exactly the kind of provenance-purism that convolutes usage and stops people from using AiiDA, but I can see an argument for the KpointsData being connected to the original structure as well.

I agree that provenance-purism-and-zealotry is to be avoided. In this case, I think the required functionality required a function (since the implementation is not a trivial one-liner) and at that point we thought we might as well just slap the calcfunction decorator on it. I don't think this case in particular is putting people of AiiDA ^^

@mbercx
Copy link
Member

mbercx commented Apr 22, 2023

Note also that you can always pass metadata.store_provenance = False to not record provenance and so not store the nodes.

True, but that is something most users will only find out about approx 2 years in. 😉 As an avid AiiDA user, I love the idea of the KpointsData.from_structure() classmethod automatically adding this step to the provenance, connecting the StructureData, Float distance and the resulting KpointsData. I'm just worried that new users might do this a bunch of times in a loop and create a lot of nodes they don't need. Of course caching will help here 1. In a way, it could highlight a lot of cool features of AiiDA.

I think that having two subclasses makes perfect sense and should be the preferred solution.

I agree, and I think I was arguing for that? 😅 But let's not derail the constructor discussion further, I'll open an issue elsewhere to discuss the atomistic classes redesign. (Maybe I should finally make the repo, but then I will go into the "must update cookiecutter" rabbithole and there is science to be done.

I agree that provenance-purism-and-zealotry is to be avoided. In this case, I think the required functionality required a function (since the implementation is not a trivial one-liner) and at that point we thought we might as well just slap the calcfunction decorator on it. I don't think this case in particular is putting people of AiiDA ^^

Oh, of course, the create_points_from_distance calculation function is probably the one I've run the most, so it is very dear to my heart. I'm quite the purist myself to be honest, except when there is a very good reason not to (e.g. significant improvement in user-friendliness, e.g. the computers and code conundrum).

Footnotes

  1. Or having features for cleaning up the database of "lone" nodes. I have so many lonely nodes in my database.

@JPchico
Copy link
Contributor

JPchico commented Apr 26, 2023

structure = StructureData(ase=ase_structure)  # <-- This should also be StructureData.from_ase, but that's a fight for another day
kpoints = KpointsData.from_structure(structure=structure, density=0.1)

I think that the solution that you propose @mbercx is ideal, as having the provenance with the density and structure directly when you are creating the kpoints would be a much nicer solution.

Sure having it in the constructor is not the best, but with a nice method like that I think that would solve a lot of the issues when generating kpoints.

@sphuber sphuber removed the good first issue Issues that should be relatively easy to fix also for beginning contributors label May 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants