Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASE: Use guess=true while checking parser support #29

Merged
merged 14 commits into from
Nov 18, 2023

Conversation

rashidrafeek
Copy link
Contributor

Is there a reason for setting guess=false while checking parser support? Only if guess=true will ASE guess the extension based on extension and therefore can read many formats like VASP POSCARS (*.vasp, POSCAR*), QE input (*.in) etc. Currently, it shows an error: Python: UnknownFileTypeError: Could not guess file type due to setting guess=false.

@mfherbst
Copy link
Owner

Thanks for the suggestion.

Hmm I guess that was to be careful and to avoid issues when file extensions are not unique. For sure we should make this a parameter exposed to the user (eg a flag in ASEParser, but before changing it we should double check this has no negative side effects.

We could also have two ase parser objects in the default list of parsers: the first more strict and the second laxer. This would make it more obvious what's the cause in case sth. breaks in the future.

@mfherbst
Copy link
Owner

mfherbst commented Sep 22, 2023

Oh also please provide example files for edge cases / unexpected behaviour if you can. Always good to add them to the atomsio test suite, because the whole parsing stuff is indeed quite messy.

@rashidrafeek
Copy link
Contributor Author

Sorry for the late reply. I was a little busy and could not respond to you comments.

I think defining these parameters in the AseParser object with a default value using @kwdef would be better instead of defining multiple parser objects. For example, we can redefine AseParser as,

@kwdef struct AseParser <: AbstractParser 
    read::Bool = false
    guess::Bool = false
end

function AtomsIO.supports_parsing(parser::AseParser, file; save, trajectory)
    ...
        format = ase.io.formats.filetype(file; read=parser.read, guess=parser.guess)
    ...
end

Additionally, it seems that currently the read argument above (which enables reading the file to determine filetype) is mapped to the save argument of supports_parsing. AFAIU, these two flags are independent and mapping them leads to incorrect behavior. For example, even though ASE can both read and write "*.xsf" files, currently it gives an error when save=false:

julia> using AtomsIO: supports_parsing
       using AtomsIOPython: AseParser

julia> supports_parsing(AseParser(), "test.xsf"; save=true, trajectory=false)
true

julia> supports_parsing(AseParser(), "test.xsf"; save=false, trajectory=false)
ERROR: UndefVarError: `PyException` not defined
Stacktrace:
 [1] supports_parsing(::AseParser, file::String; save::Bool, trajectory::Bool)
   @ AtomsIOPython ~/GitProjects/AtomsIO/lib/AtomsIOPython/src/ase.jl:22
 [2] top-level scope
   @ REPL[46]:1

caused by: Python: UnknownFileTypeError: Could not guess file type
...

This behavior is present for "*.vasp" files also.

There are individual functions can_read and can_write in the IOFormat object of ASE to determine whether ASE supports reading/writing for a specific format. I think these should be used instead. To change this, the following lines:

else
return true
end

would need to changed to

    ioformat = ase.io.formats.ioformats[format]
    elseif save
        return ioformat.can_write
    else
        return ioformat.can_read
    end

If these changes seem fine, I will modify the PR accordingly. Let me know what you think.

@mfherbst
Copy link
Owner

mfherbst commented Oct 3, 2023

Sounds good. I'm a bit confused what the read ASE kwarg is then actually for. Is there no way we can give it a sane default and not worry about it from then on?

- Add arguments to `filetype` function in `AseParser` struct with
  default same as ASE.
- Check whether ASE can read or write format was not done. Add these
  checks using `can_read` and `can_write` functios of `ioformat`.
Copy link
Owner

@mfherbst mfherbst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks getting close. Just a small refactor ;)

lib/AtomsIOPython/src/ase.jl Outdated Show resolved Hide resolved
lib/AtomsIOPython/src/ase.jl Outdated Show resolved Hide resolved
lib/AtomsIOPython/src/ase.jl Outdated Show resolved Hide resolved
@rashidrafeek
Copy link
Contributor Author

I have implemented your suggestions. And have removed the read argument from the AseParser for now as you suggested. The issue with the read argument would not occur if guess=true is made the default. Wouldn't this default make more sense as this is the current default in ASE?

I have also added a test for xsf format which would not pass in the current master. Let me know if these changes are okay.

@mfherbst mfherbst enabled auto-merge (squash) November 8, 2023 08:24
@mfherbst mfherbst disabled auto-merge November 8, 2023 08:24
@mfherbst
Copy link
Owner

mfherbst commented Nov 8, 2023

@rashidrafeek Could you add tests in lib/AtomsIOPython/test/ase.jl for the cases you would want to have supported, just to ensure that we don't break things in the future ?

@rashidrafeek
Copy link
Contributor Author

Sorry for the delay here. I have added .xyz and .vasp to the tests, which I use regularly.

Also, there was a mistake in the wording of the warning. The bug occurs when trying to read, not write (i.e, save=false). Hope it is good to go now.

@mfherbst
Copy link
Owner

Thanks!

@mfherbst mfherbst enabled auto-merge (squash) November 18, 2023 16:02
@mfherbst mfherbst merged commit d10476e into mfherbst:master Nov 18, 2023
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants