Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for "direct" jsonl files #50

Open
eimrek opened this issue Mar 6, 2024 · 4 comments
Open

support for "direct" jsonl files #50

eimrek opened this issue Mar 6, 2024 · 4 comments

Comments

@eimrek
Copy link
Member

eimrek commented Mar 6, 2024

I am wondering if we really need to support "direct" jsonl files in the optimade.yaml format.

Conceptually to me it seems that the current purpose of optimake is to generate a jsonl file from other structural data formats and optimade.yaml is something that help to achieve this.

If we already have a jsonl file, then the only purpose I see is validation, and the optimade.yaml file does not seem strictly necessary.

But perhaps generating jsonl files and validating them is different enough to separate them?

E.g. we could have a different optimake subcommand for validation (optimade validate <jsonl-file>?)

Regarding the Materials Cloud Archive service, this change would affect it, as then we should add support for a "direct" jsonl file without any optimade.yaml file.

@ml-evs
Copy link
Collaborator

ml-evs commented Mar 7, 2024

Would you still want people to provide an optimade.yaml to trigger the scraper at the your end? Otherwise you have to rely on file extensions and the file header which might have many false positives (we can also extend the yaml file to support/require the new database description field to be served in the OPTIMADE metadata)

@eimrek
Copy link
Member Author

eimrek commented Mar 7, 2024

Could be that we accept either optimade.yaml or optimade.jsonl, but i agree that a single file would be preferable. But i think it's good to design optimake in a way that makes the most sense by itself.

E.g. the example

config_version: 0.1.0

database_description: >-
  This database contains some example CIFs.

entries:
  jsonl_path: example.jsonl

I think currently it doesn't do anything, right? And in the future, would it just validate the file? But maybe this makes sense, open to discuss further.

@ml-evs ml-evs pinned this issue Mar 11, 2024
@ml-evs ml-evs unpinned this issue Mar 11, 2024
@ml-evs
Copy link
Collaborator

ml-evs commented Sep 27, 2024

Coming back to this (perhaps it can be closed in the original context), I would quite like optimake serve . (or optimake serve optimade.jsonl) to work without needing a config file, even if it throws errors about validation of the file.

@eimrek
Copy link
Member Author

eimrek commented Oct 3, 2024

yep, i agree that optimade.jsonl doesn't need to have the config file, and we could make it optional in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants