Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ID generation scheme #55

Merged
merged 4 commits into from
May 30, 2024
Merged

Improve ID generation scheme #55

merged 4 commits into from
May 30, 2024

Conversation

ml-evs
Copy link
Collaborator

@ml-evs ml-evs commented Mar 31, 2024

Blocked by #54

Addresses suggestions in #53:

  • More ergonomic IDs generation
  • immutable_id used to store current path-based ID format
  • add last modified based on last rendering of API

@ml-evs ml-evs force-pushed the ml-evs/update-ids branch from 4a74d77 to 33bfbfe Compare March 31, 2024 13:41
@ml-evs ml-evs changed the base branch from main to ml-evs/fix-tests March 31, 2024 20:04
@ml-evs ml-evs force-pushed the ml-evs/update-ids branch from 33bfbfe to 53d9ac7 Compare March 31, 2024 20:04
@ml-evs ml-evs changed the base branch from ml-evs/fix-tests to main March 31, 2024 20:06
@ml-evs ml-evs force-pushed the ml-evs/update-ids branch 2 times, most recently from 8157034 to db57310 Compare March 31, 2024 20:09
@ml-evs ml-evs force-pushed the ml-evs/update-ids branch from a5fed0f to f3a5c3b Compare April 17, 2024 23:44
@ml-evs ml-evs mentioned this pull request Apr 19, 2024
3 tasks
@ml-evs ml-evs force-pushed the ml-evs/update-ids branch from f3a5c3b to 13e8138 Compare April 28, 2024 09:16
@ml-evs ml-evs marked this pull request as ready for review April 28, 2024 13:35
@ml-evs
Copy link
Collaborator Author

ml-evs commented Apr 28, 2024

Hi @eimrek, think this is ready for review whenever you are. Probably easier to start from the changes to the tests and make sure you agree with the idea (as discussed in #53) -- happy to make any changes you see fit!

@ml-evs ml-evs requested a review from eimrek April 28, 2024 13:36
@eimrek
Copy link
Member

eimrek commented May 30, 2024

Sorry that i didn't have time before. Took a look now, looks great.

But, for the example of let's say

structures.zip:
  structures
  │   ├── molecules
  │   │   ├── benzene.xyz
  │   │   └── butadiene.xyz
  │   ├── set1
  │   │   ├── 55c564f6-ac6a-4122-b8d9-0ad9fe61e961.cif
  │   │   ├── 991bec7a-b3a8-49af-ba6d-be5afd685cd4.cif
  │   │   └── cc1a41b1-a841-4818-baf1-a6c1441dc52a.cif
  │   └── set2
  │       ├── 2aee2118-f484-4d53-90dc-0464cc6c0995.cif
  │       ├── 41b8077b-e051-45ed-88b4-57d8ecce59d0.cif
  │       └── 8b4cac08-928f-443a-8a26-5042d9970d76.cif

I think I would want to keep the subfolder as well (molecules, set1, set2) and possibly the extension, if they're different for different files. I think the IDs could be, in this case,

molecules/benzene.xyz
molecules/butadiene.xyz
set1/55c564f6-ac6a-4122-b8d9-0ad9fe61e961.cif
set1/991bec7a-b3a8-49af-ba6d-be5afd685cd4.cif
set1/cc1a41b1-a841-4818-baf1-a6c1441dc52a.cif
set2/2aee2118-f484-4d53-90dc-0464cc6c0995.cif
set2/41b8077b-e051-45ed-88b4-57d8ecce59d0.cif
set2/8b4cac08-928f-443a-8a26-5042d9970d76.cif

what do you think?

With the current implementation, i get the subfolders and extensions removed.

What i propose is to just remove any common folders/files from front or back of the path (based on splitting on forwardslash). and potentially also removing extension, if common for all the files. Do you see any issues with this?

@eimrek eimrek merged commit ec20a8d into main May 30, 2024
1 check passed
@eimrek eimrek deleted the ml-evs/update-ids branch May 30, 2024 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants