Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency in the cache of GDriveFileSystem between item added during initialization and from ls #241

Closed
simone-viozzi opened this issue Nov 8, 2022 · 1 comment
Labels
bug Something isn't working fs fsspec implementation

Comments

@simone-viozzi
Copy link
Contributor

simone-viozzi commented Nov 8, 2022

Consider this minimal working example:

The mkdir method is from #222

def mkdir(fs: GDriveFileSystem, path, create_parents=True):
    """Create directory entry at path"""

    if fs.exists(path):
        raise FileExistsError(path)

    dst_id = fs._get_item_id(fs._parent(path), create=create_parents)
    basename = posixpath.basename(path.rstrip("/"))
    fs._gdrive_create_dir(dst_id, basename)


# create a GDriveFileSystem instance with a path that point to an empy folder
root = "root/tmp/"
fs = GDriveFileSystem(root, auth)

# dump the cache
print(json.dumps(fs._ids_cache, indent=4))
{
    "dirs": {
        "tmp": [
            "1BClMfgL7BMV61-5SdWAvc8njvkPgicyx"
        ]
    },
    "ids": {
        "1BClMfgL7BMV61-5SdWAvc8njvkPgicyx": "tmp"
    },
    "root_id": "1BClMfgL7BMV61-5SdWAvc8njvkPgicyx"
}

# create a folder
folder = posixpath.join(root, "folder")
mkdir(fs, folder)

# the cache did not change
print(json.dumps(fs._ids_cache, indent=4))
{
    "dirs": {
        "tmp": [
            "1BClMfgL7BMV61-5SdWAvc8njvkPgicyx"
        ]
    },
    "ids": {
        "1BClMfgL7BMV61-5SdWAvc8njvkPgicyx": "tmp"
    },
    "root_id": "1BClMfgL7BMV61-5SdWAvc8njvkPgicyx"
}

# use the ls method to get the content of the folder we just created
print(fs.ls(folder))
[]

# there is a new entry in the cache: "root/tmp/folder"
print(json.dumps(fs._ids_cache, indent=4))
{
    "dirs": {
        "tmp": [
            "1BClMfgL7BMV61-5SdWAvc8njvkPgicyx"
        ],
        "root/tmp/folder": [
            "1VKRRnGmZm_Dvmn0G8xQjqwQxbZ9vW0Hf"
        ]
    },
    "ids": {
        "1BClMfgL7BMV61-5SdWAvc8njvkPgicyx": "tmp",
        "1VKRRnGmZm_Dvmn0G8xQjqwQxbZ9vW0Hf": "root/tmp/folder"
    },
    "root_id": "1BClMfgL7BMV61-5SdWAvc8njvkPgicyx"
}

As we can see, there is an inconsistency between items added in the initialization phase of the cache and items added by ls:

  • items added on the initialization of _ids_cache does not have the root/ prefix
    code https://github.com/iterative/PyDrive2/blob/b700387d05b4ef853607bc54ce0561d571456fc6/pydrive2/fs/spec.py#L247-L266
  • items added by ls have the root/ prefix
    code https://github.com/iterative/PyDrive2/blob/b700387d05b4ef853607bc54ce0561d571456fc6/pydrive2/fs/spec.py#L427-L467

Is this the expected behavior?

@shcheklein shcheklein added bug Something isn't working fs fsspec implementation labels Nov 30, 2022
@shcheklein
Copy link
Member

items added on the initialization of _ids_cache does not have the root/ prefix

This is not relevant anymore. Additional caching during init was removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fs fsspec implementation
Projects
None yet
Development

No branches or pull requests

2 participants