Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: improve file locking mechanism with atomic creation #455

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

geniusnut
Copy link

Here is TOCTOU in _FileLocker,
which casuses race-condition when using uploader.putfile concurrently.
Below is a mock snippet to reproduce the issue.

from multiprocessing.pool import ThreadPool
from qiniu import config
from qiniu.http.regions_provider import (
    get_default_regions_provider,
    Endpoint,
)

# Define your access key and bucket name
access_key = os.getenv('QINIU_ACCESS_KEY')
secret_key = os.getenv('QINIU_SECRET_KEY')
bucket_name = os.getenv('QINIU_TEST_BUCKET')


query_region_host = config.get_default("default_query_region_host")
query_region_backup_hosts = config.get_default("default_query_region_backup_hosts")
query_regions_endpoints = [
    Endpoint.from_host(h) for h in [query_region_host] + query_region_backup_hosts
]

# query_endpoints_provider = [Endpoint(qiniu_domain)]
print(query_regions_endpoints)


def mock_shrink_cache(name):
    # Get the default regions provider
    regions_provider = get_default_regions_provider(
        query_endpoints_provider=query_regions_endpoints,
        access_key=access_key,
        bucket_name=bucket_name,
        name=name,
    )
    for region in regions_provider:
        ...
    print(f"Last Shrink [{name}] At:", regions_provider.last_shrink_at)


ThreadPool(8).map(mock_shrink_cache, range(20))
The putfile initialize the cache every time, did I miss something, like global-static-once cache for the API.

@qiniu-prow qiniu-prow bot added the size/S label Dec 5, 2024
@lihsai0
Copy link
Collaborator

lihsai0 commented Dec 5, 2024

@geniusnut Thanks for your report.

I will try to reproduce this issue later. The design of _FileLocker is not good.

I planned try some other method to improve it, like fcntl, or yours, or some others.


The putfile initialize the cache every time, did I miss something, like global-static-once cache for the API.

This is a mistake within initializing the cache scope. It should use the global cache unless specified a different persist path.


Sincere apologies for the issue have caused in your usage. And thanks again for your report.

@geniusnut
Copy link
Author

@geniusnut Thanks for your report.

I will try to reproduce this issue later. The design of _FileLocker is not good.

I planned try some other method to improve it, like fcntl, or yours, or some others.

The putfile initialize the cache every time, did I miss something, like global-static-once cache for the API.

This is a mistake within initializing the cache scope. It should use the global cache unless specified a different persist path.

Sincere apologies for the issue have caused in your usage. And thanks again for your report.

Here's a python filelock Repo, suitable for all platforms, avoid using fcntl only on unix-like platforms.

@lihsai0 lihsai0 mentioned this pull request Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants