Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--nolock feature for opening locked databases #1744

Closed
simonw opened this issue May 17, 2022 · 7 comments
Closed

--nolock feature for opening locked databases #1744

simonw opened this issue May 17, 2022 · 7 comments

Comments

@simonw
Copy link
Owner

simonw commented May 17, 2022

The getting started docs currently suggest you try this to browse your Chrome history:

datasette ~/Library/Application\ Support/Google/Chrome/Default/History

But if Chrome is running you will likely get this error:

sqlite3.OperationalError: database is locked

Turns out there's a workaround for this which I just spotted on the SQLite forum:

You can do this using a URI filename:

sqlite3 'file:places.sqlite?mode=ro&nolock=1'

That opens the file places.sqlite in read-only mode with locking disabled. This isn't safe, in that changes to the database made by other corrections are likely to cause this connection to return incorrect results or crash. Read-only mode should at least mean that you don't corrupt the database in the process.

@simonw
Copy link
Owner Author

simonw commented May 17, 2022

I knocked out a quick prototype of this and it worked!

datasette ~/Library/Application\ Support/Google/Chrome/Default/History --nolock

Here's the prototype diff:

diff --git a/datasette/app.py b/datasette/app.py
index b7b8437..f43700d 100644
--- a/datasette/app.py
+++ b/datasette/app.py
@@ -213,6 +213,7 @@ class Datasette:
         config_dir=None,
         pdb=False,
         crossdb=False,
+        nolock=False,
     ):
         assert config_dir is None or isinstance(
             config_dir, Path
@@ -238,6 +239,7 @@ class Datasette:
         self.databases = collections.OrderedDict()
         self._refresh_schemas_lock = asyncio.Lock()
         self.crossdb = crossdb
+        self.nolock = nolock
         if memory or crossdb or not self.files:
             self.add_database(Database(self, is_memory=True), name="_memory")
         # memory_name is a random string so that each Datasette instance gets its own
diff --git a/datasette/cli.py b/datasette/cli.py
index 3c6e1b2..7e44665 100644
--- a/datasette/cli.py
+++ b/datasette/cli.py
@@ -452,6 +452,11 @@ def uninstall(packages, yes):
     is_flag=True,
     help="Enable cross-database joins using the /_memory database",
 )
+@click.option(
+    "--nolock",
+    is_flag=True,
+    help="Ignore locking and open locked files in read-only mode",
+)
 @click.option(
     "--ssl-keyfile",
     help="SSL key file",
@@ -486,6 +491,7 @@ def serve(
     open_browser,
     create,
     crossdb,
+    nolock,
     ssl_keyfile,
     ssl_certfile,
     return_instance=False,
@@ -545,6 +551,7 @@ def serve(
         version_note=version_note,
         pdb=pdb,
         crossdb=crossdb,
+        nolock=nolock,
     )
 
     # if files is a single directory, use that as config_dir=
diff --git a/datasette/database.py b/datasette/database.py
index 44d3266..fa55804 100644
--- a/datasette/database.py
+++ b/datasette/database.py
@@ -89,6 +89,8 @@ class Database:
         # mode=ro or immutable=1?
         if self.is_mutable:
             qs = "?mode=ro"
+            if self.ds.nolock:
+                qs += "&nolock=1"
         else:
             qs = "?immutable=1"
         assert not (write and not self.is_mutable)

@simonw
Copy link
Owner Author

simonw commented May 17, 2022

Not sure how to test this - I'd need to open my own lock against a database file somehow.

@simonw
Copy link
Owner Author

simonw commented May 17, 2022

I think I do that with fcntl.flock(): https://docs.python.org/3/library/fcntl.html#fcntl.flock

@simonw
Copy link
Owner Author

simonw commented May 17, 2022

I tried writing a test like this:

@pytest.mark.parametrize("locked", (True, False))
def test_locked_sqlite_db(tmp_path_factory, locked):
    dir = tmp_path_factory.mktemp("test_locked_sqlite_db")
    test_db = str(dir / "test.db")
    sqlite3.connect(test_db).execute("create table t (id integer primary key)")
    if locked:
        fp = open(test_db, "w")
        fcntl.lockf(fp.fileno(), fcntl.LOCK_EX)
    runner = CliRunner()
    result = runner.invoke(
        cli,
        [
            "serve",
            "--memory",
            "--get",
            "/test",
        ],
        catch_exceptions=False,
    )

But it didn't work, because the test runs in the same process - so taking an exclusive lock on that file didn't cause an error when the test later tried to access it via Datasette!

@simonw
Copy link
Owner Author

simonw commented May 17, 2022

I'm going to skip adding a test for this - the test logic would have to be pretty convoluted to exercise it properly, and it's a pretty minor and low-risk feature in the scheme of things.

@simonw
Copy link
Owner Author

simonw commented May 17, 2022

One thing to note is that the datasette-copy-to-memory plugin broke with a locked file, because it does this: https://github.com/simonw/datasette-copy-to-memory/blob/d541c18a78ae6f707a8f9b1e7fc4c020a9f68f2e/datasette_copy_to_memory/__init__.py#L27

tmp.execute("ATTACH DATABASE ? AS _copy_from", [db.path])

That would need to use a URI filename too for it to work with locked files.

@simonw
Copy link
Owner Author

simonw commented May 17, 2022

simonw added a commit that referenced this issue May 17, 2022
simonw added a commit that referenced this issue Jul 18, 2022
simonw added a commit that referenced this issue Aug 14, 2022
simonw added a commit that referenced this issue Sep 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant