Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writable canned queries fail with useless non-error against immutable databases #1728

Closed
wragge opened this issue Apr 28, 2022 · 13 comments
Closed
Labels

Comments

@wragge
Copy link
Contributor

wragge commented Apr 28, 2022

I've been banging my head against a wall for a while and would appreciate any pointers...

  • I have a writeable canned query to update rows in the db.
  • I'm using the github-oauth plugin for authentication.
  • I have allow set on the query to accept my GitHub id and a GH organisation.
  • Authentication seems to work as expected both locally and on Cloudrun -- viewing /-/actor gives the same result in both environments
  • I can access the 'padlocked' canned query in both environments.

Everything seems to be the same, but the canned query works perfectly when run locally, and fails when I try it on Cloudrun. I'm redirected back to the canned query page and the db is not changed. There's nothing in the Cloudstor logs to indicate an error.

Any clues as to where I should be looking for the problem?

@simonw
Copy link
Owner

simonw commented Apr 28, 2022

How did you deploy to Cloud Run?

datasette publish cloudrun defaults to running databases there in -i immutable mode, because if you managed to change a file on disk on Cloud Run those changes would be lost the next time your container restarted there.

That's why I upgraded datasette-publish-fly to provide a way of working with their volumes support - they're the best option I know of right now for running Datasette in a container with a persistent volume that can accept writes: https://simonwillison.net/2022/Feb/15/fly-volumes/

@simonw
Copy link
Owner

simonw commented Apr 28, 2022

If the behaviour you are seeing is because the database is running in immutable mode then that's a bug - you should get a useful error message instead!

@simonw
Copy link
Owner

simonw commented Apr 28, 2022

Confirmed - this is a bug where immutable databases fail to show a useful error if you write to them with a canned query.

Steps to reproduce:

echo '
databases:
  writable:
    queries:
      add_name:
        sql: insert into names(name) values (:name)
        write: true
' > write-metadata.yml
echo '{"name": "Simon"}' | sqlite-utils insert writable.db names -
datasette writable.db -m write-metadata.yml

Then visit http://127.0.0.1:8001/writable/add_name - adding names works.

Now do this instead:

datasette -i writable.db -m write-metadata.yml

And I'm getting a broken error:

error

@simonw simonw changed the title Writeable canned query fails on Cloudrun Writable canned queries fail with useless non-error against immutable databases Apr 28, 2022
@simonw simonw added the bug label Apr 28, 2022
@wragge
Copy link
Contributor Author

wragge commented Apr 28, 2022

Ah, that would be it! I have a core set of data which doesn't change to which I want authorised users to be able to submit corrections. I was going to deal with the persistence issue by just grabbing the user corrections at regular intervals and saving to GitHub. I might need to rethink. Thanks!

@simonw
Copy link
Owner

simonw commented Apr 28, 2022

I've wanted to do stuff like that on Cloud Run too. So far I've assumed that it's not feasible, but recently I've been wondering how hard it would be to have a small (like less than 100KB or so) Datasette instance which persists data to a backing GitHub repository such that when it starts up it can pull the latest copy and any time someone edits it can push their changes.

I'm still not sure it would work well on Cloud Run due to the uncertainty at what would happen if Cloud Run decided to boot up a second instance - but it's still an interesting thought exercise.

@simonw
Copy link
Owner

simonw commented Apr 28, 2022

A more realistic solution (which I've been using on several of my own projects) is to keep the data itself in GitHub and encourage users to edit it there - using the GitHub web interface to edit YAML files or similar.

Needs your users to be comfortable hand-editing YAML though! You can at least guard against critical errors by having CI run tests against their YAML before deploying.

I have a dream of building a more friendly web forms interface which edits the YAML back on GitHub for the user, but that's just a concept at the moment.

Even more fun would be if a user-friendly form could submit PRs for review without the user having to know what a PR is!

@simonw
Copy link
Owner

simonw commented Apr 28, 2022

In terms of this bug, there are a few potential fixes:

  1. Detect the write to a immutable database and show the user a proper, meaningful error message in the red error box at the top of the page
  2. Don't allow the user to even submit the form - show a message saying that this canned query is unavailable because the database cannot be written to
  3. Don't even allow Datasette to start running at all - if there's a canned query configured in metadata.yml and the database it refers to is in -i immutable mode throw an error on startup

I'm not keen on that last one because it would be frustrating if you couldn't launch Datasette just because you had an old canned query lying around in your metadata file.

So I'm leaning towards option 2.

@wragge
Copy link
Contributor Author

wragge commented Apr 28, 2022

I don't think that'd work for this project. The db is very big, and my aim was to have an environment where researchers could be making use of the data, but be easily able to add corrections to the HTR/OCR extracted data when they came across problems. It's in its immutable (!) form here: https://sydney-stock-exchange-xqtkxtd5za-ts.a.run.app/stock_exchange/stocks

@simonw
Copy link
Owner

simonw commented Apr 28, 2022

Nice custom template/theme!

Yeah, for that I'd recommend hosting elsewhere - on a regular VPS (I use systemd like this: https://docs.datasette.io/en/stable/deploying.html#running-datasette-using-systemd ) or using Fly if you want to tub containers without managing a full server.

@wragge
Copy link
Contributor Author

wragge commented Apr 28, 2022

Thanks, I'll give it a try!

@wragge
Copy link
Contributor Author

wragge commented Apr 28, 2022

And in terms of the bug, yep I agree that option 2 would be the most useful and least frustrating.

@simonw simonw added this to the Datasette 0.62 milestone Aug 14, 2022
@simonw
Copy link
Owner

simonw commented Aug 14, 2022

Decision: I'm going to implement this:

Don't allow the user to even submit the form - show a message saying that this canned query is unavailable because the database cannot be written to

@simonw
Copy link
Owner

simonw commented Aug 14, 2022

Here's what it looks like:

image

I put the disabled attribute on the form submission button too. There's no visible difference in how that button is displayed but since this is only a state that should be seen by the developer I think that's OK, especially combined with the warning at the top of the page.

@simonw simonw closed this as completed in c1396bf Aug 14, 2022
simonw added a commit that referenced this issue Aug 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants