Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database is locked #274

Closed
stevenh opened this issue Feb 21, 2016 · 39 comments
Closed

database is locked #274

stevenh opened this issue Feb 21, 2016 · 39 comments

Comments

@stevenh
Copy link

stevenh commented Feb 21, 2016

We have a little app which has multiple goroutines all running queries and it often fails with "database is locked" :(

we're running with:

db, err := sql.Open("sqlite3", "test.db?cache=shared&mode=rwc")
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;

sqlite3 -version
3.10.2 2016-01-20 15:27:19 17efb4209f97fb4971656086b138599a91a75ff9

Is this a known issue?

@mattn
Copy link
Owner

mattn commented Feb 22, 2016

did you try _busy_timeout ?

@stevenh
Copy link
Author

stevenh commented Feb 22, 2016

I wasn't aware of this, so no I hadn't.

I would however has expected busy_timeout to only effect table locks but the error says "database is locked" is the error misleading?

@mrnugget
Copy link

mrnugget commented Mar 3, 2016

Just chiming in here, because I keep running into the same problem. I created a gist that can pretty reliably demonstrate the "database is locked" error. In this program I open 1 goroutine that writes 5000 times to one table (users) and another goroutine that spawns 5000 goroutines that read from another table.

db, err := sql.Open("sqlite3", "database_file.sqlite?cache=shared&mode=rwc") doesn't fix the problem, and adding _busy_timeout makes the program really, really slow.

What helped was using db.SetMaxOpenConns(1). Which is the same as wrapping a mutex around every DB access.

I really want to understand what's going on here and what's possible with sqlite3 in Go and what's not. My guess is that the database is locked error isn't a problem in sqlite (which explicitly allows writing to and reading from different tables at the same time), but rather a combination of sqlite's thread-safety, Go's database/sql connection pooling and go-sqlite3 connection management.

@mattn
Copy link
Owner

mattn commented Mar 3, 2016

Try to use transaction.

https://gist.github.com/392999fb7472906afec8

@mrnugget
Copy link

mrnugget commented Mar 3, 2016

That still results in "database is locked" errors and I think it's even more than before.

@stevenh
Copy link
Author

stevenh commented Mar 3, 2016

Yer, I think the issue is with multiple threads and this message is equivalent to MySQL's deadlock detected, at least in our case, just MySQL's message is much clearer as it doesn't have multiple meanings which seems to be the case with sqlite.

As none of the suggestions resolved the issue we've had to move away from sqlite unfortunately.

I'll leave this bug open for others to contribute to, as its clearly wider spread.

@mrnugget
Copy link

mrnugget commented Mar 4, 2016

After some more reading and investigating, I think there is no proper solution where we can use multiple sql.DB connections in combination with sqlite3. Why?

  • The thread-safety guarantees sqlite makes only cover the case where you have one sqlite3 *db in your program and access it from different threads. sqlite can then coordinate access to the database in its own functions.
  • If we don't limit the amount of sql.DB connections (which is the database/sql default), we create multiple sqlite3 *db instances in our program. These act independently from each other. Now each connection has to do its own synchronization and they have to synchronise with the the other connections. We're now essentially in "multiple processes accessing the same file"-land. The sqlite FAQ describes this:

SQLite allows multiple processes to have the database file open at once, and for multiple processes to read the database at once. When any process wants to write, it must lock the entire database file for the duration of its update. But that normally only takes a few milliseconds. Other processes just wait on the writer to finish then continue about their business.

(Highlight by me)

The difference between sqlite and client/server databases (PostgreSQL, MySQL, ...) is that sqlite3 synchronizes access to the database file on the client-side, which is only possible if multiple threads/goroutines use the same client. If multiple clients (processes, goroutines with their own connection) use the same database file, the synchronization has to happen through the database file, in which case it's locked completely for every write action.

So I'd say that the only solution is to use db.SetMaxOpenConns(1) (or use the busy timeout).

If anything of what I just said is wrong, please feel free to point it out.

EDIT:

Basically everything I said above is wrong.

Apparently what I said about "thread-safety guarantees" only in "shared db instance across threads" is wrong. I found some sample code that shows usage of multiple db instances across multiple threads. Apparently that is the recommended way to go.

This sqlite wiki entry also shows how sqlite behaves when used in multiple threads with multiple database instances: a lot of SQLITE_BUSY/"database is locked" errors. The conclusion on the wiki page is as follows:

Make sure you're compiling SQLite with -DTHREADSAFE=1.
Make sure that each thread opens the database file and keeps its own sqlite structure.
Make sure you handle the likely possibility that one or more threads collide when they access the db file at the same time: handle SQLITE_BUSY appropriately.

Using one connection across multiple goroutines by using db.SetMaxOpenConns(1) and letting database/sql synchronize access to the shared connection with its mutexes might cause problems, since connections in database/sql are stateful, which means that a db.Begin() in one goroutine effects other goroutines.

To update the conclusion at the top of the original version of this comment:

I think there is no proper solution where we can use multiple sql.DB connections in combination with sqlite3 in Go without handling "database is locked" by retrying or by incrementing the busy_timeout. It seems like this is expected behaviour when using sqlite in a multi-threaded context.

mrnugget added a commit to applikatoni/applikatoni that referenced this issue Mar 5, 2016
The reason for this commit is this issue:

    #35

We ran into "database is locked" errors because of webhooks trying to
read from the database on every log entry they received -- all the while
are log entries being saved to the database.

This change increases the busy timeout of the 5sec default value to
30sec. We'll see how that plays out. First tests confirm that it might
slow the deployment process down but the errors disappear since the
queries are retried.

For more information, start reading here:

* #35
* mattn/go-sqlite3#274
* https://gist.github.com/mrnugget/0eda3b2b53a70fa4a894
@sqweek
Copy link
Contributor

sqweek commented Apr 19, 2016

Yer, I think the issue is with multiple threads and this message is equivalent to MySQL's deadlock detected, at least in our case, just MySQL's message is much clearer as it doesn't have multiple meanings which seems to be the case with sqlite.

sqlite's "database is locked" error doesn't have multiple meanings. It means "I tried to obtain a lock on the database but failed because a separate process/connection held a conflicting lock". It's certainly not a deadlock, and is 100% expected in code creating multiple connections to the same DB.

My guess is that the database is locked error isn't a problem in sqlite (which explicitly allows writing to and reading from different tables at the same time),

To be more specific, sqlite's locking protocol allows for many readers and exactly one writer at any given time. ie. if you attempt to start two write transactions concurrently, the second will encounter SQLITE_BUSY.

In rollback-journal mode, a read transaction will also encounter SQLITE_BUSY if it starts just as a write transaction enters its COMMIT phase - as hinted at in previous comments sqlite locks the entire database for the duration of COMMIT. This doesn't happen in WAL mode, because COMMIT doesn't need to update the main database file (just the write ahead journal).

Anyway I think you came to the same conclusion, that SQLITE_BUSY is unavoidable when using multiple connections to the same database. Shared cache mode helps by introducing finer locks (table-level instead of database-level), but still if you have two transactions trying to update the same table at once the second will get SQLITE_BUSY.

Setting the busy timeout is one way to cope with the situation - if your timeout is longer than the longest write transaction performed by your application that will mitigate many of the errors.

But even then, you can still easily get SQLITE_BUSY via a transaction along the lines of:

BEGIN;
SELECT MAX(x) FROM table1;
INSERT INTO table1 VALUES(x+1);
COMMIT;

If there is already a write transaction in progress on table1 when this INSERT statement runs, sqlite will immediately return SQLITE_BUSY, without waiting for the busy timeout.

In rollback-journal mode, this is because sqlite knows it has a read-lock (from the earlier SELECT), so waiting is just going to hold up the existing write transaction.

In WAL mode, this is because the earlier SELECT has the effect of pinning this transaction's view of the database to that point in time. sqlite knows the existing write transaction is going to update database potentially invalidating the SELECT results, and doesn't let you write to the database from this view of the past.

For this conflicting writer case, the proper course of action is for the application to ROLLBACK the transaction, and potentially try again once the other write transaction has completed.

I don't think there is any issue with go-sqlite3 here.

@Klowner
Copy link

Klowner commented Jul 5, 2016

fwiw, I stumbled across this ticket and subsequently found http://stackoverflow.com/questions/32479071/sqlite3-error-database-is-locked-in-golang

After the addition rows.Close() calls to my code I don't seem to be experiencing the database is locked issue.

@jrots
Copy link

jrots commented Jul 15, 2016

Still experiencing this issue atm..
So what is the best solution/ practices here to avoid these "locked issues" ?

  • creating a writer connection with : db.SetMaxOpenConns(1) +
  • a separate reader connection for selects without SetMaxOpenConns() ?
  • avoid using tx.Begin() and Commit() as it will block other connections?

@sqweek
Copy link
Contributor

sqweek commented Jul 15, 2016

@jrots sqlite allows multiple readers but only a single writer at any one time. If you have multiple connections to the same DB, you will inevitably run into database is locked. This is an unavoidable consequence of sqlite's locking model, and therefore something your application needs to deal with. Possible solutions include:

  1. Only ever open a single connection to the DB (but note this can still result in multiple connections to the DB if a user runs multiple instances of your application at once)
  2. Set the busy timeout to something reasonably high (which causes sqlite to retry lock acquisition for a certain amount of time before returning database is locked). In go-sqlite3 this is achieved by appending eg. ?_busy_timeout=5000 to the filename passed to sql.Open, where 5000 is the number of milliseconds to retry.

Also note that tx.Begin() and tx.Commit() are currently affected by issue #184 which can result in a connection getting stuck in an inconsistent state if SQLITE_BUSY is encountered while commit is in progress. But setting an appropriate busy timeout should avoid this situation also (unless you have excessively long lived transactions in which case you might want to rethink your design and/or ensure that you're closing all your transaction/result objects in a timely manner).

@jrots
Copy link

jrots commented Jul 15, 2016

@sqweek Thx for the thorough explanation! Are you also using db.SetMaxOpenConns(1) when initialising the connection or just avoid having that setting?

update: ok reread your answer & you probably don't use that, just one initialisation and let it underlying create multiple connections without that setting

@roxma
Copy link

roxma commented Apr 4, 2017

@mrnugget

since connections in database/sql are stateful, which means that a db.Begin() in one goroutine effects other goroutines.

I'm confused. I can't find the offical database/sql documentation on this, so here's what I found and looks sensible to me:

http://go-database-sql.org/modifying.html

In Go, a transaction is essentially an object that reserves a connection to the datastore

Which implies that the goroutine which calls the db.Begin() will own the connection, until the transaction ends with db.Commit() or db.Callback()

If it is true, how does this behavior affectes other groutines?

@grj1046
Copy link

grj1046 commented May 17, 2017

at windows 10, it can run very well, but in centos7.3 database is locked error random occur. _busy_timeout is invalid. but db.SetMaxOpenConns(1) looks run well.

@peter-kankowski
Copy link

peter-kankowski commented Jul 5, 2017

@mattn You should change this line in sqlite3.go:

#cgo CFLAGS: -DSQLITE_ENABLE_RTREE -DSQLITE_THREADSAFE

as the following:

#cgo CFLAGS: -DSQLITE_ENABLE_RTREE -DSQLITE_THREADSAFE=1

It fixes the problem. SQLITE_THREADSAFE must be 1 because a single DB connection can be used from multiple threads when using it from multiple goroutines. See https://www.sqlite.org/threadsafe.html and https://sqlite.org/compile.html#threadsafe

https://golang.org/src/database/sql/doc.txt says:

Given an *sql.DB, it should be possible to share that instance between multiple goroutines, without any extra synchronization.

@mattn mattn closed this as completed in acfa601 Jul 5, 2017
@mattn
Copy link
Owner

mattn commented Jul 5, 2017

@sqlitefan Thank you!

If you still have issue, please reopen this. 👍

@tonivj5
Copy link

tonivj5 commented Jul 6, 2017

It seems the problem isn't fixed yet... 😢

@mattn
Copy link
Owner

mattn commented Jul 7, 2017

Well, I don't make sure what is your issue but you can't avoid busy timeout. SQLite is not provide infinite blocking.

@runderwood
Copy link

Not sure if this is documented elsewhere, but I don't think the busy timeout makes any difference in my application unless I set the transaction lock parameter to "immediate" (so that the BEGIN statements are BEGIN IMMEDIATE, I think).

Once that's set, the busy timeout is in effect, and things work much better.

Maybe that's obvious to others, but it wasn't to me. So I thought I'd mention it here in case anybody else comes looking for it.

Also possible I'm misunderstanding what I'm seeing.

@rittneje
Copy link
Collaborator

rittneje commented Feb 14, 2023

@runderwood That is not the case. With a rollback journal, the busy handler will be invoked if you attempt to read while another connection is writing, or vice versa. With WAL, those situations are not generally blocking so you won't normally observe the handler getting invoked. In both modes, starting an immediate transaction while another connection holds a write lock will invoke the busy handler.

Note that there are some cases that can arise where SQLite will detect that invoking the busy handler is pointless and thus returns SQLITE_BUSY without invoking it. This in particular can happen if you use BEGIN or BEGIN DEFERRED and your transaction first reads, then another connection writes, and then your transaction tries to write. For this reason, it is best to use BEGIN IMMEDIATE for any transactions that could write.

@runderwood
Copy link

runderwood commented Feb 14, 2023

I feel like we're saying the same thing, at least mostly.

But I can confidently say that, in my application, in WAL journal mode, if my transactions are initiated with the default (plain BEGIN), then the busy timeout seems to never be used. I get many immediate "database is locked" errors with concurrent write attempts.

When I set this library's "_tx_lock" option to "immediate" (which initiates transactions with BEGIN IMMEDIATE), the busy timeout is observed and I see far, far fewer if any "database is locked" errors.

As I read your comment, particularly the last sentence, I feel like you're confirming that this is the behavior one ought to expect.

Of special interest is the scenario you describe -- which perfectly matches the situation that led me to here:

Note that there are some cases that can arise where SQLite will detect that invoking the busy handler is pointless and thus returns SQLITE_BUSY without invoking it. This in particular can happen if you use BEGIN or BEGIN DEFERRED and your transaction first reads, then another connection writes, and then your transaction tries to write.

I believe this is what was happening to trigger immediate "database is locked" responses in my application. And, as you note, reconfiguring this to use BEGIN IMMEDIATE improves the outcome immensely (insofar as the busy timeout is actually in play).

My purpose in posting here is this: reading through the docs and the comments here, it was not apparent to me that setting the busy timeout without changing the _tx_lock parameter to "immediate" would not make an appreciable difference.

@rittneje
Copy link
Collaborator

in WAL journal mode, if my transactions are initiated with the default (plain BEGIN), then the busy timeout seems to never be used

This is usually true, because WAL lifts the restrictions on concurrent reads/writes. However, if you use BEGIN (or BEGIN DEFERRED), and the very first operation is to write, then the busy handler will be invoked if some other connection currently holds a write lock.

In other words, it is not technically the case that the busy handler only has an effect when you use BEGIN IMMEDIATE. It's just (a) the other cases where you can get SQLITE_BUSY in WAL mode are less likely to occur, and (b) the situation I described above skips the handler entirely (regardless of whether you are using WAL or rollback journal). Thus it may give the false impression that it does.

Now, to be clear, there has been a suggestion in the past to use shared cache mode to avoid some "database is locked" issues, such as the aforementioned one. However, this is bad advice. Do not use shared cache mode. It only pushes the problem around, and can lead to similar looking "database table is locked" errors. The proper fix is (1) to use BEGIN IMMEDIATE for any transactions that may write to the database, and (2) use an explicit *sql.Tx in Go if you will need to write to the database while iterating through a result set.

@runderwood
Copy link

So, just to understand more clearly: why do concurrent transactions initiated with plain BEGIN and which first read then write not invoke the busy handler?

@rittneje
Copy link
Collaborator

The reason is that SQLite (the C library) is acting on limited information and making some pessimistic assumptions:

  1. the data that you read may have been changed by the other writer
  2. the data that you read may have factored into your decision of what to write

Consequently it cannot allow the transaction to succeed, since doing so would violate ACID (namely, isolation). Thus there is no point in invoking the busy handler, since the result will always be the same.

@runderwood
Copy link

That makes sense and is consistent with my experience.

Had I read the SQLite transaction docs more closely, I would have sooner realized that the default transaction mode is DEFERRED and that the behavior I've observed (and which you've described) is exactly as one should expect:

The default transaction behavior is DEFERRED...DEFERRED means that the transaction does not actually start until the database is first accessed. Internally, the BEGIN DEFERRED statement merely sets a flag on the database connection that turns off the automatic commit that would normally occur when the last statement finishes. This causes the transaction that is automatically started to persist until an explicit COMMIT or ROLLBACK or until a rollback is provoked by an error or an ON CONFLICT ROLLBACK clause. If the first statement after BEGIN DEFERRED is a SELECT, then a read transaction is started. Subsequent write statements will upgrade the transaction to a write transaction if possible, or return SQLITE_BUSY. If the first statement after BEGIN DEFERRED is a write statement, then a write transaction is started.

👍

hermitpopcorn added a commit to hermitpopcorn/decatholac-mango that referenced this issue Mar 19, 2023
vrothberg added a commit to vrothberg/libpod that referenced this issue Apr 25, 2023
According to an old upstream issue [1]: "If the first statement after
BEGIN DEFERRED is a SELECT, then a read transaction is started.
Subsequent write statements will upgrade the transaction to a write
transaction if possible, or return SQLITE_BUSY."

So let's move the first SELECT under the same transaction as the table
initialization.

[NO NEW TESTS NEEDED] as it's a hard to cause race.

[1] mattn/go-sqlite3#274 (comment)

Fixes: containers#17859
Signed-off-by: Valentin Rothberg <[email protected]>
vrothberg added a commit to vrothberg/libpod that referenced this issue May 9, 2023
As shown in containers#17831, WAL mode plays a role in causing `database is locked`
errors.  Those are errors, in theory, should not happen as the DB should
busy wait.  mattn/go-sqlite3/issues/274 has some comments indicating
that the busy handler behaves differently in WAL mode which may be an
explanation to the error.

For now, let's disable WAL mode and only re-enable it when we have
clearer understanding of what's going on.  The upstream issue along with
the SQLite documentation do not give me the clear guidance that I would
need.

[NO NEW TESTS NEEDED] - flake is only reproducible in CI.

Fixes: containers#18356
Signed-off-by: Valentin Rothberg <[email protected]>
@meetme2meat
Copy link

meetme2meat commented Jun 21, 2023

@sqweek not solve it since sqlite3 run in serialised mode when THREADSAFE=1 is used

#cgo CFLAGS: -DSQLITE_ENABLE_RTREE -DSQLITE_THREADSAFE=1

@farqis
Copy link

farqis commented Oct 28, 2023

Just chiming in here, because I keep running into the same problem. I created a gist that can pretty reliably demonstrate the "database is locked" error.

It seems the issue is with using double quotes instead of single quotes when providing string values (in your example you used "Bobby". Try 'Bobby' instead)

I was getting the same error even without any routines or threads. Using single-quotes fixed the issue.

@im-karina
Copy link

im-karina commented Oct 17, 2024

Thank you so much for your comment, Klowner. You saved me. I was considering moving off of sqlite because of this.

After swapping most of my queries away from using rows (I am fine with fetching all records at once) and auditing the stragglers to make sure they get Close()d in time, all of the issues I've been having are resolved.

speculation:

I think I remember now that sqlite can fetch row-by-row in ways that traditional databases cannot. More specifically, because sqlite can fetch a row at a time from the disk, it doesn't fetch every row from the disk unless required. So essentially, if you do a query that returns sql.Rows, the query isn't done until the sql.Rows are closed. But in my case, because I was triggering another query based on the results of each row, I was essentially guaranteeing this broken behavior, because I was trying to execute a second query before my first was finished, and they were both in the same goroutine so no amount of waiting or retries would fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests