Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validations::flush() doesn't use JobQueue (RIPD-1356): #2083

Closed
Closed
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 37 additions & 21 deletions src/ripple/app/misc/Validations.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -429,25 +429,34 @@ class ValidationsImp : public Validations
bool anyNew = false;

JLOG (j_.info()) << "Flushing validations";
ScopedLockType sl (mLock);
for (auto& it: mCurrentValidations)
{
if (it.second)
mStaleValidations.push_back (it.second);

anyNew = true;
}
mCurrentValidations.clear ();
ScopedLockType sl (mLock);
for (auto& it: mCurrentValidations)
{
if (it.second)
{
mStaleValidations.push_back (it.second);
anyNew = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fix!

}
}
mCurrentValidations.clear ();

if (anyNew)
condWrite ();
// If there isn't a write in progress already, then write to the
// database synchronously.
if (anyNew && !mWriting)
{
mWriting = true;
doWrite (sl);
}

while (mWriting)
{
ScopedUnlockType sul (mLock);
std::this_thread::sleep_for (std::chrono::milliseconds (100));
// Handle the case where flush() is called while a queuedWrite
// is already in progress.
while (mWriting)
{
ScopedUnlockType sul (mLock);
std::this_thread::sleep_for (std::chrono::milliseconds (100));
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears the mWriting is safe to read while not under lock, so unless I'm missing something, you can end the scope of sl after doWrite, and avoid the overhead of taking and releasing the lock in this loop (not to mention fighting with the other thread for the lock).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of the standard is that any unsynchronized access (read or write) of a non-atomic by two different threads is undefined behavior. If my understanding is correct, then since mWriting is a standard bool it needs the lock. Undefined behavior is nobody's friend (despite what the optimizer folks will tell you).

However I could convert mWriting into an std::atomic<bool>and we could release the lock after the call to doWrite. I'm not convinced this would be a big win, since flush() only occurs once during the life of the program. Making mWriting an atomic has the disadvantage that it constrains the optimizer in other parts of the code. But it could certainly be done. Shall I convert mWriting to an atomic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My vote is to leave the code as Scott has written it. Scott is correct that the read needs to be protected, and this seems better than the atomic<bool> solution considering the interaction with the other parts of the code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds legit to me. The only reason I brought it up was that I noticed yesterday that there are several other instances of mWriting being accessed when not under lock. Looking at it again, it appears that the only time it's not explicitly locked is in in condWrite, and condWrite appears to always be called under lock.

Soooooooo... Maybe do the same trick in doWrite in condWrite so that we don't make this same confusing mistake again?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS. I'll note that my original comment had the disclaimer "unless I'm missing something". I missed something. 😆

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll pass a ScopedLockReference into condWrite() from it's callers. I thought about doing that when made created doWrite(). At the time I was leaning toward minimal changes, but I think this is a good change to make.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I think it's better to ask the question and miss something than to not ask the question. So thanks for asking.

}

JLOG (j_.debug()) << "Validations flushed";
}

Expand All @@ -458,20 +467,27 @@ class ValidationsImp : public Validations

mWriting = true;
app_.getJobQueue ().addJob (
jtWRITE, "Validations::doWrite",
[this] (Job&) { doWrite(); });
jtWRITE, "Validations::queuedWrite",
[this] (Job&) { queuedWrite(); });
}

void doWrite ()
void queuedWrite ()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can queuedWrite be inlined to the lambda body on Line 471?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's an excellent suggestion. Thanks.

{
auto event = app_.getJobQueue ().getLoadEventAP (jtDISK, "ValidationWrite");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of event? It appears unused.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it saves load/timing information for the ValidationWrite job. getLoadEventAP returns a pointer to https://github.com/ripple/rippled/blob/develop/src/ripple/core/LoadEvent.h. I think you need to hold onto the event for the duration of this block to capture that timing info. Definitely a non-intuitive design.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Ok. I haven't delved into the LoadEvents much (at all).


std::string insVal ("INSERT INTO Validations "
ScopedLockType sl (mLock);
doWrite (sl);
}

// NOTE: doWrite() must be called with mLock *locked*. The passed
// ScopedLockType& acts as a reminder to future maintainers.
void doWrite (ScopedLockType& sl)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the purpose of passing in the scoped lock to doWrite? I'm not able to determine why its needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing in the reference to the lock is not required by the functionality. It's a way of signaling to maintainers that the lock must be held when doWrite() is called. I suppose a comment to that effect would be useful as well 😃.

{
std::string const insVal ("INSERT INTO Validations "
"(InitialSeq, LedgerSeq, LedgerHash,NodePubKey,SignTime,RawData) "
"VALUES (:initialSeq, :ledgerSeq, :ledgerHash,:nodePubKey,:signTime,:rawData);");
std::string findSeq("SELECT LedgerSeq FROM Ledgers WHERE Ledgerhash=:ledgerHash;");
std::string const findSeq("SELECT LedgerSeq FROM Ledgers WHERE Ledgerhash=:ledgerHash;");

ScopedLockType sl (mLock);
assert (mWriting);

while (!mStaleValidations.empty ())
Expand Down