-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Decoding of Messages Received from Backend #1
base: main
Are you sure you want to change the base?
Conversation
do { | ||
// Pass the buffer instead of messageSlice because .value messages continue after the first \r\n | ||
let result = try MemcacheBackendMessage.decode(from: &buffer, for: verb) | ||
// TODO: Can we make sure the message was read entirely? Difficult because we don't know the length of VA messages here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you have ideas!
|
||
extension MemcacheBackendMessage { | ||
struct Flags: MemcacheMessagePayloadDecodable, Equatable, ExpressibleByArrayLiteral { | ||
let flags: [String] // TODO: Do we want something like (Character, token: String?) instead? Or a struct? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we try to parse the flags a bit further in this stage, we can catch malformed messages earlier and it might make parsing the meta-command-specific flags a bit easier. Flags are always a single character followed by an optional token string. Some tokens have a length limit but I don't think it makes sense to try to enforce it in this stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the flags should be an enum with associated values. And I think we should already enforce this here. I think users should be able to never decode anything after this stage and trust the values 100%.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agree 👍 One potential issue however is that flags can have slightly different meanings depending on the sent command. For example, the N(token)
flags is described as vivify on miss, takes TTL as argument for Meta Get, and as auto create item on miss with supplied TTL for Meta Arithmetic commands. I couldn't find any collisions so we just need to choose good names for the enum cases.
One thing we can't verify here though is which flags can be part of a message. This also depends on the sent command.
Edit: Or do you think we should just be unopinionated and directly use the flag characters as the enum case names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented decoding of flags, I think, as far as possible and added checks for things like data type and length where it makes sense.
// Keep track of the reader index in case we later notice that we need more data | ||
let startReaderIndex = buffer.readerIndex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of handling indexes so much, a better way is normally to get a copy of the bytebuffer.
var peekableBuffer = buffer
// if you were able to decode a buffer, just write the new reader indexes back:
buffer = peakableBuffer
// Peek at the message to read the verb. It is before the first \r\n and before the first <space> if the message | ||
// contains one. | ||
guard let messageSlice = buffer.getCarriageReturnNewlineTerminatedSlice(at: buffer.readerIndex) else { | ||
// reader index wasn't moved, wait for more bytes | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the benefit of finding the first line, if that doesn't ensure that this will be the complete first message? IIUC you are interested in the first VERB
and you can get that by finding either a space or a \r\n
correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes correct. Though being able to read a line does not always indicate that we have a full message. For value messages (format: VA <data block size> <flags>\r\n<data block>\r\n
), we only know how long the message should be when we start parsing the part after the verb.
If we would look for a space first, we might fail to parse the following buffer for example: EN\r\nHD <flags>\r\n
.
Maybe the messageSlice
name is also a bit misleading here 😇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great progress! Added some more comments...
], | ||
targets: [ | ||
.target( | ||
name: "Memcache", | ||
dependencies: [ | ||
.product(name: "NIO", package: "swift-nio") | ||
.product(name: "NIO", package: "swift-nio"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explicitly import NIOCore
and NIOPosix
. Plain NIO
shall go away with the next major release.
], | ||
targets: [ | ||
.target( | ||
name: "Memcache", | ||
dependencies: [ | ||
.product(name: "NIO", package: "swift-nio") | ||
.product(name: "NIO", package: "swift-nio"), | ||
.product(name: "ExtrasBase64", package: "swift-extras-base64") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH. I would vendor in the decoding/encoding part that you need. Just make sure you mention that you vendor those parts. Also make them internal.
// MARK: - | ||
|
||
extension MemcacheFlag { | ||
enum Code: Character { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need Character
here. I think using UInt8 likely gives us better performance, since we don't need to go through UTF8 validity checks.
struct NumericToken<Value: Numeric>: CustomDebugStringConvertible { | ||
var value: Value | ||
|
||
var debugDescription: String { | ||
"numeric: \(value)" | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the benefit of the NumericToken
type? Why can't we use the Numeric values directly? Add this reasoning to the type...
struct StringToken: ExpressibleByStringLiteral, CustomDebugStringConvertible { | ||
var value: String | ||
|
||
init(stringLiteral value: String) { | ||
self.value = value | ||
} | ||
|
||
var debugDescription: String { | ||
"string: \(value)" | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we need this wrapper for?
|
||
/// Mode switch token used in Set and Arithmetic commands. | ||
enum ModeToken: RawRepresentable, CustomDebugStringConvertible { | ||
typealias RawValue = Character |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the RawValue = UInt8
@@ -0,0 +1,59 @@ | |||
import Foundation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you need Foundation
for here?
|
||
// Peek at the message to read the verb. It is before the first \r\n and before the first <space> if the message | ||
// contains one. | ||
guard let textLine = peekableBuffer.getCarriageReturnNewlineTerminatedSlice(at: peekableBuffer.readerIndex) else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would read here instead of getting. In the error cases you can pass the buffer
that you keep unmodified for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you actually need to know the complete text line? Or is the goal really only to get the first verb to then learn how long the message will be?
let verbLength = (textLine.readableBytesView.firstIndex(of: .space) ?? textLine.writerIndex) - textLine.readerIndex | ||
|
||
guard let verbString = textLine.getString(at: textLine.readerIndex, length: verbLength) else { | ||
// If we can't read a string, the text line must be empty (i.e. no characters before the first occurence of \r\n) | ||
throw MemcacheDecodingError.emptyMessageReceived(bytes: peekableBuffer) | ||
} | ||
|
||
guard let verb = MemcacheBackendMessage.Verb(rawValue: verbString) else { | ||
throw MemcacheDecodingError.unknownVerbReceived(messageVerb: verbString, messageBytes: peekableBuffer) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would put this into an extension on ByteBuffer
, that I would call readVerb() throws -> MemcacheBackendMessage.Verb
, which of course moves the readerIndex forward.
line: UInt = #line | ||
) -> Self { | ||
MemcacheDecodingError( | ||
messageVerb: "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the messageVerb should be an optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks really good already. Fabian left some good comments here and I added a few more. If we fix them up and then take another look I think we can make some good progress.
@@ -0,0 +1,53 @@ | |||
import ExtrasBase64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used here?
import ExtrasBase64 | ||
import NIOCore | ||
|
||
struct MemcacheDecodingError: Error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want to have a public error type at some point. I just left a comment over in the kafka repository with a common error pattern we have established. I think it would be great to adopt this here for the public error type that we use in the end. Might be done in a follow up PR but just leaving this here
self.message = value | ||
} | ||
|
||
static func decode(from buffer: inout ByteBuffer) throws -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make this a method on ByteBuffer
probably like readErrorMessage
/// The following formats can be decoded from the `buffer`: | ||
/// - `<flags>\r\n`. Flags are space-separated strings. | ||
/// - `\r\n`. No flags. | ||
static func decode(from buffer: inout ByteBuffer) throws -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
.split(separator: " ") | ||
.map { flag in | ||
guard let codeCharacter = flag.first, | ||
let code = MemcacheFlag.Code(rawValue: codeCharacter) | ||
else { | ||
throw MemcachePartialDecodingError.fieldNotDecodable(as: MemcacheFlag.Code.self, from: String(flag)) | ||
} | ||
return try .decode(from: flag.dropFirst(), for: code) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should look at that code complexity wise. It is iterating the flagsString
multiple times from what I can see which we could avoid.
/// The message can have the following formats: | ||
/// - `<size> <flags>\r\n<data block>\r\n`. Flags are space-separated strings. | ||
/// - `<size>\r\n<data block>\r\n` | ||
static func decode(from buffer: inout ByteBuffer) throws -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
Description
Implement decoding of the most common messages that can be received from memcached via the Meta Protocol. More information can be found in the protocol reference.
The code still has a few
TODO
s. Some of them just need to be implemented and some are rather questions on the design. I'll comment on the latter ones individually. Happy about all feedback! 🙌To be implemented before merging
ByteBuffer
CRLF extension