-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: initial KMP XML reader implementation #601
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very clean!
* A stream of text characters that can be processed sequentially. This stream maintains a current position (i.e., | ||
* offset in the string) from which all reading operations begin. The stream is advanced by `read` operations. The | ||
* stream is **not** advanced by `peek` operations. | ||
* @param bytes The source bytes for this stream (which will be decoded to a string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outdated comment
/** | ||
* A stateful scanner that scans a [StringTextStream] and reads [XmlToken] elements. | ||
*/ | ||
class XmlScanner(internal val source: StringTextStream) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
This seems like it is should be the one named XmlLexer
since it doing the actual lexical analysis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, originally this was all inside what's now called XmlLexer
. Will rename this class to XmlLexer
and rename the other class to LexingXmlStreamReader
(since it's primarily an adapter for the XmlStreamReader
interface and uses a lexer as its primary delegate).
* @param length The number of characters to read. | ||
* @param errMessage A provider of an error message to include in the exception. | ||
*/ | ||
fun readOrThrow(errMessage: () -> String): Char = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is all internal
we could probably inline all of this to avoid function call per char. I'd wait for the benchmarks of course but something to look into
val start = source.readOrThrow { "Unexpected end-of-doc while trying to read name" } | ||
if (!start.isValidForNameStart()) error("Invalid start character for name '$start'") | ||
val subsequent = source.readWhile { it.isValidForName() } | ||
return "$start$subsequent".qualify() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldn't this just do
val start = source.peekOrThrow { "Unexepcted end-of-doc while trying to read name" }
if (!start.isValidForNameStart()) error(...)
val name = source.readWhile { it.isValidForName() }
return name.qualify()
@@ -14,7 +14,7 @@ import org.xmlpull.v1.XmlPullParserFactory | |||
import java.io.ByteArrayInputStream | |||
import java.nio.charset.Charset | |||
|
|||
actual fun xmlStreamReader(payload: ByteArray): XmlStreamReader = | |||
actual fun xmlPull(payload: ByteArray): XmlStreamReader = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this not be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet, I'm leaving temporarily it so I can include it in my benchmarking. It'll be removed in a future PR that cleans up all of XmlPull.
/** | ||
* Describes an internal state of a [XmlScanner]. | ||
*/ | ||
sealed class ScannerState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internal?
SonarCloud Quality Gate failed. 0 Bugs No Coverage information |
Issue #
Addresses aws-sdk-kotlin#538. Closes #519.
Description of changes
This is the initial implementation of a KMP XML reader conforming to the
XmlStreamReader
interface. There are a few TODOs remaining and no performance tests have been completed. XML writing is also not yet updated.The new XML reader consists of these classes:
StringTextStream
: a character stream used for peeking and reading data from an underlying stream (usually decoded from bytes)XmlLexer
: produces XML tokens from aStringTextStream
LexingXmlStreamReader
: the top-level implementation ofXmlStreamReader
which layers support for peeking, subtrees, etc., on top of anXmlLexer
The new whitespace handling allows a few disabled protocol tests to succeed, see companion PR aws-sdk-kotlin#548.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.