-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify docs of binary and string builders #2699
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, just a minor suggestion
/// `item_capacity` is the number of items to pre-allocate space for in this builder | ||
/// Creates a new [`GenericStringBuilder`]. | ||
/// | ||
/// - `item_capacity` is the number of items to pre-allocate (the size of the buffer of offsets). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// - `item_capacity` is the number of items to pre-allocate (the size of the buffer of offsets). | |
/// - `item_capacity` is the number of items to pre-allocate (i.e. the final length of the array). |
Or something, the offsets is actually one larger than this 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point, thanks a lot for the feedback. I added a note about the size of the buffer being one larger, I personally think this should make things very clear, but let me know if you have any other comment.
/// `data_capacity` is the number of bytes to pre-allocate space for in this builder | ||
/// Creates a new [`GenericBinaryBuilder`]. | ||
/// | ||
/// - `item_capacity` is the number of items to pre-allocate (the size of the buffer of offsets). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// - `item_capacity` is the number of items to pre-allocate (the size of the buffer of offsets). | |
/// - `item_capacity` is the number of items to pre-allocate (i.e. the final length of the array). |
Benchmark runs are scheduled for baseline = 8206f01 and contender = e646ae8. e646ae8 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
None
Rationale for this change
When reading the docs for the
with_capacity
method ofGenericBinaryBuilder
andGenericStringBuilder
it gave me the impression that sinceitem_capacity
is specified,data_capacity
could be the number of bytes per string or item, not the total data to pre-allocate. For someone who understands the internal details of how a string array is stored this may be quite obvious, but for someone new who is reading this after thewith_capacity
of for example an int type, I think it's confusing.What changes are included in this PR?
I updated the docs to clarify the above, and avoid ambiguity and anything that can mislead. Also made some related minor changes for consistency, like specifying that the
append_
methods of the builder append to the builder instead of the array (kind of the same, but having both mixed feels confusing).