-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PYTHON-1369 Extend driver vector support to arbitrary subtypes and fix handling of variable length types (OSS C* 5.0) #1217
Merged
Merged
Changes from 8 commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
d153c61
Initial commit of unit test
absurdfarce 69f54b0
What appears to be a working test now
absurdfarce 72a27ac
Some test refinements
absurdfarce fe7a3b5
Seems to be working now
absurdfarce c295a8c
Added tests for string, map and vector subtype cases
absurdfarce d0d5983
We have most things working now. Vector of vectors still seems to be…
absurdfarce 3534876
Fix VectorType.cql_parameterized_type() to properly handle vectors of…
absurdfarce a0791b0
Fix error in test
absurdfarce f45d7df
Test fixes
absurdfarce daa54f1
Removing test client. This will eventually come back (in the form of…
absurdfarce 07c86bb
Remove custom exception type added in PYTHON-1371
absurdfarce 3363d16
Allowing user to pass in custom libev includes and libs via env vars
absurdfarce 1cba1b9
Revert "Allowing user to pass in custom libev includes and libs via e…
absurdfarce 44966c9
Merge branch 'master' into python1369
absurdfarce dcb008f
Initial sketch of what the bones of an integration test might look like
absurdfarce 8442743
Just moving some things around
absurdfarce f35dcda
Short is (incorrectly) marked as a variable size type on the server side
absurdfarce 4f6bef8
Add support for Decimal types as positional params
absurdfarce e23e425
Passing test with basic integer and floating point types
absurdfarce e101213
A few more fixed size types we missed
absurdfarce 1840a6d
Passing test which covers everything except UDTs
absurdfarce 1f7dd90
Testing for UDTs now included
absurdfarce c053b67
Explicitly throw ValueErrors when deserializing vectors with too litt…
absurdfarce 16cef42
Some minor cleanup of test fn names
absurdfarce f996a7d
Added checks for vector serialize op (and tests)
absurdfarce File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -111,7 +111,6 @@ def vints_unpack(term): # noqa | |
|
||
return tuple(values) | ||
|
||
|
||
def vints_pack(values): | ||
revbytes = bytearray() | ||
values = [int(v) for v in values[::-1]] | ||
|
@@ -143,3 +142,48 @@ def vints_pack(values): | |
|
||
revbytes.reverse() | ||
return bytes(revbytes) | ||
|
||
def uvint_unpack(bytes): | ||
first_byte = bytes[0] | ||
|
||
if (first_byte & 128) == 0: | ||
return (first_byte,1) | ||
|
||
num_extra_bytes = 8 - (~first_byte & 0xff).bit_length() | ||
rv = first_byte & (0xff >> num_extra_bytes) | ||
for idx in range(1,num_extra_bytes + 1): | ||
new_byte = bytes[idx] | ||
rv <<= 8 | ||
rv |= new_byte & 0xff | ||
|
||
return (rv, num_extra_bytes + 1) | ||
|
||
def uvint_pack(val): | ||
rv = bytearray() | ||
if val < 128: | ||
rv.append(val) | ||
else: | ||
v = val | ||
num_extra_bytes = 0 | ||
num_bits = v.bit_length() | ||
# We need to reserve (num_extra_bytes+1) bits in the first byte | ||
# ie. with 1 extra byte, the first byte needs to be something like '10XXXXXX' # 2 bits reserved | ||
# ie. with 8 extra bytes, the first byte needs to be '11111111' # 8 bits reserved | ||
reserved_bits = num_extra_bytes + 1 | ||
while num_bits > (8-(reserved_bits)): | ||
num_extra_bytes += 1 | ||
num_bits -= 8 | ||
reserved_bits = min(num_extra_bytes + 1, 8) | ||
rv.append(v & 0xff) | ||
v >>= 8 | ||
|
||
if num_extra_bytes > 8: | ||
raise ValueError('Value %d is too big and cannot be encoded as vint' % val) | ||
|
||
# We can now store the last bits in the first byte | ||
n = 8 - num_extra_bytes | ||
v |= (0xff >> n << n) | ||
rv.append(abs(v)) | ||
|
||
rv.reverse() | ||
return bytes(rv) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Adapted from vints_pack and vints_unpack above. Those functions are different enough (built-in zig-zag encoding + tuples for incoming/outgoing data) that it seemed worthwhile to live with the code duplication here. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,123 @@ | ||
import logging | ||
import unittest | ||
|
||
from cassandra.cluster import Cluster, Session | ||
|
||
class Python1369Test(unittest.TestCase): | ||
|
||
def setUp(self): | ||
#log = logging.getLogger() | ||
#log.setLevel('DEBUG') | ||
|
||
#handler = logging.StreamHandler() | ||
#handler.setFormatter(logging.Formatter("%(asctime)s [%(levelname)s] %(name)s: %(message)s")) | ||
#log.addHandler(handler) | ||
|
||
self.cluster = Cluster(['127.0.0.1']) | ||
self.session = self.cluster.connect() | ||
self.session.execute("drop keyspace if exists test") | ||
ks_stmt = """CREATE KEYSPACE test | ||
WITH REPLICATION = { | ||
'class' : 'SimpleStrategy', | ||
'replication_factor' : 1 | ||
}""" | ||
self.session.execute(ks_stmt) | ||
|
||
def _create_table(self, subtype): | ||
table_stmt = """CREATE TABLE test.foo ( | ||
i int PRIMARY KEY, | ||
j vector<%s, 3> | ||
)""" % (subtype,) | ||
self.session.execute(table_stmt) | ||
|
||
def _populate_table(self, data): | ||
for k,v in data.items(): | ||
self.session.execute("insert into test.foo (i,j) values (%d,%s)" % (k,v)) | ||
|
||
def _populate_table_prepared(self, data): | ||
ps = self.session.prepare("insert into test.foo (i,j) values (?,?)") | ||
for k,v in data.items(): | ||
self.session.execute(ps, [k,v]) | ||
|
||
def _create_and_populate_table(self, subtype="float", data={}): | ||
self._create_table(subtype) | ||
self._populate_table(data) | ||
|
||
def _create_and_populate_table_preapred(self, subtype="float", data={}): | ||
self._create_table(subtype) | ||
self._populate_table_prepared(data) | ||
|
||
def _execute_test(self, expected, test_fn): | ||
rs = self.session.execute("select j from test.foo where i = 2") | ||
rows = rs.all() | ||
self.assertEqual(len(rows), 1) | ||
observed = rows[0].j | ||
for idx in range(0, 3): | ||
test_fn(observed[idx], expected[idx]) | ||
|
||
def test_float_vector(self): | ||
self.session.execute("drop table if exists test.foo") | ||
def test_fn(observed, expected): | ||
self.assertAlmostEqual(observed, expected, places=5) | ||
expected = [1.2, 3.4, 5.6] | ||
data = {1:[8, 2.3, 58], 2:expected, 5:[23, 18, 3.9]} | ||
self._create_and_populate_table(subtype="float", data=data) | ||
self._execute_test(expected, test_fn) | ||
|
||
def test_float_vector_prepared(self): | ||
self.session.execute("drop table if exists test.foo") | ||
def test_fn(observed, expected): | ||
self.assertAlmostEqual(observed, expected, places=5) | ||
expected = [1.2, 3.4, 5.6] | ||
data = {1:[8, 2.3, 58], 2:expected, 5:[23, 18, 3.9]} | ||
self._create_and_populate_table_preapred(subtype="float", data=data) | ||
self._execute_test(expected, test_fn) | ||
|
||
def test_varint_vector(self): | ||
self.session.execute("drop table if exists test.foo") | ||
def test_fn(observed, expected): | ||
self.assertEqual(observed, expected) | ||
expected=[1, 3, 5] | ||
data = {1:[8, 2, 58], 2:expected, 5:[23, 18, 3]} | ||
self._create_and_populate_table(subtype="varint", data=data) | ||
self._execute_test(expected, test_fn) | ||
|
||
def test_varint_vector_prepared(self): | ||
self.session.execute("drop table if exists test.foo") | ||
def test_fn(observed, expected): | ||
self.assertEqual(observed, expected) | ||
expected=[1, 3, 5] | ||
data = {1:[8, 2, 58], 2:expected, 5:[23, 18, 3]} | ||
self._create_and_populate_table_preapred(subtype="varint", data=data) | ||
self._execute_test(expected, test_fn) | ||
|
||
def test_string_vector(self): | ||
self.session.execute("drop table if exists test.foo") | ||
def test_fn(observed, expected): | ||
self.assertEqual(observed, expected) | ||
expected=["foo", "bar", "baz"] | ||
data = {1:["a","b","c"], 2:expected, 5:["x","y","z"]} | ||
self._create_and_populate_table(subtype="text", data=data) | ||
self._execute_test(expected, test_fn) | ||
|
||
def test_map_vector(self): | ||
self.session.execute("drop table if exists test.foo") | ||
def test_fn(observed, expected): | ||
self.assertEqual(observed, expected) | ||
expected=[{"foo":1}, {"bar":2}, {"baz":3}] | ||
data = {1:[{"a":1},{"b":2},{"c":3}], 2:expected, 5:[{"x":1},{"y":2},{"z":3}]} | ||
self._create_table("map<text,int>") | ||
for k,v in data.items(): | ||
self.session.execute("insert into test.foo (i,j) values (%s,%s)", (k,v)) | ||
self._execute_test(expected, test_fn) | ||
|
||
def test_vector_of_vector(self): | ||
def test_fn(observed, expected): | ||
self.assertEqual(observed, expected) | ||
expected=[[1,2], [4,5], [7,8]] | ||
data = {1:[[10,20], [40,50], [70,80]], 2:expected, 5:[[100,200], [400,500], [700,800]]} | ||
self._create_table("vector<int,2>") | ||
for k,v in data.items(): | ||
self.session.execute("insert into test.foo (i,j) values (%s,%s)", (k,v)) | ||
self._execute_test(expected, test_fn) | ||
#self.session.execute("drop table test.foo") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to throw an error when the length of elements in the
byts
does not match the vector dimension definition? Same forserialize
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually happening (at least for deserialize) with the changes above. I've also recently added support for similar tests in serliaze() (along with tests for all the cases).