PYTHON-1369 Extend driver vector support to arbitrary subtypes and fix handling of variable length types (OSS C* 5.0) #1217

absurdfarce · 2024-07-23T06:33:39Z

No description provided.

…borked for some reason but the other cases are now passing.

… vectors (or other types which might have non-standard representations)

absurdfarce · 2024-07-23T06:35:15Z

cassandra/marshal.py

+        rv.append(abs(v))
+
+    rv.reverse()
+    return bytes(rv)


Adapted from vints_pack and vints_unpack above. Those functions are different enough (built-in zig-zag encoding + tuples for incoming/outgoing data) that it seemed worthwhile to live with the code duplication here.

…an integration test) with PYTHON-1394.

absurdfarce · 2024-07-24T19:52:47Z

cassandra/__init__.py

-    """
-    The driver was unable to deserialize a given vector
-    """
-    pass


Added in PYTHON-1371 to indicate an attempt to decode a vector of an unsupported subtype. With the other changes in this PR this exception is now completely unnecessary.

absurdfarce · 2024-07-24T19:56:53Z

cassandra/cqltypes.py

+    @classmethod
+    def serial_size(cls):
+        serialized_size = cls.subtype.serial_size()
+        return cls.vector_size * serialized_size if serialized_size is not None else None


Here (as in many other points in this PR) we're following the example of what's done on the server side.

absurdfarce · 2024-07-24T19:58:00Z

tests/unit/test_types.py

+    def _normalize_set(self, val):
+        if isinstance(val, set) or isinstance(val, util.SortedSet):
+            return frozenset([self._normalize_set(v) for v in val])
+        return val


Really just about making the types line up. The set codec returns a SortedSet by default and that class doesn't have all the niceties of frozenset so we convert here to simplify the equality comparison later.

absurdfarce · 2024-07-24T21:26:56Z

Jenkins build at this point looks good: all failures appear to be known issues, most notably PYTHON-1390 and friends.

…nv vars" This reverts commit 3363d16.

absurdfarce · 2024-08-28T23:12:40Z

@SiyaoIsHiding the integration test is now complete and covers both tuples and UDTs. At this point I think I've covered everything we intended to cover so this PR should be done (plus or minus changes coming out of review). Would you mind taking another pass at this?

absurdfarce · 2024-08-28T23:21:06Z

cassandra/encoder.py

@@ -217,3 +219,6 @@ def cql_encode_ipaddress(self, val):
        is suitable for ``inet`` type columns.
        """
        return "'%s'" % val.compressed
+
+    def cql_encode_decimal(self, val):
+        return self.cql_encode_float(float(val))


Fixing a bug discovered during testing. When using the drivers support for positional parameters in the execute() statement we weren't correctly expanding decimal types.

SiyaoIsHiding

Apart from the type mismatch error handling, looks good to me!

SiyaoIsHiding · 2024-08-28T23:26:30Z

cassandra/cqltypes.py

+            idx += bytes_read
+            rv.append(cls.subtype.deserialize(byts[idx:idx + size], protocol_version))
+            idx += size
+        return rv


Do we want to throw an error when the length of elements in the byts does not match the vector dimension definition? Same for serialize.

This is actually happening (at least for deserialize) with the changes above. I've also recently added support for similar tests in serliaze() (along with tests for all the cases).

SiyaoIsHiding · 2024-08-28T23:46:20Z

tests/integration/__init__.py

-lessthanorequalcass40 = unittest.skipUnless(CASSANDRA_VERSION <= Version('4.0-a'), 'Cassandra version less or equal to 4.0 required')
-lessthancass40 = unittest.skipUnless(CASSANDRA_VERSION < Version('4.0-a'), 'Cassandra version less than 4.0 required')
+greaterthanorequalcass40 = unittest.skipUnless(CASSANDRA_VERSION >= Version('4.0'), 'Cassandra version 4.0 or greater required')
+greaterthanorequalcass50 = unittest.skipUnless(CASSANDRA_VERSION >= Version('5.0-beta'), 'Cassandra version 5.0 or greater required')


nit: as far as I know there isn't a branch called '5.0-beta' anymore. It's now called '5.0.0'.

The intent here was to handle the various betas and release candidate releases as we run up to Cassandra 5.0. Once we get to an actual 5.0 release this can be easily changed.

…le or too much data

absurdfarce · 2024-09-03T06:06:24Z

I believe everything from the last go-round has been addressed. @SiyaoIsHiding would you mind taking another look?

SiyaoIsHiding · 2024-09-04T00:49:38Z

Sorry that previously I missed Counter subtype. Counter is considered a fixed length type, even if BigInt is var-sized. We may want to change the CounterColumnType as follows:

class CounterColumnType(LongType):
    typename = 'counter'

    @classmethod
    def serial_size(cls):
        return None

I don't quite understand how vector<counter, 2> works though. I don't know how we can test it.

absurdfarce added 8 commits July 18, 2024 14:57

Initial commit of unit test

d153c61

What appears to be a working test now

69f54b0

Some test refinements

72a27ac

Seems to be working now

fe7a3b5

Added tests for string, map and vector subtype cases

c295a8c

We have most things working now. Vector of vectors still seems to be …

d0d5983

…borked for some reason but the other cases are now passing.

Fix VectorType.cql_parameterized_type() to properly handle vectors of…

3534876

… vectors (or other types which might have non-standard representations)

Fix error in test

a0791b0

absurdfarce commented Jul 23, 2024

View reviewed changes

absurdfarce added 3 commits July 23, 2024 17:36

Test fixes

f45d7df

Removing test client. This will eventually come back (in the form of …

daa54f1

…an integration test) with PYTHON-1394.

Remove custom exception type added in PYTHON-1371

07c86bb

absurdfarce commented Jul 24, 2024

View reviewed changes

absurdfarce requested a review from SiyaoIsHiding July 29, 2024 22:05

absurdfarce added 11 commits August 21, 2024 17:58

Allowing user to pass in custom libev includes and libs via env vars

3363d16

Revert "Allowing user to pass in custom libev includes and libs via e…

1cba1b9

…nv vars" This reverts commit 3363d16.

Merge branch 'master' into python1369

44966c9

Initial sketch of what the bones of an integration test might look like

dcb008f

Just moving some things around

8442743

Short is (incorrectly) marked as a variable size type on the server side

f35dcda

Add support for Decimal types as positional params

4f6bef8

Passing test with basic integer and floating point types

e23e425

A few more fixed size types we missed

e101213

Passing test which covers everything except UDTs

1840a6d

Testing for UDTs now included

1f7dd90

absurdfarce commented Aug 28, 2024

View reviewed changes

SiyaoIsHiding requested changes Aug 28, 2024

View reviewed changes

absurdfarce added 3 commits August 30, 2024 15:16

Explicitly throw ValueErrors when deserializing vectors with too litt…

c053b67

…le or too much data

Some minor cleanup of test fn names

16cef42

Added checks for vector serialize op (and tests)

f996a7d

absurdfarce requested a review from SiyaoIsHiding September 3, 2024 06:06

SiyaoIsHiding approved these changes Sep 4, 2024

View reviewed changes

absurdfarce merged commit c4a808d into master Sep 4, 2024
4 of 5 checks passed

absurdfarce deleted the python1369 branch September 4, 2024 16:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PYTHON-1369 Extend driver vector support to arbitrary subtypes and fix handling of variable length types (OSS C* 5.0) #1217

PYTHON-1369 Extend driver vector support to arbitrary subtypes and fix handling of variable length types (OSS C* 5.0) #1217

absurdfarce commented Jul 23, 2024

absurdfarce Jul 23, 2024

absurdfarce Jul 24, 2024

absurdfarce Jul 24, 2024

absurdfarce Jul 24, 2024

absurdfarce commented Jul 24, 2024

absurdfarce commented Aug 28, 2024

absurdfarce Aug 28, 2024

SiyaoIsHiding left a comment

SiyaoIsHiding Aug 28, 2024

absurdfarce Sep 3, 2024

SiyaoIsHiding Aug 28, 2024

absurdfarce Sep 3, 2024

absurdfarce commented Sep 3, 2024

SiyaoIsHiding commented Sep 4, 2024

PYTHON-1369 Extend driver vector support to arbitrary subtypes and fix handling of variable length types (OSS C* 5.0) #1217

PYTHON-1369 Extend driver vector support to arbitrary subtypes and fix handling of variable length types (OSS C* 5.0) #1217

Conversation

absurdfarce commented Jul 23, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

absurdfarce commented Jul 24, 2024

absurdfarce commented Aug 28, 2024

Choose a reason for hiding this comment

SiyaoIsHiding left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

absurdfarce commented Sep 3, 2024

SiyaoIsHiding commented Sep 4, 2024