Skip to content

Add support for SCYLLA_USE_METADATA_ID and skip metadata #457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

andrzej-jackowski-scylladb

FOR NOW THIS IS A DRAFT PR
Used for testing a fix for scylladb/scylladb#20860

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

@andrzej-jackowski-scylladb
Copy link
Author

v2:

  • Fixed test_protocol.py unit tests

@Lorak-mmk
Copy link

So no CQLv5 after all?

@andrzej-jackowski-scylladb
Copy link
Author

andrzej-jackowski-scylladb commented Mar 14, 2025

So no CQLv5 after all?

As a first step, we started with an implementation of CQLv4 extension (see scylladb/scylladb#23292 in progress), as introducing a full support for CQLv5 will require implementing ten different features (see scylladb/scylladb#20860 (comment)) and introducing them all-at-once would be very difficult.

I created this draft PR because I needed a driver for testing my changes related to MetadataId, but you are more than welcome to productize those changes. However, keep in mind that usage of _SKIP_METADATA_FLAG was for some reason removed from Python Driver in 2019 (refer to datastax@4f3b9ab) and I needed to restore this functionality in order to really test the fix. I don't think that unconditional usage of _SKIP_METADATA_FLAG that is currently implemented here is safe (as it will expose Python Driver to https://github.com/scylladb/scylla-drivers/issues/61), and I'm not sure if the current implementation is sufficient to allow setting _SKIP_METADATA_FLAG when SCYLLA_USE_METADATA_ID extension is used (a driver expert review is needed to confirm this).

If my understanding is correct, neither gocql nor rust driver currently implement CQLv5. Therefore, implementing a support for SCYLLA_USE_METADATA_ID seems easier than supporting the entire CQLv5 support for those driver.

With java-driver the situation seems more complex, because (correct me if I'm wrong) a proper use of SCYLLA_USE_METADATA_ID requires changes in native protocol that is not forked by Scylla. If so, the only choices for Java driver is to fully implement support on ScyllaDb side (because java-driver already has such implementation) or to somehow modify the native protocol (e.g. fork it, but I think it is a huge decision to make).

@Lorak-mmk, @dkropachev, please share your thought on this.

@dkropachev
Copy link
Collaborator

So no CQLv5 after all?

As a first step, we started with an implementation of CQLv4 extension (see scylladb/scylladb#23292 in progress), as introducing a full support for CQLv5 will require implementing ten different features (see scylladb/scylladb#20860 (comment)) and introducing them all-at-once would be very difficult.

I created this draft PR because I needed a driver for testing my changes related to MetadataId, but you are more than welcome to productize those changes. However, keep in mind that usage of _SKIP_METADATA_FLAG was for some reason removed from Python Driver in 2019 (refer to datastax@4f3b9ab) and I needed to restore this functionality in order to really test the fix. I don't think that unconditional usage of _SKIP_METADATA_FLAG that is currently implemented here is safe (as it will expose Python Driver to scylladb/scylla-drivers#61), and I'm not sure if the current implementation is sufficient to allow setting _SKIP_METADATA_FLAG when SCYLLA_USE_METADATA_ID extension is used (a driver expert review is needed to confirm this).

If my understanding is correct, neither gocql nor rust driver currently implement CQLv5. Therefore, implementing a support for SCYLLA_USE_METADATA_ID seems easier than supporting the entire CQLv5 support for those driver.

With java-driver the situation seems more complex, because (correct me if I'm wrong) a proper use of SCYLLA_USE_METADATA_ID requires changes in native protocol that is not forked by Scylla. If so, the only choices for Java driver is to fully implement support on ScyllaDb side (because java-driver already has such implementation) or to somehow modify the native protocol (e.g. fork it, but I think it is a huge decision to make).

@Lorak-mmk, @dkropachev, please share your thought on this.

java-3.x, golang, rust, csharp and cpp-rust-driver it should be pretty easy, maybe rust will need some API changes.

java-4.x - there is an issue with com.datastax.(oss|dse).protocol.internal packages, not sure if we will have to fork them, one of alternatives is to extend codecs and trick them to think it is CQL5 when this feature is there.

on python, true, there is no infra on disable/enable skip metadata, I don't think it is crucial for testing to have it, but for proper production use it is better to put it back in place.

@dkropachev
Copy link
Collaborator

@Lorak-mmk , my proposal is to bring control over skip metadata back, by reverting (one way or another) datastax@4f3b9ab and merge this pr on top of it, removing pairing between SCYLLA_USE_METADATA_ID and _QueryMessage.skip_meta

SCYLLA_USE_METADATA_ID extension allows using METADATA_ID (which was
introduced in CQLv5) in CQLv4.

This commit:
 - Introduce support for SCYLLA_USE_METADATA_ID in protocol extension
   negotation
 - Reuse CQLv5 metadata id implemnetation if use_metadata_id feature is
   enabled
 - Modify existing unit tests for introduced changes
This change restores handling of skip_metadata in _QueryMessage, that
was removed in 2019 (most likely to prevent metadata inconsistencies
in prepared statements).

I'm not sure if any other changes are required, as there were many
modifications in the codebase since the flag handling was removed.
@andrzej-jackowski-scylladb
Copy link
Author

andrzej-jackowski-scylladb commented Mar 19, 2025

The PR is still a DRAFT.

v3:

  • Added missing whitespace in cassandra/protocol.py
  • Restored old line order in cassandra/protocol_features.py

@Lorak-mmk
Copy link

I created this draft PR because I needed a driver for testing my changes related to MetadataId, but you are more than welcome to productize those changes. However, keep in mind that usage of _SKIP_METADATA_FLAG was for some reason removed from Python Driver in 2019 (refer to datastax@4f3b9ab) and I needed to restore this functionality in order to really test the fix. I don't think that unconditional usage of _SKIP_METADATA_FLAG that is currently implemented here is safe (as it will expose Python Driver to scylladb/scylla-drivers#61), and I'm not sure if the current implementation is sufficient to allow setting _SKIP_METADATA_FLAG when SCYLLA_USE_METADATA_ID extension is used (a driver expert review is needed to confirm this).

If my understanding is correct, neither gocql nor rust driver currently implement CQLv5. Therefore, implementing a support for SCYLLA_USE_METADATA_ID seems easier than supporting the entire CQLv5 support for those driver.

With java-driver the situation seems more complex, because (correct me if I'm wrong) a proper use of SCYLLA_USE_METADATA_ID requires changes in native protocol that is not forked by Scylla. If so, the only choices for Java driver is to fully implement support on ScyllaDb side (because java-driver already has such implementation) or to somehow modify the native protocol (e.g. fork it, but I think it is a huge decision to make).

Most upstream drivers already support CQLv5. I actually thought all of them do, it is surprising to hear that gocql does not.
The existing support of CQLv5 is why I really hoped we would go this way, and use existing code that is compatible with upstream, instead of doubling the work and implementing new extension in all the drivers. Which, as you noticed yourself, is not really straightforward in some of them.
Regarding Rust Driver: yes it only supports CQLv4, and Scylla adding support for CQLv5 would be a very good reason for us to change that - which is another reason I hoped we would go with CQLv5.

It is also worth noting that not all users utilize our drivers - some use Datastax drivers. The adoption of new driver versions is also quite slow.
In addition, some 3rd party applications, ORMs and other libraries may also use old or datastax drivers.
If we went with CQLv5 they could all use that. With custom extension, only users of our newest drivers will be able to.

@andrzej-jackowski-scylladb
Copy link
Author

The existing support of CQLv5 is why I really hoped we would go this way, and use existing code that is compatible with upstream, instead of doubling the work and implementing new extension in all the drivers. Which, as you noticed yourself, is not really straightforward in some of them. Regarding Rust Driver: yes it only supports CQLv4, and Scylla adding support for CQLv5 would be a very good reason for us to change that - which is another reason I hoped we would go with CQLv5.

We are planning to add CQLv5 in the future. For now we are adding the CQLv4 extension,so problems related to https://github.com/scylladb/scylla-drivers/issues/61 can be mitigated if required (also in drivers that doesn't support CQLv5).

Please also refer to https://scylladb.atlassian.net/wiki/spaces/RND/pages/42238631/MetadataId+extension+in+CQLv4+Requirement+Document.

@Lorak-mmk
Copy link

Lorak-mmk commented Mar 27, 2025

We are planning to add CQLv5 in the future.

Great. Is there an issue for that?

andrzej-jackowski-scylladb added a commit to andrzej-jackowski-scylladb/scylladb that referenced this pull request Apr 11, 2025
Implement corner-cases of prepared statement metadata, as described in
scylladb#20860.

Although the purpose of the test was to verify the newly implemented
SCYLLA_USE_METADATA_ID protocol extension, the test also passes with
scylla-driver 3.29.3 that doesn't implement the support for this
extension. That is because the driver doesn't implement support for
skip_metadata flag, so fresh metadata are included in every prepared
statement response, regardless of the metadata_id.

This change:
 - Add test_changed_prepared_statement_metadata_columns to verify
   a scenario when a number of columns changes in a table used by a
   prepared statement
 - Add test_changed_prepared_statement_metadata_types to verify
   a scenario when a type of a column changes in a table used by a
   prepared statement
 - Add test_changed_prepared_statement_metadata_udt to veriy
   a scenario when a UDT changes in a table used by a prepared statement

I tested the code with a modified Python driver
(ref. scylladb/python-driver#457):
 - If SKIP_METADATA is enabled (scylladb/python-driver@c1809c1)
   but not other changes are introduced, all three test cases fail.
 - If SKIP_METADATA is disabled (no scylladb/python-driver@c1809c1) all
   test cases pass because fresh metadata are included in each reply.
 - If SKIP_METADATA is enabled (scylladb/python-driver@c1809c1)
   and SCYLLA_USE_METADATA_ID extension is included
   (scylladb/python-driver@8aba164) all test cases pass and verifies
   the correctness the implementation.
andrzej-jackowski-scylladb added a commit to andrzej-jackowski-scylladb/scylladb that referenced this pull request Apr 14, 2025
Implement corner-cases of prepared statement metadata, as described in
scylladb#20860.

Although the purpose of the test was to verify the newly implemented
SCYLLA_USE_METADATA_ID protocol extension, the test also passes with
scylla-driver 3.29.3 that doesn't implement the support for this
extension. That is because the driver doesn't implement support for
skip_metadata flag, so fresh metadata are included in every prepared
statement response, regardless of the metadata_id.

This change:
 - Add test_changed_prepared_statement_metadata_columns to verify
   a scenario when a number of columns changes in a table used by a
   prepared statement
 - Add test_changed_prepared_statement_metadata_types to verify
   a scenario when a type of a column changes in a table used by a
   prepared statement
 - Add test_changed_prepared_statement_metadata_udt to veriy
   a scenario when a UDT changes in a table used by a prepared statement

I tested the code with a modified Python driver
(ref. scylladb/python-driver#457):
 - If SKIP_METADATA is enabled (scylladb/python-driver@c1809c1)
   but not other changes are introduced, all three test cases fail.
 - If SKIP_METADATA is disabled (no scylladb/python-driver@c1809c1) all
   test cases pass because fresh metadata are included in each reply.
 - If SKIP_METADATA is enabled (scylladb/python-driver@c1809c1)
   and SCYLLA_USE_METADATA_ID extension is included
   (scylladb/python-driver@8aba164) all test cases pass and verifies
   the correctness the implementation.
andrzej-jackowski-scylladb added a commit to andrzej-jackowski-scylladb/scylladb that referenced this pull request Apr 25, 2025
Implement corner-cases of prepared statement metadata, as described in
scylladb#20860.

Although the purpose of the test was to verify the newly implemented
SCYLLA_USE_METADATA_ID protocol extension, the test also passes with
scylla-driver 3.29.3 that doesn't implement the support for this
extension. That is because the driver doesn't implement support for
skip_metadata flag, so fresh metadata are included in every prepared
statement response, regardless of the metadata_id.

This change:
 - Add test_changed_prepared_statement_metadata_columns to verify
   a scenario when a number of columns changes in a table used by a
   prepared statement
 - Add test_changed_prepared_statement_metadata_types to verify
   a scenario when a type of a column changes in a table used by a
   prepared statement
 - Add test_changed_prepared_statement_metadata_udt to veriy
   a scenario when a UDT changes in a table used by a prepared statement

I tested the code with a modified Python driver
(ref. scylladb/python-driver#457):
 - If SKIP_METADATA is enabled (scylladb/python-driver@c1809c1)
   but not other changes are introduced, all three test cases fail.
 - If SKIP_METADATA is disabled (no scylladb/python-driver@c1809c1) all
   test cases pass because fresh metadata are included in each reply.
 - If SKIP_METADATA is enabled (scylladb/python-driver@c1809c1)
   and SCYLLA_USE_METADATA_ID extension is included
   (scylladb/python-driver@8aba164) all test cases pass and verifies
   the correctness the implementation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants