Deprecate `metadata_request_timeout` #469

dkropachev · 2025-05-07T19:09:52Z

When I was testing e4a000f, I have figure that out there is another option that controls system queries timeout - Cluster.control_connection_timeout, switch to use that it instead of Cluster.metadata_request_timeout, but did not remove Cluster.metadata_request_timeout.

Effectively Cluster.metadata_request_timeout is not working, what is working is Cluster.control_connection_timeout.
To address that let's mark Cluster.metadata_request_timeout it as deprecated in favor of Cluster.control_connection_timeout, to be removed in next minor release.

Pre-review checklist

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
I added relevant tests for new features and bug fixes.
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
I have provided docstrings for the public items that I want to introduce.
I have adjusted the documentation in ./docs/source/.
I added appropriate Fixes: annotations to PR description.

When I was testing scylladb@e4a000f, I have figure that out there is another option that controls system queries timeout - `Cluster.control_connection_timeout`, switched to use it instead of `Cluster.metadata_request_timeout`, but did not remove `Cluster.metadata_request_timeout`. Effectively `Cluster.metadata_request_timeout` is not working, what is working is `Cluster.control_connection_timeout`. To address that let's mark `Cluster.metadata_request_timeout` it as deprecated in favor of `Cluster.control_connection_timeout`, to be removed in next minor release.

Lorak-mmk · 2025-05-07T19:33:53Z

The way I understand it control_connection_timeout was previously responsible for client-side timeout, and is now responsible for both (but we thought that metadata_request_timeout is responsible for server-side).
Do we always want client-side and server-side timeout to be the same?
If yes, then the fix makes sense. If not, then we should not deprecate it, and instead start using it for server-side timeout.

If we decide to keep this, fix, then apart from printing a warning I'd also set control_connection_timeout to the bigger of the 2 values (control_connection_timeout, metadata_request_timeout) to prevent client-side timeout from being significantly lower.

Lorak-mmk · 2025-05-07T19:34:53Z

Btw having integration tests for this functionality would have prevented this. We should not skip them in further PRs.

dkropachev · 2025-05-07T19:56:05Z

The way I understand it control_connection_timeout was previously responsible for client-side timeout, and is now responsible for both (but we thought that metadata_request_timeout is responsible for server-side). Do we always want client-side and server-side timeout to be the same? If yes, then the fix makes sense. If not, then we should not deprecate it, and instead start using it for server-side timeout.

I don't see any reason why they should not go together, only thing that I would have changed is to set server side timeout a bit less than client side, say 100ms less to offset possibl network delay.

If we decide to keep this, fix, then apart from printing a warning I'd also set control_connection_timeout to the bigger of the 2 values (control_connection_timeout, metadata_request_timeout) to prevent client-side timeout from being significantly lower.

no reason to do so, metadata_request_timeout is not working at all.

Lorak-mmk · 2025-05-07T19:58:44Z

no reason to do so, metadata_request_timeout is not working at all.

Right, my brain stopped working for a minute, sorry.

dkropachev · 2025-05-07T19:59:09Z

Btw having integration tests for this functionality would have prevented this. We should not skip them in further PRs.

What happened is I started working on the test and found that it is not working, from there I have found that there is control_connection_timeout and switched to it, dropping all the tests.
control_connection_timeout it self have test coverage.
So, only thing that would help is me carefully going over PR again after all these changes.

mykaul · 2025-05-08T07:26:33Z

Don't deprecate a used feature. Not like that.
There's a difference between the control connection timeout and how long it should take, after the connection is established, to fetch (sometimes very large) system data, such as the schema.

dkropachev · 2025-05-08T16:07:34Z

Don't deprecate a used feature. Not like that.

Please read #469 (comment) and PR Description, metadata_request_timeout is not working and never worked, it was introduced by mistake.
As of now, control_connection_timeout controls both server-side and client-side timeout.

There's a difference between the control connection timeout and how long it should take, after the connection is established, to fetch (sometimes very large) system data, such as the schema.

control_connection_timeout is client-side timeout, metadata_request_timeout - supposed to be server-side timeout.
I don't see any good reason keep them separate, it only create space for missconfiguration which will result in having retries for completed requests.

mykaul · 2025-05-11T08:20:52Z

Don't deprecate a used feature. Not like that.

Please read #469 (comment) and PR Description, metadata_request_timeout is not working and never worked, it was introduced by mistake. As of now, control_connection_timeout controls both server-side and client-side timeout.

There's a difference between the control connection timeout and how long it should take, after the connection is established, to fetch (sometimes very large) system data, such as the schema.

control_connection_timeout is client-side timeout, metadata_request_timeout - supposed to be server-side timeout. I don't see any good reason keep them separate, it only create space for missconfiguration which will result in having retries for completed requests.

I reiterate that there was a good reason to use a separate, server-side timeout, for the schema fetch queries. There's a separate phase to the connection - which is lengthy, and you don't wish to retry the whole connection, just because of the big schema (which causes a long schema query). This is a Scylla-specific feature we can take advantage of (that Cassandra may not have - did not check it). We've implemented it in Java too and I think it's useful. If it doesn't work here, we should fix it.

dkropachev requested a review from Lorak-mmk May 7, 2025 19:09

dkropachev self-assigned this May 7, 2025

dkropachev force-pushed the dk/depricate-metadata_request_timeout branch from c0898eb to 8ef0565 Compare May 7, 2025 19:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecate `metadata_request_timeout` #469

Deprecate `metadata_request_timeout` #469

dkropachev commented May 7, 2025 •

edited

Loading

Lorak-mmk commented May 7, 2025

Lorak-mmk commented May 7, 2025 •

edited

Loading

dkropachev commented May 7, 2025

Lorak-mmk commented May 7, 2025

dkropachev commented May 7, 2025

mykaul commented May 8, 2025

dkropachev commented May 8, 2025 •

edited

Loading

mykaul commented May 11, 2025

Deprecate metadata_request_timeout #469

Are you sure you want to change the base?

Deprecate metadata_request_timeout #469

Conversation

dkropachev commented May 7, 2025 • edited Loading

Pre-review checklist

Lorak-mmk commented May 7, 2025

Lorak-mmk commented May 7, 2025 • edited Loading

dkropachev commented May 7, 2025

Lorak-mmk commented May 7, 2025

dkropachev commented May 7, 2025

mykaul commented May 8, 2025

dkropachev commented May 8, 2025 • edited Loading

mykaul commented May 11, 2025

Deprecate `metadata_request_timeout` #469

Deprecate `metadata_request_timeout` #469

dkropachev commented May 7, 2025 •

edited

Loading

Lorak-mmk commented May 7, 2025 •

edited

Loading

dkropachev commented May 8, 2025 •

edited

Loading