Skip to content
This repository was archived by the owner on May 17, 2024. It is now read-only.

Add support for Microsoft SQL Server 2016+ #168

Closed
DVAlexHiggs opened this issue Jul 17, 2022 · 11 comments
Closed

Add support for Microsoft SQL Server 2016+ #168

DVAlexHiggs opened this issue Jul 17, 2022 · 11 comments
Labels
new-db-driver Request to add a new database driver stale Issues/PRs that have gone stale

Comments

@DVAlexHiggs
Copy link

DVAlexHiggs commented Jul 17, 2022

Hi All,

Very interested in this project, looks fantastic to me!.
I see a significant and probably quite common use case here where developers and the business will want to be reassured of a successful migration from an old legacy system to a cloud platform such as Snowflake.

As a consultant, I see a huge number of clients on MS SQL Server, and this is the source of the data for their migration from on-prem to the cloud.

I'm excited to use this tool but unfortunately it does not support most of the use cases I would want to use it for, due to the lack of MS SQL Support. To this end, I would like to suggest support for this platform.

I'm also more than happy to contribute to this, and I'm wondering if anyone can point me in the right direction for contributing please? I see the developer environment guide in the README, but what I mean is more specific information about creating a new adapter/adding new database support.

Thanks!

@nolar
Copy link
Contributor

nolar commented Jul 18, 2022

Hello. Thanks for reaching us!

MS SQL is one of the databases on the near-future roadmap. It has a few performance-related challenges though.

To make a new driver, take a look at the existing ones here: https://github.com/datafold/data-diff/tree/master/data_diff/databases — and do the same. Then also run the tests, as documented (CONTRIBUTING.md)[https://github.com/datafold/data-diff/blob/master/CONTRIBUTING.md].

If you have any questions, feel free to ask.

@DVAlexHiggs
Copy link
Author

DVAlexHiggs commented Jul 18, 2022

Hello. Thanks for reaching us!

MS SQL is one of the databases on the near-future roadmap. It has a few performance-related challenges though.

To make a new driver, take a look at the existing ones here: https://github.com/datafold/data-diff/tree/master/data_diff/databases — and do the same. Then also run the tests, as documented (CONTRIBUTING.md)[https://github.com/datafold/data-diff/blob/master/CONTRIBUTING.md].

If you have any questions, feel free to ask.

Thanks for your response!

I see there is a commented out mssql module. Anything I should know about this?

@nolar
Copy link
Contributor

nolar commented Jul 18, 2022

@DVAlexHiggs You can see more details on that in #51.

@DVAlexHiggs DVAlexHiggs reopened this Jul 18, 2022
@DVAlexHiggs
Copy link
Author

Not sure how this got closed. Think I mis-clicked

@erezsh erezsh added the new-db-driver Request to add a new database driver label Jul 20, 2022
@cacondie
Copy link

cacondie commented Sep 9, 2022

@DVAlexHiggs Any update on this? I also could help contribute, but don't want to pick it up if you are already actively working on it.

@erezsh
Copy link
Contributor

erezsh commented Sep 10, 2022

@cacondie ,

We are currently not pursuing this avenue, but we will be happy to accept contributions. However, be aware that the solution isn't likely to be simple.

SqlServer does support MD5 (which is what we used for hashing), but it is approx. 100 times slower than postgresql, which makes it unusable for practical purposes.

I think the solution would have to involve implementing our own checksum, probably using only simple arithmetic operations, since SQL isn't capable of much more.

Keep in mind that whatever checksum is used for SQLServer, it has to be supported by all the other databases, so that comparisons are possible. That probably means implementing this new checksum for each one. (or at least the major ones)

If you can think of a new creative solution, we'll be happy to consider it.

@icosahedron
Copy link

Would using a CLR stored procedure to compute MD5 be an option?

@masonwheeler
Copy link

Probably not as many SQL Server implementations, including just about anything on cloud storage, disallow CLR procs.

@erezsh
Copy link
Contributor

erezsh commented Dec 13, 2022

I proposed a solution to this issue in #51

@github-actions
Copy link
Contributor

This issue has been marked as stale because it has been open for 60 days with no activity. If you would like the issue to remain open, please comment on the issue and it will be added to the triage queue. Otherwise, it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues/PRs that have gone stale label May 26, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jun 2, 2023

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment and it will be reopened for triage.

@github-actions github-actions bot closed this as completed Jun 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
new-db-driver Request to add a new database driver stale Issues/PRs that have gone stale
Projects
None yet
Development

No branches or pull requests

6 participants