-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support for asyncio (SNOW-1544406) #38
Comments
Thanks for suggestion. Yes, we considered aiohttp as well as any async feature in 3.4+. There are two concerns: 1) since the driver needs to support both 2.7+ and 3.4+, it may need to have a branch for 3.4+, 2) it is not clear how to handle OCSP check in aiohttp. OCSP check is a requirement from our security team, though the standard Python's SSL library doesn't support it (and even most other clients don't care about the certificate revocation status!). We use a monkey patch on the top of https://github.com/requests/requests to intercept SSL handshake and add the OCSP checks. So definitely all contribution would be pleased along with the above concerns addressed. |
Checking if there has been any progress with asyncio support for this package, the last update seems to be over 2 years ago.. |
sorry we didn't have enough bandwidth to work on this. A plan is add async support after dropping python2. |
Python 2 will stop being supported by the Python Foundation on Jan 1, 2020. Will Snowflake also drop support at that time? Just curious if any timeline has been established on when Snowflake will EOL Python 2 support. If it helps anyone interested, as a temporary workaround, I've been running Snowflake queries inside a This has been just fine for my purposes (internal reporting) and allows me to still use the other neat asyncio features while not blocking the main thread. |
For Python 2 support, feel free to discuss in #107. We'll keep eyes on the usage metrics on our end to determine the timing of Python2 drop. |
We want this for Snowflake: |
What's the progress on aio support? Anything that uses it, such as sqlalchemy-aio cannot take advantage of this dialect+driver without it... |
Not much progress. In the planning meeting. |
Python 2.7 will be deprecated since Jan 1 2020. ANy progress on async SnowFlake connector? |
Commenting here to re-emphasise the issue. We are going to use asyncio in ongoing projects involving Python, and not having support for async is a bit frustrating. Are there any news regarding this? Thanks. |
It is very important for snowflake to support the async connector. Otherwise, we have to workaround. It is a big bottleneck for performance optimization. Thanks. |
Hey Everyone, just wanted to bump this feature request. With FastAPI becoming more and more utilized in the python ecosystem, support for an Asyncio continues to become more important. Thanks! |
Bump!! We need asyncio support :) |
Please implement pgsql wire protocol for Snowflake. Hard must. |
I think what Snowflake would really benefit from is some sort of publicly documented HTTP/REST API. It would be a lot easier to build an asyncio community library on top of something that is documented than trying to reverse engineer it or come up with sub-optimal solutions. |
Snowflake is the primary database for a project I am working on. It has been an absolute struggle dealing with all of the gotchas. Having some visibility would be a good idea - however I am going to assume the security team would struggle with this. Definitely sold to our CTO as very much compatible with existing workflows, tech stacks, etc... and showed a lot of promise by offering an sqlalchemy driver. I have been mostly using usql (https://github.com/xo/usql) to fill in a lot of gaps with the web ui, the almost unusable cli offered by snowflake, and to act as a system call when I need something done async without having to deal with busted multiprocessing solutions. |
Next bump. Building a FastAPI application which needs to perform sync and async queries would be much better if the snowflake connector would support asyncio. Yes the sqlalchemy connector is supporting this but lacking snowflake exclusive features like execute_async. @keller00 are there any updates? |
I'm sorry, but this is not planned for anytime soon. However; we do as of recently support our own Async execution feature, see documentation here: https://docs.snowflake.com/en/user-guide/python-connector-example.html#label-python-connector-asynchronous-query-examples |
While the async feature is neat.. it does not offer async socket support.
Perhaps the team at snowflake can look into using httpx as a base.
On Thu, Mar 18, 2021 at 10:05 PM Mark Keller ***@***.***> wrote:
I'm sorry, but this is not planned for anytime soon.
Our codebase is built upon using urllib3 and other dependencies that use
it under the hood (boto3 comes to my mind immediately). We also monkey
patch our own OCSP verification into urllib3 for extra security.
Last time I checked urllib3 said that they will not support asyncio ever,
so to support it we'd need a complete rewrite of the library, which we have
tried, but the benchmarks didn't live up to our standards unfortunately.
However; we do as of recently support our own Async execution feature, see
documentation here:
https://docs.snowflake.com/en/user-guide/python-connector-example.html#label-python-connector-asynchronous-query-examples
I hope that this could be useful for some of you!
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#38 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKRFPLDZ7QPXZYNR3SGXTTELSRLANCNFSM4EAS4B5A>
.
--
[image: --]
Shane R. Spencer
[image: https://]about.me/ShaneSpencer
<https://about.me/ShaneSpencer?promo=email_sig>
|
Why support python at all if you can't invest the time and effort to support modern Python? |
Async Python is Future Python, not Modern Python. A lot of widely used libraries are not (yet) async compatible and most Python running in the real world is not async. |
Hi, I'm from the future! 🛸 |
Aww sweet! I am not the only modern time traveler. |
Since to me this is easily the most frustrating topic about snowflake, I thought I'll describe some use cases: In general asyncio is a solution to achieve concurrent programming in python. This is different from parallel computing (see here: https://stackoverflow.com/questions/1897993/what-is-the-difference-between-concurrent-programming-and-parallel-programming) With that in mind here are some use cases I can think of when dealing with snowflake: Working with fastapi Airflow 2.X Deferrable Operators Running 2 independent Snowflake queries at the same time
and how could it look using asyncio?
While it's only 3 lines of code it also has a much better and easier way to handle exceptions like any solution currently available. Summary I saw quite some media about snowpark and how you want to appeal more to the python community. Personally, I don't think you can achieve that with your standard python client missing such a crucial feature. |
Can any of the maintainers comment whether this is on the roadmap? I know you can't commit to dates, but at least getting some acknowledgement so we know we're not screaming into a void would be hugely appreciated. |
Hi All , Thanks all for your inputs and suggestion . Adding support for asyncio is on our roadmap and will keep the thread posted on updates . |
Should definitely be a high priority! |
We would highly value this as well. Snowflake is highly used by data engineers who, like it or not, are most familiar with python. A lot of times I write code in python simply so that it can be supported by data engineers even though python is far from the most performant language. However, a lot of code is mostly IO waiting for long running tasks. Concurrency for IO is far more important than in memory execution speed for processes which are mostly performing orchestration in a data warehouse. Therefore, python can be nearly as fast as compiled languages for these type of tasks and we have a lot easier time hiring engineers who know python than other languages. Asyncio in python is now mature and it is well supported by foundational libraries for tasks such as HTTP, TCP and file IO. Yes, it is possible to run "async" queries where you get a query ID and then check back later to see if the query is done. However, the underlying IO is still blocking. Therefore, it takes a lot of fairly careful design to perform parallel queries efficiently with this pseudo-async approach. You have to run a blocking synchronous loop to get all the query ID's and then another blocking synchronous loop to go check all the queries, skip the completed ones on each pass and return the results once they are all done. I am not even going to try to explain this to most data engineers. |
I am trying to see if I can use your code. However, I am confused by what you mean about using the threaded executor. It looks to me like you are using self._loop.run_in_executor and self.loop is from asyncio. I think I am missing something. Does partial somehow help you with this? Are you passing in a different pool which uses multithreading instead of asyncio during instantiation of SnowflakeConnection? I am looking for some place where multithreading get's involved. I was going to try to make some similar wrapper with asyncio around a multithreaded sync code. |
snowflake-connector-python v2.2.3
Any updates? 🙏 |
This comment was marked as outdated.
This comment was marked as outdated.
@sfc-gh-anugupta @sfc-gh-dszmolka any updates? |
Thank you folks for all the feedback and interest! It is too early to give out any estimated timeline, but the team is busy with the planning and design; so there is some progress. Speaking about which, we'll update this thread when there's any significant new information on the progress. Thank you very much for bearing with us ! |
Short update to confirm this is still on the roadmap and in progress with the team. No timeline available at this moment - thank you everyone for your patience here. |
as snowfkale has already provided standard rest apis, at least for basic CRUD operations, it's technically no more difficult to use it with asyncio by ourselves, instead of waiting for the official aio SDK from snowflake. |
short update: internal POC in progress |
quick update: as you might have seen from the PRs :) team is actively working on the project and towards to getting out the initial alpha version of the connector which supports async. edit: we do understand there's a huge interest in this feature, but at this moment (3 October 2024) there is no official ETA yet. Please keep tuned, because this thread will be updated once there's important information to post. Until then, I'm afraid you'll need to bear with us for a bit and thank you for your patience! |
Any progress on that? |
update: initial support for asyncio is imminent with alpha release 3.13.0a1 , and will be available as |
update: apparently plans changed and with the 'imminent'-ness of the release I was overly optimistic, apologies. Current status is that the code planned for the private preview scope is complete, and Snowflake Product Management is working on the next steps. Probably this wasn't the update everyone was hoping for :( For those who are already Snowflake customers, your account team can track the internal status based on the ticket number mentioned in the issue title ( |
It would be really useful to see this! |
And the world keeps on spinning round and round 🦖 |
A lot has changed internally, which of course should absolutely not influence how projects are delivered, but unfortunately it does. From what I know , the project is still on the table at least. I understand how the uncertainty is super frustrating, too. |
The Snowflake connector for python seems to be implemented essentially as API calls over HTTP. Using aiohttp, companion subclasses to
SnowflakeConnector
,SnowflakeCursor
,SnowflakeRestful
etc, could be created that implement the key methods as asynchronous coroutines. Then asyncio tools could be used to run Snowflake connection routines alongside other I/O-centric or API-driven tasks.Has this been considered? Is it a viable addition to the Snowflake Connector? If so I'm happy to contribute, I'd love to hear any requirements you folks might have in mind. Or on the other hand, is it more appropriate as a fork, or as a separate project altogether in the style of aiobotocore?
The text was updated successfully, but these errors were encountered: