Implement async version of databricks_conn in BaseDatabricksHook#55568
Closed
BasPH wants to merge 18 commits intoapache:mainfrom
Closed
Implement async version of databricks_conn in BaseDatabricksHook#55568BasPH wants to merge 18 commits intoapache:mainfrom
BasPH wants to merge 18 commits intoapache:mainfrom
Conversation
ashb
reviewed
Sep 12, 2025
ashb
reviewed
Sep 12, 2025
2 tasks
dstandish
added a commit
to astronomer/airflow
that referenced
this pull request
Sep 19, 2025
In 2.x sometimes get_connection (which goes to the database) might be called without wrapping in sync_to_async. This did not fail, though it was not good behavior, since it can block the event loop. In 3.0, since we now route db calls through an API, triggers that do this fail. The reason is, the code to hit the API wraps the get_connection call with async_to_sync, which is forbidden in the asyncio event loop. Related: apache#55568 (cherry picked from commit f5b1eb4)
dstandish
added a commit
to astronomer/airflow
that referenced
this pull request
Sep 19, 2025
In 2.x sometimes get_connection (which goes to the database) might be called without wrapping in sync_to_async. This did not fail, though it was not good behavior, since it can block the event loop. In 3.0, since we now route db calls through an API, triggers that do this fail. The reason is, the code to hit the API wraps the get_connection call with async_to_sync, which is forbidden in the asyncio event loop. Related: apache#55568 (cherry picked from commit f5b1eb4)
dstandish
added a commit
to astronomer/airflow
that referenced
this pull request
Sep 19, 2025
In 2.x sometimes get_connection (which goes to the database) might be called without wrapping in sync_to_async. This did not fail, though it was not good behavior, since it can block the event loop. In 3.0, since we now route db calls through an API, triggers that do this fail. The reason is, the code to hit the API wraps the get_connection call with async_to_sync, which is forbidden in the asyncio event loop. Related: apache#55568 (cherry picked from commit f5b1eb4)
kaxil
added a commit
to astronomer/airflow
that referenced
this pull request
Oct 23, 2025
When deferrable operators run in the triggerer's async event loop and synchronously access connections (e.g., via @cached_property), the `ExecutionAPISecretsBackend` failed silently. This occurred because `SUPERVISOR_COMMS.send()` uses `async_to_sync`, which raises `RuntimeError` when called within an existing event loop in a greenback portal context. Add specific RuntimeError handling in `ExecutionAPISecretsBackend` that detects this scenario and uses `greenback.await_()` to call the async versions (aget_connection/aget_variable) as a fallback. It was originally fixed in apache#55799 for 3.1.0 but apache#56602 introduced a bug. Ideally all providers handle this better and have better written Triggers. Example PR for Databricks: apache#55568 Fixes apache#57145
kaxil
added a commit
to astronomer/airflow
that referenced
this pull request
Oct 23, 2025
When deferrable operators run in the triggerer's async event loop and synchronously access connections (e.g., via @cached_property), the `ExecutionAPISecretsBackend` failed silently. This occurred because `SUPERVISOR_COMMS.send()` uses `async_to_sync`, which raises `RuntimeError` when called within an existing event loop in a greenback portal context. Add specific RuntimeError handling in `ExecutionAPISecretsBackend` that detects this scenario and uses `greenback.await_()` to call the async versions (aget_connection/aget_variable) as a fallback. It was originally fixed in apache#55799 for 3.1.0 but apache#56602 introduced a bug. Ideally all providers handle this better and have better written Triggers. Example PR for Databricks: apache#55568 Fixes apache#57145
kaxil
added a commit
to astronomer/airflow
that referenced
this pull request
Oct 23, 2025
When deferrable operators run in the triggerer's async event loop and synchronously access connections (e.g., via @cached_property), the `ExecutionAPISecretsBackend` failed silently. This occurred because `SUPERVISOR_COMMS.send()` uses `async_to_sync`, which raises `RuntimeError` when called within an existing event loop in a greenback portal context. Add specific RuntimeError handling in `ExecutionAPISecretsBackend` that detects this scenario and uses `greenback.await_()` to call the async versions (aget_connection/aget_variable) as a fallback. It was originally fixed in apache#55799 for 3.1.0 but apache#56602 introduced a bug. Ideally all providers handle this better and have better written Triggers. Example PR for Databricks: apache#55568 Fixes apache#57145
kaxil
added a commit
to astronomer/airflow
that referenced
this pull request
Oct 23, 2025
When deferrable operators run in the triggerer's async event loop and synchronously access connections (e.g., via @cached_property), the `ExecutionAPISecretsBackend` failed silently. This occurred because `SUPERVISOR_COMMS.send()` uses `async_to_sync`, which raises `RuntimeError` when called within an existing event loop in a greenback portal context. Add specific RuntimeError handling in `ExecutionAPISecretsBackend` that detects this scenario and uses `greenback.await_()` to call the async versions (aget_connection/aget_variable) as a fallback. It was originally fixed in apache#55799 for 3.1.0 but apache#56602 introduced a bug. Ideally all providers handle this better and have better written Triggers. Example PR for Databricks: apache#55568 Fixes apache#57145
kaxil
added a commit
that referenced
this pull request
Oct 23, 2025
When deferrable operators run in the triggerer's async event loop and synchronously access connections (e.g., via @cached_property), the `ExecutionAPISecretsBackend` failed silently. This occurred because `SUPERVISOR_COMMS.send()` uses `async_to_sync`, which raises `RuntimeError` when called within an existing event loop in a greenback portal context. Add specific RuntimeError handling in `ExecutionAPISecretsBackend` that detects this scenario and uses `greenback.await_()` to call the async versions (aget_connection/aget_variable) as a fallback. It was originally fixed in #55799 for 3.1.0 but #56602 introduced a bug. Ideally all providers handle this better and have better written Triggers. Example PR for Databricks: #55568 Fixes #57145
kaxil
added a commit
that referenced
this pull request
Oct 23, 2025
When deferrable operators run in the triggerer's async event loop and synchronously access connections (e.g., via @cached_property), the `ExecutionAPISecretsBackend` failed silently. This occurred because `SUPERVISOR_COMMS.send()` uses `async_to_sync`, which raises `RuntimeError` when called within an existing event loop in a greenback portal context. Add specific RuntimeError handling in `ExecutionAPISecretsBackend` that detects this scenario and uses `greenback.await_()` to call the async versions (aget_connection/aget_variable) as a fallback. It was originally fixed in #55799 for 3.1.0 but #56602 introduced a bug. Ideally all providers handle this better and have better written Triggers. Example PR for Databricks: #55568 Fixes #57145 (cherry picked from commit da32b68)
Member
|
Ping @BasPH to rebase & resolve conflicts |
Contributor
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
|
Hi @BasPH thank you for the PR, this is super useful, could you re-open it please ? |
Member
|
I reopened it |
2 tasks
Contributor
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
|
This is not solved, please reopen |
kosteev
pushed a commit
to GoogleCloudPlatform/composer-airflow
that referenced
this pull request
Mar 2, 2026
When deferrable operators run in the triggerer's async event loop and synchronously access connections (e.g., via @cached_property), the `ExecutionAPISecretsBackend` failed silently. This occurred because `SUPERVISOR_COMMS.send()` uses `async_to_sync`, which raises `RuntimeError` when called within an existing event loop in a greenback portal context. Add specific RuntimeError handling in `ExecutionAPISecretsBackend` that detects this scenario and uses `greenback.await_()` to call the async versions (aget_connection/aget_variable) as a fallback. It was originally fixed in apache/airflow#55799 for 3.1.0 but apache/airflow#56602 introduced a bug. Ideally all providers handle this better and have better written Triggers. Example PR for Databricks: apache/airflow#55568 Fixes apache/airflow#57145 (cherry picked from commit da32b682d1b0df5d5e2078392cf8626f8fdb00ff) GitOrigin-RevId: f969e6374daa8469938169be16a28f7c073a5ce9
Contributor
|
needs rebase and resolving conflicts |
Contributor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I bumped into this error when running the DatabricksSubmitRunOperator on Airflow 3.0.6 using apache-airflow-providers-databricks==7.7.1:
Searching for the key message
RuntimeError: You cannot use AsyncToSync in the same thread as an async event loop - just await the async function directly.led me to several related issues/PRs:I didn't test the exact version in which deferrable mode on the DatabricksSubmitRunOperator broke, but I believe it's Airflow 3.0.3.
This PR adds an async version of the
databricks_connmethod and changes all async methods to use this newa_databricks_connmethod for fetching the connection.Tested by fixing all tests. I don't have a real Databricks instance to test against, but also tested this locally by monkeypatching several calls in the DatabricksHook and BaseDatabricksHook to the point where the AsyncToSync error was reached, then applied the changes from this PR, and a different error was reached because I don't have connectivity to a real Databricks instance.
Also: mypy was complaining about several usernames/passwords being None where a string was expected. I learned that an empty username/password is valid according to RFC 2617, so decided to default to
""in case it'sNone.^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.