Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions modules/manage/pages/iceberg/query-iceberg-topics.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,31 @@ EOF

endif::[]

=== Grant access to query engine users

Redpanda manages the service-to-service permissions between Redpanda and the catalog (documented in each catalog integration guide). However, you are responsible for granting your end users and query engines (such as Amazon Athena, Apache Spark, Trino, or Snowflake) read access to the Iceberg data.

There are two approaches to controlling access, and you can use them together:

==== Cloud storage prefix-level access

Grant query engine roles or users read access to the Iceberg data prefix in the cluster's storage bucket. This controls who can read the underlying data and metadata files. Scope permissions to specific prefixes to restrict access to individual tables.

* AWS (S3): Use IAM policies to grant `s3:GetObject` and `s3:ListBucket` on the Iceberg prefix (for example, `<cluster-storage-bucket-name>/redpanda-iceberg-catalog/*`). See https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-iam-policies.html[Using IAM policies with Amazon S3^].
* GCP (GCS): Use IAM conditions or bucket-level policies to grant `storage.objects.get` and `storage.objects.list` on the Iceberg prefix. See https://cloud.google.com/storage/docs/access-control/iam[GCS IAM permissions^].
* Azure (Blob Storage): Use Azure RBAC roles such as Storage Blob Data Reader scoped to the container or prefix. See https://learn.microsoft.com/en-us/azure/storage/blobs/authorize-access-azure-active-directory[Authorize access to blob data^].

==== Catalog-level table access

If you use a REST catalog, you can control access at the table level through the catalog's own access control layer. This is useful when query engines access tables through the catalog rather than reading files directly.

* AWS Glue: Use https://docs.aws.amazon.com/lake-formation/latest/dg/what-is-lake-formation.html[AWS Lake Formation^] to grant table-level and column-level permissions.
* Databricks Unity Catalog: See the https://docs.databricks.com/en/data-governance/unity-catalog/manage-privileges/index.html[Unity Catalog privileges documentation^].
* Snowflake Open Catalog: See https://other-docs.snowflake.com/en/opencatalog/access-control[Open Catalog access control^].
* GCP BigLake: See https://cloud.google.com/bigquery/docs/manage-open-source-metadata#grant_permissions[BigLake table permissions^].
Comment on lines +91 to +94
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

`Are the following catalog provider documentation links valid and accessible?

  1. AWS Lake Formation: https://docs.aws.amazon.com/lake-formation/latest/dg/what-is-lake-formation.html
  2. Databricks Unity Catalog privileges: https://docs.databricks.com/en/data-governance/unity-catalog/manage-privileges/index.html
  3. Snowflake Open Catalog access control: https://other-docs.snowflake.com/en/opencatalog/access-control
  4. GCP BigLake permissions: https://cloud.google.com/bigquery/docs/manage-open-source-metadata#grant_permissions`

💡 Result:

  1. AWS Lake Formation: Valid and accessible. 2. Databricks Unity Catalog privileges: Valid and accessible (note: exact path is /aws/en/data-governance/unity-catalog/manage-privileges/, but content matches). 3. Snowflake Open Catalog access control: Valid and accessible. 4. GCP BigLake permissions: Valid and accessible (note: full page is https://cloud.google.com/bigquery/docs/manage-open-source-metadata, which covers permissions including grant_permissions section).

Citations:


Fix incorrect Snowflake Open Catalog documentation link.

The Snowflake Open Catalog documentation link uses the incorrect domain other-docs.snowflake.com. The correct URL is: https://docs.snowflake.com/en/user-guide/opencatalog/access-control

The other three catalog documentation links (AWS Lake Formation, Databricks Unity Catalog, and GCP BigLake) are valid and accessible.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/manage/pages/iceberg/query-iceberg-topics.adoc` around lines 91 - 94,
Replace the incorrect Snowflake Open Catalog link used in the "Snowflake Open
Catalog: See https://other-docs.snowflake.com/en/opencatalog/access-control[Open
Catalog access control^]." bullet by updating the URL to the correct one
(https://docs.snowflake.com/en/user-guide/opencatalog/access-control) so the
"Snowflake Open Catalog" link points to the proper documentation; locate the
string "Snowflake Open Catalog" or the existing incorrect URL and substitute it
with the corrected URL.


=== Refresh table data

Some query engines may require you to manually refresh the Iceberg table snapshot (for example, by running a command like `ALTER TABLE <table-name> REFRESH;`) to see the latest data.

If your engine needs the full JSON metadata path, use the following:
Expand Down
Loading