You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The environment variable picked up by Iceberg starts with `PYICEBERG_` and then follows the yaml structure below, where a double underscore `__` represents a nested field, and the underscore `_` is converted into a dash `-`.
48
+
49
+
For example, `PYICEBERG_CATALOG__DEFAULT__S3__ACCESS_KEY_ID`, sets `s3.access-key-id` on the `default` catalog.
50
+
27
51
## Tables
28
52
29
53
Iceberg tables support table properties to configure table behavior.
@@ -36,7 +60,7 @@ Iceberg tables support table properties to configure table behavior.
36
60
|`write.parquet.compression-level`| Integer | null | Parquet compression level for the codec. If not set, it is up to PyIceberg |
37
61
|`write.parquet.row-group-limit`| Number of rows | 1048576 | The upper bound of the number of entries within a single row group |
38
62
|`write.parquet.page-size-bytes`| Size in bytes | 1MB | Set a target threshold for the approximate encoded size of data pages within a column chunk |
39
-
|`write.parquet.page-row-limit`| Number of rows | 20000 | Set a target threshold for the approximate encoded size of data pages within a column chunk |
63
+
|`write.parquet.page-row-limit`| Number of rows | 20000 | Set a target threshold for the maximum number of rows within a column chunk|
40
64
|`write.parquet.dict-size-bytes`| Size in bytes | 2MB | Set the dictionary page size limit per row group |
41
65
|`write.metadata.previous-versions-max`| Integer | 100 | The max number of previous version metadata files to keep before deleting after commit. |
42
66
@@ -161,26 +185,6 @@ Alternatively, you can also directly set the catalog implementation:
161
185
| type | rest | Type of catalog, one of `rest`, `sql`, `hive`, `glue`, `dymamodb`. Default to `rest`|
162
186
| py-catalog-impl | mypackage.mymodule.MyCatalog | Sets the catalog explicitly to an implementation, and will fail explicitly if it can't be loaded |
163
187
164
-
There are three ways to pass in configuration:
165
-
166
-
- Using the `~/.pyiceberg.yaml` configuration file
167
-
- Through environment variables
168
-
- By passing in credentials through the CLI or the Python API
169
-
170
-
The configuration file is recommended since that's the easiest way to manage the credentials.
The environment variable picked up by Iceberg starts with `PYICEBERG_` and then follows the yaml structure below, where a double underscore `__` represents a nested field, and the underscore `_` is converted into a dash `-`.
181
-
182
-
For example, `PYICEBERG_CATALOG__DEFAULT__S3__ACCESS_KEY_ID`, sets `s3.access-key-id` on the `default` catalog.
0 commit comments