-
Notifications
You must be signed in to change notification settings - Fork 414
Closed
Description
Feature Request / Improvement
Currently overwrite consists of a delete + append operation.
iceberg-python/pyiceberg/table/__init__.py
Lines 462 to 471 in e771190
| self.delete(delete_filter=overwrite_filter, snapshot_properties=snapshot_properties) | |
| with self.update_snapshot(snapshot_properties=snapshot_properties).fast_append() as update_snapshot: | |
| # skip writing data files if the dataframe is empty | |
| if df.shape[0] > 0: | |
| data_files = _dataframe_to_data_files( | |
| table_metadata=self.table_metadata, write_uuid=update_snapshot.commit_uuid, df=df, io=self._table.io | |
| ) | |
| for data_file in data_files: | |
| update_snapshot.append_data_file(data_file) |
As an optimization, we can support dynamic overwrite for when an entire partition is replaced.
Heres an example from @koenvo
https://gist.github.com/koenvo/e23bfab32c7e7810eb52f82c6304fc22
rotem-ad
Metadata
Metadata
Assignees
Labels
No labels