Skip to content

audit: Trino driver for Presto SQL divergences #36

@dtsong

Description

@dtsong

Problem

data_diff/databases/trino.py is a 49-line stub that inherits everything from Presto. It overrides only normalize_timestamp and normalize_uuid. Trino and Presto have diverged significantly:

  • Type precision semantics differ
  • Timestamp-as-instant vs wall-clock behavior
  • Function naming and behavior differences
  • Connection parameter handling

The __init__ calls super().__init__() (Presto), meaning any Presto-specific logic silently flows into Trino.

Scope

This is an audit, not a rewrite:

  1. Document which inherited Presto methods produce different SQL than Trino expects
  2. Identify any methods that would produce silently wrong results
  3. Add targeted overrides for critical divergences
  4. File follow-up issues for anything requiring deeper work

Key Files

  • data_diff/databases/trino.py
  • data_diff/databases/presto.py

Acceptance Criteria

  • Audit document or issue comments listing all divergence points
  • Critical divergences fixed with targeted overrides
  • Known limitations documented in code comments

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions