Skip to content

getsynq/dwhsupport

Repository files navigation

dwhsupport

A Go library that provides standardized interfaces and models for interacting with different data warehouses (DWH). This library abstracts away the complexities of working with various data warehouse systems by providing a unified interface for common operations.

Core Components

Scrapper Interface

The Scrapper interface provides methods for:

  • Validating configurations
  • Querying catalog information
  • Fetching table metrics
  • Retrieving SQL definitions
  • Managing database connections

Data Models

The library includes comprehensive models for:

  • TableMetricsRow: Table statistics and metadata
  • CatalogColumnRow: Detailed column information
  • TableRow: Table structure and properties
  • SqlDefinitionRow: SQL definitions for views/tables
  • DatabaseRow: Database-level metadata

Scope Filtering

The scrapper/scope package provides include/exclude filtering at database, schema, and table levels with glob pattern support:

// Define a scope filter
filter := &scope.ScopeFilter{
    Include: []scope.ScopeRule{
        {Database: "prod_*", Schema: "public"},
    },
    Exclude: []scope.ScopeRule{
        {Database: "prod_staging"},
    },
}

// Option 1: Wrap a scrapper with ScopedScrapper for automatic filtering
scoped := scope.NewScopedScrapper(inner, filter)
tables, err := scoped.QueryTables(ctx) // automatically filtered

// Option 2: Inject scope via context for SQL push-down
ctx = scope.WithScope(ctx, filter)
tables, err := inner.QueryTables(ctx) // filter conditions pushed into SQL

ScopedScrapper applies filtering at two levels:

  • SQL push-down — scope conditions are injected directly into warehouse queries for efficiency
  • Post-filtering — results are filtered in-memory to guarantee compliance

Query Stats

The exec/querystats package provides query execution statistics via context:

ctx = querystats.WithCallback(ctx, func(stats querystats.QueryStats) {
    log.Printf("Query %s: %d rows read, %d bytes, %v",
        stats.QueryID, stats.RowsRead, stats.BytesRead, stats.Duration)
})

// All queries executed with this context will report stats via the callback
tables, err := scrapper.QueryTables(ctx)

Query Processing

The QueryMany type provides flexible options for:

  • Batch processing of query results
  • Post-processing of rows
  • Custom argument handling
  • Configurable batch sizes

Usage

scrapper := NewYourDwhScrapper(config)
defer scrapper.Close()

// Query table metrics
metrics, err := scrapper.QueryTableMetrics(ctx, lastFetchTime)

// Get catalog information
catalog, err := scrapper.QueryCatalog(ctx)

// Query SQL definitions
definitions, err := scrapper.QuerySqlDefinitions(ctx)

// Scoped queries — only return results matching the filter
filter := &scope.ScopeFilter{
    Include: []scope.ScopeRule{{Schema: "analytics"}},
}
scoped := scope.NewScopedScrapper(scrapper, filter)
tables, err := scoped.QueryTables(ctx)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

Copyright 2024 SYNQ

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Support

For support, please open an issue in the GitHub repository.

About

Shared code to access DWH in more standardised way

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages