Skip to content

Update TPCDS data #25

Merged
comphead merged 1 commit intoapache:mainfrom
comphead:main
Dec 9, 2025
Merged

Update TPCDS data #25
comphead merged 1 commit intoapache:mainfrom
comphead:main

Conversation

@comphead
Copy link
Contributor

@comphead comphead commented Dec 9, 2025

Update TPCDS data to have more queries returning data.

Reducing queries with 0 rows result from 36 to 18

@mbutrovich
Copy link

What's different from the existing files?

@comphead
Copy link
Contributor Author

comphead commented Dec 9, 2025

TPCDS data regenerated,

the reason we still have some of empty queries is the query generator uses query templates like q1 below, substituting the random values during generation and this is totally independent of data. I believe we need to play with the filter on existing queries and make it return the result

define COUNTY = random(1, rowcount("active_counties", "store"), uniform);
define STATE = distmember(fips_county, [COUNTY], 3); 
define YEAR = random(1998, 2002, uniform);
define AGG_FIELD = text({"SR_RETURN_AMT",1},{"SR_FEE",1},{"SR_REFUNDED_CASH",1},{"SR_RETURN_AMT_INC_TAX",1},{"SR_REVERSED_CHARGE",1},{"SR_STORE_CREDIT",1},{"SR_RETURN_TAX",1});
define _LIMIT=100;

with customer_total_return as
(select sr_customer_sk as ctr_customer_sk
,sr_store_sk as ctr_store_sk
,sum([AGG_FIELD]) as ctr_total_return
from store_returns
,date_dim
where sr_returned_date_sk = d_date_sk
and d_year =[YEAR]
group by sr_customer_sk
,sr_store_sk)
[_LIMITA] select [_LIMITB] c_customer_id
from customer_total_return ctr1
,store
,customer
where ctr1.ctr_total_return > (select avg(ctr_total_return)*1.2
from customer_total_return ctr2
where ctr1.ctr_store_sk = ctr2.ctr_store_sk)
and s_store_sk = ctr1.ctr_store_sk
and s_state = '[STATE]'
and ctr1.ctr_customer_sk = c_customer_sk
order by c_customer_id
[_LIMITC];

Copy link

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @comphead!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants