Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: weightedMean aggregation of percent change in DataTables doesn't handle nulls and zeros correctly #2852

Open
nickfanion opened this issue Nov 25, 2024 · 1 comment
Labels
bug Something isn't working to-review Evidence team to review

Comments

@nickfanion
Copy link
Contributor

nickfanion commented Nov 25, 2024

Describe the bug

When using totalAgg=weightedMean in DataTables to aggregate percent change, null values and zeroes in the weightCol cause the entire row to be ignored, which leads to incorrect aggregates.

Steps to Reproduce

```sql example_data
select 'Peanut Butter' AS category, 'Chunky Peanut Butter' AS item, 6678 AS sales, 1189 AS sales_ya
union
select 'Peanut Butter', 'Smooth Peanut Butter', 7155, 7651
union
select 'Peanut Butter', 'Powder Peanut Butter', 522, 9928
union
select 'Salad Dressing', 'Thousand Island Dressing', 3210, 0
union
select 'Salad Dressing', 'Ranch Dressing', 9493, 7557
union
select 'Salad Dressing', 'Italian Dressing', 9212, 8051
union
select 'Salad Dressing', 'Caeser Dressing', 2489, 9643
union
select 'Plant-Based Milk', 'Almond Milk', 2867, 5212
union
select 'Plant-Based Milk', 'Oat Milk', 7982, 2430
union
select 'Plant-Based Milk', 'Soy Milk', 9069, 5872
```

```sql table_data
select
  category,
  item,
  sum(sales) as sales,
  sum(sales_ya) as sales_ya,
  ((sum(sales) - sum(sales_ya)) / sum(sales_ya)) as sales_yoy
from ${example_data}
group by all
order by sales DESC NULLS LAST
```

```sql sales_yoy_agg
select
  ((sum(sales) - sum(sales_ya)) / sum(sales_ya)) as sales_yoy
from ${table_data}
group by all
```

<DataTable data={table_data} groupBy=category subtotals=true totalRow=true>
<Column id=item />
<Column id=sales fmt='$#,##0' />
<Column id=sales_ya fmt='$#,##0' />
<Column id=sales_yoy totalFmt='pct1' contentType=delta fmt='pct1' totalAgg=weightedMean weightCol=sales_ya />
<Column id=sales_yoy totalFmt='pct1' contentType=delta fmt='pct1' totalAgg={sales_yoy_agg[0].sales_yoy} />
</DataTable>

The 'Thousand Island Dressing' row has zero sales_ya. Because it's zero, the weightedMean aggregation ignores the row, despite there being being a value in sales.

Logs

N/A

System Info

N/A

Severity

serious, but I can work around it

Additional Information, or Workarounds

This may be an inherent limitation to weightedMean, but percent change is a common calculation, so I think there should be a way of aggregating it in all cases. I can use a custom aggregation value as a workaround, but then that breaks groupBy aggregates in a DataTable.

Another workaround is to create a weightCol with the zeros set to a very small number like 0.001.

@nickfanion nickfanion added bug Something isn't working to-review Evidence team to review labels Nov 25, 2024
@nickfanion
Copy link
Contributor Author

Seems to be the reverse issue of #2081 but I understand why handling of zeros/nulls was changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working to-review Evidence team to review
Projects
None yet
Development

No branches or pull requests

1 participant