HypDB is the first system to detect, explain, and resolve bias in decision-support OLAP queries. We show that biased queries can be perplexing and lead to statistical anomalies, such as Simpson’s paradox. We propose a novel technique to find explanations for the bias, thereby assisting the analyst in interpreting the results. We develop an automated method for rewriting the query into an unbiased query that correctly performs the hypothesis test that the analyst had in mind. The rewritten queries compute causal effect or the effect of hypothetical interventions. At the core of our framework lies the ability to find confounding variables. We show that HypDB can be used to detect algorithmic unfairness post factum.
Babak Salimi, Corey Cole, Peter Li, Johannes Gehrke, Dan Suciu. HypDB: A Demonstration of Detecting, Explaining and Resolving Bias in OLAP queries.. VLDB, 2018.
Babak Salimi, Johannes Gehrke, Dan Suciu. Bias in OLAP Queries: Detection, Explanation, and Removal. ACM SIGMOD, 2018.
Causal Inference in Database Systems
Jul 13, 2018