In our last article, we discussed how to determine how many people drink pumpkin spice lattes in a given time period without learning their identifying information. But say, for example, you would like to know the total amount spent on pumpkin spice lattes this year, or the average price of a pumpkin spice latte since 2010. You’d like to detect these trends in data without being able to learn identifying information about specific customers to protect their privacy. To do this, you can use summation and average queries answered with differential privacy.
ADVERTISEMENT |
In this article, we will move beyond counting queries and dive into answering summation and average queries with differential privacy. Starting with the basics: In SQL, summation and average queries are specified using the SUM and AVG aggregation functions:
SELECT SUM(price) FROM PumpkinSpiceLatteSales WHERE year = 2020
SELECT AVG(price) FROM PumpkinSpiceLatteSales WHERE year > 2010
…
Add new comment