-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Private measurement of single events #41
Comments
Thanks for filing this issue Charlie, and I'm glad to see it's going to be a discussion topic at our next meeting. I guess the big question is: Do we believe a "Private Measurement API" must necessarily include aggregation?
I would really love to hear what Mozilla, Brave, and Webkit think about this topic. CC: @martinthomson, @ShivanKaul, @johnwilander If we think the answer is "Yes, a private measurement API must necessarily include aggregation", we should discuss what mechanisms we could use to try to enforce that.
That's all the ideas that immediately come to mind, but I might be missing something. Happy to hear other ideas from other people! |
CC also Luke Winstrom @winstrom as you shared a lot perspectives from Apple at the last PATCG F2F. |
@benjaminsavage thanks for that breakdown of enforcement mechanisms. I do want to emphasize that all of these mitigations (1-3) are just that: mitigations. They can all be broken with a sophisticated enough attacker, especially if that attacker is interested in targeting a small population of people. I also want to mention that while (2) and (3) are "easy" to break if the adversary can generate fake records as you mentioned, they can still be broken even if the adversary does not have that capability and is truly restricted to aggregating honestly generated events. e.g. for (2) if the system supports overlapping queries, issuing a query over sources In summary, this boundary is extremely difficult to enforce rigorously. I am nervous that attempting to do so will result in less useful measurement capabilities (e.g. disallowing overlapping queries), or making overly ambitious assumptions about the attacker's capabilities (e.g. fake record generation, auxiliary information, etc). Footnotes
|
Another mitigation similar to (2), (3) which brings the economic disincentive against getting individual level outputs but which doesn't require complicated checks in the MPC is:
|
FYI I have uploaded a PR with some text which I think captures what was discussed in the May meetings, but we are looking for feedback on this and will not merge it for some time (@AramZS for exact details on how long we'll wait for consensus) |
I’m posting an issue based on a discussion that happened in patcg-individual-drafts/ipa#60. In that issue, I proposed a new privacy mechanism satisfying differential privacy that works best in the model where individual events are queried, rather than multiple events aggregated together and queried together. While this is a setting where currently both IPA and ARA have no restrictions, we thought it best to bubble up the conversation to this group in general to discuss.
With per-event output, typically results are only useful if you aggregate them in some way after the privacy mechanism / noise is applied (similar in some sense to local DP), otherwise data is too noisy to do anything useful with it. This is also usually a huge headache because it drowns your data in much more noise than it would normally (e.g. for simple aggregates, noise standard deviation scales with O(sqrt(#users)) vs. O(1)). However, there are a few big reasons why this is useful:
The main questions for the group are:
Personally, I think we should support this kind of privacy mechanism as long as it satisfies our high level differential privacy bounds, given the benefits listed above, the challenges inherent with protecting this boundary, and the robustness of DP as a definition.
cc @eriktaubeneck @benjaminsavage @bmcase
The text was updated successfully, but these errors were encountered: