Skip to main content

Working with small data that you dare to share

AW1.120 | Day 2 | 16:00 - 16:30 | Speakers: Ulrika Vincent, Mikael Kullberg

Working with small data that you dare to share
A picture of a devroom at FOSDEM 2024
Open in browser

Notes

Abstract

How to work with toxic data? In our project we work with DNS query streams, which contain a lot of data that may expose single users and their browsing behaviour.

This talk covers how we have built a large scale statistics platform while preserving the user’s privacy and still being able to find important observations. We cover which algorithms and methods we use to gather the data in a cloud platform and run advanced analytics without touching individual user data. We share how to go from big data sets to small aggregated and minimised sets.

We believe the approach of "small data" is applicable to any field where you want to use and share sensitive data. We also invite the audience to audit our work and help build a privacy-first internet statistics platform as one good example.


Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.