X-Europe Data Science Webinars

CANCELLED - How Private and High-Quality Can Synthetic Data Be?

ABOUT OUR SPEAKER
Victoria is a data scientist for MOSTLY AI, a synthetic data startup based in Vienna, Austria.
ABOUT THE TALK
Privacy concerns have been growing over the last 20 years as companies collect more data about their customers. Synthetic data has exploded in popularity as a tool for anonymization, due to its benefits of preserving statistical relationships while severely reducing the risk of re-identification. Generating synthetic data is easy with many companies offering solutions alongside an active research field constantly publishing new algorithms. When you start to work with synthetic data, many questions arise over the accuracy and privacy-preserving characteristics of your synthetic data.
In this talk, we will go over a few topics. We outline why privacy is important and how synthetic data prevails over many classical anonymization techniques. Next, we look at open source tools in measuring the quality of a synthetic dataset. Leading into our next topic of introducing VirtualDataLab, a Python open source tool developed by MOSTLY AI. (Me included!) Lastly, we go over some best practices for picking a synthetic data generator.



Communities

And Barcelona Data Science and Machine Learning Meetup,
Budapest Deep Learning Reading Seminar,
Budapest Data Science Meetup

Flags

Supporters


CANCELLED