Liina Kamm PhD defence - Privacy-preserving statistical analysis using secure multi-party computation

Klipi teostus: Maria Gaiduk 09.03.2015 6284 vaatamist Arvutiteadus


In a modern society, from the moment a person is born, a digital record is created. From there on, the person’s behaviour is constantly tracked and data are collected about the different aspects of his or her life. Whether one is swiping a customer loyalty card in a store, going to the doctor, doing taxes or simply moving around with a mobile phone in one’s pocket, sensitive data are being gathered and stored by governments and companies. Sometimes, we give our permission for this kind of surveillance for some benefit. For instance, we could get a discount using a customer loyalty card. Other times we have a difficult choice – either we cannot make phone calls or our movements are tracked based on cellular data. The government tracks information about our health, education and income to cure us, educate us and collect taxes. We hope that the data are used in a meaningful way, however, we also have an expectation of privacy. This work focuses on how to perform statistical analyses in a way that preserves the privacy of the individual. To achieve this goal, we use secure multi-­‐party computation. This cryptographic technique allows data to be analysed without seeing the individual values. Even though using secure multi-­‐party computation is a time-­‐consuming process, we show that it is feasible even for large-­‐scale databases. We have developed ways for using the most popular statistical analysis methods with secure multi-­‐party computation. We introduce a privacy-­‐preserving statistical analysis tool called Rmind that contains all of our resulting implementations. Rmind is similar to tools that statistical analysts are used to. This allows them to carry out studies on the data without having to know the details of the underlying cryptographic protocols. The methods described in the thesis are used in practice to prepare for running a statistical study on large-­‐scale real-­‐life data to find out whether Estonian students who are working during university studies are less likely to graduate in nominal time.