Supervisors:
• Assoc. Prof. Hedi Peterson, Institute of Computer Science, UT;
• Prof. Jaak Vilo, Institute of Computer Science, UT;
• Prof. Pärt Peterson, Institute of biomedicine and translational medicine, UT.
Opponents:
* Dr. Jessica Da Gama Duarte, Olivia Newton-John Cancer Research Institute (Australia);
* Dr. Fridtjof Lund-Johansen, Oslo University Hospital (Norway).
Proteins are some of the most fundamental building blocks of life. These tiny molecules are responsible for almost all activities carried out in the organism. Different proteins are involved in different pursuits ranging from authorising massive immune responses in a time of struggle with infection to providing daily cell maintenance. Certainly, such complex functions require many protein molecules working together to be performed successfully. But not all proteins are equally useful, as the presence of some proteins is an essential condition for an individual's well-being, the abundance of others can be life-threatening. Hence, accurate information about the number and type of proteins active in the organism at any moment of time is instrumental for understanding human biology and disease mechanisms. Protein microarray is a technology that enables us to obtain accurate estimates of concentration levels of thousands of proteins in human blood in a parallel manner. However, analysing data from protein microarrays can be challenging due to lack of simple to use, automated tools. In a series of studies involving protein microarrays, we have explored and implemented various data science methods for the all-around analysis of protein concentration data. Such methods have helped us to identify and characterise proteins targeted by the autoimmune reaction in patients with the APS1 condition. The keystone of this work is a web-tool PAWER. PAWER implements relevant computational methods and provides a semi-automatic way to analyze protein microarray data online in a drag-and-drop and click-and-play style. The work that laid the foundation of this thesis has been instrumental for a number of subsequent studies of human disease and also inspired a contribution to refining standards for validation of machine learning methods in biology.