18 Sep 09 10:15 Michael Cariaso: Data Mining your own DNA

Please note that this week we have the pleasure of hosting an invited talk organized by one of the local student special interest groups, Lambda ry, http://lambda.fi

HIIT seminar, Friday Sep 18, 10:15 a.m. (coffee from 10), Exactum D122

Michael Cariaso
Founder of SNPedia
Author of Promethease
Research Scientist at KeyGene NV, Netherlands

Data Mining your own DNA

Eight years ago the Human Genome Project completed at a cost $3 billion USD. Today almost daily headlines announce "Scientists discover gene for disease X" and the technology has become accessible to the general public.

For $400 you can have your spit analyzed to check your DNA status at a half million locations. By cross referencing this against the scientific literature you can predict many details about your ancestry, appearance, drug response, and diseases. Most dramatically, at birth we can predict which 3% of us will develop Alzheimer's. Other variations can be used to predict eye color, intelligence, creativity, lactose intolerance, and many many more features. But why wait for birth, when the same techniques can be used to highlight what is possible or certain based on which mate you choose.

This talk will introduce SNPedia.com - a wiki database of DNA variations, and Promethease - a free python client which reads an individual's DNA file. While the emphasis will be viewing the genetic profiles of real people, some attention will be paid to the underlying technical elements. These include using a wiki as a semi-structured semantic database, using SPARQL vs webcrawling as alternative interfaces to Mediawiki, use of Amazon.com's FPS/EC2 web services, and a small domain specific language for representing genetic combinations.

Together these provide a glimpse of how medicine, and our understanding of ourselves, is about to change.

Michael Cariaso is a founder of SNPedia and the developer of Promethease. He is studying the genetics of large plant genomes for KeyGene in the Netherlands. He's been a Bioinformatician across a diverse range of topics for LLNL.gov, Gene Logic, Celera, SAIC and BioTeam.

Personal homepage: www.cariaso.com
Topic homepage: www.SNPedia.com

