Chris Wiggins

Chris Wiggins is co-organizer of hackNY and an associate professor of applied mathematics at Columbia University. His research focuses on applications of machine learning to real-world data. This includes inference, analysis, and organization of naturally-occurring networks; statistical inference applied to time-series data; and large-scale sequence informatics in computational biology. Prior to joining the faculty at Columbia he was a Courant Instructor at NYU and earned his PhD at Princeton University. He originally moved to NYC in 1989 to attend Columbia. Since 2001 he has also held appointments as a visiting scientist at Institut Curie (Paris), the Hahn-Meitner Institut (Berlin), and the Kavli Institute for Theoretical Physics (Santa Barbara). At Columbia he serves as the faculty advisor for the Society of Industrial and Applied Mathematics (SIAM) as well as the Application Development Initiative (ADI). He was awarded the Avanessians Diversity Award in recognition of his work enhancing diversity in departmental, school, and university programs at Columbia in 2007. He has served as a mentor during each of the Techstars NYC programs.

Data science in the natural sciences

Big data is shaping diverse fields, showing that past predictions from data-driven natural sciences are now coming to pass.

I find myself having conversations recently with people from increasingly diverse fields, both at Columbia and in local startups, about how their work is becoming “data-informed” or “data-driven,” and about the challenges posed by applied computational statistics or big data.

A view from health and biology in the 1990s

In discussions with, as examples, New York City journalists, physicists, or even former students now working in advertising or social media analytics, I’ve been struck by how many of the technical challenges and lessons learned are reminiscent of those faced in the health and biology communities over the last 15 years, when these fields experienced their own data-driven revolutions and wrestled with many of the problems now faced by people in other fields of research or industry.

It was around then, as I was working on my PhD thesis, that sequencing technologies became sufficient to reveal the entire genomes of simple organisms and, not long thereafter, the first draft of the human genome. This advance in sequencing technologies made possible the “high throughput” quantification of, for example,

  • the dynamic activity of all the genes in an organism; or
  • the set of all protein-protein interactions in an organism; or even
  • statistical comparative genomics revealing how small differences in genotype correlate with disease or other phenotypes.

These advances required formation of multidisciplinary collaborations, multi-departmental initiatives, advances in technologies for dealing with massive datasets, and advances in statistical and mathematical methods for making sense of copious natural data. Read more…

Comments: 2 |