IC Research Day 2014: Challenges in Big Data

© 2014 EPFL

© 2014 EPFL

Neuroscience, physics, climatology, humanities… Science increasingly depends on enormous quantities of data. How do we manage and exploit it all? What kind of results can we expect from this inflation of information?
Read the interview of Professors Anastasia Ailamaki and Karl Aberer, Chairs of the IC Research Day 2014 on the Challenges in Big Data on 12th of June.

Communication, entertainment, financial and health services, social networking and mobile services are just a few examples of how our day-to-day interactions have transformed into data exchanges… We generate, collect, process, and store data at phenomenally high rates. Our world is in the midst of a revolution in information technology, at the center of which is Big Data.

1. Science has entered a data-centric "fourth paradigm". Professors Anastasia Ailamaki and Karl Aberer, could you please explain?

“Over time, mankind has developed various approaches to generating new knowledge. A thousand years ago, science was mainly experimental and centered on the description of natural phenomena. A few hundred years ago, theoretical science emerged (Newton’s Laws, Maxwell’s Equations, etc.). More recently, computer science and simulations of natural phenomena gave rise to a third approach: computational science. Nowadays, data-intensive and data-centric science are driving a fourth approach to creating knowledge, which aims at unifying theory, experiments and simulation with analytics of massive troves of scientific data.

The fourth paradigm creates tremendous opportunities but also daunting challenges. In domain sciences, technological advancements have made it possible to generate unprecedented amounts of raw data from observations and simulations. As a result, scientists are faced with new frontiers of unexploited potential and seek new software tools to help transform their data into useful information. Computer science is at the heart of this challenge, with the new mission of bridging the gap between the data growth rate (which follows a steep exponential curve) and the rate of our ability to process data (which is much more modest). New technologies are being created which open new frontiers of scientific exploration, which in turn create more challenges for computer science.”

2. How can Big Data be a resource that can be turned into economic, cultural and scientific value?

“"Big Data" is at the center of the current IT revolution, in which we generate, collect, process and store data at phenomenally high rates. It is affecting every sector of engineering, science, economics and society: communication, entertainment, financial and health services, social networking and mobile services are just a few examples of how our day-to-day interactions have transformed into data exchanges. Big Data allows us to develop quantitative tools to generate, calibrate, and validate models and predictions. This enables discoveries and decisions to be made by applying empirical principles to big data sets. Data now also lies at the core of the supply-chain for both products and services in modern economies, where economic value is primarily based on information and services.”

3. What are the challenges in Big Data that both the EPFL and the School of Computer and Communication Sciences would like to tackle?

“An obvious challenge is the increasing demand for capacity to store, manage and process Big Data in research. EPFL is providing its researchers with services that allow them to systematically annotate and search data generated through experiments and simulations in research data archives. To handle the rapidly growing amounts of data, EPFL is also planning new kinds of data centers that are specifically designed for research needs. Since researchers and businesses do not have the same needs in terms of research data reliability and security, we hope to devise solutions that are more cost-effective than the ones currently available on the market.

The Human Brain Project is a typical example of a multifaceted Big Data set of challenges. Neuroscientists simulate the human brain at the molecular level. This has the potential to model the brain in unprecedented detail and reveal more information than any other method used previously. However, a truly detailed simulation of brain functionality over the span of just a few minutes is equivalent in data terms to all of the data collected all over the world. This data needs to be examined and compared with patient records from hundreds of participating hospitals, which poses a set of new technical questions. The data needs to be mined efficiently, to reveal correlations between causes and symptoms and to extract unique biological disease signatures. Only then can a patient be given individualized treatment.

The Venice Time Machine project is another project where the plan is to digitalize the complete archives of Venice, more than 80 km of documents, and make them available to researchers in humanities for completely new forms of scientific investigation, e.g. in history.

Within 1 year we will also have completed the digitalization of the Montreux Jazz Archive which will result in several Petabytes of audio and video content. This will offer unprecedented opportunities to researchers in audio and video analyses.”

4. Given the rapid growth in the use of Big data, are we facing a shortage of experts in this research area?

“A recent study by McKinsey reported that by 2018 there will be approximately 140,000 to 180,000 unfilled positions in the field of Big Data in the US alone. The demand is absolutely massive and academic institutions have to urgently respond to this.”


At the 2014 Research Day, top international Big Data experts will give a series of talks exploring both opportunities and risks posed by the Big Data revolution. EPFL researchers will also provide a glimpse of EPFL research highlights relating to Big Data.
Discover the program and register at ic.epfl.ch/researchday2014.

IC Research Day 2014: Challenges in Big Data

Date: Thursday, 12th June 2014
Time: 8:30 am - 4:00 pm
Venue: École Polytechnique Fédérale de Lausanne,
SwissTech Convention Center
Registration (until 6th June):
Mandatory through the following link
Further information:
ic.epfl.ch/researchday2014


Authors: Anastasia Ailamaki, Karl Aberer

Source: Staff Portal