Is data computer DNA

BIG DATA and the new world order

Prof. Klaus Mainzer, Technical University of Munich

It looks like those who rule the world today are those who have the best information, the fastest algorithms and computers. Today's big data world emerges against the background of total digitization and gigantic computing capacities. What perspectives are opening up for a new digital economic, legal and social order? What are the opportunities and risks? What conclusions must be drawn for Germany from the NSA debate so that this world of data, computers and automation does not get out of hand?

At the beginning of the 1950s, the computer pioneers were still of the opinion that in the future there will be a few supercomputers worldwide that can solve all arithmetic tasks. As we know, developments turned out differently: in the 1980s, many small PCs (personal computers) moved into offices, which were connected to the Internet in the 1990s. This created computer networks as the basis for global information and communication systems such as the World Wide Web (WWW). Our e-mails are broken down into small data packets and sent via router nodes distributed around the world (depending on the local load) in order to be reassembled at the recipient. As in a computer, this global computer network has a common operating system and a common computer language (e.g. Java) on the basis of which various computer programs can run. In fact, this worldwide network is itself a "virtual computer".

The introduction of the Internet and WWW ushered in the first digital revolution: ever smaller devices such as cell phones, smartphones, apps, etc. allow people to communicate with one another worldwide. We are currently experiencing the second digital revolution: Not only do people communicate with one another, but also things via radio and sensor technology - the Internet of Things with a huge production of data and signals.

What is driving the development of the big data world?

Moore's law has dictated the development of computing capacity since the 1960s: on average, computing capacity doubles every 18 months while devices are miniaturized and cheaper. We have reached this exponential curve in the age of the petaflops (peta = 1015 arithmetic steps per second) for supercomputers. According to Moore's Law, this computing power will also be realized by small computing devices in the 2020s. This means that z. B. a smartphone can simulate the computing capacity of our brains. If the transistors are miniaturized at the same time, we will reach the limits of the atomic range. Here then the sensitivities and disturbances of quantum physics apply. Then one will have to look further, perhaps with quantum computers. In any case, this computing power is enormous.
The masses of data produced in this way lead to big data. Here, too, we have arrived in the age of Peta. Nowadays, data corporations like Google convert 24 petabytes every day. H. 6000x the data content of the US library. The masses of data are amorphous, not just structured messages such as e-mails, but sensor data from GPS and mobile phones.
They cannot be dealt with by conventional (relational) databases. This requires new algorithms such as the Google search engine MapReduce (or Hadoop in Java). Put simply, this algorithm divides a mass of data into subtasks (“map”) in order to process them in parallel. In the next step, the partial results are combined to form the overall result (“Reduce”).
 
What is new with Big Data is that forecasts are not statistically extrapolated based on representative samples, but rather all data and signals are scoured in order to identify correlations and patterns. One can clearly say: In order to find the needle, I need the largest possible haystack that is completely thinned through.
What is also new is that we do not need to know the content of the messages. Rather, their importance is gained en masse on the basis of metadata. B. Sender and recipient, in a mobile phone or automobile the radio signals. For example, Google was able to predict the outbreak of an epidemic weeks before the health authorities, which, as usual, had waited for news and reports of cases of illness and had statistically extrapolated them from only the patterns of customer behavior.

Medicine is a vivid example:

It shows how big data science is changing the way we live. First there are the medical data stocks: In the next year, it is expected that individual patient files will grow to 20 terabytes (Tera = 1012). In the 2020s, medical data stocks of a total of 90 zettabytes (zetta = 1024) are expected. The medical knowledge becomes obvious. B. approx. 400,000 specialist articles on diabetes that a doctor cannot read in a human lifetime. This requires intelligent search engines to find the relevant key information.
In order to make personalized medicine possible in the future, medical data must be taken into account down to the cellular and molecular level. A person consists of 2 x 1024 molecules. With 7 billion people, about 15 x 1033 molecules will have to be taken into account. Even if we neglect redundant molecular processes, there are still approx. 6 x 1017 molecules. If we (simplified) calculate 1 bit per molecule (molecule switched on or off), we get numbers that are in the capacity range of today's or future supercomputers. This requires new types of databases such as B. SAP HANA (“High Performance Analytic Application”), which can access fast working memories (“In-Memory-Technology”): A molecular cancer analysis (proteomics) of 15 minutes is reduced to 40 seconds, a DNA sequencing of 85 hours (approx. 3 ½ days) in just 5 hours

The next key example is the economy:

There, customer and product profiles can be predicted at lightning speed using big data mining. Big data makes new business models and value chains possible: The owners of data earn money by lending licenses. Then there is the merit of know-how and skills in dealing with masses of data and, finally, mind set, i. H. Earning from new business ideas with amounts of data.