Historical story

From fossils to Facebook:everything is data

We leave physical and digital traces in the world in countless ways and in this way create data. What data and information do we consist of? “People should be a little more frugal with their data.”

Footprints in the mud look like insignificant prints, but they contain more information about you than you think. Call in a forensic specialist and he or she will effortlessly deduce from this which shoe size you have, which shoes you were wearing, when you walked there approximately and which direction you were going. He can probably also tell you how heavy you are and how you walk. When the prehistoric herbivore Orobates pabsti walked through the mud, he had no idea that his footsteps would allow more than 250 million years to lead to a reconstruction of his remarkable walking behaviour.

Footprints are just an example, because you leave traces everywhere in an incredible number of ways and in that way create 'data'. Both in the real and digital world. Some traces disappear quickly; others persist, are stored indefinitely, and may be reused. Some data is worthless; other data is mercilessly monetized by commerce. Facebook would know you better than your partner on the basis of three hundred 'likes' (a claim that is still open to negotiation) and uses that information to sell companies advertising space that only you then see.

The central storage of nature

Life on earth cannot do without data storage. And by that I don't mean your laptop or photo books, nor your paper administration and cloud storage. Every cell in our body contains the genetic code in which all your "traits" are recorded, from the color of your eyes to the risk of hereditary diseases. This information is contained in the DNA molecule and is written in a language with only four letters. Your complete code is over three billion letters long. That seems surprisingly short:your DNA code fits into a file about three gigabytes in size, and with smart storage that size shrinks to less than one gigabyte. You can easily burn the entire genetic code of a family of four to a DVD.

The length of a genetic code does not appear to say much about the complexity of an organism. The genome of the unicellular freshwater amoeba Polychaos dubium is said to be about two hundred times longer than a human.

Yet our body contains a lot of data. Bioinformatician Rob ter Horst looks at our immune system and how factors such as gender, age and genetic information influence it. “Some of our studies involve terabytes of data. We look at many different individuals, whose genome is also read multiple times to make sure of the code, and additional information, such as the way the DNA is folded," he says. “This is the so-called epigenome. That consists of information that tells how our entire genome is folded, which also differs per individual. We are looking at about sixty thousand different places for this.”

Researchers now have a good idea of ​​how nature has arranged central storage. The code in the DNA tells your body what different proteins that have a certain function in the body look like. The epigenome is a kind of regulatory layer that partly determines how much of each protein has to be made. Yet there are a lot of human characteristics that do not depend on one gene, but are the result of the cooperation of several genes and environmental factors. Length is one such property. Ter Horst says that it is still too complex to model and predict these processes precisely.

The Earth as a hard drive

Life has found a sophisticated way to store genetic data, but it also leaves many traces on our planet. In fact, you can think of the Earth as a kind of hard drive – a gold mine for scientists trying to reconstruct how life originated.

You can find those kinds of scientists in Leiden, in the Naturalis Biodiversity Center, which towers over the surrounding buildings. Inside that tower are countless drawers of extinct bees, boxes of ancient shark teeth, giant mammoth bones, and rocks that record the fossilized imprint of life from more than half a billion years ago. They are all pieces of information about the billion-year "story" of our planet and life on it.

According to collection manager Natasja den Ouden, the tower contains 42 million collection items that are stored under specific and constant conditions – such as humidity and temperature.

Den Ouden itself is responsible for the sub-collection of fossils, of which the oldest specimens are about 575 million years old. These are so-called multicellular organisms from the Ediacara period, which it is difficult to place in the plant or animal category, because they did not yet exist.

How is it possible that planet earth stores some spores for so long, while most organic spores disappear in no time? According to Den Ouden it is a coincidence. “You just have to be lucky,” she says. “Suppose you are an organism that wants to end up in a museum, then make sure that your body is quickly covered by sediments after you die. And there must be as little oxygen as possible. For example, fall into an oxygen-poor lake – that's much better than dying in a meadow, where your body decomposes quickly and is likely to be eaten.”

Everything is data

Everything is data, says theoretical physicist Erik Verlinde of the University of Amsterdam. From the DNA molecules that record our genetic information to fossils in the Earth's crust, from black holes to our thoughts. Verlinde looks at the concept of 'information' with the broadest possible view:a cosmic perspective.

The observation that everything is data is at the heart of Verlinde's theory of gravity. That theory tries to explain the movements of stars and galaxies, something that the more classical models fail to do.

Verlinde argues that gravity is only a so-called 'emergent' property of how the information of the universe behaves. The term 'emergent' can best be explained as a property that arises from another property. Take temperature as an example:we experience it directly, but on a molecular scale it is a measure of the speed at which matter particles move.

And what is the information Verlinde is talking about? According to him, it is the data needed to actually record the entire "state" of the universe, such as the position and mass of all particles in the universe. Verlinde also compares it with quantum numbers from quantum mechanics. Those numbers describe the state of a particle, for example the spin, which tells in which direction a particle is spinning.

The most amazing thing, Verlinde thinks, is not that this information is there, but that it appears to be a finite amount. Moreover, that amount of information does not change. “The 'book' describing the state of the entire universe is about 10 120 characters big,” he says. “It contains everything:from the distribution of matter in the universe to our thoughts, which sprout from our material brain.”

Does nothing really disappear? For years, physicists have had heated debates about whether information disappears when a black hole swallows matter. Information then seems to disappear forever into that black hole, causing the universal information book to shrink. Still, these data eventually reappear as black holes slowly "evaporate" as a result of the Hawking radiation proposed by famed physicist Stephen Hawking, Verlinde said. The information has thus only been temporarily 'hidden'.

Prominent place in everyday life

Data in the form of DNA, planet earth or even the universe has been around for a while, but data also seems to be claiming an increasingly prominent place in our daily lives. Why is so much about data these days? Why are the most powerful companies in the world data companies and why does data play an essential role in all kinds of social domains, from government to science? Hugo Jonker, assistant professor at the Open University, attributes this, among other things, to the character of digital. information:it usually lasts for a long time, can be copied, distributed and searched almost effortlessly. “Suppose you go to the supermarket and employees see you there. If the police ask if you were there half an hour later, the staff may not even recognize you anymore. If the agents check a database with IP addresses of phones that connected to the local Wi-Fi point, they will pick your phone out in no time,” he says.

Also, many digital traces may say more about us than traces in the 'real world', according to Jonker. We are usually much more 'efficient' on the internet. This way you usually don't accidentally 'pass by' a website, like you accidentally walk past a store. You visit a certain webpage because you are looking for something. That information can be used for harmless (targeted) advertisements, but just as well to influence elections.

Jonker makes a distinction between data (for example ones and zeros on a hard disk) and information (what those data mean in a certain context). For a privacy expert like him, it is about the protection of information, and then the context is extremely important. “The words 'Positie Omtzigt, function elsewhere' may not mean much, unless they are in the notes of the formateur of a new cabinet and are photographed. Then it suddenly becomes political dynamite," he says.

Featured by the editors

MedicineWhat are the microplastics doing in my sunscreen?!

AstronomySun, sea and science

BiologyExpedition to melting land

You are your data

What are you in this story? It just depends on what glasses you wear. You're three billion 'letters' in a DNA molecule, you're the roughly ten terabytes your brain is said to store, you're a piece of information in the cosmic information accounting system, you might be a fossil with which a future civilization describes our species. , you are the (digital) traces you leave behind in the world.

According to Jonker, we should be more careful with our data with regard to those traces, and the 'I have nothing to hide' that some adhere to is nonsense. There is always a way to think of how your information can be used and misused. The almost classic example is that of the American supermarket Target, which the New York Times reported on in 2012 (see also the box in this NEMO Kennislink article):based on the purchasing behavior of an underage girl, an algorithm 'knew' the store that she was pregnant, much to her father's dismay, who only found out when the store sent her related coupons.

Slide1/12

Cave painting:an Indonesian warthogA warthog in Indonesia is the oldest known cave painting. Probably intended to tell a story about hunting trophies. The painting is still clearly visible. Even the bushy back hairs are visible. Shelf life:about 45,000 years already

Slide1/12

Cuneiform:rock-hard data About five thousand years ago, a writing system emerged in the Middle East that you scratched into clay with a reed. At first many pictograms were used, later language became more abstract with 'nail-like' notches. People used it, among other things, to do accounting. Many (hardened) clay tablets are still in fine condition. Shelf life:already five thousand years, probably much longer

Slide1/12

Papyrus:the oldest paper Nowadays we know the cyperus papyrus mainly as a plant for the pond. Five thousand years ago, its stems were pressed into a kind of paper:papyrus. Documents from Ancient Egypt contain medical practices, mathematical calculations and folktales. Unfortunately, papyrus does not tolerate moisture well:in Europe almost everything has been lost. About two thousand years ago, parchment and later paper supplanted the papyrus. Shelf life:several thousand years under dry conditions

Slide1/12

Oil painting:the snapshot of the RenaissanceAlthough painting on buildings, ceramics and other materials was already done long before, oil painting made its breakthrough in Europe in the fifteenth century. Painters accurately capture the world around them on canvas and experiment with techniques. Initially, the work is often religious in nature, later more everyday scenes and portraits appear. Shelf life:five hundred years through restoration

Slide1/12

Punched Card:Hole Cheese of Information In 1790, the French inventor Joseph-Marie Jacquard developed a system to automatically weave a pattern into the fabric using a loom. It uses a 'programmable' cardboard strip with holes in specific places. This pattern shows how the machine weaves threads together row by row. The punched card as an information carrier proved versatile in the following two hundred years:it was used in, for example, barrel organs and in census records before the computer took over. Shelf life:hundreds of years

Slide1/12

Photo:Capturing the light The first photo with a person in it was taken in 1838 in Paris. Back then, taking a photo took minutes, so moving objects – such as people – typically faded into the shot. However, this gentleman (bottom left in the picture) stood still for a while, probably to have his shoes polished. Depending on the printing technique, the shelf life differs; many photos discolor within decades. Shelf life:decades, depending on the technique

Slide1/12

Washing cylinders:sounds from a pastHappy xylophone notes dance over the tones of a wind band. This four-minute piece of music was recorded around 1912 and is on a cylinder with a layer of wax in which the sound vibrations have been 'scratched'. It can still be played with a so-called phonograph, a device that the company of the American inventor Thomas Edison developed in 1880. Shelf life:at least a century

Slide1/12

Magnetic tape:kilometers with particle collisionsA robot moves rapidly back and forth between long racks of magnetic tapes and stops to pick up and read certain tapes. At the European particle laboratory CERN near Geneva, a large part of the raw data from the experiments is stored on tape. In 2019, the organization had 330,000,000 gigabytes of data. Tape is a relatively cheap way to store a lot of information. Shelf life:decades

Slide1/12

Floppy:flexible informationThey come in different sizes, but eventually the 3.5-inch variant became the most popular. On flexible floppies you could store games, research results and other important files. But the data could be lost relatively easily, because bits (a piece of information on the disk) changed spontaneously. Floppy's did not have a long shelf life. Now the biggest problem is that few computers can 'read' them. Shelf life:about ten years

Slide1/12

Cassette tape:fiddling with a pencil For years, cassette tapes were the most popular way to play music. You could even listen to your favorite artists on the go. The tape in the straps sometimes got stuck, whereby a pencil often offered a solution to turn everything properly again. Shelf life:up to about thirty to forty years

Slide1/12

Compact disk:the shiny disk After the cassette tapes, the CDs became extremely popular. Handy for listening to music, but also for storing data. The shelf life varies quite a bit, depending on how the disc has been treated. Do you put it in the sun unprotected? Then it goes downhill fast. But with good quality and treatment, the lifespan increases. Shelf life:decades

Slide1/12

Hard disk:tipping bits The hard disk is one of the most modern ways to store (holiday) photos, films and files. You would think that makes it one of the most reliable as well. Wrong! Over time, a hard drive suffers from data rot, because bits change spontaneously. Shelf life:often about ten years

It can also be more extreme. Data can become downright dangerous. “Suppose you were a teacher at a girls' school in Afghanistan until recently. That was fine until the regime change last summer. Now you are suddenly suspicious in the eyes of the new rulers, and information about you becomes dangerous. I'm just saying, this isn't something that just happened to the Jews in World War II,” he says. “And espionage isn't just something that happened in the GDR. There are spy apps that you can sneak onto your partner's phone if you don't trust him or her.”

As a privacy expert, Jonker is forced to think in doomsday scenarios; this is necessary to stay ahead of the bad guys and to keep the public and politicians on their toes. As an example, he mentions electronic chips in passports that gave a certain error message. That error message turned out to be different for passports from different countries. "You only make the news if you say that you can design a bomb that automatically detonates when a few people of a certain nationality are nearby," he says. “Actually, it's a shame that only then is attention paid to it. People should actually deal with their data much more than with their own wallet, for example. You don't let a stranger snoop around in there either, because they simply have no business there.”