At the beginning of this summer, science financier NWO awarded 12 million euros to the CLARIAH project, which will allow Dutch humanities scientists to build a 'digital infrastructure'. What is it for and what is the benefit? Henk Wals, director of the International Institute of Social History (IISH) and one of the initiators of the project, explains.
Big Data it is completely nowadays. Even in fields of science that traditionally do little with data analysis, such as the humanities. Nevertheless, historians, media researchers and linguists will soon be working on it thanks to the large project Common Lab Research Infrastructure for the Arts and Humanities (CLARIAH). The project is part of a true 'revolution' that has been taking place in the humanities in recent years.
Humanities scholars work with documents from archives, but also text, images and sound from the media. More and more of these types of sources can now be digitized. The advantage of digitized sources is that you can search them with special computer programs. A computer can scan digital historical records at lightning speed for specific words, word combinations or changes in words. By comparing and combining the content of all kinds of digital sources, you can draw conclusions that are far beyond the reach of the traditional, lonely archive researcher.
But before all that happens, a few things need to be done tools to be developed and EUR 12 million has now been made available for this purpose. 'A digital infrastructure', as director Henk Wals of the International Institute of Social History (IISH) in Amsterdam calls it.
12 million is a lot of money. What exactly is that needed for? “There is a revolution going on in the humanities right now. A digital revolution, especially in the field of the available research methods. That revolution has been going on for some time in some areas. But in recent years, because more and more material is being digitized and also because computer science can do more and more interesting things for humanities scientists, you have seen that this revolution is really taking shape. But all that data and the tools needed to analyze it are in all kinds of different places. That is why a kind of 'digital infrastructure' is needed to bring all those tools and data together.”
So that infrastructure mainly consists of computer programs to analyze data? "Among other things. Essentially what we do is bring the data and the analysis tools together. The tools must be structured in such a way that they can handle as many types of data as possible. And the data in turn must be standardized in such a way that as many tools as possible can work with it. So it's about data and analysis tools that can talk to each other in a digital way.”
“When you talk about data, it concerns three types of data:firstly, the structured data that is stored in different databases. Recently, there has been an increasing amount of unstructured data. This involves large amounts of digitized but unstructured texts from archives. You deal with this differently than with structured data. Finally, you have things like movies, pictures and audio. You deal with that differently.”
Really big data So. That goes beyond the old-fashioned handiwork? "To give an example. At the IISH we have the archive of the trade union FNV. That's miles of paper. It is now being digitized bit by bit. For example, if I want to know how that union has reacted to globalization in recent decades, I would have to go through all those documents, brochures and minutes. We are now at the stage where we can create a dataset that can list all those documents that are relevant to such a research question in order of relevance.”
“That doesn't mean that all the research is done right away, but it does mean that you can collect data much faster. If you can also set up visualization tools that can generate graphs and map networks, it helps you a lot as a humanities scholar in achieving new insights.”
A survey in De Groene Amsterdammer showed that many humanities scholars digital humanities find the most important development in their field. Are traditional research methods exhausted? “You should not see this digital revolution as the replacement of one method by another. The traditional methods – interpretation, well-written stories – remain, but they can be supported by new research methods. This allows you to draw new, but also, above all, more substantiated conclusions. And it is also a kind of efficiency boost, because you can read and research much more in the same time than is possible as an individual scientist. If you have a machine that reads documents for you and draws preliminary conclusions from it, then the following applies:the sky is the limit .”
It is therefore mainly about answering the large and broad research questions… “Yes, questions that require an enormous amount of material to be analysed. Another example is what we did at the Huygens Institute, where I used to work. We had a project there about seventeenth-century knowledge development. We had digitized thousands of letters from seventeenth-century scholars there. We then wanted to know where certain new knowledge first appeared in such a company of scholars. Where and how was new knowledge discussed and how did it pass from one scholar to another?”
“For that you need very advanced tools that can analyze documents in the different European languages of that time. And visualization tools that can then map correspondence networks. In this way, as a humanities scientist, you can make your conclusions a lot firmer. Instead of examining the correspondence of one such seventeenth-century scientist, you can research many at the same time. Then your conclusions will be more in the direction of science:they will be much harder and better substantiated.”
Is that necessary, then, to draw humanities more towards the beta? Humanities scholars would be concerned with interpretation, not quantitative research… “I'm not saying that the humanities should become more like the sciences, but that this is an agreement. What I also just said, with the examples about the FNV and the seventeenth-century scientists, is that you collect a lot of data and draw your conclusions based on that. But in the end we are still humanities scholars who interpret and write beautiful stories. That story is only better substantiated in this way.”
“I think it's jamemr that there now seems to be a kind of battle of directions within the humanities. With scientists on the one hand who support these new methods and on the other hand people who don't like it at all, because they think that digitization does not suit the humanities. But digitization is something that in addition to the traditional methods will exist. It is a way to better substantiate your conclusions.”