Power, influence and social advancement are often based on invisible relationships. Historical network analysis can reveal such connections, but requires data that is difficult to access and scattered across numerous institutions. This is precisely where SoNAR comes in: this innovative research data marketplace brings together access to hundreds of millions of records on actors and their social relationships, thereby making a significant contribution to historical research using new digital methods.
Imagine it is the year 1434. In Florence, the city’s powerful families are locked in a bitter struggle for power. The Medici triumph, not through arms or wealth, but through a strategic network of social connections that spans the city, largely through marriage. Whilst families such as the Strozzi prefer to marry their children amongst themselves, the Medici deliberately link their sons and daughters to families from very different neighbourhoods, thereby building bridges between otherwise separate groups. In this way, the Medici become the indispensable hub of social life in Florence – and beyond.
Such hidden structures can be uncovered using a method known as historical network analysis. In this process, marital ties are recorded as in an organisational chart: each family as a node, each marriage as a line between two nodes. Suddenly, it becomes clear that the Medici were not successful by chance, but because their position within the network placed all power in their hands. Historical network analysis reveals how hidden patterns of relationships shape history.
To identify such patterns, however, researchers need data – and this is where a key problem arises: the necessary information on catalogued sources such as documents, letters, books and objects is scattered across numerous digital catalogues and collections belonging to a wide variety of institutions, including archives, libraries and museums. This makes systematic research immensely difficult. Data collection is almost impossible.
This is the challenge addressed by SoNAR, a new research data infrastructure. It is currently being developed at the Berlin State Library in close cooperation with the Institute of Library and Information Science at Humboldt University of Berlin. The project is funded by the German Research Foundation and will run from October 2025 to March 2028.
What is SoNAR?
The name SoNAR stands for Social Network Analysis and Related Research. SoNAR is a data marketplace that will provide several hundred million records on actors – individuals, organisations, families – and their relationships. This represents an unprecedented volume of data for historical research. The core principle lies in linking information from various archives, museums and libraries and making it accessible. Researchers can use SoNAR to search for specific data, filter it and download it, before analysing it using their digital tools. In this way, the infrastructure project responds to the growing interest within the humanities and social sciences in approaching historical questions using new digital methods.
Network analyses, for example, are used by historians, literary scholars, economists and sociologists to investigate a wide variety of networks of relationships that are often deeply hidden within the data: academic contacts, archaeological material networks, political movements or economic relationships. What links all these different research interests is a relational perspective – that is, an examination of how people, objects or ideas were connected to one another.
Data sources
One of SoNAR’s strengths lies in the systematic consolidation of heterogeneous data sets. Initially, five major data collections will be integrated: The Gemeinsame Normdatei is the central reference system in the German-speaking world and ensures, for example, that personal names can be unambiguously identified. Kalliope catalogues archival holdings such as estates and autograph collections – including letters, diaries and manuscripts – from libraries, archives and museums. museum-digital, which already has an international presence, enables museums of all sizes to carry out professional cataloguing. SoNAR is also bolstered by Social Networks and Archival Context (SNAC). SNAC is an international cooperative based in the United States that links archival sources on the basis of uniquely identified individuals, organisations and families, as well as their relationships. The Journal Database (ZDB) completes the offering with records of newspapers and journals spanning several centuries. Further digital collections are set to be added in the coming years.
But linking these datasets is tricky: every institution catalogues according to its own rules. Libraries do it differently from archives, and archives differently from museums. This results in very different datasets. SoNAR has therefore designed an intelligent intermediary layer. It translates these different ‘cataloguing languages’ and brings them into a uniform format. This has a major advantage: if the same person appears in multiple sources – for example, a historical figure who both wrote letters (recorded in Kalliope) and served as co-editor of a journal (noted in the ZDB) – SoNAR uses standardised data not only to recognise that this is the same person, but also to identify further connections: Who was in contact with whom and in what capacity, and where were they active?
What SoNAR does – and what it doesn’t
SoNAR will facilitate the collection, curation, selection and processing of data, and document the source of every single finding. This makes research using this data transparent and reproducible. Others can see what has been done and replicate the studies.
But SoNAR is just one component in the research process. More data does not automatically lead to better results, and SoNAR is not a ‘one-click’ solution. The research work remains the responsibility of the researchers. They must assess and weigh up which data are relevant to the respective research questions and promise to yield new insights, and which do not. It is also important to note that SoNAR can only work with existing data sourced from cultural heritage institutions or previous research projects. For some research interests, it will therefore still be necessary to first collect other or supplementary data.
Open Science rather than data power
SoNAR will be openly accessible, with no restrictions on use. This fundamentally distinguishes the infrastructure from the offerings of large corporations in the data economy. SoNAR is funded by public money and is not profit-driven. Researchers should be able to use the service without any barriers. The guiding principle behind SoNAR is that whoever has access to data can conduct research freely. And free research benefits everyone.













