Evelyn S. Ruppert ( Goldsmiths, University of London)
3.144 billion Internet Users, 957 million websites, 180 billion emails, 3.6 billion Google searches, 3 million blog posts, 679 million tweets, 7.4 billion videos viewed, 161 million photos uploaded to Instagram – all of that activity measured on one day in June 2015 and generating what is often referred to as digital traces and Big Data. But Internet statistics tell us very little about how data is made. What they don’t tell us is that numerous people, technologies, practices and actions are involved in how data is shaped, made and captured. In other words, data is a collective accomplishment and would not be possible without relations between people and technologies. So despite claims that it is natural and raw, data is the result of the decisions, priorities, interests and values of numerous actors. In the race to exploit the hidden potential of Big Data in business, government and academia to tell us truths about societies, we risk making errors of interpretation and understanding if we don’t attend to these questions of how data is socially produced.
Statistics on Big Data and these key questions were the background to an ESRC funded project called Socialising Big Data, which sought to go beyond pronouncements about the data deluge or data revolution. Instead, the project focused on developing an understanding of the social relations that we are a part of and which come to make be remade by Big Data. To jump to the conclusion, the project developed an understanding that Big Data is inherently social because it is a product of and has a capacity to establish social relations. This understanding led to a call for reframing conventional debates that typically focus on ownership towards the development of a social ethics that recognises the connectedness and interdependent relations that make up and are made up by Big Data. It was through this argument that the project – quite unexpectedly – came to address the question of who owns Big Data.
We got to this question through the identification of three kinds of relations that we term ‘data socialities.’ They are relations that come into being through Big Data, understood as the product of different actors and technologies involved in its generation (digital platforms, mobile devices, sensors, sequencers), formatting (cleaned, linked, packaged, stored, curated) and analysis (mined, visualised, correlated).
The first relation concerns how people are attached to and socialised by Big Data. Through their bodies and actions, interactions and transactions they become part of the sociotechnical arrangements that generate Big Data and through which they become data subjects. That is, people are part of, attached to, and become subjects through social media platforms, sensors and genomic sequencers, which then come to generate Big Data. But rather than separate, both data subjects and sociotechnical arrangements are formed and changed through their mutual attachment. Platforms such as social media or search engines are calibrated and recalibrated in relation to what subjects do, and subjects adjust and change their actions and interactions in relation to those calibrations. There are, in other words, feedback loops between the two and Big Data is an outcome of these relations. Put simply, subjects and sociotechnical arrangements do not exist without the other and both change and modulate in relation to each other. This challenges assumptions that device and platform owners are the designers and makers of data and merely collect and thus own it.
A second understanding of data socialities concerns how it is through Big Data that subjects get attached and connected to each other. People can identify affiliations and form communities of mutual support through biological and cultural data; from genomic to social media data, people can identify with and become attached to each other. Attachments between people—networks, groups, profiles and publics—can also be identified through associations and patterns in Big Data. While there can be much uncertainty about the validity, veracity, meaning and implications of these attachments the point is that Big Data has the potential to join up and connect people socially in new and novel ways. At the same time, attachments can and do have an impact on people through targeted interventions or the differential treatment of identified groups.
A final relation is that which occurs when Big Data is shared, re-used, mixed, and analysed across diverse contexts and situations. In this view, Big Data connects myriad distributed people (computer scientists, data handlers, mathematicians, platform designers) and technologies (computers, devices, software, algorithms). Indeed, Big Data was the very thing that enabled us as social scientists to come together with practitioners working in the diverse contexts of national statistics, waste management and genomics. That is, Big Data connected us socially and in this way constituted a boundary object between our multiple practices, interpretations and contexts.
In these three ways data socialities can be understood as the product of attachments to and uses of Big Data, which is a collective accomplishment of connected and interdependent people and technologies. That Big Data is inherently social offers a different approach to conventional debates about who owns Big Data: corporations or data subjects? For instance, Alex Pentland proposes a ‘New Deal on Data’ to ‘rebalance of the ownership of data in favor of the individual whose data is collected.’ Data is understood as individual rather than socially interdependent and engages subjects as customers and owners. Data socialities suggest neither: Big Data is a collective and social product that exists and is maintained through connected and interdependent relations. In these ways data socialities offers a different starting point to other related arguments such as open data and proposals for a data commons which start with the social goods and benefits that sharing data can deliver rather than the social of Big Data. Taking the latter as a starting point provides an opportunity to reframe debates on ownership and related issues of privacy, confidentiality, data protection, and consent in what we are calling a social ethics of Big Data. This is the challenge that the project is now posing in a proposed ‘Social Framework for Big Data’, which will be published later this year.
Evelyn Ruppert is Professor and Director of Research in the Department of Sociology at Goldsmiths, University of London. Evelyn is a Data Sociologist with interests in the sociology of governance specifically in relation to how different kinds of data are constituted and mobilised to enact and manage populations. She has undertaken research on how different socio-technical methods and forms of data (censuses, administrative databases, surveys, transactions) organise and make possible particular ways of constituting and governing populations and how digital devices and data are reassembling social science methods. She is currently PI of an ERC funded Consolidator Grant project, Peopling Europe: How data make a people (ARITHMUS; 2014-19) and a recently completed ESRC funded project, Socialising Big Data (2013-14). She is Founding and Editor-in-chief of a SAGE open access journal, Big Data & Society. Her book, Being Digital Citizens (with Engin Isin) was published in April 2015.