Artificial Intelligence and the Privacy Paradox of Opportunity, Big Data and The Digital Universe

🔬 Research summary by Connor Wright, our Partnerships Manager.

Overview: Thanks to the pandemic, internet connectivity increasing, and companies more efficiently sharing our data, even our most private data…isn’t. This paper explores data privacy in an AI-enabled world. Data awareness has increased since 2019, but the fear remains that Smith’s findings will stay too relevant for too long.

With access to data fuelling the use of protestor tracking in Uttar Pradesh, India, as well as in the Myanmar demonstrations, privacy continues to hold a prominent position in the AI debate.

Where does data end up when we die? Will our data outlive us? What are the implications of that? With increased processing speeds and connectivity, companies are able to more accurately infer information from other websites about us. Increased data awareness has been able to slow down the speed at which this takes place somewhat, but I fear that Smith’s findings may still continue to be relevant for a while.

What may even live past our own expiry date is our data itself, whereby I’ll first touch upon Smith’s questioning about where the data that is accumulated goes. In her op-ed about the lifecycle of AI systems, MAIEI staff member Alexandrine Royer refers to the ‘AI junkyard’, a place where now obsolete AI algorithms, software and hardware go to rest. As a result, given the huge amounts of data being collected due to the current pandemic, whether on coming into contact with someone who has tested positive, your medical history, or survey responses, Smith’s question about where this all goes is still very much relevant today.

Such an observation then allowed me to think about whether when I eventually go to my own graveyard, will my data follow me? Smith’s observation that our data has a very high chance of outliving us can be a chilling thought, especially given how would have literally no control if a company refused to delete our data despite our passing. Even our most private data could remain in the hands of institutions, companies and other 3rd party actors who, without any legal fight being put up, are unlikely to go rooting for my data to then delete. Furthermore, even if they do take the time, my data has potentially already aided in influencing model behaviour meaning that its deletion is a mere formality. Well, at least my most private data dies with me, right?

According to Smith, this may not be the case. Encapsulated in his privacy paradox, even our most private data are not private. Given the interconnectivity of the internet, the possibility to infer certain qualities from our actions on different websites is increased through the sharing of data between different platforms (such as WhatsApp and Facebook). Having written in the midst of the Cambridge Analytica storm, Smith comments on the power of such companies to be able to extract data even without our knowledge (such as the 85 million Facebook users that formed Cambridge Analytica’s database). Now with the slow entrance of 5G from companies such as Huawei, Smith notes how this interconnectivity is only going to increase, with slower processing times facilitating such data sharing actions.

Smith points out that mobile phones “is the modern-day version of the loyalty card without the perceived rewards” (p.g.151). We are gracing social media clients with our data custom and having our mobile phones ‘stamped’ each time we visit, but without an end in sight to reward us for our ‘business’. This becomes even more interesting when Smith acknowledges how such social networking sites don’t actually produce that much content on their platform. Rather, it’s the users generating the content that keeps such sites engaging and vibrant, where such loyalty is still not ‘rewarded’ with a welcomed free coffee.

With faster processing speeds and even more data being required thanks to the pandemic, the never-ending expansion of the digital universe is very much still in full swing. There is now more awareness of privacy issues thanks to the Cambridge Analytica scandal in 2019, especially with businesses’ legal obligation to now display their cookie policy and ask for consent to employ them as you enter their website. While such awareness goes a long way towards creating a space for privacy in our new digital universe, it’s worrying how much of Smith’s analysis still applies to our world in 2021.

Between the lines

One message that Smith had not made crystal clear in his paper and that is worth considering from my view is the need to own your data. It is widely touted that data has become the new oil, with AI basically being the mechanical equivalent of a human stranded in a barren desert without it. In this sense, true privacy comes through the medium of true control over your data in terms of who gets access to it and for what purpose. In this sense, privacy in 2021 would do well to focus on practical ways on how we can obtain such control, or even just begin to cultivate a business environment where this is encouraged. Without doing so, I fear Smith’s privacy paradox will continue to hold steady for far too long into the future.