Social media is a ubiquitous part of our lives. And for many government analysts and investigators, a ubiquitous part of their work-flows.
This ecosystem has become more visible over the past few months as meddling in the U.S. presidential election has come to light. These events have dramatically changed the playing field and will create major disruptions for organizations that rely on aggregated social media data for critical missions.
To maintain their “free” status, social media companies have several ways to monetize their platforms. A primary method is to target users with advertising. With detailed user behavior stats, social media platforms can place targeted ads that speak to the specific preferences or characteristics of the user.
Another opportunity, and perhaps more concerning as detailed below, is for social media companies to sell programmatic access (also known as API access) on the collected behavior data of each user. Every post, like, check-in and retweet is captured as an event, correlated with a user profile, and sold to vendors.
Those vendors then parse it using advanced analytics to determine trends, patterns or criminal activity. The vendor may then use this data directly for marketing purposes, or they may sell the data as an aggregated feed to downstream organizations looking to conduct their own research on the user population. Social media activity feeds are often sold to law enforcement or other investigative organizations that conduct research.
But this gravy train is likely to come to a halt. Several current events foreshadow the potential end of programmatic access to aggregate social media data.
In 2016 the European Union adopted the General Data Protection Regulation (GDPR) which governs how personal data is stored and protected for European citizens or residents. It goes into effect on May 25, 2018. GDPR will profoundly impact how any company storing any personally identifiable data on EU citizens or residents manages user data.
Per the GDPR, personal data is defined as “any information relating to an individual, whether it relates to his or her private, professional or public life. It can be anything from a name, a home address, a photo, an email address, bank details, posts on social networking websites, medical information, or a computer’s IP address.”
That ‘covered’ data needs to be secured with robust, standardized information security processes; any vendor sharing that covered data with a third party is responsible for the integrity of the data; and any vendor who receives a ‘SAR’, or subject access request, which is a user request for clarification or deletion of their data, must comply. If they don’t, the company is subject to fines of up to 4% of global revenue.
Put another way, under GDPR, user data belongs to the user ― and not to the vendors anymore. If providers, or their downstream customers, don’t comply, they’re liable. This is true regardless of a company’s physical location.
More recently, revelations regarding the systemic abuse of Facebook’s API were brought to light by the investigation into Cambridge Analytica (CA) and the 2016 presidential election.
A whistleblower disclosed how CA programmatically grabbed data from more than 50 million people via a loophole in Facebook’s API to create complex pattern of life profiles to influence decision making. Legislative bodies on both sides of the Atlantic have started inquiries into Facebook and the monetization of data.
There is no doubt that in the aftermath of these revelations, social media companies will face legal restrictions in how they share and monetize user data - regardless of the terms of the EULA that people accept.
Many government organizations have come to rely on social media aggregators for mission critical Open Source Intelligence (OSINT). Those that do know that loss of access can have mission impact.
In 2017, Twitter banned 3rd party vendors Dataminr and Geofeedia from accessing its API if they continued to sell to law enforcement and the Intelligence Community. While Dataminr and Geofeedia were impacted, the mission wasn’t, because literally dozens of other companies sold access to the data. The government had cheap alternatives. Where the decision by Twitter to cut API access was localized to a few vendors, when the legislative changes are enacted, all personal data is likely to be affected. In other words, given current events, the next change will probably impact all social media platforms, not just a few vendors.
However, the mission continues. Government organizations will still need to access social media data to conduct OSINT. But the source data will be much harder to obtain, and will likely require manual analysis. While analysts and investigators excel at overcoming collection gaps, the loss of programmatic access to social media will be as profound a collection gap as we’ve seen in decades.
Where analysts have benefited before from composite feeds and machine-learning tools to synthesize and distill data, primary OSINT skills may have atrophied. In a post-legislative world, analysts will need to get better at first-party collection and adversary assessment. Harkening back to the pre-big data days of adversary research.
How effective will they be and how quickly can they come up to speed? To a large extent, the answer largely depends on the quality of tools and training that the government is willing to provide to fill the coming void.