Lost your password?

Twitterbots: Anatomy of a Propaganda Campaign
Gepost door  redactie redactie Gepostop  07-06-2019 12:18 07-06-2019 12:18 579  keer gelezen 579 keer gelezen  0 reacties0 reacties News News
NewsGillian Cleary
Senior Software Engineer

Internet Research Agency archive reveals a vast, coordinated campaign that was incredibly successful at pushing out and amplifying its messages.

Key Findings
The operation was carefully planned, with accounts often registered months before they were used – and well in advance of the 2016 U.S. presidential election. The average time between account creation and first tweet was 177 days.
A core group of main accounts was used to push out new content. These were often ”fake news” outlets masquerading as regional news outlets or pretended to be political organizations.
A much larger pool of auxiliary accounts was used to amplify messages pushed out by the main accounts. These usually pretended to be individuals.
The campaign directed propaganda at both sides of the liberal/conservative political divide in the U.S., in particular the more disaffected elements of both camps.
Most accounts were primarily automated, but they would frequently show signs of manual intervention, such as posting original content or slightly changing the wording of reposted contented, presumably in an attempt to make them appear more authentic and reduce the risk of their deletion. Fake news accounts were set up to monitor blog activity and automatically push new blog posts to Twitter. Auxiliary accounts were configured to retweet content pushed out by the main accounts.
The most retweeted account garnered over 6 million retweets. Only a small fraction (1,850) of those retweets came from other accounts within the dataset, meaning many of the retweets could have come from genuine Twitter users.
One of the main talking points of the 2016 U.S. presidential election campaign involved attempts to surreptitiously influence public opinion using social media campaigns. In the months after the election, it quickly became apparent that a sophisticated propaganda operation had been directed against American voters.

Not surprisingly, news of these campaigns caused widespread public concern, prompting social media firms to launch investigations into whether their services had been misused. In October 2018, Twitter released a massive dataset of content posted on its service by the Internet Research Agency (IRA), a Russian company responsible for the largest propaganda campaign directed against the U.S.

The dataset consisted of 3,836 Twitter accounts and nearly 10 million tweets. These accounts had amassed almost 6.4 million followers and were following 3.2 million accounts. The sheer volume of data was enormous, more than 275 GB.

The archive has proven to be a treasure trove of information on how the IRA’s propaganda campaign operated. For example, prior to the release, many people assumed that its posts were focused on just one side of the political spectrum. Once the data was made public, it quickly became obvious that in order to achieve its goal, the campaign directed propaganda at both sides of the liberal/conservative political divide in the U.S., in particular the more disaffected elements of both camps. The main objective of the campaign instead appeared to be sowing discord by attempting to inflame opinions on both sides. This was not just confined to the online sphere. Several of the accounts were used to organize political rallies in the U.S. and some of the most influential accounts in the dataset were used to promote these events to the largest possible audience.

However, believing that there is a lot to learn from this data beyond its messages and target audience, we decided to carry out some in-depth analysis of the archive to learn more about how this propaganda campaign worked. What we discovered was that this was not an ad-hoc response to political events in the U.S. Instead, the evidence points to a carefully planned and coordinated operation, with the groundwork often laid months in advance.

While the tactics employed changed somewhat over time, the basic template for this operation remained the same, utilizing a small core of accounts to push out new content and a wider pool of automated accounts to amplify those messages.

Along the way, we also came across some interesting bits of information, such as what appeared to be some rogue operators using monetized link-shortening services to make some money on the side.

Different account types
Once we started analyzing the data, it became apparent that the accounts could be divided into two main categories, which we called main accounts and auxiliary accounts. Each category had different characteristics and played a different role.

Main accounts had at least 10,000 followers but followed substantially fewer accounts. They were primarily used to publish new tweets.

Auxiliary accounts had less than 10,000 followers, but often followed more accounts than that. Their main purpose was to retweet messages from other accounts, although they were also used to publish original tweets. Not surprisingly, the majority of accounts were auxiliary accounts. We identified 123 main accounts and 3,713 auxiliary accounts within the dataset.

Main accounts generally were ”fake news” outlets masquerading as regional news outlets, or pretending to be political parties or hashtag games—the popular Twitter game that involves people sharing anecdotes or jokes based on a single theme, such as #5WordsToRuinADate. Based on their creation date they were usually created individually or in small batches. The default language selected for main accounts was always either English or Russian.

Auxiliary accounts usually pretended to be individuals, spreading the content created by the main accounts by retweeting it. These accounts were usually created in batches and sometimes hundreds of auxiliary accounts were created on the same day. For example, during May 2014, seven fake news accounts were set up by the agency, along with 514 auxiliary accounts.

Many of the accounts were created long before they were used. The average time between account creation and first tweet was 177 days. The average length of time an account remained active was 429 days.

Influential accounts
Some of the Twitter accounts created by the attackers managed to be extraordinarily influential. The most retweeted account within the dataset was TEN_GOP. Created in November 2015, the account masqueraded as a group of Republicans in Tennessee. It appears to have been manually operated.

In less than two years TEN_GOP managed to rack up nearly 150,000 followers. Despite only tweeting 10,794 times, the account garnered over 6 million retweets. Only a tiny fraction (1,850) of those retweets came from other accounts within the dataset. In other words, almost all of its retweets came from accounts outside the dataset, meaning many could have been real Twitter users.

Due to the popularity of this account, the IRA created two backup accounts in case TEN_GOP should be discovered and shutdown—ELEVEN_GOP and realTEN_GOP—both of which have the same profile description: “This is our backup account in case anything happens to @TEN_GOP”.

As has already been noted, the IRA’s campaign targeted both ends of the political spectrum in the U.S. and this is reflected in the breakdown of influential accounts. The top 20 most retweeted English-language accounts were split evenly between conservative and liberal messages.

There is a similar split in themes in the breakdown of the most followed English language accounts. Thirty-five percent pretended to support conservative causes, while 30 percent pretended to back liberal causes. The remainder masqueraded as general or political news outlets.

Fake news
Most of the fake news accounts created by the agency pretended to be local news outlets, such as “New Orleans Online”, “El Paso Top News”, or “San Jose Daily”. The majority of these accounts were created between May and August of 2014, but lay dormant until January 2015, when most of them started tweeting. This suggests that the fake news element of the operation was planned well in advance.

The vast majority (96 percent) of these fake news accounts were fully automated, using services to monitor blog activity and automatically push new blog posts to Twitter. Another two percent of these accounts queued tweets for publication at scheduled times.

Most followed a broad pattern of activity, with activity trending upwards from the beginning of 2015 until the summer of 2016 when there was a sudden fall in activity. By August 2016, all but one fake news account had stopped tweeting.

This is likely because the fake news accounts had been using a service known as Twitterfeed to feed their blogs to Twitter. During 2016, Twitterfeed announced that it would close by October of that year and the fake news accounts began to transition to an alternative service called Twibble. The drop off in activity during August 2016 could have been caused by technical problems during the changeover. By December the changeover was complete and the fake news accounts had resumed business as usual.

Prolific accounts
Not surprisingly, a majority (55 percent) of the most prolific accounts were fake news accounts. However, a significant number of prolific accounts had their identities masked by Twitter because they had less than 5,000 followers. Most of these accounts acted as auxiliary accounts and were automated in the same manner as most of the fake news accounts. However, while fake news accounts were automated to publish original content, these other accounts were automated to retweet content.

Following the links
Some tweets within the dataset contained links to other content. The majority of links were shortened URLs, masking the final destination. To get a picture of which sites these tweets were linking to, we followed each shortened URL tweeted during 2016 to find out the ultimate link destination. The largest number of links led to other Twitter posts, some of which were to other suspended accounts.

Aside from Twitter, other social media services such as YouTube, Instagram, and Facebook also figured highly in the list of websites being linked to. The rest of the list was mostly filled out by links to legitimate media outlets. This suggests that, along with delivering “fake news” the campaign also leveraged real news stories that supported messages it was pushing.

2016 U.S. presidential election
During the months prior to the 2016 U.S. presidential election, there was a marked increase in activity. Between January and November 2016, accounts within the dataset sent 771,954 English language tweets, with a marked uptick in activity as November approached.

Analysis of the topics tweeted by English language accounts during this time period reveals that content was heavily focussed on the election and quite evenly split between topics relating to either side of the political spectrum in the U.S.

Political rallies
Perhaps the most overt aspect of the propaganda campaign was the IRA’s organization of a number of political rallies in the U.S. Despite the fact that the accounts comprised of fake personas and organizations, they nevertheless succeeded in mobilizing people to attend events. Besides its online activities, the campaign’s operators also organized rallies supporting positions on both sides of the political spectrum.

One account, @March_for_Trump, advertised more events than any other account in the dataset, promoting events 47 times. Most of the other accounts involved in promoting rallies are anonymized as they have less than 5,000 followers.

Only nine accounts in the dataset retweeted March_for_Trump’s tweets promoting rallies, but the majority of these retweets came from the most followed or retweeted accounts in the dataset, suggesting the campaign’s operators had prioritised promoting these events.

Professional campaign
While this propaganda campaign has often been referred to as the work of trolls, the release of the dataset makes it obvious that it was far more than that. It was planned months in advance and the operators had the resources to create and manage a vast disinformation network.

It was a highly professional campaign. Aside from the sheer volume of tweets generated over a period of years, its orchestrators developed a streamlined operation that automated the publication of new content and leveraged a network of auxiliary accounts to amplify its impact.

The sheer scale and impact of this propaganda campaign is obviously of deep concern to voters in all countries, who may fear a repeat of what happened in the lead-up to the U.S. presidential election in 2016.

A growing awareness of the disinformation campaigns may help blunt their impact in future. If you’re concerned about falling victim to similar campaigns, read our blog post How to Spot a Twitter Bot, which can be found on Symantec’s Election Security blog.

About the Author
Gillian Cleary
Senior Software Engineer
Gillian is a software engineer working for Symantec's Security Technology and Response (STAR) team. She analyses emerging threat trends and develops tools for use within the department.

Er zijn nog geen reacties geplaatst.
Reactie plaatsen
Logt u a.u.b. in om een reactie te plaatsen.