#protestbaku analysis – the day after

ProtestBaku was the “official” hashtag of the protests in Baku over the past few months. It has been used by both the opposition and pro-government groups.

I’ve done a number of hashtag analyses on it.

Throughout the last few months, the pro-government tweeple have both hijacked the ProtestBaku hashtag and started a few of their own. One such example is #molotovlugenclik (Molotov Cocktail Youth), which I wrote about here.

I’m not a fan of the “impressions” or “top tweet” metrics, but I’d like to look at the networks themselves.

There are two ways to conduct Twitter analysis – one is looking at followers (who follows whom) and the other is to look at replies (and just use of the hashtags).

I’m going to focus this analysis on use of the hashtag.

With protestbaku, there were 471 Twitter accounts that used the phrase, and 529 connections (replies and retweets).

The density (0-1 how interconnected the twitterers are), is 0.001 — so, while it might seem “bad” that the interconnectedness is low, considering that the point of this hashtag is to spread information, low density is a good thing.

The average distance between any two tweeple in the map is 4.09. So you can think about this in this way: for information to go from 1 person to another, it would have to go from X…X…X…X – two people in between to get there.

full report



So, on this second analysis it takes into consideration people that follow each other. This tells you less about the spread of information and more about the relationships between people on Twitter.

However, there were 468 tweeple, with 9925 connections between them. The average distance was 2.25, so very close. Density was .05 – so tighter than the model without followers, but still information flows widely.

The most connected users are:

So there are 3 main groups.

Group 1 is foreigners and anti-government.

Group 2 includes mostly pro-government Twitter users.

Group 3 focuses on diasporan Azerbaijani tweeple like muntezir and JamalAli.

full report



#protestbaku did a very good job of spreading information widely, especially compared to other similar hashtags both in Azerbaijan and in different protests globally.

Despite hijacking efforts, the hashtag was “controlled” by those who use it properly.

Hashtag analysis #protestbaku #10mart – March 10

Lots of action in Azerbaijan right now – I can barely begin to describe.

But here’s some hashtag analysis.

#protestbaku from 10 March 3am
full report


I don’t have much to say about this right now, but more will come tomorrow.

#10mart full report

10 March 3pm full report

10 March 4pm full report

10 March 5pm full report

10 March 7pm full report

10 March 9pm full report

#protestbaku got interesting again

I’ve been collecting the tweets for #protestbaku but things haven’t been interesting in awhile. But let’s look at this analysis from March 8 at 5am.


full report

I don’t really get these clusters.

Groups 1 and 2 tweet in Azerbaijani mostly. And somehow I am in group 2. I am not sure what the differences between groups 1 and 2 are.

Group 3 is a mix of foreigners and locals, an English and Azerbaijani.

But this weekend is a new protest, so the hashtag should liven up. Maybe this is a good pre-protest sense of what is happening.

What are these knots of string doing all over your blog, Katy?

For the last 2 weeks, I’ve been posting a lot of images on my blog, but I haven’t taken the time to explain what or why or how these are happening.

Azerbaijan, one of the countries that I study, is experiencing some turmoil right now. Briefly, a soldier was killed in the military as a result of hazing. There was a coverup, but it was found out and photos of the soldier’s body came out on social media. As a result, a demonstration was organized, mainly on Facebook, to protest this sort of thing occurring in the military. (It is not uncommon.)

On January 12, the protest occurred.

At first I was involved in social media as I normally am – retweeting, sharing stories, etc. because I have a general interest in democracy and technology in this country. But then I had an idea to analyze the tweets. This seemed especially important to me because there was a bit of a battle occurring on the main event hashtag #protestbaku between pro-government and more democratically-inclined social media users.

So, thanks to Marc Smith, the first social network analysis of the #protestbaku hashtag was created about 3 hours after the protest started using the NodeXL program.

What is a social network analysis? Via Wikipedia:

Social network analysis (SNA) is the methodical analysis of social networks. Social network analysis views social relationships in terms of network theory, consisting of nodes (representing individual actors within the network) and ties (which represent relationships between the individuals, such as friendship, kinship, organizational position, sexual relationships, etc.) These networks are often depicted in a social network diagram, where nodes are represented as points and ties are represented as lines.

And NodeXL is a free tool that works with Microsoft Excel to create interactive network visualizations. It is fairly easy to use once you get used to it.

So with this program, you can see who follows whom on Twitter, who replies to each other, etc. And then it shows this all visually.

After the first January 12 protest, I did a new analysis of all the tweets, then again after the weekend was over. Then a week later I ran the analysis again and again.

The pro-government social media users started a counter-hashtag to shame a journalist. I noticed that there were a lot of strange twitter accounts associated with that hashtag. Analysis of that is here and here. I’m not going to leap to any conclusions, but please read for yourself.

Then on January 23, a riot began in a regional city Ismayilli. There was much tweeting about it, mainly from people not on the ground. But once again I did analysis of the tweets. I also made a graphic of the changing dynamics of the hashtag.

Then on January 26, another protest was organized in Baku and once again the #protestbaku hashtag was used. I kept all the analysis on one page here.


So why am I doing this?

  • I am a social scientist. I like seeing patterns in things and I believe that this sort of modeling can add to understanding.
  • I have the resources (time, computing power, skills) to do this.
  • I like making analyses accessible to people that don’t have the skills that I do.
  • I believe that information (to some extent) should be free. Moreover, I imagine that people in power have tools to understand networks like this and giving this information to everyone is more egalitarian.
  • This information (social media data) is already out there in the world, just not organized in this way.
  • I believe in freedom of expression. I am deeply sad that there is little freedom of expression in Azerbaijan.
  • If this analysis can be a tool for those supporting freedom of expression, that gives me a great deal of joy. I hope that it is not also being used as a tool of suppression, but that is the price one pays for transparency and openness.
  • It is possible that at some point I will write up some of this in the form of an academic article.
  • I’ve received a lot of positive feedback from those involved in these events that this analysis has been useful to them. It isn’t often that this sort of thing can have an immediate application, so this is really cool.

In the meantime, I am happy to answer any questions about this.


#protestbaku – part 2

So there is a new protest on January 26. It started at 3pm Baku time. Here’s the analysis for 4pm Baku time.


full report

229 users with 3559 in the last 5 hours.

most common words:

protestbaku – the hashtag itself
rt – retweet
ismayilli – the hashtag for the other event of the week
police – obvious
azerbaijan – obvious
detained – telling
sahil – means coast/shore, referring to Sahil Park
eminmilli – at the protest, was detained
polis – police
baku – obvious
və – and
ruslanazad – a main tweeter
azerbaycan – Azerbaijan in Azerbaijani
muntezir – a main tweeter
protest – obvious
turanoza – a main tweeter
saxlanıldı – held
plan – tweeeted “plan b” when protest moved from 1 location to another
huseynovaturkan – a main tweeter
b – from plan b
emin – common name
bağına – garden (?)
ismayil – Khadija Ismayil, journalist
milli – national and surname of Emin Milli

Top URLs are live videostreams and photos:

As far as the groups – they are a little strange to me this time. I’m open to any interpretations/suggestions here!

Here’s the analysis for 6pm Baku time.


full report

265 users with 3217 in the last 2 hours (basically since the last analysis was run — in a few hours, I’ll combine all the hours for a full analysis).

4 groups now – group 1 is foreigners and people with a large foreign followership like Emin Milli and FuserLimon. Group 2 are people tweeting in Azerbaijani on the ground. Group 3 is news broadcasters like Muntezir. Group 4 seems to focus on Arzu Geybulla.

But as you can see, all of these people are in a pretty close looped network. They’re mainly following each other.

Here’s the analysis for 8pm Baku time.


full report

285 users with 3652 in the last 2 hours (basically since the last analysis was run — in a few hours, I’ll combine all the hours for a full analysis).

2am Jan 26 Baku time – this is the last 7 hours


full report

Users: 385
Tweets: 5651


1am January 28 update

full report

#khadijautan update

My original post is here.

It has been 2 days since I ran my first analysis, so here’s an update.

The groups this time are now MUCH clearer to me. As you can see there are 3 groups – groups 1 and 3 are anti-Khadija tweeters and group 2 are anti-anti-Khadija tweeters.


Link to full analysis

150 people used the #khadijautan hashtag with 2527 total tweets.

So, now to the other issue… what was going on with all of these repeated tweets. Out of 1245 tweets, VERY FEW were original.

In Excel, I sorted all the tweets alphabetically. I then also used conditional formatting in Excel to turn duplicate tweets red. (Also of note, if a URL shortener was used, the tweets don’t look like duplicates because they have different URL shortenings – but I hand-coded those.) (You can easily download the Excel file here and look at it yourself.)

Yes, the same accounts tended to be the ones that were writing the same tweets. There are many examples of this. In fact, the majority of the hashtag was this kind of tweet.

I’m not making any conclusions, but I wanted to point out that a lot of the same people are posting the exact same tweet and that is strange. These accounts seem to be “real” in that these people have since tweeted other things unrelated to the #khadijautan hashtag.

But then I saw a strange pattern – these repeated tweets were all posted a few minutes apart.

time6 time5 time4  time2 time

If I was a gambler, I’d say that either 1 person was logging into multiple twitter accounts or some sort of program was used.

As always, I am happy to answer questions about this.


Here’s an example of a couple of Twitter accounts where the same two tweets are posted in a row by multiple accounts.



#khadijautan – something is strange here

In a reaction to #protestbaku, a number of Azerbaijani tweeters, especially those associated with the pro-government youth organization, began a Twitter campaign called #khadijautan. This translates to “shame on you Khadija.” Who is Khadija? She is a journalist with Radio Free Europe, known for her investigations of government corruption. (More on Khadija here). What was so shameful? She said “there is a need for mothers in this country who don’t bargain over their son’s dead bodies”[...].” More on this on Arzu Geybulla’s blog.

And those using the hashtag think that they succeeded: “With 22K people engaged, apx 200K impressions #KhadijaUtan campaign succeeded. #azerbaijan #protestBaku.”

When I was doing analysis of #protestbaku, I saw that a lot of the Twitter accounts using #khadijautan didn’t have a photo associated with them. That is sort of odd, right? Most people put a picture on their Twitter account.

[Here is a tl;dr:"Turns out... "successful" #khadijautan hashtag campaign was mainly executed by a cyber-zombie army of tweeters that had 1. No profile photo 2. No followers 3. Didn't send any tweets before this campaign 4. They wrote the same message over and over again. Read this article only if you have basic knowledge of how twitter works and statistics."]

Only 126 people used this hashtag but they tweeted using it 2198 times (that includes 557 retweets), so it was fairly easy to do analysis on this.

I looked a little closer to my social network analysis map and saw that those Tweeters without photos also tended to not have a lot of friends on Twitter. That’s also a little odd.

So I took a closer look. Link to the full report here.

To understand the following, let’s have a little refresher of high school statistics:

Average or mean = equal to the sum of the values divided by the number of values
Standard deviation = standard deviation shows how much variation or “dispersion” exists from the average (mean, or expected value). A low standard deviation indicates that the data points tend to be very close to the mean; high standard deviation indicates that the data points are spread out over a large range of values.
Mode = value that appears most often in a set of data

Here’s the distribution of followers and following for the people on this hashtag. You can see that the vast majority of those tweeting with #khadijautan don’t have very many other people that they follow or that follow them. The average number of people that #khadijautan Tweeters follow is 194, the mode is 122, standard deviation of 242. This means that even though some people follow a lot of people, most don’t.

The number of followers for one of these #khadijautan was on average 371, and the mode was 7, standard deviation was 562. Again, while some of these people have a lot of followers, most don’t.


And it looked like the users of the #khadijautan hashtag didn’t Tweet a lot.


The number of tweets for these people is average 5168, mode 45, standard deviation of 11715. Again, a lot of people that don’t tweet a lot were on this hashtag.

Then I sorted the Tweeters by the date that they joined Twitter. 14 of them joined Twitter in the last few days. That isn’t that many.

Here’s the distribution of when these people joined Twitter. As you can see, a lot of them joined recently.


This is unlike most hashtag analyses. It is odd.

But let’s look at the groups – this is essential to understanding what is going on.


Group 2, for example, are mostly people fighting AGAINST this hashtag (full disclosure, this includes myself).

Group 3 includes individuals that are regular Tweeters from the pro-government opposition group.

Groups 4 and 5 seems really strange to me. I’m not sure what’s going on there. They look like tweet aggregators.

So let’s talk about Group 1 then. The top tweeters are all in the middle, but look at all the accounts that don’t have profile pictures (the blue circles). (This is also the case for Group 3, but not as heavily.) There were 41 people in Group 1 and 34 people in Group 3. That isn’t a lot. They all follow each other. Not many people saw their hashtag.



Okay, the content of the Tweets. What were people saying on the #khadijautan hashtag?

(I’m going to summarize this, but you can download the whole file here if you want to look at it yourself.)

What ended up surprising me is that a lot of Tweets from these “no profile photo” accounts were basically the same statement over and over again. Not retweets, per se, but just the same statement.

For example, this: “Стремящаяся вести свои политические игры, пользуясь смертью невинного солдата #Khadijautan #aztwi” was said 27 times by 21 different “no profile photo” accounts. This seems really strange to me.

Or this tweet: “X.İsmayıl bazarlıq statusunun Samirə Qubadovaya aid olmadığını dedi. #KhadijaUtan kampaniyası məqsədinə çatmışdır! http://t.co/xCEM83kz” was said 18 times by many of the same people that were tweeting repeatedly in other cases too AND don’t have profile photos.

This strange behavior happened a LOT.

I think that it is fair to say that there is some sort of robot set up to do these tweets.

I welcome questions on this and encourage people to open the file and look for themselves.