The emergence of large scale social networks has led to research in approaches to classify similar users on a network. While many such approaches use data mining techniques, recent efforts have focused on measuring the similarity of users using structural properties of the underlying graph representing the network. In this paper, we identify the Twitter followers of the 2016 presidential candidates and classify them as Democrat, Republican or Bipartisan. We did this by designing a new approach to measuring structural similarity, PolRANK. PolRANK computes the similarity of a pair of users by accounting for both the number of candidates they follow from each party and the specific candidates they follow. To test our algorithm, we crawled a data set of all followers of every presidential candidate in June 2015 and then ran experiments on a random subset of 10% of that data. When tested against similar algorithms, PolRANK outperforms SimRank, P-Rank and Cosine-Similarity as it is more efficient when used in large data sets. This efficiency is due to PolRANK’s ability to calculate similarity independent of other users. The time complexity of P-Rank is O(n4) while the time complexity of PolRANK is O(n3).
Paustian, William K., "Classifying Political Similarity of Twitter Users" (2015). Computer Science Summer Fellows. 1.
Available to all.