Submission Date

7-24-2015

Document Type

Paper

Department

Computer Science

Faculty Mentor

Akshaye Dhawan

Comments

Presented during the 17th Annual Summer Fellows Symposium, July 24, 2015 at Ursinus College.

Supported by a Howard Hughes Medical Institute (HHMI) grant.

Project Description

The emergence of large scale social networks has led to research in approaches to classify similar users on a network. While many such approaches use data mining techniques, recent efforts have focused on measuring the similarity of users using structural properties of the underlying graph representing the network. In this paper, we identify the Twitter followers of the 2016 presidential candidates and classify them as Democrat, Republican or Bipartisan. We did this by designing a new approach to measuring structural similarity, PolRANK. PolRANK computes the similarity of a pair of users by accounting for both the number of candidates they follow from each party and the specific candidates they follow. To test our algorithm, we crawled a data set of all followers of every presidential candidate in June 2015 and then ran experiments on a random subset of 10% of that data. When tested against similar algorithms, PolRANK outperforms SimRank[1], P-Rank[2] and Cosine-Similarity as it is more efficient when used in large data sets. This efficiency is due to PolRANK’s ability to calculate similarity independent of other users. The time complexity of P-Rank is O(n4) while the time complexity of PolRANK is O(n3).

Share

COinS