"Latent Semantic Indexing in the Discovery of Cyber-bullying in Online " by Jacob L. Bigelow

Computer Science Summer Fellows

Title

Latent Semantic Indexing in the Discovery of Cyber-bullying in Online Text

Author

Jacob L. Bigelow, Ursinus CollegeFollow

Submission Date

7-22-2016

Document Type

Paper

Department

Computer Science

Faculty Mentor

April Kontostathis

Comments

Presented during the 18th Annual Summer Fellows Symposium, July 22, 2016 at Ursinus College.

Supported by a National Science Foundation Research at Undergraduate Institutions (NSF RUI) grant (No. 1421896).

Project Description

The rise in the use of social media and particularly the rise of adolescent use has led to a new means of bullying. Cyber-bullying has proven consequential to youth internet users causing a need for a response. In order to effectively stop this problem we need a verified method of detecting cyber-bullying in online text; we aim to find that method. For this project we look at thirteen thousand labeled posts from Formspring and create a bank of words used in the posts. First the posts are cleaned up by taking out punctuation, normalizing emoticons, and removing high and low frequency words. Due to the nature of online text many of the words are misspelled either purposefully or unintentionally so a spell check software is used to check the vocabulary, ensuring spelling variations are accounted for. Using this word bank we create a term by document matrix with each post being its own document. By implementing Latent Semantic Indexing (LSI) a query can be placed to the matrix for posts that could have cyber-bullying content. Then the algorithm is trained by adjusting our methods to clean posts and revising spelling corrections for particular repetitive words. With an established approach to pruning the word bank we test our LSI algorithm on other data sets.

Recommended Citation

Bigelow, Jacob L., "Latent Semantic Indexing in the Discovery of Cyber-bullying in Online Text" (2016). Computer Science Summer Fellows. 2.
https://digitalcommons.ursinus.edu/comp_sum/2

Download

Open Access

Available to all.

Included in

Communication Technology and New Media Commons, Computational Linguistics Commons, Databases and Information Systems Commons, Social Media Commons, Theory and Algorithms Commons

COinS

Computer Science Summer Fellows

Title

Author

Submission Date

Document Type

Department

Faculty Mentor

Comments

Project Description

Recommended Citation

Open Access

Included in

Browse

Summer Fellows Resources

Search

Author Corner

Links

Computer Science Summer Fellows

Title

Author

Submission Date

Document Type

Department

Faculty Mentor

Comments

Project Description

Recommended Citation

Open Access

Included in

Share

Browse

Summer Fellows Resources

Search

Author Corner

Links