Abstract
With the increasing enrollment numbers into popular computer science courses, there is a need to bridge the similarly increasing feedback gap between individual students and course instructors. One way to address this challenge is for instructors to collect feedback from students in form of textual reviews or unit-of-study reflections – however, manually reading these reviews is time-consuming, and self-reported Likert scale responses are noisy. Rule-based approaches to sentiment analysis such as VADER (Valence Aware Dictionary and sEntiment Reasoner) have been used to capture the sentiments conveyed in textual feedback, they however fail to capture contextual differences as many words have different sentiments in different contexts. In this work, I investigated the use of supervised machine learning approaches and compared their performance in predicting the sentiment in student feedback collected in large computer science classes with the lexicon-based approach VADER. I found that machine learning models trained solely on student self-reported sentiment ratings were only comparable with a balanced accuracy of 73.8% versus 73% (VADER). However, a hybrid approach using the VADER score as a feature and training using the student self-ratings performed better than VADER alone. Using better quality labels collected through a crowdsourcing experiment led to the best machine learning model performance.
Committee Chair
Marion Neumann, PhD
Committee Members
Chien-Ju Ho, PhD William Yeoh, PhD
Degree
Master of Science (MS)
Author's Department
Computer Science & Engineering
Document Type
Thesis
Date of Award
Spring 5-20-2022
Language
English (en)
DOI
https://doi.org/10.7936/2nc1-6j73
Author's ORCID
https://orcid.org/0000-0002-3739-8919
Recommended Citation
Kasumba, Robert, "Application of Crowdsourcing and Machine Learning to Predict Sentiments in Textual Student Feedback in Large Computer Science Classes" (2022). McKelvey School of Engineering Theses & Dissertations. 708.
The definitive version is available at https://doi.org/10.7936/2nc1-6j73