Explicit Content Detection in Music Lyrics Using Machine Learning

A research on automatically filtering explicit words in music lyrics by using machine learning technique.

Photo credit: Unsplash

Problem & Research Goal

Music has serious effects on children’s development. Music lyrics have become more violent and sexual over the years. However, the system for filtering explicit contents in music often does not work properly, not to mention that it takes a lot of time and effort to do it properly.
In this study, we propose several machine learning models that automatically detect explicit contents in Korean lyrics and compare their performances.

Methods

The proposed Bagging with selective vocabulary model outperformed not only the other competing models we designed, but also the filtering method that used the man-made profanity dictionary, which is a widely-used method to detect explicit contents in the industry.

Results and Contribution

The proposed automated lyrics screening approach makes practical contributions to music industry, helping it significantly save time and effort for censoring harmful contents for the youths. The proposed approach is generalizable to other language settings as long as the same kinds of data used in the study are available

DOI

https://doi.org/10.1109/BigComp.2018.00085