Looking for Lexical Signatures in Gomorrah

Authors

Paola Dalla Torre
Maria SS. Assunta University of Rome image/svg+xml
Paolo Fantozzi
Maria SS. Assunta University of Rome image/svg+xml
Maurizio Naldi
Maria SS. Assunta University of Rome image/svg+xml
Placido Pellegriti
AlmavivA (Italy) image/svg+xml

Synopsis

Gomorrah is an Italian crime drama TV series that has been appreciated worldwide, being sold in 190 countries, despite its wide use of Neapolitan dialect, hardly understandable without the use of subtitles even for most Italians. Scholars immediately approached the study of this serial phenome- non, analysing it from different points of view, framing it within the broad- er context of studies on the new Italian television and its serial products. Our approach to Gomorrah, taking these elements into account, adds a new perspective that concerns character recognition, an emerging branch of re- search, to associate dialogues with characters and identify the verbal features of characters. We have then chosen Gomorrah as a challenging dataset to perform character recognition. We rely on the transcripts of the series after a pre-processing stage to standardize the lexicon despite the vagaries of dialect and remove stopwords. A machine learning approach, based on a selection of tools, is then employed to identify characters from the lexicon they employ. The problem is approached as a multi-class classification scheme. We compare several representations of texts, including the simple one-hot en- coding and more advanced embedding techniques. The results are presented through a confusion matrix, which can also serve to identify similarities in the linguistic profiles of characters. 

Author Biographies

Paola Dalla Torre, Maria SS. Assunta University of Rome

Paola Dalla Torre is associate professor of Cinema, Photography, Television at LUMSA of Rome. Her research activity has developed along some guidelines: the study of contemporary cinema in its ethical-philosophical implications; the analysis of the genre of science fiction in the contemporary world; and recently the analysis of TV series through a quantitative methodology. In addition, her research is now focused on the economic-cultural study of Italian film exhibition, within nationally funded research on Italian cinemas with three Italian universities (Cin_Ex, PI Mariagrazia Fanchi).

Paolo Fantozzi, Maria SS. Assunta University of Rome

Paolo Fantozzi has a Master’s Degree in Engineering in Computer Science and a PhD in Computer Engineering, both obtained at Sapienza University of Rome. He is an assistant professor at LUMSA University in Rome, where he is in charge of the courses Algorithms and Data Structures, Databases and Big Data, and Data and Social Network Analysis. His research interests are machine learning, artificial intelligence, natural language processing, efficient algorithms on graphs and hypergraphs and theoretical computer science. His Erdos number is 4. He worked on many different research and development projects in many fields: agriculture, transportation, energy management, predictive maintenance, health care, disease monitoring, information retrieval, cyber security, law and regulations. He led the team who created the Voice Assistant at IBM IoT Lab in Hursley in 2018. He has been a tutor at the Italian Olympiads in Informatics since 2018.

Maurizio Naldi, Maria SS. Assunta University of Rome

Maurizio Naldi is a full professor of Computer Science at LUMSA University in Rome. He got his PhD in Telecommunications Engineering from the University of Rome Tor Vergata and his MSc in Electronic Engineering from the University of Palermo in 1988. Prior to his academic career, he pursued an industrial career in Several ICT companies, ending as Head, Traffic Forecasting and Cost Analysis in WIND Telecomunicazioni in 2000. He was with the University of Tor Vergata, first as an assistant professor and then as an associate professor from 2000 to 2019. He is Co-Editor of the Electronic Commerce Research and Applications journal, published by Elsevier. His research interests span the fields of network and service economics, e-commerce, risk analysis, and applications of machine learning.

Placido Pellegriti, AlmavivA (Italy)

Placido Pellegriti received his BSc degree magna cum laude in Computer Science for Data Management from LUMSA University in Rome in 2022 after a six-month internship in Almaviva as an apprentice. Since then, he has been with Almaviva as a Data Scientist. His research interests lie in machine learning and its applications.

Downloads

Published

July 10, 2023

License

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Dalla Torre, P., Fantozzi, P., Naldi, M., & Pellegriti, P. (2023). Looking for Lexical Signatures in Gomorrah. In G. Avezzù & M. Rocchi (Eds.), Audiovisual Data: Data-Driven Perspectives for Media Studies. 13th Media Mutations International Conference (pp. 41-63). Media Mutations Publishing. https://doi.org/10.21428/93b7ef64.a8ccd1c0