Estudio de técnicas de inteligencia artificial aplicadas a datos recopilados en Twitter.

Arzola Santos, Manuel Alexander

Ver/Abrir

Exportar Citas

Fecha

2024

URI

http://riull.ull.es/xmlui/handle/915/36520

Resumen

En el presente TFM se trata el tema de la obtención de datos en redes sociales, su tratamiento y la extracción de conocimiento mediante diversos métodos. La red social elegida es Twitter, y el tipo de dato con el que se trabajará mayoritariamente, tuits pertenecientes a diversos usuarios. Se mostrará el criterio seguido a la hora de conseguir los datos, de acuerdo al objetivo que se pretende lograr, así como todos los pasos seguidos para la recopilación de datos y su tratamiento y limpieza. Los métodos estudiados incluyen sistemas recomendadores basados en contenido, por una parte mediante el uso del valor TF-IDF asociado a cada palabra y por otra parte haciendo uso del modelo Word2Vec, cuya arquitectura emplea redes neuronales, tocando así el deep learning. Se empleará también el machine learning, pues se usarán algoritmos de clasificación supervisada para poder clasificar usuarios de acuerdo a temas de interés para el usuario. Por último, se realizará un análisis de sentimientos sobre parte de los tuits recogidos, midiendo la positividad o negatividad de los tuits, observando su evolución temporal, la actividad del usuario, porcentaje de tuits negativos, positivos o neutros y empleando también algoritmos de clasificación que permitan etiquetar a los tuits como positivos o negativos.

This TFM deals with the issue of obtaining data on social networks, their treatment and the extraction of knowledge from the data through various methods. The social network chosen is Twitter, and the type of data with which we will mainly work is tweets belonging to various users. The criteria followed when obtaining the data will be shown, according to the objective to be achieved, as well as all the steps followed for data collection and its treatment and cleaning. The studied methods include content-based recommendation systems. On the one hand by using the TF-IDF value associated with each word and on the other hand making use of the Word2Vec model, whose architecture uses neural networks, that are part of deep learning. Machine learning will also be used, since supervised classification algorithms will be used to classify users according to topics of interest to the user. Finally, a sentiment analysis will be carried out on part of the collected tweets, measuring the positivity or negativity of the tweets, observing their temporal evolution, user activity, percentage of negative, positive or neutral tweets and also using classification algorithms that allow tweets to be labeled as positives or negatives.

Excepto si se señala otra cosa, la licencia del ítem se describe como info:eu-repo/semantics/openAccess