Characterization of the k-means algorithm for spectral profiles

Sola Viladesau, Eva

dc.contributor.author	Sola Viladesau, Eva
dc.date.accessioned	2024-01-19T14:27:51Z
dc.date.available	2024-01-19T14:27:51Z
dc.date.issued	2023
dc.identifier.uri	http://riull.ull.es/xmlui/handle/915/35472
dc.description.abstract	The k-means algorithm is a Machine Learning clustering method that has gained popularity both for its scalability and its simplicity. The output of this method contains a distribution of the input data in k groups as well as k representative examples. The aim of this Bachelor’s Thesis is to test k-means clustering results under controlled conditions by means of an artificial dataset. The data mimic solar observations from the Interface Region Imaging Spectrograph (IRIS) in the Mg II h&k lines. The situation is made incrementally more complex and the impact on the clustering is studied on a case by case basis. The goal is to consistently obtain a distribution that accurately separates the different profiles in the dataset. Furthermore, the results are compared to those of hierarchical clustering methods and the effect of two common preprocessing schemes is analyzed. The k-means final results are considered satisfactory, given that the main goal of discerning between spectral behavior patterns is achieved with very low error rates, even when the data are purposefully contaminated with defective profiles and noise. Nevertheless, when these impediments become too widespread, masking becomes necessary, allowing for the previous statistics to be recovered. The hierarchical methods are deemed equal or inferior to k-means in terms of performance, depending on the specific criterion.	es_ES
dc.language.iso	en	es_ES
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.title	Characterization of the k-means algorithm for spectral profiles	es_ES
dc.type	info:eu-repo/semantics/bachelorThesis	es_ES
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es_ES
dc.subject.keyword	Machine Learning	es_ES
dc.subject.keyword	k-means algorithm	es_ES
dc.subject.keyword	agglomerative hierarchical clustering	es_ES
dc.subject.keyword	feature scaling	es_ES
dc.subject.keyword	Principal Component Analysis	es_ES

Ficheros en el ítem

Nombre:: SolaViladesau_Eva_TFG_GF.pdf
Tamaño:: 1.401Mb
Formato:: PDF

Ver/Abrir

Este ítem aparece en la(s) siguiente(s) colección(ones)

TFG. Física

Mostrar el registro sencillo del ítem

Excepto si se señala otra cosa, la licencia del ítem se describe como Attribution-NonCommercial-NoDerivatives 4.0 Internacional