RT info:eu-repo/semantics/bachelorThesis
T1 Characterization of the k-means algorithm for spectral profiles
A1 Sola Viladesau, Eva
K1 Machine Learning
K1 k-means algorithm
K1 agglomerative hierarchical clustering
K1 feature scaling
K1 Principal Component Analysis
AB The k-means algorithm is a Machine Learning clustering method that has gainedpopularity both for its scalability and its simplicity. The output of this methodcontains a distribution of the input data in k groups as well as k representativeexamples.The aim of this Bachelor’s Thesis is to test k-means clustering results undercontrolled conditions by means of an artificial dataset. The data mimic solarobservations from the Interface Region Imaging Spectrograph (IRIS) in the Mg IIh&k lines. The situation is made incrementally more complex and the impacton the clustering is studied on a case by case basis. The goal is to consistentlyobtain a distribution that accurately separates the different profiles in the dataset.Furthermore, the results are compared to those of hierarchical clustering methodsand the effect of two common preprocessing schemes is analyzed.The k-means final results are considered satisfactory, given that the main goal ofdiscerning between spectral behavior patterns is achieved with very low error rates,even when the data are purposefully contaminated with defective profiles and noise.Nevertheless, when these impediments become too widespread, masking becomesnecessary, allowing for the previous statistics to be recovered. The hierarchicalmethods are deemed equal or inferior to k-means in terms of performance, dependingon the specific criterion.
YR 2023
FD 2023
LK http://riull.ull.es/xmlui/handle/915/35472
UL http://riull.ull.es/xmlui/handle/915/35472
LA en
DS Repositorio institucional de la Universidad de La Laguna
RD 16-may-2024