Classification, which means discrimination between examples belonging to different classes, is a fundamental aspect of most scientific applications. Machine Learning (ML) tools have proved to be very performing in this task, in the sense that they can achieve very high success rates. On the other hand, the “realism” and interpretability of their models are very low, resulting often in modest increases of knowledge and limited applicability. In this paper, a methodology is described, which, by applying ML tools directly to the data, allows formulating new scientific models that describe the actual “physics” determining the boundary between the classes. The proposed technique consists of a stacked approach of different ML tools, each one applied to a specific subtask of the scientific analysis; all together they combine all the major strands of machine learning, from rule based classifiers and Bayesian statistics to genetic programming and symbolic manipulation. To take into account the error bars of the measurements, an essential aspect of any scientific form of inference, the novel concept of the Geodesic Distance on Gaussian manifolds is adopted. The characteristics of the methodology have been investigated with a series of systematic numerical tests, for different types of classification problems. The potential of the approach to handle real data has been tested with various experimental databases. The obtained results indicate that the proposed method permits to find a good trade-off between accuracy of the classification and complexity of the derived mathematical equations. Moreover, the derived models can be tuned to reflect the actual phenomena, providing a very useful tool to bridge the gap between data, machine learning tools and scientific theories.
A Syncretic Approach to Knowledge Discovery for the Natural Sciences
Murari A.; Peluso E.; Lungaroni M.; Gaudio P.; Gelfusa M.; JET Contributors
Journal:
Data mining and knowledge discovery pp. 1 - 49
Year:
2019
ISTP Authors: Andrea Murari
Keywords: Machine learning, ML
Research Activitie: JOURNAL ARTICLES
Related products
-
Monthly notices of the Royal Astronomical Society (Online) 503 (4), pp. 4815 - 4827 Year: 2021 DOI: 10.1093/mnras/stab319
Comparing turbulence in a Kelvin-Helmholtz instability region across the terrestrial magnetopause
Quijia, P.; Fraternale, F.; Stawarz, J.E.; Vasconez, C.L.; Perri, S.; Marino, R.; Yordanova, E.; Sorriso-Valvo, L.
-
Nuclear fusion 61 (7), pp. 076013-1 - 076013-15 Year: 2021 DOI: 10.1088/1741-4326/abfcdf
Prediction of temperature barriers in weakly collisional plasmas by a Lagrangian coherent structures computational tool
Di Giannatale G.; Bonfiglio D.; Cappello S.; Chacon L.; Veranda M.
-
Physical review. E (Print) 104 (2), pp. 025201-1 - 025201-13 Year: 2021 DOI: 10.1103/PhysRevE.104.025201
Transition to turbulence in a five-mode Galerkin truncation of two-dimensional magnetohydrodynamics
Carbone, Francesco; Telloni, Daniele; Zank, Gary; Sorriso-Valvo, Luca
-
Nuclear fusion (Online) 61 (4), pp. 046020-1 - 046020-12 Year: 2021 DOI: 10.1088/1741-4326/abe3c7
Onset of tearing modes in plasma termination on JET: The role of temperature hollowing and edge cooling
Pucella G.; Buratti P.; Giovannozzi E.; Alessi E.; Auriemma F.; Brunetti D.; Ferreira D.R.; Baruzzo M.; Frigione D.; Garzotti L.; Joffrin E.; Lerche E.; Lomas P.J.; Nowak S.; Piron L.; Rimini F.; Sozzi C.; Van Eester D.
English
Italiano