
Overview of the ML challenge flow and data sets used for training. All the preparation was explicitly made after fair principles, with the workflows of storage and cleaning of data stored on Github. The Codabench platform and the IT cluster NERSC are used for the rating and classification of the submission. – CS.LG
Publisher’s note: When we send teams to the expeditions of astrobiology and astrogeology in other worlds, they will have to be rather autonomous. Indeed, most of their preliminary research could well be done outside people. Having the best most compact tools is of paramount importance. This goes for knittings to be used on excursions and equipment to be used at the base camp. Having expert systems, AI, machine intelligence, etc. Integrated into our knittings or as autonomous systems in a research base used by several sensors and humans will be of great use. Once again, as researchers will be limited in terms of bandwidth and latency with human research and analytical capacities on earth, everything that is at hand in the search locations offworld will accelerate greatly and concentrate research.
Scientific discoveries are often made by finding a model or object that was not predicted by the known rules of science.
Often, these events or abnormal objects that do not comply with the standards indicate that the rules of the science governing the data are incomplete, and something new must be present to explain these unexpected aberrant values.
The challenge of finding anomalies can be confused because it requires codifying complete knowledge of known scientific behavior, then projecting these known behaviors on data to search for differences.
When using automatic learning, this presents a particular challenge because we demand that the model understand not only the scientific data not only perfectly, but also recognizes when the data is incoherent and outside the extent of its formed behavior.
In this article, we present three data sets aimed at developing an anomalies based on automatic learning for disparate scientific fields covering astrophysics, genomics and polar science.
We present the different data sets as well as a diagram to make automatic learning challenges around the three sets of data found, accessible, interoperable and reusable (fair).
In addition, we present an approach that is generalized to future automatic learning challenges, allowing the possibility of major challenges more with a high intensity of calculation which can ultimately lead to a scientific discovery.
Elizabeth G. Campolongo, Yuan-Tang Chou, Ekaterina Govorkova, Wahid Bhimji, Wei-Lun Chao, Chris Harris, Shih-Chieh Hsu, Hilmar Lapp Christopher Lawrence, Eric Moreno, Ryan Raikman, Jiaman Wu, Ziheng Zhang, Bayu Adhi, Gharehtoragh, Saúl Alonso Monsalve, Marta Babicz, Furqan Baig, Namrata Banerji, William Bardon, Tyler Barna, Tanya Berger-Wolf, Adji Bousso Dieng, Micah Brachman, Quentin Buat, David Cy Hui, Phuong Cao, Franco Cerino, Yi-Chun Chang, Shivajii Chaulagain, An-Kai Chen, Demung Chen, Eric Chen, Chia-Jeu Chou, Zih-Chen Ciou, Miles Cochran-Branson, Mariaio Ooudo Dadarlat, Peter Darch, Malina Desai, Daniel Diaz, Steven Dillmann, Javier Duarte, Isla Duporge, Urbas Ekka, Saba Entezari, Hao Fangi, Hao Fangi, Hao Fangi Rian Flynn, Geoffrey Fox, Emily Freed, Hang Gao, Jing Gao, Julia Gonski, Mathew Graham, Abolflazla Hazelden, Joshua Henry Peterson, Duc Hoang, Wei Hu, Mirco Huennefeld, David Hyde, Vandana Janej Yunfan Kang Lee, Langhyeon Lee, Shaocheng Lee, Suzan Van der Lee, Charles Lewis, Haitong Li, Haoyang Li, Henry Liao, Mia Liu, Xiaolin Liu, Xiulong Liu, Vladimir Loncar, Fangzheng Lyu, Ilya Makarov, Abhishikth Mallampalli Cen-Yu-Yu Alexander Micah, Alexander Micah, Micah Alexand Migala, Farouk Mokhtar, Mathieu Morlighem et al. (50 additional authors not shown)
Comments: 18 pages 6 figures to submit to Nature Communications
Subjects: Machine Learning (CS.LG); Instrumentation and methods of astrophysics (astro-ph.im)
Cite like: Arxiv: 2503.02112 (CS.LG) (or Arxiv: 2503.02112V1 (CS.LG) for this version)
https://doi.org/10.48550/arxiv.2503.02112
Focus to find out more
History of submission
From: Philip Harris
(V1) Mon, March 3, 2025 22:54:07 UTC (37,668 ko)
https://arxiv.org/abs/2503.02112
Astrobiology,