DS0708 - Données massives, connaissances, décision, calcul haute performance et simulation numérique

Machine learning for very large arrays in radiostronomy – MAGELLAN

Submission summary

Major breakthroughs in cosmology and in astrophysics are expected with the planned new large interferometers, such as the Square Kilometer Array (SKA, around 2025) and its precursors. A large international effort is now under way to design and to build these new facilities. Their ultimate performance, however, is based on new, massive and complex data modeling and processing. Moreover, the computing power available worldwide in 2025 will not be sufficient to fully process the whole SKA dataset. Finally, current data reconstruction algorithms do not match the expected sensitivity achievable with these interferometers.

MAGELLAN aims at developing new methodological approaches to achieve breakthroughs in these three areas: calibration complexity, massive data sets, and precision of reconstruction algorithms. Within the general context of linear inverse problems of very large sizes, MAGELLAN specifically introduces a whole set of learning techniques to control various aspects of the inverse problem: calibration of the direct model, uncertainty of the calibrated model, accounting of temporal, spatial and spectral variations of the direct model, a priori assumptions to improve the spatio-spectral image reconstruction. These methodological goals are all approached keeping in mind the necessary scaling to solve the global inverse problem.

MAGELLAN is deeply embedded within the scientific context of the future radio-interferometers. The following issues are fundamental difficulties, clearly identified as the major barriers that must be lifted in order to achieve the full capabilities of these future facilities. First, the spatial and temporal variability of Earth’s ionosphere impacts on our description of the direct model, hence it is the main limiting factor on the precision achievable with these instruments. Secondly, even within a fully specified 2D direct model, the complexities and varied morphologies of astrophysical sources challenge the generality and accuracy of a priori assumptions that are made in reconstruction techniques, using either regularization or penalty constraints. Consequently, the current reconstruction codes cannot reach the target sensitivity of 60 dB that SKA may be able to achieve thanks to its large collecting surface. Finally, the explosion of the inverse problem size directly follows from the additional spectral dimension: SKA will produce tens of hyperspectral cubes of about 80 TB; such large sizes exclude using global inversion methods based on minimizing penalty criteria.

In this framework, MAGELLAN's objective is twofold. The first objective resides in producing methodological research aimed at solving global inverse problems in large dimensions, with two additional and challenging specificities. The first is to account for modeling errors in the direct model; the second is a strong constraint imposed on the accuracy of the resulting algorithm. Breaking from the classical approach used in radio-interferometry, which is mainly based on an a priori description of the direct model, MAGELLAN will open new ways through a new focus on machine learning techniques. Going beyond the sole use of physical models to let the data themselves guide the processing is a powerful new approach that has already demonstrated its potential in the past few years. The second objective is the implementation of the elaborated inversion strategy for the restoration of hyperspectral datacubes from the data of very large radio-interferometers. Particular attention will be paid to their instrumental specificities. Strategies based on parallel and distributed processing will be investigated. The relevance of these approaches for radioastronomy has already been established by several MAGELLAN investigators in exploratory studies on image reconstruction based on selected observables or on a priori images set from data. MAGELLAN proposes to generalize these machine learning techniques to the whole inverse problem field.

Project coordination

André Ferrari (Laboratoire JL Lagrange (OCA/CNRS/UNS))

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

LTCI CNRS-Telecom ParisTech Laboratoire Traitement et Communication de l'Information
SATIE Laboratoire des Systèmes et Applications des Technologies de l'Information et de l'Energie
LAGRANGE (OCA/CNRS/UNS) Laboratoire JL Lagrange (OCA/CNRS/UNS)
SATIE Laboratoire des Systèmes et Applications des Technologies de l'Information et de l'Energie

Help of the ANR 387,766 euros
Beginning and duration of the scientific project: September 2014 - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter