INFRA - Infrastructures matérielles et logicielles pour la société numérique

multi-objective scheduling for large scale parallel systems – MOEBUS

Submission summary

The ever growing evolution of computing platforms leads to a highly
diversified and dynamic landscape. The most significant classes of
parallel and distributed systems are supercomputers, computational
grids, clouds and large hierarchical multi-core machines installed in
dedicated computing centers. They are all characterized by an
increasing complexity for managing the jobs and the resources. Such
complexity stems from the various hardware characteristics (the
heterogeneity of the components, the many levels of hierarchies, etc.)
and from the applications characteristics (always bigger, more
complex, more versatile, etc.). This project focuses on the efficient
execution of parallel applications submitted by various users and
sharing resources in large-scale high-performance computing
environments.

In large computing centers, the platforms are more and more composed
of heterogeneous computing resources (e.g standard many-core
processors coupled to specialized co-processors such as GPUs) with
efficient communication interconnects. These resources are used by
many users/applications who have their own practices and needs which
may be opposite to each other. In order to deliver the required
computing power simultaneously to the users and applications,
dedicated software suites have been developed to allocate the jobs to
the available resources. Software such as batch systems provides
mechanisms for managing and allocating the resources (i.e. submitting
the jobs, processing them, and sometimes monitoring their execution),
while others strive to better exploit the resources allocated to a
user or an application (for instance by placing the multiples
application processes so as to reduce communication times). Indeed,
the use of the resources may have a critical impact on the whole
system, and thus needs to be optimized. The continuous hardware
evolution creates new characteritics that raise new and complex
challenges for managing and exploiting the resources. It makes the
dedicated pieces of software (e.g the job manager, the programming
standards implementations) much more complicated than ever.

We propose to investigate new functionalities to add at low cost in
actual large scale schedulers and programming standards, for a
better use of the resources according to various objectives and
criteria. Clearly, the principle of using several priority queues in
operational batch schedulers is not the best solution since it
prioritizes arbitrarily some jobs (or resources) which in turn may
delay other jobs. We propose to revisit the principles of existing
schedulers after studying the main factors impacted by job
submissions. Then, we will propose novel efficient algorithms for
optimizing the schedule for unconventional objectives like energy
consumption and to design provable approximation multi-objective
optimization algorithms for some relevant combinations of objectives
(performance, fairness, energy consumption, etc.). An important
characteristic of the project is its right balance between theoretical
analysis and practical implementation. The most promising ideas will
lead to integration in reference systems such as SLURM and OAR as well
as new features in programming standards implementations such as MPI
or OpenMP. We expect MOEBUS results to impact further use of very
large scale parallel platforms.

Project coordination

Denis TRYSTRAM (Institut Polytechnique de Grenoble) – Denis.Trystram@imag.fr

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

BULL SAS
Grenoble INP Institut Polytechnique de Grenoble
Inria Bordeaux Sud-Ouest Inria Bordeaux Sud-Ouest
Inria Grenoble Rhône Alpes Institut National de Recherche en Automatique et informatique

Help of the ANR 393,923 euros
Beginning and duration of the scientific project: September 2013 - 48 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter