ANR-FWF - Appel à projets générique 2018 - FWF

Incremental Design of Experiments – INDEX

Incremental Designs of Experiments

INDEX studies efficient incremental solutions to combinatorial optimisation problems occurring in design of computer experiments. Modern industrial processes often resort to complex simulation models whose computational cost requires substitution by a surrogate of much lesser complexity. The surrogate quality depends on the set of simulation inputs (the design) used for its construction. The objective is to propose an ordered design which is nearly optimal when stopped at any point.

Incremental construction of experimental designs, space-filling in particular, under constraints

Construction of nested designs by greedy-type algorithms exploiting submodularity of the design criterion. <br />Investigation of the dynamical properties of algorithms for sequential design.<br />Construction of informative designs under constraints through the notion of privacy sets.

Dynamical systems, submodular function maximization, kernel methods and minimization of a kernel discrepancy by kernel herding (conditional gradient).

* A new symmetric design criterion for model discrimination, based on linearization of model responses and nominal sets for their parameters.
* An accelerated greedy algorithm for the minimization of mutual entropy.
* New methods for the adaptive estimation of a percentile in a Gaussian process model.
* Greedy construction of a space-filling design in high dimension using fractional-factorial designs to reduce the size of the candidate set.
* Inequalities for the elimination of inessential points in c- and A-optimal designs (construction of 0-core sets).
* An algorithm for the on-line data thinning of a sequence of experimental points.
* Investigation of the connections between Bayesian integration, (kernel) energy minimization, maximum-mean-discrepancy, potential theory and space-filing design.
* Approximation of singular kernels via bounded completely monotone functions.
* Construction of validation/testing designs.
* Construction of optimal block designs for estimation in copula models.

The project is evolving according the the planned schedule, although the corona crisis induced some delays.
A new topic, that emerged at a project meeting in Feb. 2020, concerns the construction of validation designs. In brief, the problem is to propose a set of m design points where to evaluate a function f, in order to check the accuracy of a model of f constructed from evaluations at a distinct n-point design. Preliminary studies have been undertaken at EDF and several papers are in preparation.

Five papers published in international journals (Biometrical J., J. of Statistical Planning and Inference, SIAM/ASA J. Uncertainty Quantification, J. of Computational and Applied Mathematics, Appl. Stochastic Models Bus. Ind.), four submitted (plus communications at and submissions to several conferences).

INDEX will study efficient incremental solutions to combinatorial optimisation problems occurring in design of computer experiments.


Modern industrial processes often resort to simulation models of huge computational costs. Use of the original numerical codes for engineering tasks such as design optimisation and performance assessment, which require an intensive exploration of the model input space, would then require unrealistic amount of time. The current trend is to substitute the original numerical codes by a surrogate model of much lesser complexity, often a semi-parametric interpolator of a finite set of its outputs. The quality of the surrogate model depends on the set of simulation inputs (the design) used for this construction, and, obviously, it increases with design size.


Classical approaches to design of experiments consider the design size N as a fixed parameter and try to optimise the information in the overall set of N points. However, in many situations the model simulations are progressively integrated, and a decision to stop the learning process is done on-line, based either on the estimated quality of the surrogate model already built or, more pragmatically, because the available (time, cost) budget has been totally consumed. In this context, it is important that the order of execution of the design points be well chosen, such that for all n<N, the corresponding design be as informative as possible. This new formulation of the design problem is at the core of the INDEX project.<br />

In reality, the learning set obtained by running the numerical code at the design points is used for both identification of the model used for its interpolation, and to predict the output of the numerical code at input points not in the design (to interpolate them). These two simultaneous goals correspond to distinct design criteria: while for the first it is important to be able to chose an appropriate model family and accurately estimate its parameters (in particular the spatial coherency of the output field), the latter establishes a preference for designs that spread the design points uniformly over the input domain. One way of addressing this multi-criteria problem is to formulate the design problem as a constrained optimisation problem, searching for the maximally informative design under a set of constraints over its design geometry. INDEX goal is to define efficient incremental algorithms for this constrained optimisation problem.


Many variants of the constrained subset-selection problem stated above are NP-hard, the goal of finding an optimal solution must necessarily be replaced by the search for good approximations. For many similar problems, algorithms with approximation guarantees – in the sense that they yield a solution for which the value of the optimised criterion is within a certain fraction of the optimal one – have been proposed in the computer science community. The consortium of INDEX, gathering both experts in design of experiments and computer science, will build upon these results. We believe that more efficient approximation bounds and algorithms can be constructed by taking the specificity of the design problem into account in the study of the ergodic properties of the corresponding discrete dynamical systems.


Besides its intrinsic scientific interest, the research program of INDEX is also of practical interest: the need to find efficient nested families of designs has been brought to the other consortium members by EDF, motivated by the real context of exploitation of surrogate models in an industrial scenario.

Project coordination

Luc Pronzato (Laboratoire I3S, CNRS UCA)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

EDF SA EDF R&D SITE CHATOU
JKU Johannes Kepler Universität Linz
CNRS Laboratoire I3S, CNRS UCA

Help of the ANR 249,413 euros
Beginning and duration of the scientific project: January 2019 - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter