DS0705 - Sciences et technologies logicielles

Correctness of Linux Scripts – Colis

Submission summary

This project aims at applying techniques from deductive program
verification and analysis of tree transformations to the problem of
analyzing shell scripts, in particular those that are used in software
installation. These scripts play a central role in the initial
installation of UNIX and Linux machines, as well as in recurrent
configuration, installation, and software maintenance tasks. Scripts
interact with the state of the operating system, in particular via
the file system.

Errors in scripts may have catastrophic consequences due to the fact
that they are often executed at crucial moments, like the update of
software packages that are installed on a UNIX-like machine. Scripts
must work correctly in a multitude of different situations that
probably have not been planned by the author of the script. For
instance, installation scripts have to work correctly regardless of
the choice of packages that currently are installed on the system, and
they also have to work correctly in abnormal situations, for instance
when the system administrator tries to resume an operation that
was previously interrupted.

We see three main challenges:

1) Script languages are difficult. They are Turing-complete
programming languages. Typically they provide known features from
imperative programming languages like various loop constructions
(while, for, ...), assignment to variables, and recursive functions
and procedures. Since script languages are often used for interaction
with the operating system they provide easy means to invoke UNIX
commands. However, their highly dynamic execution model is based on a
combination of expansion (e.g. variable expansion), evaluation, and
execution of system commands. They stand between macro expansion
languages and classical programming languages.

2) The data structure that is manipulated by these scripts is a
complex one, namely the file system tree that is installed on a
UNIX-like machine. Our approach is to consider the file system, as
seen by the programmer of a shell script (i.e. abstracting from the
implementation of a file system) as a tree-like object. However, we
have to cope with features like unbounded depth and width, multiple
hard links to leaves, symbolic links, ownership and access
permissions, and access dates.

3) The properties we are interested in range from the simple to the
complex. Simple properties may be that a script is not allowed to
touch certain parts of the file system tree, complex properties
involve sequences of executions of possibly different scripts, like
executing an installation script then a de-installation script, or
re-running a script after the first attempt to execute it has failed.

At the end of the project we want to produce tools that formally
analyse and attempt to verify scripts, possibly giving assurance of
correctness, or pin-pointing possible errors. This is an ambitious
goal - we certainly will have do compromises, make hypotheses on the
structure of the scripts, and do some approximations of the file
system and the behavior of system commands. These assumptions and
approximations will be guided by use cases of installation scripts from
the Debian GNU/Linux distribution.

We will start by laying out the formal models of the shell script
language and the file system data structure, and by developing a
language for expressing properties of scripts to be verified. We will
pursue two avenues of formal techniques: one based on tree transducers
which are a formal system of transformations on trees (like XML trees,
or file system trees), and the other one using deductive program
verification techniques as proposed by the Why3 environment. We will
also aim at combining these two technologies.

Project coordination

Ralf Treinen (Université Paris-Diderot, Laboratoire Preuves, Programmes et Systèmes (en train de fusion avec LIAFA))

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

PPS Université Paris-Diderot, Laboratoire Preuves, Programmes et Systèmes (en train de fusion avec LIAFA)
INRIA Saclay - Ile-de-France/Equipe projet TOCCATA Inria - Centre de recherche Saclay - Ile-de-France - Equipe projet TOCCATA
INRIA Lille INRIA Lille - Nord Europe

Help of the ANR 384,111 euros
Beginning and duration of the scientific project: September 2015 - 48 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter