CE29 - Chimie : analyse, théorie, modélisation

Conformational design for high throughput MS/MS reading of digital polymers – shapeNread

Molecular design for high throughput reading of digital polymers

The shapeNread project aims at optimizing the tridimensional structure of digital polymers for high throughput sequencing. Coded data will be segmented in blocks to be released in a first MS/MS step and further activated in a second MS/MS step for sequencing. Specific tags will be conceived by molecular modelling in order to distinguish these segments in terms of conformation and hence enable high throughput sequencing by using MS/MS-IMS-MS/MS.

High throughput reading of molecularly encoded information

The general scientific context of the shapeNread project is massive and long-term data storage, with the need for alternative media to hard disks that are currently employed and require energy-consuming data centers to store quintillion bytes of data generated every day. A key issue is to increase storage density, that is, to lower the size occupied by each binary digit (or bit) used to code digital information. Compared to the tens of nanometers of each bit in current hard disks, targeted storage densities require bit size in the nanometer range or below, that is, the atomic or molecular scale. Recent new concepts propose to use polymers with a controlled sequence of monomers respectively defined as the 0-bit and 1-bit of the ASCII code to store digital information at the molecular level. In this context, our group has developed a variety of sequence-defined macromolecules that were produced as highly monodisperse species. Because the 0/1 coding system used in these polymers is based on mass variation between the two units, tandem mass spectrometry (MS/MS) is systematically used as an efficient sequencing method to «read« information «written« in the chains. Yet, the reading step still needs to be improved. While limitation of MS/MS for de novo sequencing of long chains has been addressed by segmenting binary information into blocks, decoding a whole polymer remains a slow process with the number of secondary activation steps increasing with the number of blocks to be sequenced. Using tags to induce conformational variation, and hence IMS separation, of all segments would solve this issue by enabling sequencing of any digital polymer in a single MS/MS-IMS-MS/MS experiment. By providing a high throughput reading methodology, this project will contribute to maintain our leadership in the highly competitive academic field of molecular information storage and pave the way to the development of any future devices.

In order to rationalize their synthesis, tags will be conceived by molecular modeling based on their propensity to provide different collision cross sections (CCS) for all segments to be separated by IMS. The success of the shapeNread project hence relies on the ability of a theoretical model to predict conformation of segments as a function of their tag in order to select candidates for the synthesis of relevant digital polymers only. This predictive model will be constructed and evaluated based on a series of available macromolecules, which CCS will be measured experimentally and compared to theoretical values. Targeted species are composed of phosphodiester units that are readily ionized in the negative ion mode, so an IMS calibration procedure for anions will first have to be developed to ensure reliability of experimental CCS employed to validate predictions. Finally, processing raw data (IMS and MS/MS) by the home-made MS-DECODER software will be mandatory for automated on-line digital data treatment.

In order to validate the theoretical model aimed at predicting tag-induced conformational changes, CCSs calculated by the model have to be compared to experimental values measured by IMS. However, the reliability of such measurements relies on the availability of appropriate standards which empirical search is often tedious, particularly in the negative ion mode as requested by our analytes. Yet, the open-source software IMSCal has been released early 2021 by B.T. Ruotolo et al., which allows experimental CCSs to be obtained with no need for standard measurements, at least in the positive ion mode. After slight modifications of this software, we have recently demonstrated its robustness also in the negative ion mode. The blocking point regarding the need for standards has hence been circumvented, enabling the evaluation of theoretical methods involved in the predictive model. Two models have actually been developed. Model 1 is based on geometry optimization and molecular dynamics: it is more precise data but more time-consuming. Model 2 is based on DFT-based conformational analysis without geometry optimization, hence less precise but faster. In both cases, final CCS calculation was performed by the trajectory method. To test these models, we used the minor 6 Å2 variation induced when substituting two H atoms by two CH3 groups when changing 0-bit to 1-bit in the coded chain. Surprisingly, best correlation was found between experimental values and data provided by model 2, with predicted variation of 8 Å2. Promising performances of model 2 are currently being evaluated for CCS changes induced by tag substitution. In parallel, alkoxyamines decorated with different tags have been prepared and used to synthesize coded macromolecules. Their MS/MS and IMS characterization have allowed pre-selection of most promising designs.

The predictive model is nearly available and will be tested by the end of 2021 for macromolecules already analyzed for their CCS: these results will permit to decide which tasks need to be focused on, development of the predictive model and optimization of the MS/MS-IMS-MS/MS coupling at the Institut de Chimie Radicalaire (ICR) in Marseille vs synthesis works conducted at the Institut Charles Sadron (ICS) in Strasbourg. Meanwhile, the collaboration between ICR and the Institut Pluridisciplinaire Hubert Curien (IPHC) in Strasbourg will consist of implementing appropriate programs into the home-made MS-DECODER software to handle multidimensional data generated upon digital polymer sequencing. In parallel, even if model 2 performs adequately to predict CCS modulations induced by tags, it would be interesting from a more fundamental point of view to identify potential issues related to the geometry optimization step used in model 1. However, it should be mentioned that deliverables initially defined in the Gantt diagram need to be re-scheduled as the lockdown and distance working periods experienced in 2020 had negative impacts on experimental tasks of the project. A time lag of a few months seems reasonable.

One article in peer-reviewed international journals:

Design of abiological digital poly(phosphodiester)s.
L. Charles, J.-F Lutz
Accounts of Chemical Research 2021, 54 (7) 1791-1800

One oral communication in a national congress:

Synthesis of alkoxyamine-containing mass-tags allowing optimal MS/MS sequencing of digital polymers.
T. Schutz, E. Laurent, L. Oswald, J.-L. Clément, D. Gigmes, L. Charles, J.-F. Lutz
SFC-Alsace Young Scientist Webinar, 28th June 2021

The shapeNread project aims at optimizing the tridimensional structure of encoded synthetic polymers to allow their high throughput sequencing by coupling tandem mass spectrometry with ion mobility spectrometry. Synthesis of these polymers will allow binary information to be contained in blocks, each labeled with a specific tag to enable their distinction in terms of mass and conformation. To do so, tags will be conceived by molecular modeling based on their propensity to provide different collision cross sections for all blocks. Once released in a first activation stage, block-containing fragments could hence be separated by ion mobility and further be individually sequenced after a second activation stage. Such a structural design would allow high throughput reading of molecularly encoded information (at least one decabyte per chain) by MS/MS-IMS-MS/MS. Appropriate development of the MS-DECODER software will enable automated data analysis.

Project coordinator

Madame Laurence CHARLES (Institut de Chimie Radicalaire)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.


ICR Institut de Chimie Radicalaire
ISIS Institut de science et d'ingénierie supramoléculaires
IPHC Institut Pluridisciplinaire Hubert Curien IPHC

Help of the ANR 511,714 euros
Beginning and duration of the scientific project: December 2019 - 48 Months

Useful links

Explorez notre base de projets financés



ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter