Reconstructing Distributions of Molecules from Cryo-EM Datasets
Reconstructing Distributions of Molecules from Cryo-EM Datasets
The reconstruction of flexible molecules is one of the most critical challenges in structural biology, providing valuable insights into the functions and mechanisms of biomolecules by observing their motion. Cryogenic electron microscopy (cryo-EM) is an ideal technique for studying the dynamic conformational landscape (i.e., range of motions) because it captures snapshots of entire conformational ensembles. However, this reconstruction task poses significant mathematical and computational challenges due to the massive size of datasets (sometimes exceeding terabytes), their high dimensionality, and their low signal-to-noise ratio.
After introducing the fundamentals of the cryo-EM reconstruction problem, I will present a framework for reconstructing the protein distribution in a dataset by representing it within a linear subspace. This process begins with an efficient method for estimating the principal components of the distribution of molecules (3D volumes) from incomplete and noisy measurements (2D images), continues with a reconstruction algorithm based on kernel regression, and concludes with algorithms for inferring molecular motions.