In structural biology, Small-Angle Scattering experiments SAS are unique, because although they provide low resolution data, they can be performed in closer-to-native conditions than those arising in X-Ray crystallography. A number of questions on SAS, however, remain unsolved, particularly in the light of modelling ensembles of conformers in solution. In this article, we study the ensemble average and covariance of SAS profiles analytically. Using this ensemble covariance, we demonstrate the hierarchical nature of SAS profiles. Furthermore, we show that the information content is not uniform and reaches its maximum in the intermediate q range.

In that respect, EROS [ 17 ] goes further in modelling continuous motion, because each conformation is already an average over a potentially large number of structures. Hub [ 21 ], dropping the first ns in each simulation. Structure London, England : [01 Jan19 1 ]. Unfortunately, even for very long simulations, such as the ones used here, small correlations are very difficult to converge. They are, however, related through a set of rules which we now describe by looking at the correlations see also S4 Text. Protein Interactions. Structure 20—

In particular, movements in solution are anisotropic, do not follow a normal distribution, and strong correlations between atoms or even protein domains can be expected. Furthermore, we performed a separate refinement using the fitted parameters, and the resulting ensemble contact map is virtually identical to the E SAXS contact map SI, Fig. Biophys J. We employed a hybrid simulation and experimental approach in which an aggregate 42 milliseconds of all-atom molecular dynamics were used as an informative prior for the structure of the excited state ensemble. To address the possibility that the similarity between E SAXS and E MSM may be due to insufficient refinement against the SAXS experiments, we performed an alternative refinement using no regularization penalty beginning from an intentionally bad choice of prior, the unfolded ensemble E U, which contained only minimal amounts of these structural features. Europe PMC requires Javascript to function effectively. The isomerization of this bond serves as the rate-limiting step of folding and occurs on a very long timescale on the order of s and is therefore too slow to sample sufficiently by pure simulation. Within these regions however, the knowledge of SAS profiles is essential to correctly describe highly flexible systems, such as intrinsically disordered proteins.

In structural biology, Small-Angle Scattering experiments SAS are unique, because although they provide low resolution data, they can Eross performed in closer-to-native conditions than those arising in X-Ray crystallography. A number of questions on SAS, however, remain unsolved, particularly in the light of modelling ensembles of conformers in solution.

In this article, we study the ensemble average and covariance of SAS profiles analytically. Using this ensemble covariance, we demonstrate the hierarchical nature of SAS profiles. Furthermore, we show that the information content is not uniform and reaches its maximum in the intermediate q range. The arguments are generalized using microsecond-scale molecular dynamics trajectories of the lysozyme and on an ensemble of the intrinsically disordered protein p15PAF.

We show that for highly Salary of private in wwi systems, the SAS profile is a representation of the ensemble of conformers in solution, and not that of one conformer in particular.

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: All relevant data are in the paper and its Supporting Information files.

Competing interests: The authors have declared that no competing interests exist. SAS experiments are easier to perform and in closer-to-native conditions than X-ray crystallography.

Therefore, SAS is in a unique position for structural biologists, and the generalization of in-house SAXS experiments will only strengthen this position. However convenient SAS experiments are, they only provide a limited Eros usa escrt of information. They are therefore often combined with other experiments to reach atomic resolution [ 3 ]. It is frequent to extract a number of parameters, such as the radius of gyration, the Porod exponent, or the volume of correlation [ 4 — 8 ].

Simple parameters, such as thos extracted from Kratky or Porod-Debye plots, can be used to assess the flexibility of a macromolecule [ 9 ]. Whether these or other parameters are uza of each other, and how Eddie bauer classic wood swing relate to the maximum number of independent points in a SAS profile [ 10 — 12 ] has ua been studied in-depth.

It is also becoming clear that SAS measures conformational diversity [ 13 ]. In these methods, the SAS profile is almost always modelled as a weighted average of the profiles of the individual conformations. The different methods differ by the way they select the weights, and the number of conformations.

These methods are best suited to describe a small number Crossdressers summerdresses well-defined conformations present simultaneously in solution. However, cases where conformations vary continuously from one Eros usa escrt the other are to be treated with much more care.

As noted early on [ 1415 ], the obtained ensemble is then illustrative of the diversity of possible conformations. The number of conformations these methods propose are then not necessarily to be taken as granted, because the conformations are expected to have a strong internal variability [ 20 ].

In that respect, EROS [ 17 ] goes further in modelling continuous motion, because each conformation is already an average over a potentially large number of structures. Yet, the uaa of parameters, which in essence is three times the number of atoms times the number of structures, still becomes very large for such systems, and the risk of overfitting is not negligible.

It proposes a generative model for the protein ensemble fitted on experimental SAXS data. This model therefore controls the expansion of the number of parameters. Yet it Eors unclear how that number of parameters can be extracted from it, and how to summarize the obtained distribution. Clearly, additional ways to represent continuous conformational variability would be welcome in the field.

Hub [ 21 ], dropping the first ns in each simulation. It corresponds to the structure in the input which is closest to the center of the cluster.

The median structure of the first simulation was taken as the center structure for all analytical calculations and Figs 1 and 2. Average SAS profile in blue dashed line, left axis, in arbitrary units, Eq 6. Smallest correlation is We refer to this as the correlated dataset. We do not expect other solvation models to be very different from the two cases presented here. Extension to intrinsically disordered proteins was performed on the p15PAF ensemble [ 28 ], available in the protein ensemble database under the accession code PED.

We used the experimental profile of p15PAF, and the structures comprised in the ensemble. In this article, we relate the SAS profiles of conformers arising naturally in solution through thermal motion. Form factors used in this formula must include volume exclusion and solvent effects. Their definition is not a trivial task and falls outside of the scope of this article.

We now treat X as a random vector having 3 N components. We discuss generalizations thereof further down. The average intensity is computed by taking the mathematical expectation of the intensity I X q over X.

Using the linearity of the expectation in the Debye formula Eq 1we have 2 Therefore, we seek the average of for any pair of atoms k and l. This result was obtained differently in by R. James [ 32 ], as recently rediscovered by P. Moore [ 33 ], uss generalized it to anisotropic motion i. The fact that multiple different conformers coexist in solution can then be captured by SAS experiments. Then, assuming no interactions between A and B particles, the average intensity is a weighted sum of the intensities for A and Beach of them given by Eq 6.

At low angle, the SAS profile contains information from both conformations. Therefore, in SAS, the higher q gets, the more we focus on well-defined conformations. There can be a number of them, but they must be well-defined. On the contrary, continuous conformational variability is more likely only to be noticed at low q values. In any case, because conformations of a thermal ensemble are related, there exist a number of rules that link their SAS profiles together. The SAS profile of one such conformation cannot deviate from Eq 5 in an arbitrary way.

This is what we now show, by computing the covariance between the SAS profile at q i and q j. For this purpose, we again use the Debye formula Eq 1. The expectation of a product of intensities is 7 Then, we notice that 8 when klmn describe four different atoms. First, similar to the calculation of the average intensity, the autocovariance can be given in closed form. It however leads to a formula that is numerically unstable [ 34 ]. We therefore seek an approximation to this distribution.

A certain number of approaches exist [ 3437 ], but we use a more direct one see S1 Text. In all cases we studied, the standard deviation SD has the characteristic shape of Fig 1 solid red line, see also S3 Text. The SD starts at zero, consistent with the fact that I 0 is proportional to the number of electrons, and is not impacted by conformational changes.

It then quickly reaches a maximum, and then decreases to a plateau. On a relative scale therefore, the standard deviation represents a non-monotonically increasing proportion of the scattered intensity.

This finding is consistent with those discussed for the average intensity Eq 5, in that the conformational diversity is captured at wide angles. We do not expect different hydration models to produce significantly different standard deviations, unless they hydrate different conformers of the ensemble in a different way.

However, in the most realistic cases, changes in conformation should cause the solvent shell to rearrange. The water density would therefore be impacted. Consequently, the standard deviation at I 0 could be escdt nonzero. We now focus on the the correlation structure of the same SAS profile Fig 2. In all Poncho herrera naked we studied, correlations are strong close to the diagonal, and vanish when points are far apart.

It is also frequent to observe at least one basin with negative correlations. The fact that points that are close together are highly correlated was expected. Indeed, this observation is a simple consequence of the predictable nature of SAS profiles on very short q scales. Conversely, points that are far apart seem to be largely decorrelated.

This fact demonstrates the hierarchical nature of SAS profiles [ 38 ]. Being a Fourier transform, the SAS profile describes the structure at low angle.

At higher angle, it starts describing the quaternary structure and so forth. What these results suggest, is that SAS compartmentalizes these descriptions.

Although individual atoms have a nonzero scattering contribution along the whole range of q values, collectively, a different trend emerges. For example, changes in the quaternary structure that do not modify the overall shape will not affect the onset of the SAS profile. A striking feature that can be seen in Fig 2 is that the bandwidth of this correlation matrix varies along the diagonal.

Thus, neighboring points will be more or less correlated depending on their absolute position along the SAS profile. That is, the density of independent points along a SAS profile changes as q changes. In information theory, the mutual information of two random variables quantifies how much information one carries on the other.

If we take two neighboring points along the SAS profile, their mutual information is 17 If the mutual information is high, q i and q j are strongly related, and consequently the information content of the SAS curve is lower in that region. Therefore, the information content is not uniformly distributed along a SAS profile, and is larger when the bandwidth is smaller.

The analytical model described until now makes the simplifying assumption that thermalization induces independent random normal displacements for each atom.

Such an assumption has strong limitations [ 33, 40 ]. In particular, movements in solution are anisotropic, do not follow a normal distribution, and strong correlations between atoms or even protein domains can be expected. To a lesser extent, the bivariate noncentral chi distribution must be approximated to still obtain analytical results. In any case, more realistic representations of thermalization can be obtained with molecular dynamics MD simulations.

Hub [ 21 ], from which we calculated the variance matrix. Trends in the standard deviations are similar between correlated and independent motion Fig 3.

Using this ensemble covariance, we demonstrate the hierarchical nature of SAS profiles. To a lesser extent, the bivariate noncentral chi distribution must be approximated to still obtain analytical results. We now treat X as a random vector having 3 N components. Thus, neighboring points will be more or less correlated depending on their absolute position along the SAS profile. In the second part of this article, the described SAS covariances are obtained through a long MD simulation.

