Understanding protein non-folding

doi:10.1016/j.bbapap.2010.01.017

Review

. 2010 Jun;1804(6):1231-64.

doi: 10.1016/j.bbapap.2010.01.017. Epub 2010 Feb 1.

Understanding protein non-folding

Vladimir N Uversky¹, A Keith Dunker

Affiliations

Affiliation

¹ Institute for Intrinsically Disordered Protein Research, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN 46202, USA. [email protected]

PMID: 20117254
PMCID: PMC2882790
DOI: 10.1016/j.bbapap.2010.01.017

Review

Understanding protein non-folding

Vladimir N Uversky et al. Biochim Biophys Acta. 2010 Jun.

. 2010 Jun;1804(6):1231-64.

doi: 10.1016/j.bbapap.2010.01.017. Epub 2010 Feb 1.

Authors

Vladimir N Uversky¹, A Keith Dunker

Affiliation

¹ Institute for Intrinsically Disordered Protein Research, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN 46202, USA. [email protected]

PMID: 20117254
PMCID: PMC2882790
DOI: 10.1016/j.bbapap.2010.01.017

Abstract

This review describes the family of intrinsically disordered proteins, members of which fail to form rigid 3-D structures under physiological conditions, either along their entire lengths or only in localized regions. Instead, these intriguing proteins/regions exist as dynamic ensembles within which atom positions and backbone Ramachandran angles exhibit extreme temporal fluctuations without specific equilibrium values. Many of these intrinsically disordered proteins are known to carry out important biological functions which, in fact, depend on the absence of a specific 3-D structure. The existence of such proteins does not fit the prevailing structure-function paradigm, which states that a unique 3-D structure is a prerequisite to function. Thus, the protein structure-function paradigm has to be expanded to include intrinsically disordered proteins and alternative relationships among protein sequence, structure, and function. This shift in the paradigm represents a major breakthrough for biochemistry, biophysics and molecular biology, as it opens new levels of understanding with regard to the complex life of proteins. This review will try to answer the following questions: how were intrinsically disordered proteins discovered? Why don't these proteins fold? What is so special about intrinsic disorder? What are the functional advantages of disordered proteins/regions? What is the functional repertoire of these proteins? What are the relationships between intrinsically disordered proteins and human diseases?

PubMed Disclaimer

Figures

**Figure 1. Time-dependent increase in the number of PubMed hits dealing with ID proteins**
The following keywords have been used to perform this search: intrinsically disordered, natively unfolded, intrinsically unstructured, intrinsically unfolded and intrinsically flexible.

**Figure 2. Peculiarities of amino acid composition of ID proteins**
A. Comparison of the mean net charge and the mean hydrophobicity for a set of 275 folded (open circles) and 91 natively unfolded proteins (gray circles). The solid line represents the border between intrinsically unstructured and native proteins (see text). B. Order/Disorder composition profile. Comparisons of amino acid compositions of ordered protein with each of three databases of disordered protein. The ordinates are (% amino acid in disordered dataset - % amino acid in ordered dataset) / (% amino acid in ordered dataset) = Δ/globular_3D. The residues are ordered according to the Vihinen's flexibility scale [105]. Names of each database indicate how the disordered regions were identified. Negative values indicate that the disordered database has less than order, positive indicates more than order.

**Figure 3. Time-dependent increase in the total number of IDP predictors**
The list of predictors includes: the first suggested predictor of IDPs [378]; the first formal predictor of IDPs [379]; predictor of ID in calcineurin family [380]; CH-Plot [71]; CDF [85]; PONDR^® VL-XT [84]; GlobPlot [381]; DisEMBL [382]; DISOPRED [383]; flavors of protein disorder [122]; NORSp [384]; predictor by using reduced amino acid alphabet [385]; DISOPRED2 [307]; DRIPPRED [386]; FoldUnfold [387, 388]; Softberry (http://www.softberry.com); VaZyMolO [389]; PONDR^® VL3-E [113]; IUPred [301, 390]; FoldIndex [391]; RONN [392]; DISpro [393]; PONDR^® VSL1 [394]; CDF [100]; combined CDF/CH-Plot predictor [100]; α-MoRF [114]; Prelink [395]; PONDR^® VSL2 [396]; Spritz [397]; DisPSSMP [398]; IUP predictor [399]; disorder prediction in calmodulin partners [233]; Decision trees [268]; Wiggle [400]; iPDA [401]; PrDOS [402]; SGT [403]; Ucon [404]; α-MoRF II [116]; composition profiler [109]; POODLE-L [405]; POODLE-S [406]; POODLE-W [403]; NORSnet [404]; OnD-CRF [407]; predictor by using bayesian multinomial classifier [408]; DISOclust [409]; Top-IDP [111]; DPROT [410]; hierarchical classifier [411]; MetaPrDOS [412]; MeDor [413]; Draai [414]; CDF-ALL [415]; IUPforest-L [416].

**Figure 4. Binary predictors of intrinsic disorder**
A. CDF analysis. Dashed curve located above the boundary represents CDF curve of ordered protein, whereas solid line located below the boundary corresponds to the CDF curve of IDP. Here, *δ_i* and *d_j* (where i and j range from 1 to 7) are attributed to the ordered and disordered protein, respectively, and represent the distances of points at the CDF curve from the corresponding boundary points. The averaged distance of a given CDF curve from a boundary line is calculated $dCDF = \frac{∑_{i = 1}^{7} δ_{i}}{7}$ as or $dCDF = \frac{∑_{j = 1}^{7} d_{j}}{7}$ . B. CH-plot analysis. Black square located above the boundary corresponds to ordered protein, open circle located below the boundary represents disordered protein. C. CH-CDF analysis. Black square corresponds to disordered protein DP00124, whereas open circle represents ordered protein 1EXP. X-coordinates were calculated as averaged distances of corresponding CDF curves from a boundary (positive dCDF distance corresponds to a protein predicted to be ordered by CDF analysis, negative dCDF distance corresponds to a protein predicted to be disordered by CDF analysis, see plot A). Y-coordinates were obtained as distances from spots corresponding to proteins to boundary. Positive and negative dCH distances correspond to protein predicted by CH-plot to be disordered or ordered, respectively, see plot B.

**Figure 5. CH-CDF plot for mice proteins**
The principles of this computational tool are described in the Approach section. Quadrants contain differently disordered proteins: red quadrant contains extended IDPs (predicted to be disordered by CDF and CH-plot analysis), pink quadrant contains native molten globules (predicted to be disordered by CDF and ordered by CH-plot), the blue quadrant contains globular proteins (predicted to be ordered by both CDF and CH-plot analyses), whereas the violet quadrant contains proteins predicted to be ordered by CDF and disordered by CH-plot.

**Figure 6. Illustrative examples of ID proteins**
**Top line:** Collapsed (molten globule-like, MG) disorder; Extended (pre-molten globule-like, PMG) disorder; (coil-like, coil) disorder. Ordered globular protein of same length is also shown for comparison. Figure represents model structures of a 100 residue-long polypeptide chain. **Middle line:** Relative hydrodynamic volumes occupied by a 100 residue-long polypeptide chain in these four conformations. **Bottom line:** Relative hydrodynamic volumes occupied by a 500 residue-long polypeptide chain in these four conformations. Spheres in the middle and bottom lines show an increase in the hydrodynamic volume relative to the volume of the corresponding ordered protein.

**Figure 7. Predicted abundance of mostly disordered proteins in several proteomes**
I, *Y. pestis*; II, *E. coli*; III, *A. fulgidus*; IV, *M. thermoautotrophicum*; V, *S. cerevisiae*; VI, *A. thaliana*; VII, *M. musculus*. Analysis was performed by three predictors of mostly disordered proteins: the charge-hydropathy (CH) plot, the cumulative distribution function (CDF) of PONDR^® VL-XT score, and a consensus predictor that combines the CH-plot and CDF predictors. The main point is that eukaryotes appear to contain far more intrinsic disorder as compared to prokaryotes. This amount of predicted disorder has important functional consequences, and so proteomic experiments need to be redesigned to recognize and explore intrinsically disordered proteins.

**Figure 8. Involvement of intrinsic disorder in protein function**
Note that the classical structure-function paradigm cannot describe many of the function proteins perform.

**Figure 9. Example of an entropic clock. Top panel**
Simplified model of a Shaker-type voltage-gated K⁺ ion channel (blue) with ‘ball and chain’ timing mechanism. The ‘ball and chain’ is comprised of an inactivation, or ball, domain (yellow) that is tethered to the pore assembly by a disordered chain (red) of ∼ 60 residues. For simplicity, only four of the proposed ten states are shown. The cytoplasmic side of the assembly is oriented downward. A. Closed state prior to membrane depolarization. Note that conformational changes of the pore have sealed the channel and a positive charge on the cytoplasmic side of the pore assembly excludes binding of the ball domain. B. Open state following membrane depolarization. C. After depolarization, the cytoplasmic side of the pore opening assumes a negative charge that facilitates interaction with the positively charged ball domain. D. Inactivation of the channel occurs when the ball domain occludes the pore. The transition from C to D does not involve charge migration and can be modeled as a random walk of the ball domain towards the pore opening. (Portions of figure based on Antz et al. [218]). **Bottom panel:** Schematic presentation of the ‘chain’ length-dependent timing of channel inactivation. Different lengths of the ‘chain’ region of N-terminal domain result in different rates of channel inactivation [220, 221], where shorter ‘chain’ causes a more rapid inactivation (A), whereas a longer ‘chain’ produces slower inactivation (B). Modified from [225].

**Figure 10. Polymorphism in the bound state**
Comparison of axin and FRAT binding to GSK3β. The binding sites for the axin (383–401) peptide and FRAT (197–222) peptides are co- localized in the C-terminal domain of GSK3β. However, the two peptides have no sequence homology, have different conformations in their bound state, and possess different sets of interactions with GSK3β.

**Figure 11. p53 interaction with different binding partners illustrate peculiarities of one-to-many signaling**
A structure versus disorder prediction on the p53 amino acid sequence is shown in the center of the figure (up = disorder, down = order) along with the structures of various regions of p53 bound to fourteen different partners. The predicted central region of structure with the predicted amino and carbonyl termini as being disordered have been confirmed experimentally for p53. The various regions of p53 are color coded to show their structures in the complex and to map the binding segments to the amino acid sequence. Starting with the p53-DNA complex (top, left, magenta protein, blue DNA), and moving in a clockwise direction, the Protein Data Bank IDs and partner names are given as follows for the fourteen complexes: (1tsr – DNA), (1gzh – 53BP1), (1q2d – gcn5), (3sak – p53 (tetrametization domain)), (1xqh – set9), (1h26 – cyclinA), (1ma3 – sirtuin), (1jsp – CBP bromo domain), (1dt7 – s100ββ), (2h1l – sv40 Large T antigen), (1ycs – 53BP2), (2gs0 – PH), (1ycr – MDM2), and (2b3g – rpa70).

**Figure 12. Examples of binding regions and their positions relative to PONDR^®predicted order**
A. Eukaryotic initiation factor (blue) and the binding region of 4EBP1 (red). B. The PONDR^® VL-XT prediction for 4EBP1 with the binding region designated (blue bar). C. The B (blue) and A (yellow) subunits of calcineurin and the autoinhibitory region of the A subunit (red helix) in the midst of observed disordered sequence (red dashes). D. The PONDR^® VL-XT prediction for the last 121 amino acid residues of the A subunit with the autoinhibitory region indicated (blue bar). Modified from [114].

**Figure 13. Mechanisms of IDP regulation inside the cell**
*Regulation of ordered proteins (i) and IDPs (ii) at the transcriptional level*. mRNAs encoding ordered proteins and IDPs are transcribed with comparable rates; however, IDP-encoding mRNAs are subjected to faster degradation. Therefore, the pool of the IDP-encoding mRNAs is significantly smaller than the number of mRNAs encoding the ordered proteins. *Regulation of ordered proteins (iii) and IDPs (iv) at the translational level*. The biosynthesis of ordered proteins is noticeably faster than that of IDPs. When synthesized, IDPs are either subjected to fast degradation, to various posttranslational modifications, PTMs, (including phosphorylation as shown in the plot), or to binding with specific partners. As a result of slow transcription and fast degradation, the overall level of IDPs inside the cells is lower and their half-lives are generally shorter than those of ordered proteins. However, some IDPs can be present at high quantities and/or for long periods of time due to either specific PTMs or due to the interactions with some specific factors.

**Figure 14. Druggable p53–Mdm2 interaction**
Protein disorder features and small molecule design. The p53 peptide (in color) bound to Mdm2 (PDB 1YCR, in gray scale) is shown in (A). Close-up view of p53 (ribbon) bound to Mdm2 (globular). The side chains of p53's crucial residues for the interaction (Phe 19, Trp 23, Leu 26) are shown (B). Notice that residues Phe19, Trp23 and Leu26 of p53 are pointing into the Mdm2 binding pocket. By comparison, the small molecule nutlin-2, designed to mimic the side chains of the residues from p53 is shown in (C). The PONDR® VL-XT plot of p53 is shown in (D), which indicates that this fragment of p53 might undergo disorder-to-order transition upon binding to Mdm2. The purple bar represents the predicted α-MoRF region (α-helical molecular recognition feature) [114, 116], the hollow box represents the determined binding region, which shows a good agreement between the two. Hydrophobic cluster analysis of binding region is shown. Figure is modified from [364].

**Figure 15. IDPs as drug targets**
Protein-protein interactions involving α-helical or β-strand portion of the partners are used to design small molecules for cancer drugs. A. A ribbon diagram of complex of Bcl-xL and BAK fragment was regenerated from PDB 1BXL. Small molecules were designed based on the 20-residue helix of BAK to inhibit the interaction. B. A ribbon diagram of complex of MDM2 and P53 fragment was regenerated from PDB 1YCR. Small molecule inhibitors were designed based on the structure of the helical fragment of P53. C. A ribbon diagram of complex of IL-2 receptor α and IL-2 was regenerated from PDB 1Z92. Small molecules were designed based on the α-helix portion of IL-2 that interacts with the receptor. D. A ribbon diagram of complex of β-catenin and T cell factor was regenerated from PDB 1G3J. The structure of β-catenin is consisted of 12 tri-helical repeats (except the repeat 7, which just has two helical units). Small molecules from a natural-product library were screened and a couple of inhibitors were found. However, the binding sites for the small molecule inhibitors were not clear. E. A ribbon diagram of complex of XIAP and Smac fragment was regenerated from PDB 1G3F. Small molecule inhibitors were designed based on the β-strand fragment (AVPIAQKSE) of Smac.

See this image and copyright information in PMC

Cited by

Variation of free-energy landscape of the p53 C-terminal domain induced by acetylation: Enhanced conformational sampling.
Iida S, Mashimo T, Kurosawa T, Hojo H, Muta H, Goto Y, Fukunishi Y, Nakamura H, Higo J. Iida S, et al. J Comput Chem. 2016 Dec 5;37(31):2687-2700. doi: 10.1002/jcc.24494. Epub 2016 Oct 13. J Comput Chem. 2016. PMID: 27735058 Free PMC article.
Genus-specific pattern of intrinsically disordered central regions in the nucleocapsid protein of coronaviruses.
Barik S. Barik S. Comput Struct Biotechnol J. 2020 Jul 17;18:1884-1890. doi: 10.1016/j.csbj.2020.07.005. eCollection 2020. Comput Struct Biotechnol J. 2020. PMID: 32765822 Free PMC article.
Analysis of exome sequences with and without incorporating prior biological knowledge.
Namkung J, Raska P, Kang J, Liu Y, Lu Q, Zhu X. Namkung J, et al. Genet Epidemiol. 2011;35 Suppl 1(Suppl 1):S48-55. doi: 10.1002/gepi.20649. Genet Epidemiol. 2011. PMID: 22128058 Free PMC article.
Sialylated Glycan Bindings from SARS-CoV-2 Spike Protein to Blood and Endothelial Cells Govern the Severe Morbidities of COVID-19.
Scheim DE, Vottero P, Santin AD, Hirsh AG. Scheim DE, et al. Int J Mol Sci. 2023 Dec 1;24(23):17039. doi: 10.3390/ijms242317039. Int J Mol Sci. 2023. PMID: 38069362 Free PMC article. Review.
Highly efficient NMR assignment of intrinsically disordered proteins: application to B- and T cell receptor domains.
Isaksson L, Mayzel M, Saline M, Pedersen A, Rosenlöw J, Brutscher B, Karlsson BG, Orekhov VY. Isaksson L, et al. PLoS One. 2013 May 7;8(5):e62947. doi: 10.1371/journal.pone.0062947. Print 2013. PLoS One. 2013. PMID: 23667548 Free PMC article.

See all "Cited by" articles

References

1. Fischer E. Einfluss der configuration auf die wirkung der enzyme. Ber Dt Chem Ges. 1894;27:2985–2993.
1. Lemieux UR, Spohr U. How Emil Fischer was led to the lock and key concept for enzyme specificity. Adv Carbohydrate Chem Biochem. 1994;50:1–20. - PubMed
1. Blake CC, Koenig DF, Mair GA, North AC, Phillips DC, Sarma VR. Structure of hen egg-white lysozyme. A three-dimensional Fourier synthesis at 2 Angstrom resolution. Nature. 1965;206:757–761. - PubMed
1. Kendrew JC, Bodo G, Dintzis HM, Parrish RG, Wyckoff H, Phillips DC. A three-dimensional model of the myoglobin molecule obtained by x-ray analysis. Nature. 1958;181:662–666. - PubMed
1. Kendrew JC, Dickerson RE, Stranberg BE, H RJ, Davies DR, Phillips DC, Shore VC. Structure of myoglobin: a three-dimensional Fourier synthesis at 2 Å resolution. Nature. 1960;185:422–427. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

[1] Fischer E. Einfluss der configuration auf die wirkung der enzyme. Ber Dt Chem Ges. 1894;27:2985–2993.

[2] Fischer E. Einfluss der configuration auf die wirkung der enzyme. Ber Dt Chem Ges. 1894;27:2985–2993.

[3] Lemieux UR, Spohr U. How Emil Fischer was led to the lock and key concept for enzyme specificity. Adv Carbohydrate Chem Biochem. 1994;50:1–20. - PubMed

[4] Lemieux UR, Spohr U. How Emil Fischer was led to the lock and key concept for enzyme specificity. Adv Carbohydrate Chem Biochem. 1994;50:1–20. - PubMed

[5] Blake CC, Koenig DF, Mair GA, North AC, Phillips DC, Sarma VR. Structure of hen egg-white lysozyme. A three-dimensional Fourier synthesis at 2 Angstrom resolution. Nature. 1965;206:757–761. - PubMed

[6] Blake CC, Koenig DF, Mair GA, North AC, Phillips DC, Sarma VR. Structure of hen egg-white lysozyme. A three-dimensional Fourier synthesis at 2 Angstrom resolution. Nature. 1965;206:757–761. - PubMed

[7] Kendrew JC, Bodo G, Dintzis HM, Parrish RG, Wyckoff H, Phillips DC. A three-dimensional model of the myoglobin molecule obtained by x-ray analysis. Nature. 1958;181:662–666. - PubMed

[8] Kendrew JC, Bodo G, Dintzis HM, Parrish RG, Wyckoff H, Phillips DC. A three-dimensional model of the myoglobin molecule obtained by x-ray analysis. Nature. 1958;181:662–666. - PubMed

[9] Kendrew JC, Dickerson RE, Stranberg BE, H RJ, Davies DR, Phillips DC, Shore VC. Structure of myoglobin: a three-dimensional Fourier synthesis at 2 Å resolution. Nature. 1960;185:422–427. - PubMed

[10] Kendrew JC, Dickerson RE, Stranberg BE, H RJ, Davies DR, Phillips DC, Shore VC. Structure of myoglobin: a three-dimensional Fourier synthesis at 2 Å resolution. Nature. 1960;185:422–427. - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Understanding protein non-folding

Affiliation

Understanding protein non-folding

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources