[go: up one dir, main page]

Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Single-molecule states link transcription factor binding to gene expression

Abstract

The binding of multiple transcription factors (TFs) to genomic enhancers drives gene expression in mammalian cells1. However, the molecular details that link enhancer sequence to TF binding, promoter state and transcription levels remain unclear. Here we applied single-molecule footprinting2,3 to measure the simultaneous occupancy of TFs, nucleosomes and other regulatory proteins on engineered enhancer–promoter constructs with variable numbers of TF binding sites for both a synthetic TF and an endogenous TF involved in the type I interferon response. Although TF binding events on nucleosome-free DNA are independent, activation domains recruit cofactors that destabilize nucleosomes, driving observed TF binding cooperativity. Average TF occupancy linearly determines promoter activity, and we decompose TF strength into separable binding and activation terms. Finally, we develop thermodynamic and kinetic models that quantitatively predict both the enhancer binding microstates and gene expression dynamics. This work provides a template for the quantitative dissection of distinct contributors to gene expression, including TF activation domains, concentration, binding affinity, binding site configuration and recruitment of chromatin regulators.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: SMF reveals occupancy and promoter state at an engineered reporter.
Fig. 2: Thermodynamic models reveal rTetR–VP48 competes with nucleosomes with the aid of its AD.
Fig. 3: Average rTetR–VP48 occupancy predicts nucleosome-free promoters and gene expression.
Fig. 4: SMF of a type I interferon response reporter reveals decoupling of accessibility and activation.
Fig. 5: Kinetic modelling reveals timescales for chromatin and gene regulatory changes.

Similar content being viewed by others

Data availability

All high-throughput sequencing datasets generated in this study are available through the National Center for Biotechnology Information (NCBI) Sequencing Read Archive (BioProject PRJNA1071686) and Gene Expression Omnibus (GSE276513, GSE276514 and GSE276515). Plasmid maps are available via Zenodo at https://doi.org/10.5281/zenodo.13841007 (ref. 82). The GRCh38 human reference genome is available at the NCBI (https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000001405.26/).

Code availability

The synthetic SMF analysis and state-calling software are available via GitHub at https://github.com/GreenleafLab/amplicon-smf and via Zenodo at https://doi.org/10.5281/zenodo.13840888 (ref. 83). Additional scripts for processing RNA-seq and ATAC–seq data, as well as for reproducing the analyses in this manuscript, are available via Zenodo at https://doi.org/10.5281/zenodo.13841007 (ref. 82).

References

  1. Spitz, F. & Furlong, E. E. M. Transcription factors: from enhancer binding to developmental control. Nat. Rev. Genet. 13, 613–626 (2012).

    Article  CAS  PubMed  Google Scholar 

  2. Kelly, T. K. et al. Genome-wide mapping of nucleosome positioning and DNA methylation within individual DNA molecules. Genome Res. 22, 2497–2506 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Krebs, A. R. et al. Genome-wide single-molecule footprinting reveals high RNA polymerase II turnover at paused promoters. Mol. Cell 67, 411–422 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Vierstra, J. et al. Global reference mapping of human transcription factor footprints. Nature 583, 729–736 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  5. Wunderlich, Z. & Mirny, L. A. Different gene regulation strategies revealed by analysis of binding motifs. Trends Genet. 25, 434–440 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Giniger, E. & Ptashne, M. Cooperative DNA binding of the yeast transcriptional activator GAL4. Proc. Natl Acad. Sci. USA 85, 382–386 (1988).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  7. Pettersson, M. & Schaffner, W. Synergistic activation of transcription by multiple binding sites for NF-kappa B even in absence of co-operative factor binding to DNA. J. Mol. Biol. 214, 373–380 (1990).

    Article  CAS  PubMed  Google Scholar 

  8. Thanos, D. & Maniatis, T. Virus induction of human IFN beta gene expression requires the assembly of an enhanceosome. Cell 83, 1091–1100 (1995).

    Article  CAS  PubMed  Google Scholar 

  9. Mirny, L. A. Nucleosome-mediated cooperativity between transcription factors. Proc. Natl Acad. Sci. USA 107, 22534–22539 (2010).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  10. Biddie, S. C. et al. Transcription factor AP1 potentiates chromatin accessibility and glucocorticoid receptor binding. Mol. Cell 43, 145–155 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Fryer, C. J. & Archer, T. K. Chromatin remodelling by the glucocorticoid receptor requires the BRG1 complex. Nature 393, 88–91 (1998).

    Article  ADS  CAS  PubMed  Google Scholar 

  12. Herschlag, D. & Johnson, F. B. Synergism in transcriptional activation: a kinetic view. Genes Dev. 7, 173–179 (1993).

    Article  CAS  PubMed  Google Scholar 

  13. Martinez-Corral, R. et al. Transcriptional kinetic synergy: a complex landscape revealed by integrating modeling and synthetic biology. Cell Syst. 14, 324–339 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Shipony, Z. et al. Long-range single-molecule mapping of chromatin accessibility in eukaryotes. Nat. Methods 17, 319–327 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Stergachis, A. B., Debo, B. M., Haugen, E., Churchman, L. S. & Stamatoyannopoulos, J. A. Single-molecule regulatory architectures captured by chromatin fiber sequencing. Science 368, 1449–1454 (2020).

    Article  ADS  CAS  PubMed  Google Scholar 

  16. Sönmezer, C. et al. Molecular co-occupancy identifies transcription factor binding cooperativity in vivo. Mol. Cell 81, 255–267 (2021).

    Article  PubMed  Google Scholar 

  17. Levo, M. et al. Systematic investigation of transcription factor activity in the context of chromatin using massively parallel binding and expression assays. Mol. Cell 65, 604–617 (2017).

    Article  CAS  PubMed  Google Scholar 

  18. Bintu, L. et al. Transcriptional regulation by the numbers: models. Curr. Opin. Genet. Dev. 15, 116–124 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Gossen, M. et al. Transcriptional activation by tetracyclines in mammalian cells. Science 268, 1766–1769 (1995).

    Article  ADS  CAS  PubMed  Google Scholar 

  20. Vaisvila, R. et al. Enzymatic methyl sequencing detects DNA methylation at single-base resolution from picograms of DNA. Genome Res. 31, 1280–1289 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Ackers, G. K., Johnson, A. D. & Shea, M. A. Quantitative model for gene regulation by lambda phage repressor. Proc. Natl Acad. Sci. USA 79, 1129–1133 (1982).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  22. Kim, H. D. & O’Shea, E. K. A quantitative model of transcription factor-activated gene expression. Nat. Struct. Mol. Biol. 15, 1192–1198 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Neely, K. E. et al. Activation domain-mediated targeting of the SWI/SNF complex to promoters stimulates transcription from nucleosome arrays. Mol. Cell 4, 649–655 (1999).

    Article  CAS  PubMed  Google Scholar 

  24. Yudkovsky, N., Logie, C., Hahn, S. & Peterson, C. L. Recruitment of the SWI/SNF chromatin remodeling complex by transcriptional activators. Genes Dev. 13, 2369–2374 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Neely, K. E., Hassan, A. H., Brown, C. E., Howe, L. & Workman, J. L. Transcription activator interactions with multiple SWI/SNF subunits. Mol. Cell. Biol. 22, 1615–1625 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Papillon, J. P. N. et al. Discovery of orally active inhibitors of Brahma homolog (BRM)/SMARCA2 ATPase activity for the treatment of Brahma related gene 1 (BRG1)/SMARCA4-mutant cancers. J. Med. Chem. 61, 10155–10172 (2018).

    Article  CAS  PubMed  Google Scholar 

  27. Martin, B. J. E. et al. Global identification of SWI/SNF targets reveals compensation by EP400. Cell https://doi.org/10.1016/j.cell.2023.10.006 (2023).

  28. Kundu, T. K. et al. Activator-dependent transcription from chromatin in vitro involving targeted histone acetylation by p300. Mol. Cell 6, 551–561 (2000).

    Article  CAS  PubMed  Google Scholar 

  29. Alerasool, N., Leng, H., Lin, Z.-Y., Gingras, A.-C. & Taipale, M. Identification and functional characterization of transcriptional activators in human cells. Mol. Cell 82, 677–695 (2022).

    Article  CAS  PubMed  Google Scholar 

  30. Rada-Iglesias, A. et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283 (2011).

    Article  ADS  CAS  PubMed  Google Scholar 

  31. Brower-Toland, B. et al. Specific contributions of histone tails and their acetylation to the mechanical stability of nucleosomes. J. Mol. Biol. 346, 135–146 (2005).

    Article  CAS  PubMed  Google Scholar 

  32. Lasko, L. M. et al. Discovery of a selective catalytic p300/CBP inhibitor that targets lineage-specific tumours. Nature 550, 128–132 (2017).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  33. Raj, A., Peskin, C. S., Tranchina, D., Vargas, D. Y. & Tyagi, S. Stochastic mRNA synthesis in mammalian cells. PLoS Biol. 4, e309 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Suter, D. M. et al. Mammalian genes are transcribed with widely different bursting kinetics. Science 332, 472–474 (2011).

    Article  ADS  CAS  PubMed  Google Scholar 

  35. Chong, S., Chen, C., Ge, H. & Xie, X. S. Mechanism of transcriptional bursting in bacteria. Cell 158, 314–326 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Xiao, J. Y., Hafner, A. & Boettiger, A. N. How subtle changes in 3D structure can create large changes in transcription. Elife 10, e64320 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Zuin, J. et al. Nonlinear control of transcription through enhancer–promoter interactions. Nature 604, 571–577 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  38. Gossen, M. & Bujard, H. Tight control of gene expression in mammalian cells by tetracycline-responsive promoters. Proc. Natl Acad. Sci. USA 89, 5547–5551 (1992).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  39. Kessler, D. S., Veals, S. A., Fu, X. Y. & Levy, D. E. Interferon-alpha regulates nuclear translocation and DNA-binding affinity of ISGF3, a multimeric transcriptional activator. Genes Dev. 4, 1753–1765 (1990).

    Article  CAS  PubMed  Google Scholar 

  40. Lazear, H. M., Schoggins, J. W. & Diamond, M. S. Shared and distinct functions of type I and type III interferons. Immunity 50, 907–923 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Platanitis, E. et al. A molecular switch from STAT2-IRF9 to ISGF3 underlies interferon-induced gene transcription. Nat. Commun. 10, 2921 (2019).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  42. Rengachari, S. et al. Structural basis of STAT2 recognition by IRF9 reveals molecular insights into ISGF3 function. Proc. Natl Acad. Sci. USA 115, E601–E609 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Bluyssen, H. A. & Levy, D. E. Stat2 is a transcriptional activator that requires sequence-specific contacts provided by stat1 and p48 for stable interaction with DNA. J. Biol. Chem. 272, 4600–4605 (1997).

    Article  CAS  PubMed  Google Scholar 

  44. Patel, M. C. et al. BRD4 coordinates recruitment of pause release factor P-TEFb and the pausing complex NELF/DSIF to regulate transcription elongation of interferon-stimulated genes. Mol. Cell. Biol. 33, 2497–2507 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Cui, K. et al. The chromatin-remodeling BAF complex mediates cellular antiviral activities by promoter priming. Mol. Cell. Biol. 24, 4476–4486 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Manry, J. et al. Evolutionary genetic dissection of human interferons. J. Exp. Med. 208, 2747–2759 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Krause, C. D. & Pestka, S. Cut, copy, move, delete: the study of human interferon genes reveal multiple mechanisms underlying their evolution in amniotes. Cytokine 76, 480–495 (2015).

    Article  CAS  PubMed  Google Scholar 

  48. Arimoto, K.-I., Miyauchi, S., Stoner, S. A., Fan, J.-B. & Zhang, D.-E. Negative regulation of type I IFN signaling. J. Leukoc. Biol. https://doi.org/10.1002/JLB.2MIR0817-342R (2018).

  49. Mostafavi, S. et al. Parsing the interferon transcriptional network and its disease associations. Cell 164, 564–578 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Dogan, N. et al. Occupancy by key transcription factors is a more accurate predictor of enhancer activity than histone modifications or chromatin accessibility. Epigenetics Chromatin 8, 16 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Corces, M. R. et al. Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat. Genet. 48, 1193–1203 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Weirauch, M. T. et al. Evaluation of methods for modeling transcription factor sequence specificity. Nat. Biotechnol. 31, 126–134 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Lee, D. Y., Hayes, J. J., Pruss, D. & Wolffe, A. P. A positive role for histone acetylation in transcription factor access to nucleosomal DNA. Cell 72, 73–84 (1993).

    Article  CAS  PubMed  Google Scholar 

  54. Struhl, K. Histone acetylation and transcriptional regulatory mechanisms. Genes Dev. 12, 599–606 (1998).

    Article  CAS  PubMed  Google Scholar 

  55. Narita, T. et al. Enhancers are activated by p300/CBP activity-dependent PIC assembly, RNAPII recruitment, and pause release. Mol. Cell 81, 2166–2182 (2021).

    Article  CAS  PubMed  Google Scholar 

  56. Ferrie, J. J. et al. p300 is an obligate integrator of combinatorial transcription factor inputs. Mol. Cell 84, 234–243.e4 (2024).

  57. Kornberg, R. D. & Lorch, Y. Irresistible force meets immovable object: transcription and the nucleosome. Cell 67, 833–836 (1991).

    Article  CAS  PubMed  Google Scholar 

  58. Boeger, H. Kinetic proofreading. Annu. Rev. Biochem. 91, 423–447 (2022).

    Article  CAS  PubMed  Google Scholar 

  59. Wong, F. & Gunawardena, J. Gene regulation in and out of equilibrium. Annu. Rev. Biophys. 49, 199–226 (2020).

    Article  CAS  PubMed  Google Scholar 

  60. Shelansky, R. & Boeger, H. Nucleosomal proofreading of activator-promoter interactions. Proc. Natl Acad. Sci. USA 117, 2456–2461 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  61. Mahdavi, S., Salmon, G. L., Daghlian, P., Garcia, H. G. & Phillips, R. Flexibility and sensitivity in gene regulation out of equilibrium. Preprint at bioRxiv https://doi.org/10.1101/2023.04.11.536490 (2023).

  62. Guharajan, S., Parisutham, V. & Brewster, R. C. Probing the dependence of transcription factor regulatory modes on promoter features. Preprint at bioRxiv https://doi.org/10.1101/2024.05.30.596689 (2024).

  63. Vaisvila, R. et al. Discovery of cytosine deaminases enables base-resolution methylome mapping using a single enzyme. Mol. Cell 84, 854–866 (2024).

    Article  CAS  PubMed  Google Scholar 

  64. He, R. et al. Human transcription factor combinations mapped by footprinting with deaminase. Preprint at bioRxiv https://doi.org/10.1101/2024.06.14.599019 (2024).

  65. Policarpi, C., Munafò, M., Tsagkris, S., Carlini, V. & Hackett, J. A. Systematic epigenome editing captures the context-dependent instructive function of chromatin modifications. Nat. Genet. https://doi.org/10.1038/s41588-024-01706-w (2024).

  66. DelRosso, N. et al. Large-scale mapping and mutagenesis of human transcriptional effector domains. Nature 616, 365–372 (2023).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  67. Durrant, M. G. et al. Systematic discovery of recombinases for efficient integration of large DNA sequences into the human genome. Nat. Biotechnol. 41, 488–499 (2023).

    Article  CAS  PubMed  Google Scholar 

  68. Tycko, J. et al. High-throughput discovery and characterization of human transcriptional effectors. Cell 183, 2020–2035 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Iurlaro, M. et al. Mammalian SWI/SNF continuously restores local accessibility to chromatin. Nat. Genet. 53, 279–287 (2021).

    Article  CAS  PubMed  Google Scholar 

  70. Weinert, B. T. et al. Time-resolved analysis reveals rapid dynamics and broad scope of the CBP/p300 acetylome. Cell 174, 231–244 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Teague, B. Cytoflow: a Python toolbox for flow cytometry. Preprint at bioRxiv https://doi.org/10.1101/2022.07.22.501078 (2022).

  72. Pedersen, B. S., Eyring, K., De, S., Yang, I. V. & Schwartz, D. A. Fast and accurate alignment of long bisulfite-seq reads. Preprint at https://arxiv.org/abs/1401.1129 (2014).

  73. Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Corces, M. R. et al. An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14, 959–962 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Schep, A. N., Wu, B., Buenrostro, J. D. & Greenleaf, W. J. chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat. Methods 14, 975–978 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  78. Marinov, G. K. ChIP-seq for the identification of functional elements in the human genome. Methods Mol. Biol. 1543, 3–18 (2017).

    Article  CAS  PubMed  Google Scholar 

  79. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  80. Feng, J., Liu, T., Qin, B., Zhang, Y. & Liu, X. S. Identifying ChIP-seq enrichment using MACS. Nat. Protoc. 7, 1728–1740 (2012).

    Article  CAS  PubMed  Google Scholar 

  81. Marinov, G. K. Identification of candidate functional elements in the genome from ChIP-seq data. Methods Mol. Biol. 1543, 19–43 (2017).

    Article  CAS  PubMed  Google Scholar 

  82. Doughty, B. GreenleafLab/synthetic_enhancer_footprinting_additional_materials: SMF Paper Files. Zenodo https://doi.org/10.5281/zenodo.13841007 (2024).

  83. Doughty, B. GreenleafLab/amplicon-smf: amplicon-smf v1.0.0. Zenodo https://doi.org/10.5281/zenodo.13840888 (2024).

Download references

Acknowledgements

We thank B. Liu for Tn5 and help with ATAC–seq and processing, S. Higashino and S. Allen for keeping our laboratories running, S. Nair for designing background sequence no. 2, C. Ludwig for pCL056 and help with RNA-seq, E. Metzl-Raz for the coffee corner, and all the members of the laboratories of W.J.G., L.B. and J. Engreitz for discussions and feedback. This work was supported by the grant numbers NSF GRFP DGE-1656518 (B.R.D., M.M.H. and J.M.S.) and DGE-2146755 (C.R.-M. and Y.T.), the Stanford Interdisciplinary Graduate Fellowship affiliated with Stanford Bio-X (B.R.D. and J.M.S.), the Stanford Bio-X Bowes Fellowship (A.R.T.), the Sarafan Chem-H Chemistry-Biology Interface Training Grant (A.R.T.), the National Institutes of Health (NIH) Training Program grant T32GM145402 (A.R.T.), the Stanford Graduate Fellowship (M.M.H.) and the Stanford VPGE EDGE Fellowship (M.M.H. and C.R.M.). E.M. acknowledges support from the Swedish Research Council (grant 2020-06459), the Foundation Blanceflor and the Science for Life Laboratory. This work was supported by the NIH grants UM1HG009436, R01NS128028, DP1HG013599, R01HL171611, R01HG013317, and P50HG007735 (to W.J.G.), and NIH National Institute of General Medical Sciences R35M128947 (to L.B.). W.J.G. was a Chan Zuckerberg Biohub investigator and is an Arc Innovation Investigator, and acknowledges grants 2017-174468 and 2018-182817 from the Chan Zuckerberg Initiative.

Author information

Authors and Affiliations

Authors

Contributions

B.R.D., M.M.H. and J.M.S. contributed equally as co-first authors, are listed alphabetically by last name, and have agreed that any author can be listed first in reporting this study. M.M.H., B.R.D., J.M.S., L.B. and W.J.G. conceived of the study. M.M.H., B.R.D. and J.M.S. designed, performed and analysed all SMF experiments, with assistance from G.K.M. and C.R.M. M.M.H. and B.R.D. optimized the SMF assay. J.M.S. and M.M.H. designed and cloned all libraries, and M.M.H., J.M.S., A.R.T. and B.R.D. made cell lines and performed tissue culture. G.K.M. and M.M.H. performed and G.K.M. and B.R.D. analysed ChIP–seq. J.M.S., M.M.H. and A.R.T. performed and J.M.S. analysed flow cytometry. B.R.D., J.M.S. and C.R.M. performed and analysed ATAC–seq, with help from D.D. A.R.T. performed and analysed the RNA-seq experiments. J.M.S. and E.M. purified rTetR and performed electrophoretic mobility shift assays. B.R.D. wrote the SMF processing code and probabilistic binding model, with substantial input from G.K.M., B.E.P. and Y.T. B.R.D. and M.M.H. wrote the simple competition and nucleosome destabilization models, J.M.S. and M.M.H. wrote the additive activation model, and J.M.S. wrote the kinetic model, all with substantial input from L.B. and W.J.G. J.M.S., B.R.D., L.B. and W.J.G. wrote the manuscript, with input from all authors. L.B. and W.J.G. jointly supervised the work.

Corresponding authors

Correspondence to Lacramioara Bintu or William J. Greenleaf.

Ethics declarations

Competing interests

W.J.G. is a consultant and equity holder for 10x Genomics, Guardant Health, Quantapore and Ultima Genomics, co-founder of Protillion Biosciences and is named on patents describing ATAC–seq. L.B. is a co-founder of Stylus Medicine and a member of its scientific advisory board. All other authors declare no competing interests.

Peer review

Peer review information

Nature thanks Steven Hahn, Arjun Raj and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Single-molecule footprinting assay controls and optimizations.

A) Gene expression (MFI) for Citrine for constructs with variable TetO number at 24 and 48 h with dox. B) ChIP-seq signal (reads overlapping the reporter per million mapped reads across the genome) for H3K27ac (left) and Pol II-S5P (right) at the reporter with and without the addition of doxycycline. C) Average methylation on a synthetic amplicon containing a single CTCF site (purple). Error bars represent the 95% CI from three biological replicates. D) Genome-wide relationship between the average ATAC-seq signal (RPM) and average methylation probability from SMF at ATAC-seq peaks. Peaks were grouped into 100 quantile bins (green dots). E) Methylation probabilities at CpGs from the fully methylated (pUC19) and fully unmethylated (Lambda phage) EM-seq controls. F) Percent of pairs of GpCs flanking the same TetO site that have identical methylation status on the same molecule. Error bars represent the 95% CI obtained by averaging all pairs of GpCs flanking all TetOs on all molecules. G) Average rTetR-VP48 occupancy across amplicons for two different lysis and methylation conditions.

Extended Data Fig. 2 Binding heterogeneity, methylation timecourses, and promoter substates.

A) Cartoon schematic of probabilistic binding model. For each read (single-molecule methylation signal), we enumerate all possible underlying molecular configurations of TFs and nucleosomes and assign each a likelihood of being observed. We then select the state with the maximum likelihood. B) The fraction of molecules with a nucleosome overlapping each position in the enhancer with no TetO sites. C) The fraction of molecules which have enhancers either a) completely covered by nucleosomes, b) only bound by TFs, or c) covered with a mix of the two as a function of the number of TetOs. D) Average TF occupancy across amplicons for both rTetR-VP48 and rTetR as a function of the duration of the methylation reaction. Dotted line represents the conditions used in our assay (7.5 min). E) Average methylation across the reporter molecule with six TetO binding sites under three different fixation and methylation conditions: native SMF with 7.5 min methylation time (green), 15 min 1% formaldehyde fixation and 1 hr methylation (purple), and 15 min 1% formaldehyde fixation and 6 hr methylation (orange). F) Average methylation in the promoter after subclustering the active (nucleosome-free) promoters using k-means. G) Fraction of active promoters that are found in each k-means cluster (above), faceted by the number of TetOs in the enhancer. Error bars represent the 95% CI from four biological replicates.

Extended Data Fig. 3 The Nucleosome Destabilization model predicts rTetR-VP48 binding in chromatin.

A) Relationship between average TF occupancy and ATAC-seq signal (RPM) for a selection of single enhancer amplicons from multiple backgrounds. Each dot represents a single enhancer. Error bars represent the 95% CI from four (SMF) or two (ATAC-seq) biological replicates. B) Schematic describing how rTetR-VP48 is installed with PiggyBac transposase and expressed in K562 cells, selected with hygromycin, and assayed for activity with mCherry under a dox-responsive promoter. C) Flow cytometry distributions for mCherry fluorescence with and without dox for two technical replicates. D) Histogram of the predictions of number of TFs bound per molecule for the enhancer with 7 TetO sites from the Simple Competition model (gray) fit to the binding data of rTetR-VP48 and these predictions with Gaussian noise (SNR = 1) added to TF energy (black), effectively varying the concentration of the TF per cell/molecule (see Methods). E) Average rTetR-VP48 occupancy with variable TetO number at 24 and 48 h with dox. F) Probability of each TetO site being occupied on nucleosome-free molecules for the enhancer with six TetO sites (two biological replicates). Dashed line indicates the average occupancy. G) Fraction of molecules with >0 TFs bound for enhancers with increasing TetO number (four biological replicates) fit to either the Simple Competition (r2 = 0.44) or Nucleosome Destabilization (r2 = 0.99) models. H) Occupancy distributions across all enhancers (left) with matching simulations from the Nucleosome Destabilization model (right). Each row represents a single enhancer and each column represents the number of TFs bound. The pixel intensity at (row i, column j) indicates the fraction of molecules with i TetO sites that have j TFs bound. Rows sum to 1. I) Full molecular state representations for 10,000 measured (left) or simulated (according to the Nucleosome Destabilization, right) molecules with six TetO sites, where each column represents a TetO site and each row represents a molecule. Sites are colored by their occupancy status. J) Best-fit parameters for binding of rTetR-VP48 using the Nucleosome Destabilization model (four biological replicates). K) Three alternate TF cooperativity models (left) and their fits to average occupancy and occupancy distributions (right) for rTetR-VP48.

Extended Data Fig. 4 Chemical inhibitor and activation domain perturbations to rTetR-VP48 binding.

A) Fraction of molecules with >0 TFs bound for rTetR only, rTetR-VP48 treated with BAF inhibitor, and rTetR-VP48 treated with p300 inhibitor. Fits are either to the Simple Competition (rTetR only) or Nucleosome Destabilization (rTetR-VP48 + inhibitors) models (r2 = 0.86, 0.96, and 0.96, respectively). Data are from two biological replicates. B) Occupancy distributions across all enhancers (top) with matching simulations from the Nucleosome Destabilization model (bottom) for rTetR only, rTetR-VP48 treated with BAF inhibitor, and rTetR-VP48 treated with p300 inhibitor. C) Full molecular state representations for 10,000 measured (top) or simulated (bottom) molecules for rTetR only, rTetR-VP48 treated with BAF inhibitor, and rTetR-VP48 treated with p300 inhibitor. Each column represents a TetO site and each row represents a molecule. Sites are colored by their occupancy status. D) Flow cytometry for FLAG-tagged TF expression levels for cells expressing rTetR or rTetR-VP48. Dotted line represents the level of background staining. Data are from two biological replicates. E) Average occupancy (top) and occupancy distribution (bottom) of rTetR (two biological replicates) with a fit from the Nucleosome Destabilization model (r2 = 0.83 and 0.96, respectively). F) Average TF occupancy across variable TetO number for rTetR-VP48 (left) and rTetR only (right) with and without treatment with BAF inhibitor. Data are from two biological replicates. G) Best-fit parameters from the Nucleosome Destabilization model after treatment with the BAF and p300 inhibitors (two biological replicates). Gray dashed line represents rTetR-VP48 parameter fits for comparison. H) Average occupancy (left) and occupancy distribution (right of rTetR-VP48 in the presence of the BAF inhibitor BRM014 (two biological replicates) with a fit from the Simple Competition model (r2 = 0.78 and 0.51, respectively). I) Fraction of molecules with nucleosomes overlapping the enhancer as a function of the number of TetO sites for rTetR-VP48, rTetR-VP48 treated with BAF inhibitor, and rTetR-VP48 treated with p300 inhibitor. J) Average H3K27ac signal across the synthetic reporter locus with and without p300 inhibition. Data are from two biological replicates, and the signal is normalized to spike-in mouse chromatin.

Extended Data Fig. 5 Active promoter state validation, characterization and alternative models.

A) Relationship between rTetR-VP48 (black) and rTetR (green) occupancy and active promoter fraction. B) Relationship between rTetR-VP48 occupancy and active promoter fraction across transcription inhibition conditions with: no inhibitor (black), 1.5 h flavopiridol (cyan) and 6 h (brown) and 25 h (tan) alpha-amanitin. C) TF occupancy (left) and active promoter fraction (right) across transcription inhibition conditions relative to no inhibitor for constructs with 5–8 TetOs. Alpha-amanitin for 25 h (tan) likely reduces rTetR-VP48 concentration due to global reduction of transcription. D) Relationship and linear fit (black line) between active promoter fraction and RNA expression (RT-qPCR) on constructs with 0–8 TetOs (two technical replicates). E) Schematics of alternative models relating TF binding to promoter activation. Cooperative Activation (left) assumes activation is cooperative with the number of TFs bound on average (kon*avgTFn). Thresholded Activation (right) assumes activation of the promoter occurs when there is at least 1 TF present (kon*frac>=1 TF). F) Model fits for additional models in E on the relationship between average rTetR-VP48 occupancy and active promoter fraction. Fit parameters for cooperative model: kon/koff = 0.13 ± 0.006 TF−1, n = 1.1 ± 0.04. Fit parameter for thresholded model: kon/koff = 0.5 ± 0.1. G) Chi-squared per degree of freedom comparing fits in F. H-I) Instantaneous relationship between the number of rTetR-VP48s bound and fraction promoter active on the same molecule across backgrounds (H) or across TetO number (I). J) Example plots to compare to data in Fig. 3h represent the expected relationship if promoter rates are much faster than TF binding rates (left) or vice versa (right). K-L) EMSA of rTetR (concentration noted above each lane) binding to 60 bp target DNA (1 nM) without a TetO site (left) and with a TetO site (right) in the presence and absence of doxycycline (1:50 rTetR to dox concentration). M) EMSA of rTetR (160 nM) binding to 60 bp target DNA (1 nM) with a TetO site across varying dox concentrations (ratio relative to rTetR concentration noted under each lane) (left). K-L) In vitro rTetR:TetO binding from EMSAs varying rTetR concentration in the presence (black) and absence (gray) of dox (K) or varying dox concentration (L) with binding isotherm fits (y = x/(KD+x) where KD is the affinity of rTetR to DNA and the affinity of dox to rTetR is assumed to be much smaller. M) Apparent TF concentration from relative in vivo rTetR-VP48 concentration (from binding energy in Nucleosome Destabilization model) across dox concentrations with binding isotherm fit. N) Flow cytometry distributions for a construct with 7xTetO without inhibitor (gray), with p300 inhibition (cyan) or with BAF inhibition (purple).

Extended Data Fig. 6 ISRE reporter footprinting controls, characterization, and cofactor inhibitions.

A) Distributions of the number of ISREs within a 500 base pair window around TSSs of ISGs identified from bulk RNA-seq (pink) and of non-ISGs (gray). B) Relationship between methylation time and measured average wide (left) and narrow (right) footprints across amplicons after 24 h of IFN-β stimulation. Dashed line is the methylation time chosen for all experiments. C) Bulk methylation data for the construct with 4 ISRE sites, with (black) and without (gray) IFN-β. Error bars are standard deviation between two biological replicates. D) Nucleosome occupancy across ISRE number before and after six hours of stimulation (two biological replicates). E) Likelihood of occupancy across six ISRE binding sites in nucleosome-free molecules pre- (black) and post-stimulation (pink) (two biological replicates). The line indicates average occupancy. F) Relationship between ISRE number and active promoter fraction across with six hours of IFN-β (two biological replicates). G) Enrichment of active promoter clusters (classified using rTetR-VP48 data) at six hours relative to zero hours of IFN-β (two biological replicates). H) Relationship between narrow footprints and active promoter fraction with (black) and without (gray) 6 h of IFN-β (two biological replicates). I) Relationship between active promoter fraction and RNA expression (RT-qPCR, two technical replicates). Black line is a linear fit. J) Relationship between ISRE copy number and gene expression (RT-qPCR, two technical replicates). Black line is the coupled Additive Activation model and linear fit from fraction promoters active to gene expression with input of measured average wide footprints. K) Relationship between ISRE number and the fold change in average wide footprints with six hours IFN-β and BAF inhibition (purple) or p300 inhibition (cyan) relative to without inhibitors. L) Relationship between ISRE number and gene expression (as measured by flow cytometry at 24 h of IFN-β) with BAF inhibition (purple), p300 inhibition (cyan), and without drug (black) (two technical replicates).

Extended Data Fig. 7 Nucleosome destabilization model fits pre-stimulation ISRE binding best and validation with cofactor inhibitions.

A) Relationship between ISRE number and average narrow footprints for the average of two biological replicates (left) and the distribution of narrow footprints bound for 4x ISRE sites (right) prior to stimulation fit by the Simple Competition model (black) and Nucleosome Displacement model (green). For both models, the nucleosome energy is fixed to the fit value in rTetR-VP48 experiments. B-C) Goodness of fit of number bound distributions in A by r2 across ISRE copy numbers (B) and chi-squared per degrees of freedom (C). D) Fit parameters (averaged across two biological replicates) from Nucleosome Displacement model (gray bars); black dots are rTetR-VP48 parameters. E) Relationship between ISRE number and the average narrow footprints pre-stimulation under conditions of BAF inhibition (purple) or p300 inhibition (cyan). Dotted line is data without inhibitors. F) Relationship in E shown as a fold change compared to no inhibitor across constructs with three to six ISREs.

Extended Data Fig. 8 ATAC-seq and RNA-seq of IFN-β response across time and cofactor inhibitions at endogenous ISGs.

A) Differential ATAC-seq accessibility of TF motif types after IFN-β stimulation (two biological replicates). B) Differential RNA-seq expression after IFN-β (two biological replicates). C) Average ATAC-seq accessibility relative to ISREs genome-wide, Tn5-bias corrected and RPM normalized for samples with (red) and without (pink) IFN-β (two biological replicates). D) Average ATAC-seq accessibility with (red) and without (pink) IFN-β relative to the TSSs of genes upregulated with stimulation (identified by RNA-seq) that contain at least three ISREs within the 500 bp window. Gray line is average ATAC-seq accessibility for non-ISG promoters that are matched to the pre-stimulation expression level (TPM) of plotted ISG promoters. Shaded error regions are 95% confidence intervals from bootstrapping (two biological replicates). E) Average ATAC-seq accessibility as in D with no inhibitor (gray), BAF inhibition for 12 h (purple), and p300 inhibition for 24 h (cyan). F) Average ATAC-seq accessibility as in D for conditions in E after IFN-β addition. G) Average ATAC-seq accessibility relative to ISREs genome-wide as in C for conditions in F. H) ATAC-seq tracks of normalized chromatin accessibility without (top) and with (bottom) IFN-β with BAF inhibition for 12 h (purple), p300 inhibition for 24 h (cyan), and no inhibitor (black) for IFI6, ISG15, and USP18. I) RNA expression of endogenous ISGs before (pink) and after (red) IFN-β (two biological replicates). J) ATAC-seq tracks of normalized chromatin accessibility over an IFN-β timecourse for IFI6, ISG15, and USP18. Black lines denote the region that was installed at the reporter.

Extended Data Fig. 9 Footprinting and expression measurements of ISG regulatory elements installed at the reporter.

A) Reporter with ISG15 and USP18 promoter containing intact and scrambled ISREs replacing minCMV (inset). Average methylation pre- (light) and post- (dark) IFN-β for intact (red) and scrambled (gray) ISREs (two biological replicates and two scrambles). Dashed line set by construct without ISREs. B) Gene expression (MFI) of reporters as in A. C) Aggregated methylation data obtained for reporter constructs with endogenous ISG promoters and proximal enhancers present (IFI6, ISG15, USP18) with intact (top) and scrambled (bottom) ISREs with (pink) and without (black) IFN-β. D) Gene expression (flow cytometry) of ISG promoter reporters after 0, 8, and 24 h of IFN-β with BAF inhibition (purple), p300 inhibition (cyan), and no inhibitor present (gray). Black dashed line is the MFI of WT K562s.

Extended Data Fig. 10 Activation kinetics and model fits across rTetR-VP48 and ISGF3 reporters.

A) rTetR-VP48 occupancy without (black) and with (gray) dox during sample processing across TetO numbers (two biological replicates). B) Representative temporal delay between rTetR-VP48 occupancy (blue) and promoter activation (black) for constructs with six and eight TetOs (two biological replicates). C) Potency (kon/koff) fit from Additive Activation Model across dox timecourse. Error bars are standard deviations. D) Measured data (top) and fit kinetic models (bottom) for TF occupancy (r2 = 0.99), promoter activation (r2 = 0.90), RNA (r2 = 0.95), and protein levels (r2 = 0.92) over time and across TetO number. E-F) Data and fits (lines) for average TF occupancy (E) and active promoter fraction (F) over time for the construct with seven TetOs without inhibitor (black), with p300 inhibition (cyan), and with BAF inhibition (purple) (two biological replicates). G) Potency (kon/koff) fit from Additive Activation Model for conditions in E-F. Error bars are standard deviations. H) Representative lack of temporal delay between wide footprints (pink) and promoter activation (black) for constructs with four and six ISREs (two biological replicates). I) RNA expression as measured by RT-qPCR over time (two technical replicates) for variable ISRE numbers. J) The relationship between average wide footprints and active promoter fraction across IFN-β activation (two biological replicates).

Supplementary information

Supplementary Information

Supplementary Notes 1 and 2 and Figs. 1–7.

Reporting Summary

Peer Review File

Supplementary Tables 1–6

Supplementary tables and legends.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Doughty, B.R., Hinks, M.M., Schaepe, J.M. et al. Single-molecule states link transcription factor binding to gene expression. Nature (2024). https://doi.org/10.1038/s41586-024-08219-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41586-024-08219-w

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing