What molecules does DNA interact with?
DNA interacts with other nucleic acids. DNA-DNA interactions play a vital role in a number of processes and are hypothesised to be key to large-scale chromosome organisation1. Looking at RNA, single-stranded RNA (ssRNA) can anneal to its double-stranded DNA template – co-transcriptionally or post-transcriptionally – creating an R-loop comprised of an RNA-DNA hybrid (RDH) duplex and a displaced single-stranded DNA (ssDNA). Roughly 60 % of human genes contain RDH-forming sequences, and they make up around 5 % of the mammalian genome, with functions in a variety of processes including transcription, replication, chromosome segregation, telomere regulation, DNA repair, and DNA methylation2.
Additionally, a substantial proportion of the human genome encodes non-coding RNAs (ncRNAs) instead of proteins. ncRNAs include microRNAs (miRNAs), small interfering RNAs (siRNAs), PIWI-interacting RNAs, and long non-coding RNAs (lncRNAs) – all of which play important roles in multiple biological processes, particularly epigenetic regulation3. Nuclear lncRNAs participate in chromatin organisation and transcriptional regulation and act as a structural scaffold to promote further interactions between proteins and nucleic acids4. There is also further evidence to suggest sequence-specific interactions of lncRNAs with DNA via triple-helix (triplex) formation, where DNA binds a third single-stranded nucleic acid in its major groove. Research has shown that lncRNA-DNA triplexes recruit protein complexes to specific genomic regions and regulate gene expression, controlling transcription through the recruitment of coactivator or corepressor proteins.
As for many other downstream methods, accurate DNA quantification is an important prerequisite for interaction studies.
For instance, a nucleolar lncRNA can form a triplex with a ribosomal DNA promoter that is then recognised by the DNA methyltransferase DNMT3B, which methylates rDNA promoters and represses rDNA transcription. Further studies have also demonstrated the ability of specific proteins to interact with triplexes in vivo, for example, helicases such as RecQ can unwind triplex structures, signifying a potential regulatory role of triplexes5.
Protein-DNA interactions are also fundamental to almost all biological processes in eukaryotes, from controlling the organisation of DNA and chromatin to transcription, DNA repair, and replication. Understanding how proteins interact with DNA, what proteins are interacting, and what nucleic acid sequences are involved, is absolutely key to understanding how these complexes affect biological pathways. DNA-binding proteins include structural molecules, transcription factors, polymerases, nucleases, and proteins that help repair breaks in the DNA double helix. DNA-histone interactions are especially important, playing a role in chromatin structure and gene regulation, helping to condense and structure DNA. The human genome contains approximately three billion base pairs, meaning that each individual cell contains around two metres of DNA. Without proteins and other molecules to package this DNA, there is no way it could fit in the human body! The positively-charged histones strongly bind to negatively-charged DNA to form nucleosome complexes that fold into chromatin fibers. These are further compressed and folded, before being tightly coiled into a pair of chromatids that form a chromosome. DNA-histone interactions can also be modified by acetylation or methylation, where acetylation loosens the winding of DNA, increasing access to allow transcription, and methylation prevents transcription factors from binding, leading to gene silencing6.
How do DNA-protein interactions occur?
DNA-protein interactions are mediated by either direct contact between the base pairs of DNA and specific amino acids in the protein structure, or indirect contact facilitated predominantly by water molecules and conformational changes in the DNA structure. Proteins bind with DNA through electrostatic interactions (salt bridges), dipolar interactions (hydrogen bonding), entropic effects (hydrophobic interactions) and dispersion forces (base stacking), determining whether a protein binds in a tight, sequence-specific manner or through a loose, non-specific interaction7. It is also possible to increase the affinity and specificity of a particular protein-nucleic acid interaction through multi-protein complex formation or oligomerisation. During binding, both protein and DNA conformation can be altered, which can enhance the binding of other proteins. This includes changes in protein side-chain location and local refolding, as well as bending of the DNA backbone or local untwisting of the helix.
Specific DNA interactions
Double-stranded DNA has a highly negatively-charged sugar-phosphate backbone, with a core of stacked base pairs whose edges are exposed in the major and minor grooves. Every DNA sequence has a unique chemical signature characterised by the functional groups on each base. Proteins are able to recognise this chemical pattern, along with sequence-dependent variations in DNA structure and flexibility for binding. Most sequence-specific DNA binding proteins recognise and bind to their target DNA sequence with a high affinity, using structural domains to make sequence-specific contact with the DNA bases in the major groove. While there is remarkable structural diversity in DNA binding folds, common binding motifs in the genome can be observed8.
The most common DNA binding domains include:
- Zinc finger
- Winged helix
- Leucine zipper