DNase-Seq Analysis Tutorial
DNase hypersensitivity profiling is an assay that takes
advantage of the fact that DNase with cleave DNA at sites of
"open/accessible" chromatin. Most DNA is compacted
into chromatin consisting of DNA tightly wound around
nucleosomes, and is inaccessible to DNase treatment.
As a result, DNase fragments DNA where nucleosomes have been
removed, which is near active regulatory regions
(promoters+enhancers+insulators). DNase is also
capable of 'nicking' DNA, including DNA wrapped around
nucleosomes, such that only one of the strands is actually
cut. Regardless, DNase has proven itself as an
invaluable assay for identifying open chromatin regions
without using ChIP-Seq, which requires an antibody that may
bias your peaks to regions of a certain type.
DNase-Seq is the application of next-gen sequencing to
Two Different Types of DNase-Seq (Important!)
So far, there have been two heavily used
protocols for DNase-Seq. The original method from
the Crawford lab first ligates an adapter to the end of
DNase-cleaved DNA fragments, and then sequencing into the
fragment, often creating a "tag" in the process. The
2nd method involves extracting DNase-treated DNA and then
size selecting for fragments of size ~50-100 bp. In
the 2nd case, the idea is that open chromatin regions are
likely to generate DNase cleavage sites less than 100 bp
apart (creating a sub-100 bp fragment of DNA, say on
either side of a transcription factor binding site), while
nucleosomal DNA will likely produce fragments >150 bp
(the size of a nucleosome). The first method is more
faithful to the recovery of DNase cleavage sites, but the
2nd method shows very robust enrichment at regulatory
elements and looks "cleaner". That comes with the
catch that the 'open' regulatory element must be of a
certain size range and capable of generating cleavage
sites in the right size range. I'll refer to these
as the "crawford method" and "size-selection method" to
keep them straight.
Why is this important? The original Crawford method
measures DNase cleavage sites (and the strand information
is less important), while the size-selection method is a
lot like ChIP-Seq where the regulatory element with
transcription factor binding sites is likely to be on the
fragment of DNA extracted in the size selection
process. In fact, DNase-Seq generated with the
size-selection method should be treated exactly the same
way in HOMER as ChIP-Seq data.
Preprocessing and Mapping
If using the Crawford style DNase-Seq data, you
may need to cleave adapter sequences from the 5' and/or 3'
ends - make sure to BLAT a couple to the genome and check
it out if you don't know.
Mapping of DNase-Seq reads is a lot like ChIP-Seq. You
could use bowtie or another DNA-based mapping
algorithm. (General Info on
Creating Tag Directories and Quality Control
Creation of a DNase-Seq tag directory works the
same way as with ChIP-Seq or RNA-Seq.
Finding DNase-hypersensitive regions of the genome from
The basic idea behind identifying hypersensitive
regions is to look for locations with high density of
DNase-Seq reads, much like ChIP-Seq peak finding.
If using Crawford-style DNase-Seq, we need to remember
that the strand of the read may or may not help indicate
where the key information (i.e. TF binding sites) are
located). As such, the idea is to back up the read
fragments so that their center is right on the 5' end of
the read. You can do this with "-style dnase", which
basically sets the options (FILL IN):
findPeaks EScell-DNaseSeq/ -o auto -style
If using Size-Selection style DNase-Seq, use the ChIP-Seq
findPeaks EScell-DNaseSeq/ -o auto -style
Output: Peaks will be centered on regions of highest
DNase-Seq read density.
Creating UCSC Visualization Files
To visualize DNase-Seq
experiments in the UCSC Genome Browser, we'll run the makeUCSCfile
(more info here
Example for Crawford style DNase-Seq data:
makeUCSCfile EScell-DNase/ -o auto -style
Example for Size-Selection style DNase-Seq
makeUCSCfile EScell-DNase/ -o auto
You can also make 'cleavage-site' tracks at nucleotide
resolution by only visualizing the 5' ends of the
reads. Use the "-style tss" option for that.
You can also use makeBigWig.pl and makeMultiWigHub.pl if
you have a webserver at your disposal to post the
resulting bigWig files (covered in more depth here
). Each have an option
called '-dnase' if using Crawford-style data (no special
option needed for the size-selection style data).
Analysis of DNase-Seq Data
Almost all of the routines
in HOMER dedicated to ChIP-Seq work well with DNase-Seq
methods as well.