Software for motif discovery and next-gen sequencing analysis

DNase-Seq Analysis Tutorial

DNase hypersensitivity profiling is an assay that takes advantage of the fact that DNase with cleave DNA at sites of "open/accessible" chromatin.  Most DNA is compacted into chromatin consisting of DNA tightly wound around nucleosomes, and is inaccessible to DNase treatment.  As a result, DNase fragments DNA where nucleosomes have been removed, which is near active regulatory regions (promoters+enhancers+insulators).  DNase is also capable of 'nicking' DNA, including DNA wrapped around nucleosomes, such that only one of the strands is actually cut.  Regardless, DNase has proven itself as an invaluable assay for identifying open chromatin regions without using ChIP-Seq, which requires an antibody that may bias your peaks to regions of a certain type.  DNase-Seq is the application of next-gen sequencing to DNase-treated DNA.

Two Different Types of DNase-Seq (Important!)

So far, there have been two heavily used protocols for DNase-Seq.  The original method from the Crawford lab first ligates an adapter to the end of DNase-cleaved DNA fragments, and then sequencing into the fragment, often creating a "tag" in the process.  The 2nd method involves extracting DNase-treated DNA and then size selecting for fragments of size ~50-100 bp.  In the 2nd case, the idea is that open chromatin regions are likely to generate DNase cleavage sites less than 100 bp apart (creating a sub-100 bp fragment of DNA, say on either side of a transcription factor binding site), while nucleosomal DNA will likely produce fragments >150 bp (the size of a nucleosome).  The first method is more faithful to the recovery of DNase cleavage sites, but the 2nd method shows very robust enrichment at regulatory elements and looks "cleaner".  That comes with the catch that the 'open' regulatory element must be of a certain size range and capable of generating cleavage sites in the right size range.  I'll refer to these as the "crawford method" and "size-selection method" to keep them straight.

Why is this important?  The original Crawford method measures DNase cleavage sites (and the strand information is less important), while the size-selection method is a lot like ChIP-Seq where the regulatory element with transcription factor binding sites is likely to be on the fragment of DNA extracted in the size selection process.  In fact, DNase-Seq generated with the size-selection method should be treated exactly the same way in HOMER as ChIP-Seq data.

Preprocessing and Mapping

If using the Crawford style DNase-Seq data, you may need to cleave adapter sequences from the 5' and/or 3' ends - make sure to BLAT a couple to the genome and check it out if you don't know.

Mapping of DNase-Seq reads is a lot like ChIP-Seq. You could use bowtie or another DNA-based mapping algorithm.  (General Info on Mapping)

Creating Tag Directories and Quality Control

Creation of a DNase-Seq tag directory works the same way as with ChIP-Seq or RNA-Seq.

Finding DNase-hypersensitive regions of the genome from DNase-Seq Data

The basic idea behind identifying hypersensitive regions is to look for locations with high density of DNase-Seq reads, much like ChIP-Seq peak finding.

If using Crawford-style DNase-Seq, we need to remember that the strand of the read may or may not help indicate where the key information (i.e. TF binding sites) are located).  As such, the idea is to back up the read fragments so that their center is right on the 5' end of the read.  You can do this with "-style dnase", which basically sets the options (FILL IN):
findPeaks EScell-DNaseSeq/ -o auto -style dnase
If using Size-Selection style DNase-Seq, use the ChIP-Seq options:
findPeaks EScell-DNaseSeq/ -o auto -style factor
Output: Peaks will be centered on regions of highest DNase-Seq read density.

Creating UCSC Visualization Files

To visualize DNase-Seq experiments in the UCSC Genome Browser, we'll run the makeUCSCfile command (more info here).

Example for Crawford style DNase-Seq data:

makeUCSCfile EScell-DNase/ -o auto -style dnase

Example for Size-Selection style DNase-Seq
makeUCSCfile EScell-DNase/ -o auto

You can also make 'cleavage-site' tracks at nucleotide resolution by only visualizing the 5' ends of the reads.   Use the "-style tss" option for that.

You can also use makeBigWig.pl and makeMultiWigHub.pl if you have a webserver at your disposal to post the resulting bigWig files (covered in more depth here).  Each have an option called '-dnase' if using Crawford-style data (no special option needed for the size-selection style data).

Analysis of DNase-Seq Data

Almost all of the routines in HOMER dedicated to ChIP-Seq work well with DNase-Seq methods as well.  

Can't figure something out? Questions, comments, concerns, or other feedback: