logo

HOMER

Software for motif discovery and next-gen sequencing analysis



Next-Generation Sequencing Analysis

HOMER offers tools and methods for interpreting Next-gen *-Seq experiments.  In addition to Genome Browser/UCSC visualization support and peak finding [and motif finding of course], HOMER can help assemble data across multiple experiments and look at positional specific relationships between sequencing tags, motifs, and other features.  You do NOT need to use the peak finding methods or other NGS routines from HOMER in order to use its motif analysis tools.

Generalized Analysis can be separated into the following steps for each experiment type:
Basic NGS Tutorial: Introduction to next-gen sequencing, FASTQ files, mapping, samtools, and more.
  1. Mapping to the genome (NOT performed by HOMER, but important to understand)
  2. Creation Tag directories, quality control, and normalization. (makeTagDirectory)
  3. UCSC visualization (makeUCSCfile, makeBigWig.pl)
  4. Peak finding / Transcript detection / Feature identification (findPeaks, getDifferentialPeaksReplicates.pl)
  5. Motif analysis (findMotifsGenome.pl)
  6. Annotation of Peaks (annotatePeaks.pl)
  7. Quantification of Data at Peaks/Regions in the Genome/Histograms and Heatmaps (annotatePeaks.pl)
  8. Quantification of Transcripts and Repeats (analyzeRNA.pl, analyzeRepeats.pl)
  9. Peak finding / Differential Peak calling with Replicates (getDifferentialPeaksReplicates.pl)
  10. Quantifying Differential Features/Enrichment/Expression (getDiffExpression.pl)
Additional analysis strategies:

Tutorials for Individual Techniques:

csRNA-seq: Technique that isolates 5' capped, short RNAs (20-60nt) from total RNA to map initiating transcripts genome-wide. Maps transcription activity at regulatory elements and efficiently captures initiation from both promoters and enhancer regions. (also works for Start-seq/5'GRO-seq/GRO-cap/PRO-cap/TSS-seq/5'RNA-seq, etc.)

ChIP-Seq: (Tutorials 1-10 above are geared toward ChIP-Seq and RNA-Seq) Isolation and sequencing of genomic DNA "bound" by a specific transcription factor, covalently modified histone, or other nuclear protein.  This methodology provides genome-wide maps of factor binding.  Most of HOMER's routines cater to the analysis of ChIP-Seq data.

RNA-Seq: (This one is currently only a quick-recipe driven list of commands, but the tutorials 1-3, & 8 above are geared toward RNA-Seq) Extraction, fragmentation, and sequencing of RNA. There are many variants on RNA-seq too, such as Ribo-Seq (isolation of ribosomes translating RNA), small RNA-Seq (to identify miRNAs), etc.

GRO-Seq: RNA-Seq of nascent RNA.  Transcription is halted, nuclei are isolated, labeled nucleotides are added back, and transcription briefly restarted resulting in labeled RNA molecules.  These newly created, nascent RNAs are isolated and sequenced to reveal "rates of transcription" as opposed to the total number of stable transcripts measured by normal RNA-seq.

Hi-C: Genomic interaction assay for understanding genome 3D structure.  This assay is much more specialized - For more information about how to use HOMER to analyze Hi-C data, check out the Hi-C analysis section.

DNase-Seq: Treatment of nuclei with a restriction enzyme such as DNase I will result in cleavage of DNA at accessible regions.  Isolation of these regions and their detection by sequencing allows the creation of DNase hypersensitivity maps, providing information about which regulatory elements are accessible in the genome.


Tutorials for Different Strategies of Analysis

Unannotated Organisms: Using HOMER with unsupported species or poorly annotated organisms.

Analyzing Data in Genomic Repeats: (For now please refer to tutorial #8 above) Quantifying sequencing data in genomic repeat regions.





Can't figure something out? Questions, comments, concerns, or other feedback:
cbenner@ucsd.edu