|
Advanced Annotation
For some people, the default annotation scheme HOMER uses
just isn't enough! This page will reveal how to get
under the hood and muck around with the HOMER style
annotations.
Using Custom Annotations
To use custom annotations with HOMER, you
basically need to create a HOMER-style peak file that
contains all of the features you'd like to use with
annotation. The key is that the file must be sorted
such that the high priority annotations are at the top of
the file, and lowest priority annotations at the
bottom. The priority is important since each
location in the genome can be annotated a different way -
for example a promoter region can also be technically
considered intergenic space, or a CpG Island in some cases
- which annotation is most important to you. An
example of this is done on the fly when you provide a
custom GTF file with transcript definitions to annotatePeaks.pl
when doing annotation:
#(behind the scenes when running
annotatePeaks.pl with the -gtf <gtf file> option)
parseGTF.pl transcripts.gtf ann > annotations.txt
You'll notice that this output file places all
of the promoter regions at the top, followed by TTS
(transcription termination sites), followed by exons/utrs,
introns, etc. You could reshuffle this file if you
want to change the priorities of the annotations such that
exons and not promoters are at the top - that way if a
given regions is annotated as both a promoter and an exon,
it's final annotation assignment will be an exon.
You can of course use whatever you want - no need to start
with a GTF file. You could get the regions from any
source you like, such as ChIP-Seq peaks, the annotation
folder in homer (i.e.
homer/data/genomes/hg19/annotation/), etc.
Before you can use these annotations in the file you
created, you need to preprocess them with the program assignGenomeAnnotation
to make a final annotation table:
assignGenomeAnnotation <ann
peak file> < ann peak file> -prioritize
<ann table file> > stats.txt
assignGenomeAnnotation annotations.txt
annotations.txt -prioritize annotations.final.txt >
stats.txt
(you need to specify the annotations.txt file twice)
The output (in this case "annotations.final.txt") can then
be used as an annotation file with HOMER. You can
use it with annotatePeaks.pl, or use it directly with
assignGenomeAnnotation (this is what annotatePeaks.pl does
under the hood):
annotatePeaks.pl <peak/BED file>
<genome> -ann <annotation table file>
> output.txt
(i.e. -ann annotations.final.txt)
or
assignGenomeAnnotation <peak/BED
file> <annotation table file>
-ann <annotated output> > stats.txt
|