HOMER

Software for motif discovery and ChIP-Seq analysis

Change Log:

Major Bugs/Errors are shown in red
Major Additions are shown in blue
Minor stuff or upgrades that won't likely have a big impact are shown in black

HOMER v5.1 (7/16/24)

- Additional updates to documentation, some input error checking, etc.

HOMER v5.0 (4/24/24)

- Incorporation of HOMER2, which adds extensive position-dependent analysis tools and documentation (HOMER2).
- New background selection tools added to homer2 executable (homer2 background).
- New motif positional enrichment analysis screening (createHomer2EnrichmentTable.pl) and genetic variant analysis (pacifierHomer2.pl) tools added.
- Note: findMotifs.pl/findMotifsGenome.pl still use the old background selection by default (much less resource intensive), but use of "-useNewBg" can be used to invoke HOMER2 functionality.
- When selecting background regions, findMotifsGenome.pl now excludes regions that have any overlap with target regions.
- various smaller bug fixes, updates to annotation packages

HOMER v4.11 (10/24/19)

- added csRNA-seq analysis routines (findcsRNATSS.pl) and documentation page describing the analysis steps.
- annotatePeaks.pl and analyzeRepeats.pl will attempt to parse additional annotation information from GTF files (including gene names, descriptions, etc.)
- added "-all" option to getDifferentialPeaksReplicates.pl to report all possible peaks and their differential statistics.

HOMER v4.10 (5/16/18)

- Updates for annotation/genome information
- Fixed problem where the known motif matrices no longer show up in the knownResults output
- More color controls with makeMultiWigHub.pl
- Major updates to Hi-C analysis (DLR/ICF calculations, new normalization options, loop/TAD finding and analysis)
- Changed many of the defaults for analyzeHiC - now defaults to ihskb (which normalizes for resolution and sequencing depth)
- Hi-C PCA analysis can now be instructed to run on sub-chromosomal regions (runHiCpca.pl, -customRegions <peak/BED>)

HOMER v4.9 (2/20/17)

- Motif Logos are now generated as inline SVG (SVG files are also provided in the output to load directly into Illustrator or Inkscape). This also removes the requirement for installing ghostscript and weblogo. You can generate the logos the old way by specifying "-seqlogo" with findMotifsGenome.pl, findMotifs.pl, etc.
- Added getDifferentialPeaksReplicates.pl to help handle replicate peak calling and differential peak calling - uses R/DESeq2 to handle differential significance calculations. Works with ChIP-seq peak and TSS identification calling.
- Added options to annotatePeaks.pl, analyzeRepeats.pl, and getDiffExpression.pl to output variance stabalized gene expression/peak read density matrices (great for all sorts of things, particularly clustering and PCA analysis)
- Updated super enhancer peak finding to include options to exclude regions like promoters (can help with H3K27ac based SE detection)
- Added "-se <peak file>" to help with annotating super enhancers to target genes.
- mergePeaks with "-cobound <#>" option now reports the peak identities that overlap the reference peak list.

HOMER v4.8 (1/13/16)

- Updates to all genome annotation, gene annotation, known motif files.
- Lots of updates to getDiffExpression.pl - expanded options for automating edgeR/DESeq differential expression analysis. Works much better with ChIP-Seq data too.
- Added tool for meta-gene histogram creation (makeMetaGeneProfile.pl)
- makeUCSCfile now makes full resolution bedGraphs by default. You need to specify "-fsize <#>" to down sample to create smaller bedGraphs.
- loadGenome.pl now properly configured during installation (would have trouble recognizing location of libraries in previous versions)
- Hi-C analysis bug in chromosome ordering (applies only to non canonical chromosome names) causing Hi-C analyses to crash fixed.
- Bug in SAM/BAM file parsing fix - previously, the 'unmapped' flag in the sam flag field was ignored, causing problems if the aligner would output invalid chromosome names or positions (i.e. BWA may do this)
- Fixed minimum peak threshold selection problems when using normalized tag threshold (-ntagThresh) with findPeaks
- Fixes to Hi-C paired-read parsing. Previously there the makeTagDirectory program may have crashed if certain chromosomes where present in one set of files but not in another.

HOMER v4.7 (8/25/14)

- Many many small changes and updates
- In some cases strand specific read counting seemed to randomly switch to unstranded read counting when using annotatePeaks.pl - fixed.
- Default behavior for annotatePeaks.pl was to count reads based on the average peak size - now the default is "-size given" (recommended to always use -size parameter)
- Modification to background annotation and update scripts that links gene symbols/IDs to more relevant RefSeq/Ensembl IDs. Before they would often link to XM_###### which are less useful.
- Incorporated several tools for mCpG analysis from methylC-Seq or BS-Seq data - documentation to come soon.
- Fixed issues with GTF parsing. By default it will output each transcript. However, when running most programs or parseGTF.pl you can specify "-gid" to have the program output representative transcripts using the gene_id instead of the transcripts_id.
- Changed default on -strand option for analyzeRepeats.pl to 'both' because it is a safer default.
- Changed the way the accession number tables are created in HOMER to prioritize the assignment of NM RefSeq numbers as representative IDs for each gene instead of XM and NR RefSeq numbers. This had caused some issues when trying to match up data to promoters in the past - this modification should dramatically improve results when attempting to do similar analysis in the future. This change applies to all of the 'organism' packages
- Updated all organism and UCSC genome packages

HOMER v4.6 (3/29/14)

- perl scripts will now run perl from the PATH instead of /use/bin/perl (modified shebangs)
- findPeaks now finds super enhancers with the "-style super" option - works better now with large data sets
- getGWASOverlap.pl script has been fixed (upgrades to mergePeaks broke it from before)
- findHiCDomains.pl has been improved and streamlined
- getPeakTags/findPeaks have been updated to work faster and use less memory particularly for large data sets.

HOMER v4.5 (1/27/14)

- Updated peak finding and read counting code to be much more memory efficient (findPeaks, getPeakTags)
- findPeaks now finds super enhancers with the "-style super" option (peak finding documentation is updated too)
- Using GTF files will now (by default) report the transcript_id in the output file, not the gene_id (parseGTF.pl)
- Fixed bug so that annotatePeaks.pl will now center peaks on motifs with the "-size given" option.
- Fixed edge effects with tag coverage and bedGraph/wig in annotatePeaks.pl histograms

HOMER v4.5 (1/27/14)

- Fixed error in updateGeneIdentifiers.pl that caused none of the key annotation files to be downloaded.
- Fixed bug in assignGenomeAnnotation introduced in last version that provided incorrect annotation priority assignment (i.e. TSS given priority over intron annotations when overlapping, etc.)
- Modified updateUCSCGenomeAnnotation.pl slightly to be smarter and more automated.
- Updated annotations in all genome packages
- Updated configureHomer.pl script to correctly remove packages
- Fixed website to provide access to old versions of the software
- Fixed warnings during c++ compilation

HOMER v4.4 (1/14/14)

- Updated annotations and system for data organization. Organism accessions and GO are now managed separately from the main code and promoters/genomes.
- Code for updating gene accessions, promoter locations, genome annotations, etc. are now included in HOMER and available in the homer/update/ directory.
- loadPromoters.pl and loadGenomes.pl scripts now make it much easier to incorporate any organism into HOMER.
- makeUCSCfile and annotatePeaks.pl now normalized experiments to a fragment length of 100. Experiments with larger/smaller lengths are normalized in bedGraphs and the 'Coverage' column of annotatePeaks.pl historgrams.
- Changed defaults for mergePeaks to use the given size of the peaks when merging ("-d given" is now default, not 100 bp)
- Fixed rare bug with makeTagDirectory that would cause some some chromosomes to change the strand of all reads on the chromosome (Not many reports from users, but could have happened in the last version...)
- Added option to findMotifsGenome.pl and preparseGenome.pl such that the user can choose the directory to store preparsed files. This is useful when a system has many users and a single, shared installation of HOMER. Also, by default, the command will set the permissions on the preparsed directories to be group writable.
- Most programs now include the command line options used in the output for better record keeping.
- Fixed error in mergePeaks when merging peaks from a single peak file - previously there was potential for problems when a peak file was completely within another one (only affected variable length peaks)
- Added option with annotatePeaks.pl to store annotation enrichment results ("-annStats <outputfile>").
- Made compareMotifs.pl parallel so that checking known motifs for matches is much faster if running findMotifs.pl/findMotifsGenome.pl with multiple CPUs.
- findGO.pl (which is run by findMotifs.pl) will now check ontologies in parallel will multiple CPUs.
- Change in line-up for GO enrichment (incorporates NCBI's biosystems database which includes KEGG, reactome, etc.)
- Added support for finding super enhancers (findPeaks "-style super")
- parseGTF.pl now removes any accession number versioning (i.e. NM_012345.2 -> NM_012345) from identifiers.
- Fixed annotatePeaks.pl so that if a custom genome is used with an unknown organism, it will still try to add gene information from the "-gene <file>" option.
- Conservation options (for phastCons plots) are being phased out. New instructions on how to analyze conservation will appear soon.

HOMER v4.3 (8/26/13)

- Added automation scripts (batchParallel.pl, batchMakeTagDirectories.pl, batchFindMotifsGenome.pl, etc.) and documentation.
- Fixed issue with BED file processing with non-unique IDs. (added -unique option to bed2pos.pl)
- analyzeRNA.pl now defaults to "-count genes" instead of "-count exons".
- findMotifsGenome.pl now requires that you specify the -size parameter when running it.
- Removed duplicates in the known motif library
- Fixed error with pthread initialization that would cause de novo motif finding to crash in rare circumstances (homer2, findMotifsGenome.pl, findMotifs.pl)
- By default, hubs will not have a line drawn at zero (looks a little more professional, makeMultiWigHub.pl)
- Added scrambleFasta.pl script, in case you don't have a background file for motif finding. findMotifs.pl no longer requires a background file, although it's still highly recommended.
- Fixed bug in the reporting of FDR for de novo motif finding (using -fdr <#> option). Previously, the HTML output page would occasionally report the FDR of motifs that were similar to the primary motif, not the FDR calculation for the motif itself. The "motif files" and "more information" page reported the proper FDR before - now all report the correct value.

HOMER v4.2 (4/11/13)

- Fixed error in annotation that would lead some peaks found right on the boundary of two different annotations to be assigned the default (intergenic). Fixed this and added more output statistics for annotation regions, including the total amount of sequence assigned to each annotation so that the expected annotation can be calculated (annotatePeaks.pl, assignGenomeAnnotation)
- HOMER can now extract sequence information from a near unlimited number of peak regions (previously it would slow down if regions overlapped continuously across the chromosome, homerTools).
- Sequence extraction, QC, and peak finding routines have had bugs fixed that arise when analyzing genomes with thousands of scaffolds (makeTagDirectory, findPeaks, annotatePeaks.pl)
- findPeaks fixed - fold threshold calculations relative to input and local read density have been modified. Previously, the minimum coverage in the background region was set to the average genomic coverage. This has been replaced with a pseudo count (0.5 reads) - this helps with small genomes where the average genomic background may be quite high. This change increases sensitivity (findPeaks).
- fixed bug in motif comparison output after motif finding (compareMotifs.pl)
- Many upgrades to analyzeRepeats.pl
- Gene Ontology result files for all ontologies now report gene symbols instead of Entrez Gene IDs (easier for users to interprets)
- Fixed problem with double-counting of isoforms for GO analysis (findMotifs.pl, findGO.pl, annotatePeaks.pl)
- Arabidopsis annotation changes: Chromosomes now named "1" instead of "Chr1" to be consistent with Ensembl annotations.

HOMER v4.1 (11/2/12)

- Make efficiency improvements to SIMA.pl and runHiCpca.pl (correlation calculations) for Hi-C. Added "-rawAndExpected <file>" output option for analyzeHiC to allow simultaneous reporting of raw and expected interactions at the same time.
- Enabled 2D historgrams to work correctly for Hi-C interaction data (analyzeHiC with the "-hist <#>" option). - Improved GC normalization options in makeTagDirectory. New option "-iterNorm <#>" allows for more precise normalization control
- fixed mergePeaks to allow merging of single peak files that are strand specific.
- fixed "-gsize <#>" input parsing for findPeaks (accepts scientific notation i.e. 2e9 now)
- fixed problem with findPeaks that will cause peak coordinates to be negative or larger than the chromosome when centering peaks.
- fixed file naming bug for QC files when parsing Hi-C alignment files with makeTagDirectory ("LocalDistribution.txt" file was incorrectly named).
- fixed scaling bug for histograms in annotatePeaks.pl - "Coverage" column may not be compatible with histograms made with different bin sizes (fixed now)

HOMER v4.0 (10/15/12)

- Incorportated Hi-C routines into HOMER (several programs including analyzeHiC, runHiCpca.pl, etc.)
- release of new documentation for Hi-C analysis, including updates to other parts of the annotation

HOMER v3.18 (10/2/12)

- Added support to annotatePeaks.pl to quantify WIG file coverage at peaks ("-wig <WIG file>" option)
- Fixed SAM parsing for paired-end files

HOMER v3.17 (9/15/12)

- pre-release of Hi-C routines
- additional option for SAM/BAM parsing in makeTagDirectory ("-unique", "-keepOne", "-keepAll").

HOMER v3.16 (8/15/12)

- Updates to analyzeRepeats.pl to enable additional control over classes of repeats reported.
- Added support for mCpG files fron Encode (makeTagDirectory option "-format mCpGbed")

HOMER v3.15 (8/2/12)

- Added program analyzeRepeats.pl to quantify reads in repeat regions (will likely replace analyzeRNA.pl in the near future). - configureHomer.pl now has options -bigWigDir, -bigWigUrl, -hubsDir, -hubsUrl to set values used in makeBigWig.pl and makeMultiWigHub.pl and stores them in the config.txt file so they will be constant with future updates.

HOMER v3.14 (7/20/12)

- Updated and added species specific motif libraries (vertebrates, insects, worms, yeast, plants, all). findMotifs.pl and findMotifsGenome.pl will try to auto detect the organism based on promoter set/genome. Can be overridden with, "-mset setName". - Modernized the analyzeChIP-Seq.pl script to handle the major option for findPeaks/findMotifsGenome.pl/annotatePeaks.pl - makeUCSCfile can now accept input experiments ("-i inputDirectory") to normalize the bedGraph files. To avoid low coverage artifacts, psuedo counts are added when performing the ratio calculation ("-pseudo #", default: 5). Can also report the log2 ratio ("-log").

HOMER v3.13.1 (7/18/12)

- Added routines for mC analysis, makeTagDirectory can process encode style methylation files ("-format mCpGbed") into mC tag directories. These can be used with annotatePeaks.pl to calculated methylation profiles and avg. methylation content at peaks (when running annotatePeaks.pl use the "-ratio" flag)
- Fixed bug in analyzeRNA.pl that would leave off counting the first nucleotide of the gene
- Updated motif library, other small things

HOMER v3.13 (6/22/12)

- Update to annotation system, better annotation for ncRNA, more accurate UTR boundaries (some were off by a bp or two
- No more separate "masked" genomes - homerTools extract now has option "-mask" that will replace soft masked sequences (e.g. lowercase letters) with N. Programs like findMotifsGenome.pl and annotatePeaks.pl now have option "-mask" or will interpret hg18r as shorthand for "hg18 ... -mask"
- In makeTagDirectory Fixed errors with CIGAR parsing with SAM files - improved RNA-Seq bedGraph visualization at splice junctions, use "-fragLength given" with makeUCSCfile or makeBigWig.pl etc.
- Fixed a bunch of other stuff I can't remember...

HOMER v3.12 (6/8/12)

- Bugs fixed and small options added.
- Fixed problem with mergePeaks crashing

HOMER v3.11 (5/21/12)

- Lots of bugs fixed and small options added.
- Fixed inconsistencies with treating BED files as zero-indexed
- annotatePeaks.pl maintains the peak order when making heatmaps
- Fixed IP efficiency calculation for "-style histone" or "-region" in findPeaks - Fixed bug with multi-processor support from some linux distros (should crash anymore) with motif finding
- Fixed bug in getDifferentialPeaks with -size parameter

HOMER v3.10 (3/22/12)

- Lots of bugs fixed and small options added.
- annotatePeaks.pl now works with bedGraph files in a manner similar to tag directories (option "-bedGraph <file>").
- Added "-precision <1|2|3>" option to makeTagDirectory to print values if format 1.0 or 1.00 or 1.000 (useful if normalizing or using fractional tag counts)
- Fixed annotatePeaks.pl "-center <motif>" option when using unbalanced peak size (i.e. "-size -200,50").
- Added program removeOutOfBoundsReads.pl to remove reads that are out of bounds, causing problems for UCSC (some alignment programs have a tendency to do this)
- several new options added to compareMotifs.pl (scale heights of logos to information content "-bits", skip similar matching/visualization "-basic", etc.)

HOMER v3.9 (2/1/12)

- Fixed bigWig/hub creation to work with updates at UCSC (makeBigWig.pl/makeMultiWigHub.pl) makeBigWig.pl now requires that you enter the genome as an argument, and when making bigWigs with makeUCSCfile, you need to specify the chrom.sizes file (makeBigWig.pl and makeMultiWigHub.pl take care of this automatically)
- annotatePeaks.pl can now process bedGraph files just like a tag directory by using the "-bedGraph" option (i.e. make histograms, calculate read density, etc.)

HOMER v3.8.2 (1/6/12)

- Fixed issue with findPeaks using too much memory. Added option "-minTagThreshold <#>" that controls the smallest peaks to consider. By default this is set at the uniform density (i.e. expected tags per peak region given a uniform tag coverage)
- Fixed bug with findMotifsGenome.pl/findMotifs.pl/homer2 "-cache <#>" that caused a crash if too large of a cache was specified.
- Fixed bug with 5' adapter trimming and added the option to trim adapter sequence while allowing mismatches with homerTools.

HOMER v3.8.1 (11/30/11)

- (3.8.1) Fixed sequence parsing issues for short sequences for de novo motif finding, added support for % based histograms over variable length regions (i.e. gene bodies)
- Added support for UCSC Hub Creation (makeMultiWigHub.pl)
- Modified general routines to work better with large numbers of chromosomes (i.e. genomes composed of scaffolds like X. tropicalis)
- Added support for Arabidopsis (tair10) and X. tropicalis (xenTro2)
- Updated annotations
- Fixed bug with CpG/GC% calculations in annotatePeaks.pl when dealing with variable length peaks/regions
- "-forceBED" option is now standard for parsing BED files of sequence read alignments (makeTagDirectory), new option "-force5th" to use 5th column of BED file as read count
- Fixed issue with auto-detecting BAM files

HOMER v3.7 (11/02/11)

- Added q-value/FDR calculations to de novo motif finding. Unfortunately, due to the complexity behind the de novo algorithm, the only way to do this is to calculate it empirically by randomizing the data and recalculaing motifs. As a result, it takes a long time to calculate FDR values (option -fdr <#>, # is the number of randomizations, findMotifsGenome.pl)
- Fixed bug in findMotifsGenome.pl causing the option "-size given" to crash.
- Fixed calculation of peak overlap significance in mergePeaks, in cases where "-d #" is used.

HOMER v3.6 (10/12/11)

- Fixed bugs in mergePeaks, v3.5 would crash in some extreme cases. Fixed significance calculations for peak overlaps (again) to deal with nearby peaks from the same file.
- Changed the way "-matrix <filename>" works with mergePeaks/annotatePeaks.pl.
- Added q-values (Benjamini multiple hypothesis testing corrections) to know motif finding (findMotifs.pl/findMotifsGenome.pl)
- fixed bug in findMotifs.pl that causes problems with custom promoter sets and failure to output de novo motifs

HOMER v3.5 (10/06/11)

- Changed how findPeaks interprets genomes size: When using "-gsize <#>", use the number of mappable bases, not 2x the number as previously suggested.
- Added "-nfr" flag to findPeaks to help find nucleosome free regions in histone modification data (works best with MNase datasets)
- Fixed bug in mergePeaks - when merging peaks when several peaks in the same file are within range (i.e. merging transcription factor peaks within 100000 bp), latest version would sometimes crash.
- mergePeaks now outputs the total number of peaks that contribute to each peak in the 8th column.
- fixed bug with findPeaks - normally findPeaks uses the average coverage of an experiment as the minimum when considering the enrichment over input signal - this value was divided by 2 in the previous versions. Weak ChIP-seq experiments will likely see less peaks in the output file now when using input to filter the experiment (but the peaks you do get back will be better)

HOMER v3.4 (09/30/11)

- Fixed problem with motif statistic reporting for de novo motifs (findMotifsGenome.pl, findMotifs.pl, homer2). Previous version calculated motif percentages using sequence that had more significant motifs masked, causing some instances to be missed. HOMER now reports the % of sequences containing the sequence using the original sequences. The motifs found by the algorithm themselves are uneffected, just the reported statistics have changed. This change on reflects statistics found in the homerResults.html file. Other tasks, such as searching for motifs, are uneffected.
- Fixed GTF format parsing when the file is not sorted properly (caused some annotation to be dropped from consideration if the file wasn't presorted).
- Chuck facts must now be installed separately using the configureHomer.pl program ("-getFacts")

HOMER v3.3 (09/28/11)

- makeTagDirectory will now take gzipped (*.gz), zipped (*.zip), bzip2(*.bz2), and bam (*.bam) files directly - no need to decompress them. samtools needs to be installed and available on the executable path for homer to work with *.bam files
- Fixed total mapped tag normalization when running annotatePeaks.pl or analyzeRNA.pl with options like "-pc <#>" that limit the number of reads considered an a specific position. These programs now reference the "tagCountDistribution.txt" file in the tag directory to properly scale the total number of tags to be compatible with the limiting function.
- fixed problem with mergePeaks - peaks from the same file within the distance ("-d #") are also merged.
- mergePeaks now outputs how many peaks were merged in each output peak (useful when merging over large distances)
- mergePeaks can now work with a single peak file to merge peaks found within a given distance of one another. Can also be used to filter a peak file for peaks found within a given region.
- Slight modifications made to the calculations for peak overlap significance in mergePeaks (When using "-matrix ..." option). Total coverage of the peaks is not calculated to adjust numbers when peaks are overlapping (not done before, might have been a problem for a peak file with many overlapping peaks)
- Option for excluding Chuck Facts added ("-nofacts") to findMotifs.pl, findMotifsGenome.pl, and compareMotifs.pl. To permanently remove them, remove the file in homer/data/misc/
- compareMotifs.pl, which is used by the de novo motif finding programs to determine the similarity between de novo and known motifs, has been modified such that Pearson's correlation is the default for comparing motif matrices. When comparing matrices of different length they are elongated with 0.25 frequencies. Overall an improvement based on what a human would "expect".

HOMER v3.2 (08/11/11)

- Fixed problem with peak annotation (if peaks are overlapping some would not be annotated)
- Lots of small things, such as file format detection, etc.

HOMER v3.1 (05/25/11)

- Added "easy" Custom Genome support. Instead of specifying a "genome" such as "hg18" for programs such as findMotifsGenome.pl or makeTagDirectory, you can specify the path to the genomic FASTA files (either a single file or a directory with FASTA files named for each chromosome).
- No longer need a "reference file" for preparsing the genome, now it will just randomly determine regions if one is not provided or cannot be found (i.e. if you have a custom genome).
- "-chopify" option will automatically split up large background regions to the size of the target regions. So if you're too lazy to explicitely select regions as background, you can provide a large FASTA file (with findMotifs.pl in FASTA mode) or a large region (i.e. a whole chromosome in findMotifsGenome.pl) and the "-chopify" option will tell HOMER to chop up the region into smaller, target-sized, chunks.
- "-rna" option for motif finding to output RNA style motifs and automatically searches only the + strand.
- Fixed SAM format auto detection. Works better now (If you have BAM, use samtools to convert BAM formated files to SAM)
- Fixed problem with automatic genome size dectection with smaller genomes in findPeaks.

HOMER v3.0 (05/09/11)

- New motif finding program homer2

New masking strategy increases sensitivity for finding co-regulated motifs
Autonormalization helps reduce problems caused by sequence composition bias

- Added zebrafish (danRer7) and yeast (sacCer2) support.
- Added GRO-Seq analysis routines (findPeaks, analyzeRNA.pl)
- Added read normalization and bundled QC into makeTagDirectory
- Bunch of other stuff...

HOMER v2.7 (12/14/10)

- added support for parsing alignment in SAM format. (If you have BAM, use samtools to convert BAM formated files to SAM)

HOMER v2.6 (10/21/10)

- batchAnnotatePeaks.pl program for making histograms across multiple peak files.
- findPeaks now has "-region" mode for identifing variable-length regions of signal enrichment.
- tagDir2bed.pl script available to easily export tag data into bed file format
- findMotifsGenome.pl and annotatePeaks.pl are now BED file compliant! Since everyone uses those, I guess Chuck can too. But his still prefers position/peak files! Not all subprograms work with BED files, so you may still need to use bed2pos.pl to switch formats to do certain tasks.

HOMER v2.5 (10/11/10)

- Fixed errors in mergePeaks script (gave negative chromosome coordinates before if compiled on certain systems)
- No longer keep around temporary/sequence files from the motif finding
- Added wikipathways to GO analysis
- Fixed bug in clonal filtering with peak finding (affects highly clonal experiments/lower organism analysis)
- getDistalPeaks.pl program now finds intra/inter-genic peaks
- updated promoter sets to use refseq
- New error checking for FASTA file input (previously HOMER required only ACGTN characters).

HOMER v2.4 (08/30/10)

- New annotation system (2 tier - promoter/exon/intron/intergenic and detailed i.e. repeats, etc.). Also, fixed a slight bug in annotation priority script.
- Genomic Gene annotations standardized on refGene.txt file from UCSC. (includes miRNA and some other non-coding RNAs)
- New GenomeOntology.pl program - significance association calculations with genomic annotations like repeats, other peak files, etc.

- Annotations for exons/introns/promoters etc., repeats, gene deserts/gene rich regions, GO terms, peaks from published data.
- Works with both peaks and with tag directories (for peak independent analysis)
- Added as an

- New mergePeaks program - rewritten in C++ with added functionality
- Added strand specific tag counting to annotatePeaks.pl ("-strand" option)
- Added analyzeRNA.pl program to compute gene expression levels from RNA-Seq data. Will also work for repeats, but need a lot of memmory.
- Added tts mode (to go along with tss mode) in annotatePeaks.pl.
- Added "motifFindingParameters.txt" file that remembers which parameters where used during motif finding
- Added "bias motifs" to known libraries help users identify motifs that are likely come from sequence bias or are just garbage.
- Added strand support to makeUCSCfile for UCSC genome browser tag pile-ups - fixed problem with tag extensions exceeding the size of the chromosomes.
- Added MSigDB links and annotations to GO analysis.
- Updated all annotations, accessions as of 8/30/2010.

HOMER v2.3

- Fixed error in histogram creation (stupid round-off error shifted the location of some peaks)
- assignGenomeAnnotation program rewritten in C++ for a dramatic speed up.
- Fixed problem of findMotifsGenome.pl crashing if non-unique peak ids are used
- Fixed redundant ID creation when generating background sequences (minor issue)
- Added peak file checking (i.e. for non-redundant IDs)
- Added more known motifs
- Bunch of other minor stuff I can't remember...

Can't figure something out? Questions, comments, concerns, or other feedback:
cbenner@ucsd.edu