HOMER

Software for motif discovery and next-gen sequencing analysis

Converting File Formats for Hi-C Analysis with HOMER

Quick Reference

#optional - create a *.hic file if you have juicer_tools installed to visualize with Juicebox (output file will be placed inside the tag directory):
tagDir2hicFile.pl HicExp1TagDir/ -juicer auto -genome hg38 -p 10

Creating *.hic files to visualize with Juicebox

Surfing through your Hi-C data with Juicebox is a liberating experience. This will require a hic file, which is a binary file encoded to specifically work with Juicer/Juicebox tools. To generate one from your HOMER Hi-C tag directories, HOMER provides a script to automate the process of generating the file. However, this requires that "juicer_tools" be installed and available on the executable PATH (juicer_tools is an alias provided to run the juicer.jar JAVA program). Alternatively, you can run your original FASTQ files through the Juicer pipeline. In general the program works like this:

tagDir2hicFile.pl <HiC Tag Directory> -juicer <outputFilename.hic> -genome <genome version> -p <# CPUs>

tagDir2hicFile.pl HicExp1TagDir/ -juicer auto -genome hg38 -p 10
This command will produce a *.hic file - if "-juicer auto" is specified, the hic file will be created inside the tag directory. This file can then be used with the Juicer/Juicebox family of tools.

If the "-juicer <filename>" option is omitted, the command will actually generate a Hi-C summary formatted file and send it to stdout.

Command line options for tagDir2hicFile.pl

        Usage: tagDir2hicFile.pl <tag directory> [options]

        By default, this program will output a file in "HiC summary" format to stdout:
                id<tab>chr1<tab>pos1<tab>strand1<tab>chr2<tab>pos2<tab>strand2

        Options below can be set to help output a *.hic file for use with juicebox/juicer

        Options (most are for use with juicer):
                -juicer <filename.hic> (create *.hic file with juicer, "-juicer auto" places file in tagdir)
                -genome <genome> (genome is passed on to juicer_tools - if using a normal genome, i.e. hg38,
                        mm10, etc. it's probably best to specify the genome code - if juicer_tools can recognize it.
                        Otherwise specify the path to a chrom.sizes file instead of the genome code)
                -juicerExe <"command to run juicer_tools"> (executable for running juicer_tools,
                        by default assumes "juicer_tools" is in the executable PATH)
                -juicerOpt <"juicer options"> (command line options to pass to juicer, use quotes "...")
                -p <#> (number of CPUs to use during sort command for juicer file creation, default: 1)
                -short <filename> (output read pairs in "short format" for processing with juicer,
                        but don't run juicer_tools. This file will not be sorted the way juicer wants it)

Can't figure something out? Questions, comments, concerns, or other feedback:
cbenner@ucsd.edu