|
Converting File Formats for Hi-C Analysis with HOMER
Quick Reference
#optional - create a *.hic file if you
have juicer_tools installed to visualize with
Juicebox (output file will be placed inside the tag
directory):
tagDir2hicFile.pl
HicExp1TagDir/
-juicer auto -genome hg38 -p 10
Creating *.hic files to visualize with Juicebox
Surfing through your Hi-C data with Juicebox is
a liberating experience. This will require a hic file,
which is a binary file encoded to specifically work with
Juicer/Juicebox tools. To generate one from your HOMER
Hi-C tag directories, HOMER provides a script to automate
the process of generating the file. However, this requires
that "juicer_tools" be installed and available on the
executable PATH (juicer_tools is an alias provided to run
the juicer.jar JAVA program). Alternatively, you can run
your original FASTQ files through the Juicer pipeline. In
general the program works like this:
tagDir2hicFile.pl <HiC Tag Directory>
-juicer <outputFilename.hic> -genome <genome
version> -p <# CPUs>
tagDir2hicFile.pl
HicExp1TagDir/
-juicer auto -genome hg38 -p 10
This command will produce a *.hic file - if "-juicer
auto" is specified, the hic file will be created
inside the tag directory. This file can then be used with
the Juicer/Juicebox family of tools.
If the "-juicer <filename>" option is
omitted, the command will actually generate a Hi-C summary
formatted file and send it to stdout.
Command line options for tagDir2hicFile.pl
Usage:
tagDir2hicFile.pl <tag directory> [options]
By default,
this program will output a file in "HiC summary" format to
stdout:
id<tab>chr1<tab>pos1<tab>strand1<tab>chr2<tab>pos2<tab>strand2
Options below
can be set to help output a *.hic file for use with
juicebox/juicer
Options (most
are for use with juicer):
-juicer <filename.hic> (create *.hic file with
juicer, "-juicer auto" places file in tagdir)
-genome <genome> (genome is passed on to
juicer_tools - if using a normal genome, i.e. hg38,
mm10, etc. it's probably best to specify the genome code -
if juicer_tools can recognize it.
Otherwise specify the path to a chrom.sizes file instead
of the genome code)
-juicerExe <"command to run juicer_tools">
(executable for running juicer_tools,
by default assumes "juicer_tools" is in the executable
PATH)
-juicerOpt <"juicer options"> (command line options
to pass to juicer, use quotes "...")
-p <#> (number of CPUs to use during sort command
for juicer file creation, default: 1)
-short <filename> (output read pairs in "short
format" for processing with juicer,
but don't run juicer_tools. This file will not be sorted
the way juicer wants it)
|