Parameters
The command-line interface is installed as excavate-ht.
Run the following in Terminal to print descriptions of parameters (also described below).
.. code-block:: bash
excavate-ht –help excavate-ht generate –help excavate-ht pair –help
EXCAVATE-HT Modes
EXCAVATE-HT has two modes: generate and pair.
excavate-ht generate
The generate mode creates a complete gRNA library using inputted variant data.
Required arguments
--vcfPath to the VCF file. Comma-separated if more than one (e.g.,
cell_line.vcf.gz,1000genomes.vcf.gz). If using a phased cell-line VCF, put that first. Required.--var_typeVariant type:
cell-lineorpopulation. Can specify multiple, comma-separated (e.g.,cell-line,population1,population2). Required.--chrom_faPath to the chromosome FASTA file for your locus of interest (e.g.,
chr1sequence.fasta). Required.--locusGenomic locus in format
chr#:start-end. Required.-o,--output_dirOutput directory for results. Folder name if you are already in the excavate folder. Required.
Cas protein and PAM configuration
You must specify either --cas OR both --pam-list and --orient.
--casSpecify Cas protein. One of:
SpCas9,SpCas9_NG,enAsCas12a, orSaCas9.--pam-listPAM sequences (if not using a supported Cas protein). Comma-separated. Use IUPAC codes. Required if
--casis not specified.--orientPAM orientation. Choices:
3primeor5prime. Required if--pam-listis specified.
Guide design parameters
--af-thresholdAllele frequency threshold between 0 and 1. A buffer of 0.01 is subtracted from the AF threshold to account for rounded values in VCF files. Default:
0.1-g,--guide-lengthGuide length in base pairs. Default:
20-m,--max_snppos_in_protospacerMaximum distance (bp) of SNP from PAM sequence. Default:
10
Off-target analysis
--off-targetsEnable off-target analysis: counts exact matches genome-wide and 1-bp mismatches in the chromosome of interest. Uses Bowtie1. Flag (no value needed).
--download-hg38-indexDownload prebuilt hg38 Bowtie indexes (GRCh38_noalt_as) into
<outdir>/bowtie_indexand use them automatically. Flag (no value needed).--genome-index-prefixBowtie1 index prefix for the genome FASTA (e.g.,
/path/to/index/hg38_bt1). If missing or index files are not found, EXCAVATE-HT will build it next to this prefix.--genome_faPath to the whole genome FASTA file for your organism. Required if building Bowtie indexes from scratch (when not using
--download-hg38-indexor--genome-index-prefix).--chrom-index-prefixBowtie1 index prefix for the chromosome FASTA (e.g.,
/path/to/index/chr1_bt1). If missing or index files are not found, EXCAVATE-HT will build it next to this prefix.--bowtie-threadsNumber of threads for Bowtie1. Defaults to
NSLOTSif set (HPC environments), otherwise 4.
Output options
--split-phasedEnable splitting of gRNA libraries by cell-line phasing. Flag (no value needed).
--summaryOutput a summary table for each gRNA library. Flag (no value needed).
--per-vcfSave single-gRNA libraries for each VCF file, split by allele. Flag (no value needed).
Pairing options
--pairing-methodEnable pairing of gRNA to output a dual-guide library. Choices:
r: pair all guides that target different SNPs together (default).fp: pair guides about a fixed point.t: tiled pairing - pair guides that target adjacent variants.
Default:
r-f,--fixed-points-listOne or more fixed points, comma-separated (genomic coordinate without chr#, e.g.,
11989251,12002042,...) in your locus to pair guides around. Required when using--pairing-method fp.
excavate-ht pair
The pair mode pairs an existing single-gRNA library to create a dual-guide library.
Required arguments
-i,--input-libraryPath to input single-gRNA library file. Required.
-o,--output_dirOutput directory for results. Folder name if you are already in the excavate folder. Required.
Pairing options
--pairing-methodEnable pairing of gRNA to output a dual-guide library. Choices:
r: pair all guides that target different SNPs together (default).fp: pair guides about a fixed point.t: tiled pairing - pair guides that target adjacent variants.
Default:
r-f,--fixed-points-listOne or more fixed points, comma-separated (genomic coordinate without chr#, e.g.,
11989251,12002042,...) in your locus to pair guides around. Required when using--pairing-method fp.