assembleTemplate is a required command, and Initiates the assembly of the loaded sequences using the specified template as a reference.
Example:
XNG script used in the “clustering” step of the de novo transcriptome RNA-seq workflow:
merSize: 25
minNewClusterSize: 5
minSingleMergeClusterSize: 7
minMultiMergeClusterSize: 7
minMultiMergeIgnoreFactor: (currently not used by default)
minClusterSizeToOutput: 100
Parameter | Description | Allowed values (defaults underlined) |
---|---|---|
alignmentCutoff | Used in the “clustering” step of the de novo transcriptome RNA-seq workflow. | [number] Default = 200 |
assemble | Specifies whether to use the part of the query that matches the contaminant sequence(s), the part that doesn’t match, or both. | [ matchContam / noMatchContam / all ] |
assemblyInfo | Contains information about the assembly. | [text string] |
assemblyInfoAlt | Contains pairs of keys and values which will be written to the -0.assemblyInfo file. | |
autoTrim | Specifies whether mismatching ends of reads should automatically be trimmed. | [ true / false] |
autoTrim | Specifies whether mismatching ends of reads should automatically be trimmed. | [ true / false] |
boneyardAssembly | Specifies whether sequences not used in the original or incremental XNG assemblies should be added to the assembly project by the SNG assembler. This command pertains only to reference-guided assemblies with gap closure. By default, during this type of assembly, the XNG assembler first finds structural variations (SVs) then splits the contig after each SV. Elements of this process can be modified using this command. (Note: “Boneyard” is a term for sequences that were not assigned to any contig). | [ true / false] |
combineDuplicateSeqs | Specifies whether the duplicate reads will be clustered. | [ true / false] |
contaminant | Use of this parameter partitions the query data by running an additional mer-match (layout) against the specified contaminant sequence(s). A full assembly is then run using the part of the query that either matches or does not match the contaminant sequence(s). This parameter can be used for removing reads originating from an organism(s) that may have also been present in the query data set (e.g., reads from human DNA present in a metagenomic sample from the human gut). file: [directory/filename enclosed in quotes] the file with contaminant sequences. assembleContam: [matchContam/noMatchContam/all] merLayoutMin: [number] unassembled: [directory/filename enclosed in quotes] the file containing no contaminant reads. |
[directory/filename enclosed in quotes] |
dbSNPTable | (Intended for internal use only). | [directory/filename enclosed in quotes] |
delayAlignInserts | Use of this flag turns the delay reads that cause inserts on or off. ‘True’ means that gap causing reads will be delayed. Reads will be added such that reads causing the lowest number of inserts (length of inserts is not considered) will be added before those causing more inserts. | [true / false] Defaults: true for named read technologies; false for ‘Other’ read technologies |
deleteIntermediates | Specifies whether intermediate files are saved or deleted. These files can be large with large-scale projects. | [true / false / none / all / notTemplateMer] |
directoryMer | Specifies the path and directory where both the template and query data mer files will be stored. Alternatively, separate directories for the template and query mer files can be specified using the parameters below. If no directory is specified, the mer file will be created in the directory containing the sequence data. | [directory/filename enclosed in quotes] |
directoryQueryMer | (required) Specifies the path and directory where the query mer file will be stored. | [directory/filename enclosed in quotes] |
directoryTemplateMer | (required) Specifies the path and directory where the template mer file will be stored. | [directory/filename enclosed in quotes] |
filterDeepLayout | (optional) Specifies that XNG remove superfluous sequences in areas of deep coverage. Wizard equivalent: Using ‘true’ is equivalent to selecting the Limit all deep coverage regions radio button from the Alignment tab. This tab is accessed from the Assembly Options screen by pressing the Advanced Options button. |
[ true / false ] Set to ‘false,’ by default, except for projects involving miRNA or microbial genomes, where it is set to ‘true.’ |
filterDeepLayoutOrganelle | (optional) Specifies that XNG remove superfluous sequences in areas of deep coverage. Wizard equivalent: Using ‘true’ is equivalent to selecting Advanced Assembly Options > Alignment tab > Only limit deep coverage regions for Mitochondria and Chloroplasts radio button | [ true / false ] Set to ‘false,’ by default, except for projects involving a mitochondrial or chloroplast template (i.e., those with a short name of ‘MT’,‘M’, or ‘CHL’ or ‘chloro’), where it is set to ‘true.’ |
forceFullForwardAlign | Start the alignment at the 5’ end of the sequence. | [ true / false ] |
forceMake | Specifies whether new intermediate mer files will be created. A value of ‘false’ means that existing valid intermediate files will be used. | [ true / false / query / hit / layout] |
format | Specifies the format of the alignment output file. If ‘none’ is entered, the assembly is run to include the alignment phase, but no alignment output is generated. This parameter can be used to remove reads from a contaminant source. | [ BAM / SQD / NONE / NONE_align/Aux_align] |
gap5Prime | Put the gap on the 5’ side of the sequence. | [ true / false ] |
gapPenalty | The penalty for opening or extending a gap during an alignment. This penalty is deducted from the pairwise score used to calculate match percentage. A high gap penalty suppresses gapping, while a low value promotes gapping. | [number] Default = 30 for most workflows, 50 for the de novo transcriptome RNA-seq workflow. |
gapExtensionPenalty | Used in the “clustering” step of the de novo transcriptome RNA-seq workflow. | [number] Default = 5 |
geneticCode | This parameter specifies the genetic code to use with a reference sequence. | [filepath/standard Lasergene genetic code file name] |
hits | (required) Specifies the path and name of the hit file. Incomplete paths will be appended to the default directory. | [directory/filename enclosed in quotes] |
increaseRunGapPen | This parameter is a flag to increase the gap open penalty in HP runs. | [ true / false ] |
layout | (required) Specifies the path and name of the layout file. Incomplete paths will be appended to the default directory. | [directory/filename enclosed in quotes] |
layoutAlign | Specifies that a pairwise alignment should be performed at the payout phase in order to pick the best position for a given read. | [ true / false ] |
layoutMaxTemplateGap | The maximal number of gaps introduced into the alignment used during layout. | [number] |
layoutRSRange | The maximal Register Shift difference used while building the layout. | [number] |
layoutType | Specifies how reads are to be laid out. | [ unique / once / multiple / multipleAll ] |
matchScore | The score for a base match during an alignment. This score contributes to the pairwise score used to calculate match percentage. Increasing the matchScore value allows for longer or more frequent gaps, thus forcing bases that match to be assembled together. | [number] Default = 10 |
MaxGap | The theoretical maximum length of a gap that could be inserted. In practice, the maximum gap size will usually be about half of this value. | [number from 0-99] Default = 6 for most workflows, 30 for the de novo transcriptome RNA-seq workflow |
maxMergeSize | When linking clusters into a scaffold, only link them together if the overall number of reads in the scaffold would not exceed this threshold. Used in the “clustering” step of the de novo transcriptome RNA-seq workflow. | |
maxNCnt | (optional) This parameter removes sequential reads of the IUPAC ambiguity code ‘N’ that are greater than or equal to the number specified. Use of this parameter may help in assemblies whose reads contain large clusters of spurious N’s. | [integer] |
maxSecondaryTrimLength | During alignment, a read can be trimmed from both ends. This parameter defines the longest allowable length for the smaller of the two trimmed ends. | [number] |
maxSeqs | Specifies the maximum number of query sequences to add to an assembly. Use of this command can speed up assembly. | [number] |
merCntThresh | Minimum number of mers needed in order to be recorded in the mer file. | [number] |
merLayoutMin | Specifies the minimum length (in bases) of at least one stretch of matching mers used to identify matches between the reference and query data. The minimum value is equal to the mer. The maximum value is the read length, which would require the entire read be an exact match. For example, with a merSize of 19 and a merLayoutMin of 21, at least one stretch of three consecutive mers in a read would have to match for the read in order to be included in the layout. | [number from 11-1000] Default = 25 |
merMinimizer | (Intended for internal use only) | [number] |
merSize, merLength or matchSize | (required) Specifies the length (in bases) of mers used to identify matches between the reference and query data. | [number] |
merSkip | (Intended for internal use only) Specifies the number of positions to ignore or “skip” when creating the template mer file. Normally, mers are only skipped in the query (see merSkipQuery, below). The first and last mer of every read are always included. Increasing the value reduces the size of the intermediate files as well as the overall assembly time. However, larger values can also reduce the number of reads included in the assembly, especially with short read data. 0 = do not skip 2 = skip every second base 3 = skip every third base etc. |
[number] Default = 0 |
merSkipQuery | Specifies the number of positions to ignore or “skip” when creating the query mer file. The first and last mer of every read are always included. Increasing the value reduces the size of the intermediate files as well as the overall assembly time. However, larger values can also reduce the number of reads included in the assembly, especially with short read data. 0 = do not skip 2 = skip every second base 3 = skip every third base etc. |
[number] Default = 0 |
method | Defines how to handle splits in the assembly: * normal – normal assembly method * splitOnly – only reads which have been split will be included in the assembly * noSplit – no reads will be split |
[normal/splitOnly/noSplit] |
minAlignedLength | Specifies the minimum number of bases that must align after trimming for a read to be included in the assembly. | [number from 11-999] Default = 25 for most workflows, 50 for the de novo transcriptome RNA-seq workflow. |
minClusterSizeToOutput | Threshold for the number of reads that a cluster must contain in order for the cluster to be passed along to SNG for assembly in the next step of the program. Used in the “clustering” step of the de novo transcriptome RNA-seq workflow. Note that this command is present only for the clusterParam block of the rnaAssemble command. |
[number] |
minMatchPercent | The minimum percentage of matches in an overlap required to join two sequences in the same contig. | [number] Default = 93 for most workflows, 60 for the de novo transcriptome RNA-seq workflow. |
minMultiMergeClusterSize | When two or more clusters overlap the same k-mer, the minimum number of reads (depth) required at that k-mer for a cluster to consider that cluster significant. If three or more clusters exceed this threshold, the k-mer is considered “noisy” and a potential false join, and will not be merged. This is reported as a “multi-cluster link that was not merged”. If two significant clusters overlap and have similar enough depth, the clusters are considered linked and are scaffolded together. Otherwise, if only one cluster is significant, all reads at that k-mer which have no assigned cluster are merged directly into it as described for the minSingleMergeClusterSize option. This parameter is used in the “clustering” step of the de novo transcriptome RNA-seq workflow. Note that this command is present only for the clusterParam block of the rnaAssemble command. |
[number] |
minMultiMergeIgnoreFactor | When two or more clusters overlap the same k-mer and may be linked, they must be within this ratio of one other. Used in the “clustering” step of the de novo transcriptome RNA-seq workflow. Note that this command is present only for the clusterParam block of the rnaAssemble command. |
[number] |
minSeqsPerTemplate | Minimum number of sequences sufficient to build the layout or alignment. | [number] |
minSingleMergeClusterSize | The minimum number of reads (depth) matching an existing cluster at a single k-mer required to extend that cluster by immediately adding all new reads for that k-mer to the cluster. Used in the “clustering” step of the de novo transcriptome RNA-seq workflow. Note that this command is present only for the clusterParam block of the rnaAssemble command. |
[number] |
minNewClusterSize | Minimum number of matching reads at a single k-mer (i.e., “depth”) required to create a new cluster. Used in the “clustering” step of the de novo transcriptome RNA-seq workflow. Note that this command is present only for the clusterParam block of the rnaAssemble command. |
[number] |
mismatchPenalty | The penalty for a base mismatch during an alignment. This penalty is deducted from the pairwise score used to calculate match percentage. | [number] Default = 20 |
noSexChromosomes | Disables special handling of sex chromosomes. | [ true / false ] |
noSVPairSort | Specifies whether to turn off the calculation of pairs for structural variations. This may potentially reduce XNG assembly time. | [ true / false ] |
onePackage | Specifies whether an assembly containing multiple reference sequences should be bundled into a single .assembly package. If ‘false’ is entered, one .assembly package is created per contig. | [ true / false ] |
openInSeqman | (optional) Specifies whether the completed assembly should immediately be launched in SeqMan. | [ true / false] |
output | (required) Specifies the path and directory of the output files. Incomplete paths are appended to the default directory. | [directory/filename enclosed in quotes] |
pairDist | (Intended for internal use only) | [true/ false ] |
pickTemplate | Defines the number of templates from which to choose, and finds the template that is the best match for the input sequence. | [number] |
placeHit | (Intended for internal use only) | [ true / false ] |
probe | (Intended for internal use only) | [number] |
query | (required) Specifies the directory and file name(s) of the query data to be assembled. A folder with one or data files can also be used in place of individual file names. Properties for query: file: [directory/filename enclosed in quotes] Specifies the directory and file/folder. isPair: [true/false] Specifies whether the query files contain paired end data. minDist: [number] (required if isPair is ‘true’) Specifies the minimum expected distance in bases between paired end reads. Default is 0. maxDist: [number] (required if isPair is ‘true’) Specifies the maximum expected distance in bases between paired end reads. Defaults are 750 for Illumina; 4500 for 454 and Sanger, 7500 for Other, and user-defined for Ion Torrent seqTech: [unknown|IonTorrent||IlluminaLongReads|454|PacBio|normalScore|Other] Specifies the offset to be used when converting compressed quality scores into numerical values. These are the offsets used for the technology specified: Note 1: For 454,quality scores for homopolymeric runs of ≥ 2 are oriented from 5’ to 3’ on the top strand. Note 2: If possible, the data type of unknown data is determined automatically based on the first data file. pairTech : [unknown|LucigenRsaI|LucigenBfaI|Rsa1|Bfa1|Custom] pairLinker: [string] groupName: [string] The name of a group this file belongs to. Used for running multiple samples in one file. sex: [unknown|female|male] trim: [ true / false ] Specifies whether vector trimming needs to be applied to the reads. sngTrim: contains parameters for fast vector trimming (See the SNG command trimVector ) scan: [ true / false ] Specifies whether reads needs to be scanned for contaminants contaminantScan: Contains the assembleTemplate command with contaminant file used as a template and parameters: directoryTemplateMer, hits, layout, output, unassembled, results, format, mersize, ignorePolyMers and deleteIntermediates. The format parameter has valuenone_ALIGN. Example: query: {{file: “/data/home/proj/Illumina_s_5_1.txt”} {file: “/data/home/proj/Illumina_s_5_2.txt “} isPair: true minDist: 400 maxDist: 700 seqTech: Illumina} |
[directory/filename enclosed in quotes] |
recordSplitsOnly | Functional only when used in the same program as splitTemplateContigs or recordStructVariations (both described below). Specifies whether or not to turn off contig splitting while still recording SVs for later inclusion in the Structural Variation Report. | [ true / false ] |
recordStructVariations | Specifies under which circumstances structural variations (SVs) should be calculated and recorded. 0|false = Don’t calculate SVs 1|true = Calculate SVs at zero coverage 2 = Calculate SVs at insertions and deletions 3 = Calculate SVs at zero coverage and at insertions |
[ integer between 0-3 / true / false ] Default = 2 |
removeDuplicateSeqs | Completely removes clonal reads after the alignment phase of assembly. Clonal reads, where the endpoints of both reads in a pair match those in another pair, are usually the result of PCR artifacts. If ‘true,’ the reads will not be scored, and will not be included in SNP calculations. Marking this parameter to ‘true’ may substantially increase the time needed for assembly. | [ true / false ] |
removeUniqueInserts | Removes reads that cause an insert which no other read would create. This parameter is only enabled when delayAlignInserts (described under the assembleTemplate command) is true. | [ true / false ] Defaults: true for Illumina and Ion Torrent read technologies; false for all other types. |
repeatPenaltyScale | Indicates the quality penalty (using the Phred scale) to use for a read which places in two locations identically. Higher repeat counts are further penalized relative to this on a log2 scale such that repeats placing in four locations have a double penalty, in eight locations have a triple penalty, and so on. This penalty is applied to a ceiling of Phred score 30 if the other methods are disabled or have a higher score. | [number] Default = 8 |
repeatThreshMax | Specifies the maximum number of occurrences of a mer in the reference sequence(s) for it to be considered repeated. Mers exceeding this number will not be used for identifying matches. | [number from 1-10000] Default = 100 |
repeatThreshMin | Specifies the minimum number of occurrences of a mer in the reference sequence(s) for it to be considered repeated. Mers less than this number will not be used for identifying matches. | [number] |
reportFiles | Defines the kind of report file to be generated. perProject: [ true / false ] Generate a per project report. perTemplate: [ true / false ] Generate a per template report. removeInteral: [ true / false ] Remove intermediate reports. |
|
repeatmermax | Threshold number of occurrences in a data set for a mer to be considered “repeated.” Used in the “clustering” step of the de novo transcriptome RNA-seq workflow. | |
results | Specifies the path and name of the result summary file. This file contains a compilation of assembly statistics and uses the extension fileSize.txt. Incomplete paths will be appended to the default directory. | [directory/filename enclosed in quotes] |
saveUnSplitAssembly | Specifies whether XNG should save both the normal assembly output, [filename].assembly, and the unsplit intermediate assembly, [filename]-noSplit.assembly. The latter file contains SVs but no SNPs, and can be used to validate splits in the final assembly. | [true / false ] |
sex | Specifies the sex of the subject, used for read placement and SNP calling. See How sex chromosomes are handled for details. | [ male / female / unknown ] |
showCDSVariant | Specifies whether or not XNG should show all variants of a CDS feature contacted by a SNP. The version number for the CDS variant will then appear in brackets when viewed in the SNP report in SeqMan Pro. | [ true / false ] |
sngConvertOptions | (Intended for internal use only) | [text string] |
snp | Specifies whether or not a SNP detection pass of the gapped alignment should be made during the assembly. | [ true / false ] |
snp_checkStrandedness | Specifies whether or not the strand that each read comes from is considered in the SNP calculation. This is ignored by the simple SNP calling method (used when genome ploidy is “Heterogeneous”). | [ true / false ] |
snp_combineSubs | This parameter is used to coalesce adjacent substitutions. | [ true / false ] |
snp_excludeBases3p | (internal use only) This parameter causes the specified number of bases from the 3’ end of each read to not be considered during variant calling. | [integer] |
snp_excludeBases5p | (internal use only) This parameter causes the specified number of bases from the 5’ end of each read to not be considered during variant calling. | [integer] |
snp_excludeBasesEdge | This parameter causes the specified number of bases from both the 5’ and 3’ ends of each read to not be considered during variant calling. | [integer] For the simple SNP calling method (used when genome ploidy is “Heterogeneous”), the default is 5. For the Bayesian SNP calling methods (used when genome ploidy is Diploid or Haploid), the default is 0. |
snp_limitEndPos | Specifies the 3’ most coordinate of the specified template from which to stop calculating SNPs. | [number between 1 and the length of the template] |
snp_limitStartPos | Specifies the 5’ most coordinate of the specified template from which to begin calculating SNPs. A value between 1 and the length of the template must be entered. | [number] Default = 1 |
snp_limitTemplateID | Specifies a single template ID for which to calculate SNPs. | [number] Default = 0 |
snp_logEndPos | Specifies the 3’ most coordinate of the specified template from which to stop storing a detailed log of SNP information. A value between 1 and the length of the template must be entered. | [number] Default = 1 |
snp_logLevel | Specifies the level of detailed logging to store in the “shared” project directory as “SNP.log.” Level 0 specifies that no log will be stored. Level 1 stores detailed info on the SNPs which were called, level 2 also logs columns where the preliminary filtered passed but the final filtering failed, and level 3 logs all columns. This is ignored by the simple SNP calling method (used when genome ploidy is “Heterogeneous”). | [whole number from 0-3] Default = 0 |
snp_logStartPos | Specifies the 5’ most coordinate of the specified template from which to begin storing a detailed log of SNP information. A value between 1 and the length of the template must be entered. | [number] Default = 1 |
snp_logTemplateID | Specifies a single template from which to store a detailed log of SNP information. | [number] Default = 0 |
snp_maxRun | Specifies the maximum length of a homopolymeric run for an indel to be considered during variant calling. For example, a snp_maxRun of ‘5’ will allow a portion of sequence up to 5 bases in length to be called as a SNP. | [integer] Defaults are 3 for 454 and Ion Torrent read technologies; 5 for all others. |
snp_maxStrandBias | Strand Bias (SB) for a SNP is the bias for the SNP appearing on one strand versus the other. It is measured relative to the strand bias in the assembly at the location of the SNP. For example, in a column with 60 forward reads and 40 backward reads, 6 SNP bases on the forward strands, and 4 on the reverse strands would be unbiased. SB is given by the formula: SB = |SNP% f – SNP% r | / Total SNP% …where SNP% f and SNP% r are the percentage of reads containing the variant on the forward (top) and reverse (bottom) strands, respectively; and SNP% is the total percentage of reads containing the variant. SB is calculated based on an “absolute value,” and will therefore be a positive number. The effect of different SB thresholds is shown below: -1 – A negative number cannot normally be generated by the equation above. However, you may use ‘-1’ in the script to turn off the snp_maxStrandBias parameter. In the wizard, SeqMan NGen indicates the parameter is turned off by making Maximum strand bias (see Variants tab) either blank or absent. 0 – Perfectly balanced (unbiased) strands. Reads with variants are present on both strands, and variants appear equally on both strands. Between 0-1, not inclusive – As the number ‘1’ is approached, more variants are called with unbalanced variants containing reads at that position 1 – All variant-containing reads are on a single strand. Note: In cases where all the reads covering a base are on one strand only, the SNP% of the other strand cannot be calculated (due to a “division by zero” error). These positions will not be removed by the snp_maxStrandBias filter. To remove these variants, instead set snp_minStrandCov to ≥ 1. Example: In a homozygous case (SNP% = 100) with a depth of 100, where 75 variant containing reads are on the top strand (75%) and 25 variant containing reads are on the bottom strand (25%), the strand bias would equal: (75 – 25)/100 = 0.5. |
[integer] Defaults for the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”) are 0.8 for 454 and Ion Torrent read technologies; not shown (blank) for all others. Defaults for the simple SNP calling method (used when genome ploidy is “Heterogeneous”) are 0.25 for all read technologies. |
snp_minHomopolDelDepth | Specifies the minimum read depth required to call a deletion in a homopolymeric run. | [integer] Default = 0 |
snp_minHomopolDelFrac | Specifies the minimum fraction of reads required to call a deletion in a homopolymeric run. | [integer] Default = 0 |
snp_minHomopolInsDepth | Specifies the minimum read depth required to call an insertion in a homopolymeric run. | [integer] Default = 0 |
snp_minHomopolInsFrac | Specifies the minimum fraction of reads required to call an insertion in a homopolymeric run. | [integer] Default = 0 |
snp_minPctToScore | Specifies minimum percentage of reads in a column which must differ from the reference in order to score the column. For the simple SNP calling method (used when genome ploidy is “Heterogeneous”), this is the only criteria used to call a SNP. For the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”), this is a filter applied before the other parameters. | [number from 0-1] Default = 0.05 |
snp_minProbNonrefToCall | Specifies the minimum probability of a SNP column which is required to call a SNP, expressed as a number from 0 and 1. The probabilities of all genotypes other than Homozygous Reference are totaled and checked against this number. This is the final filter applied during the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”) and is ignored by the simple SNP calling method (used when genome ploidy is “Heterogeneous”). | [number from 0-1] Default = 0.1, requiring a minimum 10% change. |
snp_minStrandCov | Specifies the minimum number of reads from each strand required to call a variant at a given position. | [integer] In the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”), the default is 0. In the simple SNP calling method (used when genome ploidy is “Heterogeneous”), the default is 5. |
snp_minVariantDepthToScore | (required if “snp” is true) Specifies the minimum depth required for a specific base (or deletion) in a column before it is considered usable for SNP calling. This is the second filter applied during the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”) and is ignored by the simple SNP calling method (used when genome ploidy is “Heterogeneous”). | [number from 0-100] Default = 2 |
snp_minWeight | Called “Minimum base quality score” in the SeqMan NGen wizard, this parameter specifies the minimum quality score for a base to be considered in the SNP calculation. | [number] In the simple SNP calling method (used when genome ploidy is “Heterogeneous”), the default is 20. In the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”), the default is 5. |
snp_reportUserMissing | Specifies what kind of positions to put in the missingUser file, including one or more of the following: dbSNP = dbSNP Pos user = in user VCF SNP file zeroCoverage = include zero coverage regions cosmic = in COSMIC database allcaptured = include all positions in capture regions captured = include only positions in capture regions Example: snp_reportUserMissing: [user allcaptured captured] [kParamTypeStrFixedVocab] |
|
snp_runVar | Uses a Bayesian probabilistic model to exclude heterozygous insertions and deletions in homopolymeric runs. Intended for use with Ion Torrent data. | [ true / false ] Defaults: true for 454 and Ion Torrent read technologies; false for all others. |
snp_showAllFeatures | Specifies whether XNG should count SNPs multiple times if the SNP contacts different versions (variants) of a CDS feature. | [ true / false ] |
snp_writeExtended | Specifies whether the additional values produced by the Haploid or Diploid SNP calculation methods are included in the SNP table. Wizard equivalent: Advanced Options > Alignment tab > Trim to targeted regions |
[ true / false ] |
snpMethod | Specifies the SNP detection method to use. Simple produces a count of each type of base in the column and calculates the percent of non-reference bases. Haploid uses a Bayesian statistical model to calculate a probability score that the position contains a polymorphism and give a quality score for the base called at that position. Diploid uses a Bayesian statistical model to calculate a probability score that the position contains a polymorphism and give a quality score for the base(s) called at that position. Based on the scores, it also calls the genotype at each position. | [ simple / haploid / diploid ] |
splitTemplateContigs | Specifies under which circumstances contigs should be cut after a templated assembly. Any split contigs will be grouped into scaffolds with a defined position to allow for easy sorting when the project is viewed in SeqMan Pro. This command pertains only to reference-guided assemblies with gap closure. By default, during this type of assembly, the XNG assembler first finds structural variations (SVs) then splits the contig after each SV. Elements of this process can be modified using this command. 0|false = Don’t split 1|true = Split at locations with zero coverage 2 = Split at insertions and deletions 3 = Split at zero coverage and at insertions |
[ integer between 0-3 / true / false ] Default = 2 |
template | (required) Specifies the directory and file name of the reference sequence file. A folder with one or more reference sequence files can also be used in place of individual file names. Each entry must also be enclosed by brackets. If more than template entry is used, the list must also be enclosed by an additional set of brackets. Properties for template: file: [directory/filename enclosed in quotes] Specifies the directory and file/folder. feature: [directory/filename enclosed in quotes] (optional) Specifies the directory and file name for annotated features when the reference sequence and feature annotations are in separate files. transcriptKind: [both|identified|novel] if the .Transcriptome package is used as a template, defines which transcripts will be used as a template. userSNP: [directory/filename enclosed in quotes] exomeCapture: file: [directory/filename enclosed in quotes] The BED file name. track: [string] the region of interest (Optional) merMask: [ true / false ] Specifies if mers from outside of the capture region should be excluded from assembly. Examples for template: Sequence and annotation in one file: AssembleTemplate template: {{file: “/data/home/proj/MG1655.gbk”} {file: “/data/home/proj/W3110.gbk”}} Sequence and annotation in separate files: AssembleTemplate template: {file: “/Library/ABC_proj/references/MG1655.fas” feature: “/Library/ABC_proj/references/MG1655.gff”} |
[directory/filename enclosed in quotes] |
templateHitCntThresh | (Intended for internal use only) | [number] |
trimToTargetRegions | Controls whether reads are trimmed, by default, to the boundaries of the targeted regions, as defined by the .bed or manifest file. The default of true indicates that the reads are trimmed to the stated boundaries. If conditions are not met, the SeqMan NGen wizard does not change this parameter to ‘false,’ but instead omits it from the script. The parameter status is only shown in the script for control workflows. Wizard equivalent: Trim to targeted regions in the Alignment tab. This tab is accessed from the Assembly Options screen by pressing the Advanced Options button. |
[ true / false ] |
unassembled | [directory/filename enclosed in quotes] | |
verify | [ true / false ] |
Need more help with this?
Contact DNASTAR