Bioinformatics pipeline block to map sequence reads to reference genome
BwaMEM block enables you to map sequencing reads to a reference
The block requires the BWA Support Package for Bioinformatics Toolbox™. If the support package is not installed, then a download link is provided. For details, see Bioinformatics Toolbox Software Support Packages.
b = bioinfo.pipeline.block.BwaMEM
also specifies additional
b = bioinfo.pipeline.block.BwaMEM(
also specifies the output file name.
b = bioinfo.pipeline.block.BwaMEM(OutFilename=
fileName — Output file name
string | character vector
Output file name, specified as a string or character vector. The block saves the mapping results to this file.
BwaMEMOptions | string | character vector
BwaMEM options, specified as a
BWAMEMOptions object, string, or character vector.
If you are specifying a string or character vector, it must be in the
bwa native syntax (prefixed by a dash) .
Specify optional pairs of arguments as
the argument name and
Value is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
The following list of arguments is a partial list. For the complete list, refer to
the properties of
AlternativeHitsThreshold — Threshold for determining which hits receive XA tag in output SAM file
[5 200] (default) | nonnegative integer | two-element numeric vector
Threshold for determining which hits receive an XA tag in the output SAM file, specified as a nonnegative integer n or two-element numeric vector
[n m], where n and m must be nonnegative integers.
If a read has less than n hits with a score greater than 80% of the best score for that read, all hits receive an XA tag in the output SAM file.
When you also specify m, the software returns up to m hits if the hit list contains a hit to an ALT contig.
AppendReadCommentsToSAM — Flag to append FASTA or FASTQ comments to output SAM file
false (default) |
Flag to append FASTA or FASTQ comments to the output SAM file, specified as
false. The comments appear as text
after a space in the file header.
ErrorHandler — Function to handle errors from
Function to handle errors from the
run method of the block, specified as a function handle. It specifies the function to call if the run method encounters an error within a pipeline. In order for the pipeline to continue after a block fails,
ErrorHandler must return a structure compatible with the output ports of the block. The error handling function is called with the following two input arguments:
Structure with the following fields:
Field Description identifier Identifier of the error that occurred message Text of the error message index Linear index indicating which block process failed in the parallel run. By default, the index is always 1 because there is only one run per block. For details on how block inputs can be split across different dimensions for multiple run calls, see Bioinformatics Pipeline SplitDimension.
Input structure passed to the
runmethod when it failed.
Inputs — Input ports
This property is read-only.
Input ports of the block, specified as a structure. The field names of the structure are the names of the block input ports and the field values are
bioinfo.pipeline.Input objects. These objects describe the input port behaviors. The input port names are the expected field names of the input structure that you pass in for the block
Inputs structure has the
IndexBaseName— Base name of the reference index files. The index files are in the AMB, ANN, BWT, PAC, and SA file formats. For example, the base name of an index file
"Dmel_chr4". This input is a required input that must be satisfied.
Reads1File— Name of FASTQ file for the first mate reads or single-end reads. For paired-end data, sequences in
Reads1Filemust correspond read-for-read to sequences in
Reads2File. This input is a required input that must be satisfied.
Reads2File— Name of FASTQ file for the second mate reads for paired-end data. This input is an optional input.
The default value for each of these inputs is a
bioinfo.pipeline.datatypes.Unset object, which means that the input value is
not set yet.
Outputs — Output ports
This property is read-only.
Output ports of the block, specified as a structure. The field
names of the structure are the names of the block output ports and the field values are
bioinfo.pipeline.Output objects. These objects describe the output port behaviors.
The output structure returned by the block
run method has the field names
that are the same as the output port names.
Outputs structure has the
BwaMEMOptions object (default)
BwaMEM options, specified as a
BWAMEMOptions object. The default value is a default
OutFilename — Output file name
"Aligned.sam" (default) | string
Output file name, specified as a string. By default, the output file is named as
Aligned.sam, which contains the mapping results.
Map Reads to Reference Using BwaMEM
Map reads to the Drosophila chromosome 4 sequence using the
import bioinfo.pipeline.block.* import bioinfo.pipeline.Pipeline FC1 = FileChooser(which("Dmel_chr4.fa")); FC2 = FileChooser(which("SRR6008575_10k_1.fq")); BI = BwaIndex; BM = BwaMEM; P = Pipeline; addBlock(P,[FC1,FC2,BI,BM]); connect(P,FC1,BI,["Files","ReferenceFASTAFile"]); connect(P,BI,BM,["IndexBaseName", "IndexBaseName"]); connect(P,FC2,BM,["Files", "Reads1File"]); run(P); results(P,BM)
ans = struct with fields: SAMFile: [1×1 bioinfo.pipeline.datatypes.File]
SAMFile to see the location of
the output file.
ans = "C:\PipelineResults\BwaMEM_1\1\Aligned.sam"
 Li, Heng, and Richard Durbin. “Fast and Accurate Short Read Alignment with Burrows-Wheeler Transform.” Bioinformatics 25, no. 14 (July 15, 2009): 1754–60. https://doi.org/10.1093/bioinformatics/btp324.
 Li, Heng, and Richard Durbin. “Fast and Accurate Long-Read Alignment with Burrows–Wheeler Transform.” Bioinformatics 26, no. 5 (March 1, 2010): 589–95. https://doi.org/10.1093/bioinformatics/btp698.
Introduced in R2023a