Bioinformatics pipeline block to select files or URLs
FileChooser block enables you to select files or download files
fileNames — Names of files or URLs
string | character vector | ...
Names of files or URLs, specified as a string, character vector, string vector, or cell array of character vectors. You can include a full file path. The block does not use the MATLAB® path.
ErrorHandler — Function to handle errors from
Function to handle errors from the
run method of the block, specified as a function handle. It specifies the function to call if the run method encounters an error within a pipeline. In order for the pipeline to continue after a block fails,
ErrorHandler must return a structure compatible with the output ports of the block. The error handling function is called with the following two input arguments:
Structure with the following fields:
Field Description identifier Identifier of the error that occurred message Text of the error message index Linear index indicating which block process failed in the parallel run. By default, the index is always 1 because there is only one run per block. For details on how block inputs can be split across different dimensions for multiple run calls, see Bioinformatics Pipeline SplitDimension.
Input structure passed to the
runmethod when it failed.
Files — File names or URLs
empty string array (default) | string | character vector | ...
Names of files or URLs, specified as a string, character vector, string vector, or cell array of character vectors.
Files is always appended after
determine the file or URL destinations. Files can include
https as a scheme if
PathRoot is empty.
Inputs — Input ports
This property is read-only.
Input ports of the block, specified as a structure. The field names of the structure are the names of the block input ports and the field values are
bioinfo.pipeline.Input objects. These objects describe the input port behaviors. The input port names are the expected field names of the input structure that you pass in for the block
Outputs — Output ports
This property is read-only.
Output ports of the block, specified as a structure. The field
names of the structure are the names of the block output ports and the field values are
bioinfo.pipeline.Output objects. These objects describe the output port behaviors.
The output structure returned by the block
run method has the field names
that are the same as the output port names.
Outputs structure has
the following field:
Options — Parameters for obtaining data from web server
weboptions object (default)
Parameters for obtaining data from a web server, specified as a
This property is used only when the scheme is
https, or when
Files contains a URL. The
default value is a
weboptions object with default property
PathRoot — Root path for
empty string array (default) | string | character vector
Root path for all files in the
Files property, specified as a
string or character vector.
PathRoot is always prefixed to
determine the file or URL destinations.
PathRoot can include
https as a scheme
PathRoot is empty.
Create a Simple Pipeline to Plot Sequence Quality Data
Import the Pipeline and block objects needed for the example.
import bioinfo.pipeline.Pipeline import bioinfo.pipeline.block.*
Create a pipeline.
qcpipeline = Pipeline;
Select an input FASTQ file using a
fastqfile = FileChooser(which("SRR005164_1_50.fastq"));
sequencefilter = SeqFilter;
Define the filtering threshold value. Specifically, filter out sequences with a total of more than 10 low-quality bases, where a base is considered a low-quality base if its quality score is less than 20.
sequencefilter.Options.Threshold = [10 20];
Add the blocks to the pipeline.
Connect the output of the first block to the input of the second block. To do so, you need to first check the input and output port names of the corresponding blocks.
Outputs (port of the first block) and
Inputs (port of the second block).
ans = struct with fields: Files: [1×1 bioinfo.pipeline.Output]
ans = struct with fields: FASTQFiles: [1×1 bioinfo.pipeline.Input]
Files output port of the
fastqfile block to the
FASTQFiles port of
Next, create a
UserFunction block that calls the
seqqcplot function to plot the quality data of the filtered sequence data. In this case,
inputFile is the required argument for the
seqqcplot function. The required argument name can be anything as long as it is a valid variable name.
qcplot = UserFunction("seqqcplot",RequiredArguments="inputFile",OutputArguments="figureHandle");
Alternatively, you can also use dot notation to set up your
qcplot = UserFunction; qcplot.RequiredArguments = "inputFile"; qcplot.Function = "seqqcplot"; qcplot.OutputArguments = "figureHandle";
Add the block.
Check the port names of
sequencefilter block and
ans = struct with fields: FilteredFASTQFiles: [1×1 bioinfo.pipeline.Output] NumFilteredIn: [1×1 bioinfo.pipeline.Output] NumFilteredOut: [1×1 bioinfo.pipeline.Output]
ans = struct with fields: inputFile: [1×1 bioinfo.pipeline.Input]
FilteredFASTQFiles port of the
sequencefilter block to the
inputFile port of the
Run the pipeline to plot the sequence quality data.
Select Input File Using
FileChooser block to select an input file
provided with the toolbox.
import bioinfo.pipeline.block.FileChooser import bioinfo.pipeline.Pipeline FC = FileChooser(which("SRR6008575_10k_1.fq")); P = Pipeline; addBlock(P, FC); run(P); R = results(P, FC)
R = struct with fields: Files: [1×1 bioinfo.pipeline.datatypes.File]
Files to see the location of
ans = "C:\Program Files\MATLAB\R2023a\toolbox\bioinfo\bioinfodata\SRR6008575_10k_1.fq"
Introduced in R2023a