Main Content

SeqTrimOptions

Contain options to trim sequences based on specified criterion

Since R2023a

Description

A SeqTrimOptions object contains options to trim sequences based on a specified criterion. This object is used as the value of Options property of the bioinfo.pipeline.block.SeqTrim block.

Creation

Description

optionsObj = bioinfo.pipeline.options.SeqTrimOptions creates a SeqTrimOptions object with default property values.

optionsObj = bioinfo.pipeline.options.SeqTrimOptions(Name=Value) sets properties using one or more name-value arguments. Name is the property name and Value is the property value. For example, ssOpt = bioinfo.pipeline.options.SeqTrimOptions(Threshold=[3 20]) specifies to trim each sequence when the number of bases with quality below 20 is greater than 3.

Properties

expand all

Base quality encoding format, specified as a character vector or string.

Criterion to trim sequences, specified as one of the following options. Specify only one trimming criterion per function call.

  • 'MaxNumberLowQualityBases'– applies a maximum threshold on the number of low-quality bases allowed before trimming a sequence starting at the 5' end.

  • 'MaxPercentLowQualityBases'– applies a maximum threshold on the percentage of low-quality bases allowed before trimming a sequence starting at the 5' end.

  • 'MeanQuality'– applies a minimum threshold on the running average base quality allowed before trimming a sequence starting at the 5' end.

  • 'BasePositions'– trims each sequence according to the base positions (first base and last base) starting at the 5' end.

  • 'Termini'– trims each sequence from either the 5' or 3' end or from both ends.

Use this name-value pair argument together with 'Threshold' to specify the appropriate threshold value. Depending on the trimming criterion, the corresponding value for 'Threshold' varies. See the 'Threshold' option for the default values.

Note

Sequences resulting in empty sequences after trimming are saved in the output files as empty sequences. To remove empty sequences from files, use the seqfilter function with the 'MinLength' option set to the value of 1.

Suffix to use in the output file name, specified as a character vector or string. It is inserted after the input file name and before the file extension. The default is '_trimmed'.

Threshold value for the trimming criterion, specified as a scalar or vector. Use this name-value pair to define the threshold value for the trimming criterion specified by 'Method'.

Depending on the trimming criterion, the corresponding value for 'Threshold' can be a scalar or two-element vector. If you do not specify 'Threshold', then the function uses the default threshold value of the corresponding method. For each trimming criterion, the function uses the encoding format of the base quality specified by the 'Encoding' name-value pair argument.

'Method''Threshold'Default 'Threshold' value
'MaxNumberLowQualityBases'Two-element vector [V1 V2]. V1 is a nonnegative integer that specifies the maximum number of low-quality bases allowed before trimming. V2 specifies the minimum base quality. Any base with quality less than V2 is considered a low-quality base. [0 10]
'MaxPercentLowQualityBases'Two-element vector [V1 V2]. V1 is a scalar between 0 and 100 that specifies the maximum percentage of low quality bases allowed before trimming. V2 specifies the minimum base quality. Any base with quality less than V2 is considered a low-quality base.[0 10]
'MeanQuality'Positive scalar that specifies the minimum threshold on the running average base quality allowed before trimming a sequence starting at the 5' end. 0
'BasePositions'

Two-element vector [V1 V2], where V1 and V2 are positive integers specifying the base positions to start trimming at the 5' end and 3' end, respectively.

To trim only the 5' end of each sequence before position V1, use [V1 Inf].

To trim only the 3' end of each sequence after position V2, use [1 V2].

[1 Inf], that is, each sequence is left untrimmed.
'Termini'

Two-element vector [V1 V2], where V1 and V2 are nonnegative integers specifying the number of bases to trim at the 5' end and the 3' end, respectively.

To trim V1 bases at the 5' end only, use [V1 0].

To trim V2 bases at the 3' end only, use [0 V2].

[0 0], that is, each sequence is left untrimmed.

Size of the sliding window to apply the trimming criterion to a sequence, specified as a positive integer. The size of the window corresponds to the number of bases that the function uses at one time to apply the criterion. Any given sequence is trimmed before the first base of the window that violates the given criterion.

The sliding window can be applied to the following methods:

  • 'MaxNumberLowQualityBases',

  • 'MaxPercentLowQualityBases', and

  • 'MeanQuality'.

Note

Sequences shorter than the size of the window are saved in the output file as empty sequences. To remove empty sequences from files, use the seqfilter function with the 'MinLength' option set to the value of 1.

Version History

Introduced in R2023a