int2nt

Convert nucleotide sequence from integer to letter representation

Syntax

SeqChar = int2nt(SeqInt)

SeqChar = int2nt(SeqInt, ...'Alphabet', AlphabetValue, ...)
SeqChar = int2nt(SeqInt, ...'Unknown', UnknownValue, ...)
SeqChar = int2nt(SeqInt, ...'Case', CaseValue, ...)

Input Arguments

SeqIntRow vector of integers specifying a nucleotide sequence. For valid integers, see the table Mapping Nucleotide Integers to Letter Codes. Integers are arbitrarily assigned to IUB/IUPAC letters.
AlphabetValueString specifying a nucleotide alphabet. Choices are:
  • 'DNA' (default) — Uses the symbols A, C, G, and T.

  • 'RNA' — Uses the symbols A, C, G, and U.

UnknownValueCharacter to represent unknown nucleotides, that is 0 or integers ≥ 17. Choices are any character other than the nucleotide characters A, C, G, T, and U and the ambiguous nucleotide characters N, R, Y, K, M, S, W, B, D, H, and V. Default is *.
CaseValueString specifying the case of the returned character string. Choices are 'upper' (default) or 'lower'.

Output Arguments

SeqCharNucleotide sequence specified by a character string of codes.

Description

SeqChar = int2nt(SeqInt) converts SeqInt, a row vector of integers specifying a nucleotide sequence, to SeqChar, a string of codes specifying the same nucleotide sequence. For valid codes, see the table Mapping Nucleotide Integers to Letter Codes.

Mapping Nucleotide Integers to Letter Codes

NucleotideIntegerCode
Adenosine 1A
Cytidine 2C
Guanine 3G
Thymidine 4T
Uridine (if 'Alphabet' set to 'RNA') 4U
Purine (A or G) 5R
Pyrimidine (T or C) 6Y
Keto (G or T) 7K
Amino (A or C) 8M
Strong interaction (3 H bonds) (G or C) 9S
Weak interaction (2 H bonds) (A or T) 10W
Not A (C or G or T)11B
Not C (A or G or T)12D
Not G (A or C or T)13H
Not T or U (A or C or G)14V
Any nucleotide (A or C or G or T or U) 15N
Gap of indeterminate length16-
Unknown (any integer not in table) 0 or ≥ 17* (default)

SeqChar = int2nt(SeqInt, ...PropertyName', PropertyValue, ...) calls int2nt with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:


SeqChar = int2nt(SeqInt, ...'Alphabet', AlphabetValue, ...)
specifies a nucleotide alphabet. AlphabetValue can be 'DNA', which uses the symbols A, C, G, and T, or 'RNA', which uses the symbols A, C, G, and U. Default is 'DNA'.

SeqChar = int2nt(SeqInt, ...'Unknown', UnknownValue, ...) specifies the character to represent unknown nucleotides, that is 0 or integers ≥ 17. UnknownValue can be any character other than the nucleotide characters A, C, G, T, and U and the ambiguous nucleotide characters N, R, Y, K, M, S, W, B, D, H, and V. Default is *.

SeqChar = int2nt(SeqInt, ...'Case', CaseValue, ...) specifies the case of the returned character string. CaseValue can be 'upper' (default) or 'lower'.

Examples

  • Convert a nucleotide sequence from integer to letter representation.

    s = int2nt([1 2 4 3 2 4 1 3 2])
    
    s =
    ACTGCTAGC
    
  • Convert a nucleotide sequence from integer to letter representation and define # as the symbol for unknown numbers 17 and greater.

    si = [1 2 4 20 2 4 40 3 2];
    s = int2nt(si, 'unknown', '#')
    
    s =
    ACT#CT#GC
    
Was this topic helpful?