genbankread
Read data from GenBank file
Description
reads a GenBank-formatted file, GenBankData = genbankread(File)File, and creates
GenBankData, a structure or array of structures, with the fields
corresponding to the GenBank® keywords.
sets the connection timeout (in seconds)
to read data from a remote file or URL.GenBankData = genbankread(File,
TimeOut=TimeOutValue)
Examples
Retrieve sequence information for the HEXA gene, store the data in a file, and then read into the MATLAB® software.
getgenbank("nm_000520",ToFile="TaySachs_Gene.txt") s = genbankread("TaySachs_Gene.txt")
s = struct with fields:
LocusName: 'NM_000520'
LocusSequenceLength: '4785'
LocusNumberofStrands: ''
LocusTopology: 'linear'
LocusMoleculeType: 'mRNA'
LocusGenBankDivision: 'PRI'
LocusModificationDate: '05-OCT-2024'
Definition: 'Homo sapiens hexosaminidase subunit alpha (HEXA), transcript variant 2, mRNA.'
Accession: 'NM_000520'
Version: 'NM_000520.6'
GI: ''
Project: []
DBLink: []
Keywords: 'RefSeq; MANE Select.'
Segment: []
Source: 'Homo sapiens (human)'
SourceOrganism: [4×65 char]
Reference: {[1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct]}
Comment: [46×66 char]
Features: [160×74 char]
CDS: [1×1 struct]
Sequence: 'ctcacgtggccagccccctccgagaggggagaccagcgggccatgacaagctccaggctttggttttcgctgctgctggcggcagcgttcgcaggacgggcgacggccctctggccctggcctcagaacttccaaacctccgaccagcgctacgtcctttacccgaacaactttcaattccagtacgatgtcagctcggccgcgcagcccggctgctcagtcctcgacgaggccttccagcgctatcgtgacctgcttttcggttccgggtcttggccccgtccttacctcacagggaaacggcatacactggagaagaatgtgttggttgtctctgtagtcacacctggatgtaaccagcttcctactttggagtcagtggagaattataccctgaccataaatgatgaccagtgtttactcctctctgagactgtctggggagctctccgaggtctggagacttttagccagcttgtttggaaatctgctgagggcacattctttatcaacaagactgagattgaggactttccccgctttcctcaccggggcttgctgttggatacatctcgccattacctgccactctctagcatcctggacactctggatgtcatggcgtacaataaattgaacgtgttccactggcatctggtagatgatccttccttcccatatgagagcttcacttttccagagctcatgagaaaggggtcctacaaccctgtcacccacatctacacagcacaggatgtgaaggaggtcattgaatacgcacggctccggggtatccgtgtgcttgcagagtttgacactcctggccacactttgtcctggggaccaggtatccctggattactgactccttgctactctgggtctgagccctctggcacctttggaccagtgaatcccagtctcaataatacctatgagttcatgagcacattcttcttagaagtcagctctgtcttcccagatttttatcttcatcttggaggagatgaggttgatttcacctgctggaagtccaacccagagatccaggactttatgaggaagaaaggcttcggtgaggacttcaagcagctggagtccttctacatccagacgctgctggacatcgtctcttcttatggcaagggctatgtggtgtggcaggaggtgtttgataataaagtaaagattcagccagacacaatcatacaggtgtggcgagaggatattccagtgaactatatgaaggagctggaactggtcaccaaggccggcttccgggcccttctctctgccccctggtacctgaaccgtatatcctatggccctgactggaaggatttctacatagtggaacccctggcatttgaaggtacccctgagcagaaggctctggtgattggtggagaggcttgtatgtggggagaatatgtggacaacacaaacctggtccccaggctctggcccagagcaggggctgttgccgaaaggctgtggagcaacaagttgacatctgacctgacatttgcctatgaacgtttgtcacacttccgctgtgaattgctgaggcgaggtgtccaggcccaacccctcaatgtaggcttctgtgagcaggagtttgaacagacctgagccccaggcaccgaggagggtgctggctgtaggtgaatggtagtggagccaggcttccactgcatcctggccaggggacggagccccttgccttcgtgccccttgcctgcgtgcccctgtgcttggagagaaaggggccggtgctggcgctcgcattcaataaagagtaatgtggcatttttctataataaacatggattacctgtgtttaaaaaaaaaagtgtgaatggcgttagggtaagggcacagccaggctggagtcagtgtctgcccctgaggtcttttaagttgagggctgggaatgaaacctatagcctttgtgctgttctgccttgcctgtgagctatgtcactcccctcccactcctgaccatattccagacacctgccctaatcctcagcctgctcacttcacttctgcattatatctccaaggcgttggtatatggaaaaagatgtaggggcttggaggtgttctggacagtggggagggctccagacccaacctggtcacagaagagcctctcccccatgcatactcatccacctccctcccctagagctattctcctttgggtttcttgctgcttcaattttatacaaccattatttaaatattattaaacacatattgttctctaggcactgtggtagtgggtttttttgttgtttttgtttttgagactgtctcaaaaactctgtcgcccaggctgacagtgcagtggcacaatcttggctcactgcagcctctgcctcctgggttcaagcgattctcgtgcatcagcctcctgagtaactggaattaataggcacgtgccaccatgtccatctaattcatatatatatattttttttttctgagacggagtctcactgtcacacaggctggagtgcagtggcacgatctcgactcactgcaagctccacctcctgggttcacgccattctcctgcctcagcctccccagtagctgggactacaggcgcccgccaccacgcccggctaattttttgtatttttagtagagatggggtttctccgtgttagccaggatggtctcgatctcctgacctcgtgatccgcccgccttggcctcccaaagtgctgggattacaggcgtgagccaccgcgcccggccgaattcatctatttttagtagagatggggtttcactatattggccaggctggtcttgaactcatgacctcagatgttcacttgtcttggcctcccaaagtgctgggattagaggcgtgagccaccgcacgcgggcctgtggtaaattgttgaatttgaaggactcagaggccctggtcaattccaaaataacgtaggcgacttccatccccctcctcccaaccattttcagcccaaagcatcttcgcagggaatggatggctgcgcggaggtgggcggtggctctggagagggtctttgcaggtgtgattttctctagaaggaaatgtctcgtcgtggacccagactgccccctcctggtttcagatgcagaagtgatactgtaagccagaggcgggggcagtaatgcatcgcagccattttaggtgaggatttccttggcggttatttgttaagttctttggctgggccctgggctggggtaacaatggacaggttccaggcatttttttcagaaagcttccagtgtagtggatacagaaacttcaggaaggcagggctgagaaggatctgagtaaaactcggtccttcaacaccatccttcagcccctgggtcatgttccttcgaggtcctggtgggaggtagacaagcctagccttgtgctgttcctgtaaggacagggtgggcattttctaccaacagaattcttggaattttcacacagcccagcctagccaagtccagggctatagcccagatacacaagttaaggtcccagcactggcacccaccacaggagcccccttacctctattacccagaagcttgtaggaggggtggtccgcagacaaggaccctgcacaggtgcgaccctgcttccctcctggtcataactttcatgttactattgcttgggataatgttaagtaaaaatagcagacactgagttttaagtctcaagtggatgaaggcagagatcgtgatacacttgagttaaagcagtagggttctgtcattttctattcctgttgtaaacattttctttaatgttattatttttaccactaaactaacgtggcctggtcacgactttcattggtaaagtgtgctgttcctcaccctccaccgttgctcctttggtccactgatcataagagcatttacctgaaggtcgtcagacctcgaatgccaacaggtcaactgcagtggcctgcagttaccacccagtctgttccaatgaacagaatcgctgttgccccaactcatctcccttcacctaggctgtaaattgaaagtcccacccctgagcggaacacaggccatcttgtgtgctgtgcaccaccagggggtggggaagtttccagactgacttcctggctccagtcatcctaggaaaagagttctccagtcgctccccacccccaccccttcccattccaggagtctattaaggaggcaaagcaggcctaacgggtatcaaagcaaaggagtgaatggagactgggagagtcttcaacctctcctctccttggtaggagctgaggctgcatgccaggtaccttcccttcgaggaatctaataaagctaggtcactggtgttttcaggtgcttctcaaaggattgccgtaggggtaggatatcaggatgtgggagcacaggtgccaccacagcactagtgatggagagtcattgcccctagacttctgggacagtgagactgtgaggaaagctgaaatgatactgggaaagggtgaaagaaaggatgtaggtggaatttatttagtattaatgtaggtacacataccttatggcaacattcctagcactctaaattctagatttgtatagtttctgtcaatatcttttgtaagcttaatcaatacagggcatgacaagtatgtgtcacatacttttttttccacgaagaaaaaaaataagtaggaattgggtgctttgtttatcaaaatttgtatttcctttataaataaactttgaaataaaggttgaaaattagta'
Display the source organism for this sequence.
s.SourceOrganism
ans = 4×65 char array
'Homo sapiens '
'Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;'
'Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; '
'Catarrhini; Hominidae; Homo. '
Input Arguments
GenBank-formatted file, specified as one of these values:
Character vector or string specifying a file name, a path and file name, or a URL pointing to a file. The referenced file is a GenBank-formatted file (ASCII text file). If you specify only a file name, that file must be on the MATLAB® search path or in the MATLAB Current Folder.
MATLAB character array or string vector that contains the text of a GenBank-formatted file.
You can use the getgenbank function with the
ToFile argument to retrieve sequence information from the
GenBank database and create an GenBank-formatted file.
Data Types: char | string
Connection timeout in seconds, specified as a positive number. The default value is 5. For details, see here.
Data Types: double
Output Arguments
Resulting data, returned as a MATLAB structure or array of structures containing fields corresponding to GenBank keywords.
When File contains multiple entries, each entry is stored as a
separate element in GenBankData. For a list of the GenBank keywords, see https://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html.
Version History
Introduced before R2006a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)