readbytes

Read binary data from a file

MuPAD® notebooks will be removed in a future release. Use MATLAB® live scripts instead.

MATLAB live scripts support most MuPAD functionality, though there are some differences. For more information, see Convert MuPAD Notebooks to MATLAB Live Scripts.

Syntax

readbytes(filename | n, <m>, <format>, <BigEndian | LittleEndian>, <ReturnType = DOM_HFARRAY | DOM_LIST | [DOM_HFARRAY] | [DOM_HFARRAY, dim1, dim2, …]>)

Description

readbytes lets you read arbitrary files and interpret their contents as a sequence of numbers.

The results of readbytes depend on the interpretation of the binary data set by the format option. When reading a file, you can interpret it as a stream of Byte, SignedByte, Short, SignedShort, Word, SignedWord, Float or Double. These are standard formats that are used by many program packages to read data. See Example 1.

This function is particularly useful when you work on data provided by or intended for external programs. For example, you can use it to implement encryption or compression algorithms in MuPAD®. See Example 2.

You can specify the file directly by its name. If a file name is specified, readbytes opens and closes the file automatically. If READPATH has no value, readbytes interprets the file name as a pathname relative to the “working directory.” Absolute path names are processed by readbytes, too.

Note

The meaning of “working directory” depends on the operating system. On Microsoft® Windows® systems and on Mac OS X systems, the “working directory” is the folder where MuPAD is installed. On UNIX® systems, it is the current working directory in which MuPAD was started.

If a file name is specified, each call to readbytes opens the file at the beginning. If the file was opened via fopen, subsequent calls of readbytes with the corresponding file descriptor start at the point in the file that was reached by the last readbytes command. Hence, if you want to read a file by portions, you must open it with fopen and use the returned file descriptor instead of the filename. See Example 3.

Note

If you open the file by using fopen, be sure to pass the flag Raw to fopen. Otherwise, readbytes throws an error.

Note

If the number of bytes in the file in a readbytes call is not a multiple of units of the specified format, the data are read up to the last complete number. The remaining bytes are ignored. See Example 4.

Be sure to read the data in the appropriate way. You need to know the format used by the program which created the file.

If readbytes is used with the option ReturnType = [DOM_HFARRAY, dim1, dim2, …], the return value is a DOM_HFARRAY of the appropriate size. Here dim1, dim2, … and positive integers which specifies the size of the dimensions of the array. If the file contains lesser values or the number of values to be read is limited, the not read elements of the array are initialized to 0.0. In other cases exactly the elements of the array are read. See Example 6.

If an array of type DOM_HFARRAY with complex numbers is written to a file, then first the real parts of the elements are written and then the complex parts are written to the file. Because readbytes can only read real values, first create the real and then the complex part to reconstruct the complex array. See Example 7.

Environment Interactions

The function readbytes is sensitive to the environment variable READPATH. First, the file is searched in the “working directory.” If it cannot be found there, all paths in READPATH are searched.

Examples

Example 1

Write a sequence of numbers to the file test.tst with the default settings. Then, load them back in:

writebytes("test.tst", [42, 17, 1, 3, 5, 7, 127, 250]):
readbytes("test.tst")

Read the above data with some other option: SignedByte interprets all values from 0 to 127 exactly as Byte does. Higher values x, however, are interpreted as x - 256. For example, 250 - 256 = - 6:

readbytes("test.tst", SignedByte)

Short interprets two bytes to be one number. Therefore, the eight written bytes are interpreted as four numbers. For example, the first 2 bytes yield 42 28 + 17 = 10769:

readbytes("test.tst", Short)

With the flag LittleEndian, the byte order is reversed. For example, the first 2 bytes now yield 17 28 + 42 = 4394:

readbytes("test.tst", Short, LittleEndian)

Word interprets four bytes to be one number. Therefore, the eight written bytes give two numbers. The first 4 bytes yield 10769 216 + 259 = 705757443:

readbytes("test.tst", Word)

Double interprets eight bytes to represent one floating-point number. The interpretation is machine dependent and may be different for you:

readbytes("test.tst", Double)

Example 2

Use readbytes and writebytes to encrypt the file created in the previous example with a simple “Caesar type encoding”: Any integer x (a byte) is replaced by x + 13 mod 256:

L := readbytes("test.tst"): 
L := map(L, x -> (x + 13 mod 256)):
writebytes("test.tst", L):

Knowing the encryption and its key, you can successfully decrypt the file:

L := readbytes("test.tst")

map(L, x -> (x - 13 mod 256))

delete L:

Example 3

Use fopen to write and read a file in portions:

n := fopen("test.tst", Write, Raw):               
for i from 1 to 10 do writebytes(n, [i]) end_for: 
fclose(n):

Equivalently, you can write all data in one go:

n := fopen("test.tst", Write, Raw):               
writebytes(n, [i $ i = 1..10]):
fclose(n):

Read the data byte by byte:

n := fopen("test.tst", Read, Raw): 
readbytes(n, 1), readbytes(n, 1), readbytes(n, 1);
fclose(n):

The next command reads in portions of 5 bytes each:

n := fopen("test.tst", Read, Raw): 
readbytes(n, 5), readbytes(n, 5);
fclose(n):

delete n, i:

Example 4

Here is what happens if the number of bytes in the file does not match a multiple of units of the specified format. Because both SignedShort and Float consist of an even number of bytes, the trailing 5-th byte corresponding to 11 is ignored:

writebytes("test.tst", [42, 17, 7, 9, 11], Byte):
readbytes("test.tst", SignedShort), 
readbytes("test.tst", Float)

Example 5

Specify byte ordering by using BigEndian and LittleEndian:

writebytes("test.tst", [129, 255, 145, 171, 191, 253], Byte):
L1 := readbytes("test.tst", Short, BigEndian)

L2 := readbytes("test.tst", Short, LittleEndian)

Look at the data in a binary representation. (See numlib::g_adic for details). The effect of using LittleEndian instead of BigEndian is to exchange the first 8 bits and the last 8 bits of each number:

map(L1, numlib::g_adic, 2)

map(L2, numlib::g_adic, 2)

delete L1, L2:

Example 6

Read data from a file and create a DOM_HFARRAY with the data using the option ReturnType:

writebytes("test.tst", 
  [    0.2703,   12.8317, -33.1531, 9999.9948, 0.2662,  -14.3421, 
    1000.1801,    0.4521, -34.6787,  -67.3549, 0.6818,   13], Double):
readbytes("test.tst", ReturnType=[DOM_HFARRAY,2,6]);
readbytes("test.tst", ReturnType=[DOM_HFARRAY,2,3,2]);

hfarray(1..2, 1..3, 1..2, [0.2703, 12.8317, -33.1531, 9999.9948, 0.2662, -\
14.3421, 1000.1801, 0.4521, -34.6787, -67.3549, 0.6818, 13.0])

If you try to read more elements, exactly the elements of the array are read.

readbytes("test.tst", ReturnType=[DOM_HFARRAY,2,4]);
readbytes("test.tst", 12, ReturnType=[DOM_HFARRAY,2,3]);

If you read just a part of the array, the other elements are initialized with 0.0.

readbytes("test.tst", ReturnType=[DOM_HFARRAY,2,7]);
readbytes("test.tst", 4, ReturnType=[DOM_HFARRAY,2,6]);

If you try to read all the data from the file using the option ReturnType without a dimension for the DOM_HFARRAY a one dimensional array of the right size is created.

 
readbytes("test.tst", ReturnType=DOM_HFARRAY)

Example 7

Write a DOM_HFARRAY with complex numbers to a file and try to reconstruct it by reading the data.

A := hfarray(1..2, 1..3,
             [[2342.133 + 56*I, -342.56, PI + I],
              [           -3*E, I^2 + I,     13]]);
writebytes("test.tst", A);
fd := fopen("test.tst", Read, Raw):   
B := readbytes(fd, ReturnType = [DOM_HFARRAY, 2, 3]);
C := readbytes(fd, ReturnType = [DOM_HFARRAY, 2, 3]);
bool(A = B + C*I);
flose(fd):

delete A, B, C, fd:

Parameters

filename

The name of a file: a character string

n

A file descriptor provided by fopen: a positive integer. The file must have been be opened using the fopen-flag Raw.

m

The number of values to be read or written: a positive integer.

format

The format of binary data, specified as Byte, SignedByte, Short, SignedShort, Word, SignedWord, Float, and Double.

Options

Byte, SignedByte, Short, SignedShort, SignedWord, Word, Double, Float

The format of the binary data. The default format is Byte.

A byte is an 8-bit binary number. Therefore, a byte can have 28 different values. For Byte, these are the integers from 0 to 255. For SignedByte, they are the integers from - 128 to 127.

With Byte, the data are read/written in 8-bit blocks, interpreted as unsigned bytes. When writing, the numbers are checked for being in the range from 0 to 255.

With SignedByte, the data are read or written using the 2-complement.

Byte is the default format.

A “short” is a 16-bit binary number (2 bytes). Therefore, a “short” can have 216 different values. For Short, these are the integers from 0 to 65536. For SignedShort, they are the integers from - 32768 to 32767.

The semantics of Short or SignedShort is analogous to that of Byte or SignedByte, respectively.

A “word” is a 32-bit binary number (4 bytes). Therefore, a “word” can have 232 different values. For Word, these are the integers from 0 to 4294967296. For SignedWord, they are the integers from - 2147483648 to 2147483647.

The semantics of Word or SignedWord is analogous to that of Byte or SignedByte, respectively.

A “float” is a 32-bit representation of a real number (4 bytes). A “double” is a 64-bit representation of a real number (8 bytes).

Note

Floating-point and double-precision values are read/written in the format of the machine/operating system MuPAD is currently running on. Therefore, the results may differ between different platforms.

Binary files containing floating-point numbers are, in general, not portable to other platforms.

See the flags BigEndian and LittleEndian for details on the byte ordering.

See Example 1 for an overview over the different format options.

BigEndian, LittleEndian

The byte ordering: either BigEndian or LittleEndian. The default ordering is BigEndian.

BigEndian and LittleEndian specify the order used to arrange the bytes for Short, SignedShort, Word, SignedWord, Float, and Double.

For all formats, the data are written in 8-bit blocks (bytes). This also includes the formats where a unit is longer than one byte (all formats but Byte and SignedByte). With BigEndian, the bytes with the most significant bits (“high bits”) are written first. With LittleEndian, the bytes with the least significant bits are written first.

If, for example, Short is selected, there are 16 bits that are to be written. If you pass BigEndian, first the byte with the bits for 215 to 28 and then the byte with the bits for 27 to 20 are written. If you specify LittleEndian, the order of the bytes is reversed.

BigEndian and LittleEndian have no effect if the formats Byte or SignedByte are specified.

BigEndian is the default byte order.

See Example 5 for the effects of BigEndian and LittleEndian.

ReturnType

Option, specified as ReturnType = DOM_HFARRAY | DOM_LIST | [DOM_HFARRAY] | [DOM_HFARRAY, dim1, dim2, …]that sets the type of the return value.

If set to DOM_LIST, the return value is a list which contains the read data.

If set to DOM_HFARRAY, the return value is a one dimensional array which contains the read data.

If set to [DOM_HFARRAY, dim1, dim2, …], the return value is a (multidimensional) array and dim1, dim2, … are positive integers which specifies the size of the dimensions of the array.

Return Values

A list of MuPAD numbers (either integers or floating-point numbers) or an array of hardware floating-point values of type DOM_HFARRAY. Its type depends on the setting of the option ReturnType.