Main Content

Use Python Pandas DataFrames in MATLAB

Since R2024a

Pandas is a core Python library that is useful for working with tabular data in Python. The pandas library defines a class called DataFrame.

This example shows how to use a pandas DataFrame (py.pandas.DataFrame) in MATLAB.

  • You can convert a pandas DataFrame to a MATLAB table using the table function. You can also convert a pandas DataFrame to a MATLAB timetable using the timetable function.

  • You can convert a MATLAB table or timetable to a pandas DataFrame using the py.pandas.DataFrame function. You can also pass MATLAB tables or timetables directly to Python functions without converting them to pandas DataFrames first.

Convert Pandas DataFrame to MATLAB Table

Convert a pandas DataFrame to a MATLAB table using the table function.

For example, create a pandas DataFrame and convert it to a MATLAB table. In this case, MATLAB converts the pandas Series to MATLAB strings and converts the Python integer values to MATLAB integer values.

last_names = py.pandas.Series({"Sanchez","Johnson","Zhang","Diaz","Brown"},dtype="object");
age = py.numpy.random.randint(18,high=85,size=py.len(last_names));
height = py.numpy.random.randint(60,high=78,size=py.len(last_names));

pd = py.pandas.DataFrame(struct("LastName",last_names,"Age",age,"Height",height));
md = table(pd)
md=5×3 table
    LastName     Age    Height
    _________    ___    ______

    "Sanchez"    61       65  
    "Johnson"    46       62  
    "Zhang"      68       61  
    "Diaz"       34       72  
    "Brown"      80       65  

Convert Pandas DataFrame to MATLAB Timetable

If you have a pandas DataFrame that contains timestamped values, you can convert it to a MATLAB timetable using the timetable function:

  • If the index of the pandas DataFrame has time values (of type datetime64, timedelta64, Timestamp, or Timedelta), then those values are used to set the row times of the MATLAB timetable.

  • If the index does not have time values, then the row times come from the first column of the pandas DataFrame that has time values (of type datetime64, timedelta64, Timestamp, or Timedelta).

For example, create a pandas DataFrame and convert it to a MATLAB timetable. In this case, MATLAB converts the Python datetime values to MATLAB datetime values.

date_today = py.datetime.datetime.now();
mtimes = py.pandas.date_range(date_today,periods=3,freq='S');
temp = py.list({double(37.3),double(39.1),double(42.3)});
pressure = py.list({int32(30.10),int32(30.56),int32(28.90)});
wspeed = py.list({single(13.4),single(6.5),single(7.3)});

ptt = py.pandas.DataFrame(struct("MeasurementTime",mtimes, ...
    "Temp",temp,"Pressure",pressure,"WindSpeed",wspeed));

mtt = timetable(ptt)
mtt=3×3 timetable
      MeasurementTime       Temp    Pressure    WindSpeed
    ____________________    ____    ________    _________

    01-Feb-2024 13:51:50    37.3       30         13.4   
    01-Feb-2024 13:51:51    39.1       31          6.5   
    01-Feb-2024 13:51:52    42.3       29          7.3   

Convert MATLAB Table to Pandas DataFrame

MATLAB implicitly converts any MATLAB table passed to a Python function into a pandas DataFrame. However, if you have a MATLAB table, you can explicitly convert it to a pandas DataFrame using py.pandas.DataFrame.

For example, create a MATLAB table and convert it to a pandas DataFrame.

dish_name = ["omelette";"soup";"salad";"fries";"cookie"];
price = [8.50;5.00;7.25;1.50;2.00];
sold_out = [true;false;true;false;true];

md = table(dish_name,price,sold_out);
pd = py.pandas.DataFrame(md);
py.print(pd)
  dish_name  price  sold_out
0  omelette   8.50      True
1      soup   5.00     False
2     salad   7.25      True
3     fries   1.50     False
4    cookie   2.00      True

You can also pass MATLAB tables directly to Python functions without converting them to pandas DataFrames first. For example, find the length of md using py.len.

l = py.len(md)
l = 
  Python int with properties:

    denominator: [1×1 py.int]
           imag: [1×1 py.int]
      numerator: [1×1 py.int]
           real: [1×1 py.int]

    5

Convert MATLAB Timetable to Pandas DataFrame

You can convert MATLAB timetables to pandas DataFrames using py.pandas.DataFrame.

For example, create a MATLAB timetable and convert it to a pandas DataFrame.

mtt = timetable(datetime(["2023-12-18";"2023-12-19";"2023-12-20"]), ...
               [37.3;39.1;42.3],[30.1;30.03;29.9],[13.4;6.5;7.3]);
mtt.Properties.VariableNames = ["Temp","Pressure","WindSpeed"];
ptt = py.pandas.DataFrame(mtt);
py.print(ptt)
            Temp  Pressure  WindSpeed
Time                                 
2023-12-18  37.3     30.10       13.4
2023-12-19  39.1     30.03        6.5
2023-12-20  42.3     29.90        7.3

Data Type Conversion from Pandas DataFrames to MATLAB Tables or Timetables

When you convert a pandas DataFrame to a MATLAB table or timetable, MATLAB automatically converts these pandas data types to MATLAB types. In this table, py. refers to built-in Python data types, np. refers to NumPy, and pd. refers to pandas.

Pandas Data Type

Converted Data Type in MATLAB

np.uint8

pd.UInt8

uint8

np.uint16

pd.UInt16

uint16

np.uint32

pd.UInt32

uint32

np.uint64

pd.UInt64

uint64

np.int8

pd.Int8

int8

np.int16

pd.Int16

int16

np.int32

pd.Int32

int32

np.int64

pd.Int64

int64

np.float32

pd.Float32

single

np.float64

pd.Float64

double

py.bool

logical

pd.Categorical

categorical

np.datetime64

pd.Timestamp

datetime

np.timedelta64

pd.Timedelta

duration

np.complex64

complex (single)

np.complex128

complex (double)

py.str

pd.StringDtype

string

py.tuple

cell

py.dict

struct

Other type

Python object — py.type

py.NaN

NaN

py.NaT

NaT

pd.NaN

<missing>

Data Type Conversion from MATLAB Tables or Timetables to Pandas DataFrame

When you call a Python function with a MATLAB table or you explicitly convert a MATLAB table or timetable to a pandas DataFrame, MATLAB automatically converts the table data into types that best represent the data in the pandas language. In this table, py. refers to built-in Python data types, np. refers to NumPy, and pd. refers to pandas.

MATLAB Data Type

Converted Data Type in Pandas

uint8

np.uint8

uint16

np.uint16

uint32

np.uint32

uint64

np.uint64

int8

np.int8

int16

np.int16

int32

np.int32

int64

np.int64

single

np.float32

double

np.float64

logical

py.bool

categorical

pd.Categorical

datetime

np.datetime64

duration

np.timedelta64

complex (single)

np.complex64

complex (double)

np.complex128

string

cellstr

np.object

cell

py.tuple

dictionary

struct

py.dict

NaN

py.NaN

NaT

py.NaT

<missing>

pd.NaN

<undefined> (categorical)

pd.NaN

Copyright 2023 The MathWorks, Inc.

See Also

|

Related Topics