correlation using specific values of the table

Hello, im trying to calculate the correlations of 3 different stocks and an index. I got the code up to here
clc
clear all
formatSpec = '%s %f %f %f %f';
temp_dat = readtable('Prices4.csv','Format',formatSpec,'ReadVariableNames',true);
% Extract pirices from table
price = table2array(temp_dat(:,2:end));
%Returns
rets = log(price(2:end,:)./price(1:end-1,:));
I want to Find the correlation matrix between the stocks on days when index returns are positive, and the correlation matrix when index returns are below the 10th percentile.
I dont know how to tell matlab to only take in to account those specifc values .

3 Comments

What is the first column in the .csv file -- a date string, probably? If so, I'd suggest a timetable could be the thing here, and it's unlikely would need the input format at all.
Then, use the table itself once you've got it instead of duplicating the same data into another array -- that'll have the secondary benefit of keeping the stock IDs with them that you carefully read in to begin with instead of throwing them away.
Then use logical addressing to select the data of interest (and, if use a timetable, retime) or rowfun to apply the calculation. You can create the grouping variable from the index returns and use groupsummary as well.
Per usual, it would make illustrating with real code simpler if you would attach a representative sample of the input file so folks have something to work with without having to also make up data...
hey thanks for the answer, yes the first colum in the table are just dates, sorry to bother with further questions, i just started using matlab about three weeks ago, a timetable function? this is my tmpdata, is this the table i can directly use? or I need another timetable, will group summary will take the days for all the other 3 stocks?
Thank you very much for your help
I have another table, with only the returns from each stock and the index

Sign in to comment.

Answers (1)

Here is my idea.
I reccomend you should use useful functions such as price2ret, corrplot (corr), prctile and the functionality of timetable as follows:
Sample data generation
t = datetime(2022,8,1:31,12,0,0)';
price = timetable(t,rand(31,1),rand(31,1))
price = 31×2 timetable
t Var1 Var2 ____________________ ________ ________ 01-Aug-2022 12:00:00 0.63836 0.63655 02-Aug-2022 12:00:00 0.21399 0.82862 03-Aug-2022 12:00:00 0.79756 0.015867 04-Aug-2022 12:00:00 0.33462 0.30921 05-Aug-2022 12:00:00 0.99808 0.028121 06-Aug-2022 12:00:00 0.93243 0.90636 07-Aug-2022 12:00:00 0.92369 0.54634 08-Aug-2022 12:00:00 0.77849 0.57087 09-Aug-2022 12:00:00 0.61343 0.77218 10-Aug-2022 12:00:00 0.98854 0.96529 11-Aug-2022 12:00:00 0.82305 0.43109 12-Aug-2022 12:00:00 0.24589 0.57595 13-Aug-2022 12:00:00 0.61277 0.49913 14-Aug-2022 12:00:00 0.25548 0.37036 15-Aug-2022 12:00:00 0.040045 0.59132 16-Aug-2022 12:00:00 0.62331 0.93248
Return calculation
tmp = price2ret(price);
price_to_return = tmp(:,["Var1","Var2"])
price_to_return = 30×2 timetable
Time Var1 Var2 ____________________ __________ ________ 02-Aug-2022 12:00:00 -1.093 0.26371 03-Aug-2022 12:00:00 1.3156 -3.9555 04-Aug-2022 12:00:00 -0.86855 2.9697 05-Aug-2022 12:00:00 1.0928 -2.3975 06-Aug-2022 12:00:00 -0.068044 3.4729 07-Aug-2022 12:00:00 -0.0094157 -0.50619 08-Aug-2022 12:00:00 -0.17102 0.043917 09-Aug-2022 12:00:00 -0.2383 0.30205 10-Aug-2022 12:00:00 0.47717 0.22321 11-Aug-2022 12:00:00 -0.18321 -0.8061 12-Aug-2022 12:00:00 -1.2081 0.2897 13-Aug-2022 12:00:00 0.91309 -0.14316 14-Aug-2022 12:00:00 -0.87483 -0.29839 15-Aug-2022 12:00:00 -1.8531 0.46788 16-Aug-2022 12:00:00 2.745 0.4555 17-Aug-2022 12:00:00 0.21941 -0.21122
[1] correlation between positive returns
find the data indices that satisfy the condition (positive return) to work out the correlation.
pos_idx = (price_to_return.Var1 > 0) & (price_to_return.Var2 > 0);
[R,pValue] = corrplot(price_to_return(pos_idx,:))
R = 2×2 table
Var1 Var2 ________ ________ Var1 1 -0.40455 Var2 -0.40455 1
pValue = 2×2 table
Var1 Var2 _______ _______ Var1 1 0.42628 Var2 0.42628 1
[2] ccorrelation between returns under 10th percentile
first, calculation the 10 percentile values for Var1 and Var2 respectively:
prct10 = prctile(price_to_return.Variables,10)
prct10 = 1×2
-1.7133 -2.0793
find the data indices that meet the condition ( < 10 percentile)
Var1_idx = price_to_return.Var1 < prct10(1);
Var2_idx = price_to_return.Var2 < prct10(2);
calculate the correlation:
[R2,pValue2] = corrplot([price_to_return.Var1(Var1_idx),price_to_return.Var2(Var2_idx)])
R2 = 2×2
1.0000 -0.4191 -0.4191 1.0000
pValue2 = 2×2
1.0000 0.7247 0.7247 1.0000

Categories

Products

Release

R2022a

Asked:

on 9 Sep 2022

Answered:

on 10 Sep 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!