Pairs Trading Code: Further Guidance Please
1 view (last 30 days)
Show older comments
Good afternoon.
I am trying to design a Pairs Trading code: I am using Matlab R2011b version. The dataset I am using is the S&P500, 1982 – 2012 Daily.
The procedure includes:
- A distance measure on each 12 month period (formation) sliding by 6 month period (Trading)
- Normalize
- Do pair wise distance measure (basically a 500, 500 matrix)
- Find pairs which minimize the distance measure (the sum of squared deviations between the two normalized price series.)
- Filter and identify top ranking pairs formation.
I would like to know:
- What are the next coding steps needed to find the pair wise distance measure criterion,
- What are the next coding steps to minimise the distance measure?
- How do I filter and identify the top ranking pairs integrating the time period sliding?
- How would I include a restriction on stocks being matched in the same industry sector according to the relevant SIC codes?
The code completed so far is below inclusive of note headers (%) for simplification.
Any help, guidance or advice would be very much appreciated.
Kind regards
Tomasz Mlynowski
% Import data
[data,text] = xlsread('G:\me\Desktop\Pairs Trading\sp500 price data.xlsm',1);
% Dates in numeric format
dates = datenum(text(4:end,1),'dd/mm/yyyy');
% Keep names (for reference)
names = text(1,2:end);
% Free memory from text (more than 80 MB)
clear text
% Retrieve year and month
[y,m] = datevec(dates);
% Unique pairs of year-month
ym = unique([y,m],'rows');
% Distance measure on each 12 month block sliding by 6 (except last block is 5)
% For reference, appendix: http://www.tinbergen.nl/discussionpapers/11150.pdf
for r = 12:6:numel(y)
% Select the data for the formation period
tmp = data(r-11:r,:);
% 1. Normalization
% Find columns with at least one non NAN value
idxnan = isnan(tmp);
cols = find(~(all(idxnan)));
% LOOP through columns
for c = cols
% First non NaN value
first = find(~idxnan(:,c),1,'first');
tmp(:,c) = tmp(:,c) / tmp(first,c);
end
% Now, don't loop by single column because you need to normalize taking
% into consideration pairs.
% LOOP through columns
% for c = cols
% Now select a specific column and loop against all others
% for cpair = setdiff(cols,c)
% first1 = find(~idxnan(:,c ),1,'first');
% first2 = find(~idxnan(:,cpair),1,'first');
%
% tmp(:,c) = tmp(:,c) / tmp(first,c);
% end
%
%
% First non NaN value
% first = find(~idxnan(:,c),1,'first');
% tmp(:,c) = tmp(:,c) / tmp(first,c);
%
% end
end
%Pairwise distance measure
% Minimum distance criterion identification
0 Comments
Answers (0)
See Also
Categories
Find more on Transaction Cost Analysis in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!