sorting unique date by time.

2 views (last 30 days)
HAOLUN TIAN
HAOLUN TIAN on 25 Jun 2019
Edited: Campion Loong on 27 Jun 2019
hello,
I have a n*3 data.(entire US stock data) each cloume date are: stock id, date, price.
the date in in the second cloume is repeating for each stock id.
A T1 P1a
A T2 P2a
A T3 P3a
A T4 P4a
B T1 P1a
B T2 P2b
B T3 P3b
B T4 P4b
.... .... ....
i try to arrange data, let first cloume be the date, and firt row be the stock id.
ID =unique (Data(:,1))';
Date = unique(Data(:,2))
and then i have no ideal how to coding .
the data need to be
A B C ...............
T1 P1a P1b P1c ...............
T2 P2a P2b P2c
T3 P3a P3b P3c
T4 P4a P4b P4c
i have no coding experiences before, and to try the learn the matlab.
thanks for your time
  2 Comments
Jan
Jan on 26 Jun 2019
You forgot to mention, in which format the input data are stored. Is this a cell array or a table object? The wanted output format is not clear also.
HAOLUN TIAN
HAOLUN TIAN on 26 Jun 2019
is csv data, i download on wrds.

Sign in to comment.

Accepted Answer

Shameer Parmar
Shameer Parmar on 26 Jun 2019
Edited: Shameer Parmar on 27 Jun 2019
Hello Haolun, here is the logic..
% Assumming input Data be like this..
% first column is 'ID', second Column is 'date', third column is 'price'
inputData =[
11 21062019 1000
11 22062019 1500
11 23062019 2000
11 24062019 2500
22 21062019 3000
22 22062019 3500
22 23062019 4000
22 24062019 4500
33 21062019 5000
33 22062019 5500
33 23062019 6000
33 24062019 6500];
IDs =unique(inputData(:,1));
Dates = unique(inputData(:,2));
totalIDs = length(IDs);
totalDates = length(Dates);
% logic to create output data
% First item of output will be empty
outputData{1,1} = [];
% This FOR loop is for first row (row of IDs)
for count1 = 1:totalIDs
outputData{1,count1+1} = IDs(count1);
end
% This FOR loop is for first column (column of Dates)
for count2 = 1:totalDates
outputData{count2+1,1} = Dates(count2);
end
% This For loop is for intermediate values
for count1 = 1:totalIDs
for count2 = 1:totalDates
ID_val = IDs(count1);
Date_val = Dates(count2);
ID_Column_Indices = find(ismember(inputData(:,1),ID_val));
Date_Column_Indices = find(ismember(inputData(:,2),Date_val));
require_price_val = [];
for i=1:length(ID_Column_Indices)
for j=1:length(Date_Column_Indices)
if ID_Column_Indices(i)==Date_Column_Indices(j)
index = ID_Column_Indices(i);
require_price_val = inputData(index,3);
end
end
end
outputData{count2+1,count1+1} = require_price_val;
end
end
inputData % to display input data
outputData % to display output data
% Output data will be display as..
outputData =
[] [ 11] [ 22] [ 33]
[21062019] [1000] [3000] [5000]
[22062019] [1500] [3500] [5500]
[23062019] [2000] [4000] [6000]
[24062019] [2500] [4500] [6500]
Please see attached script and run it and see the result.. Thanksss... :)
  4 Comments
HAOLUN TIAN
HAOLUN TIAN on 27 Jun 2019
hi Shameer,
i run the code on matlab 2017a
HAOLUN TIAN
HAOLUN TIAN on 27 Jun 2019
Hello ,Shameer,
i find out is my mistakes.your code is right, i try really small part of my data it can run. (only first 5000 rows).
my total datas are about 400mb to 2GB. several different data. i think i need to wait more hours to get the result.

Sign in to comment.

More Answers (2)

Steven Lord
Steven Lord on 26 Jun 2019
If you were to store your data in a table or timetable array, the operation you're describing is unstack.
The first example on that documentation page looks somewhat similar to your application and includes the step of creating a table from the data, so you may be able to adapt it to suit your needs.
Once you've converted your data into a table or timetable there are lots of other functions that take advantage of the tabular nature of the data. You could name the variables in your table as A, B, or C for example, and index into the table using those variable names rather than using ID code like 11, 22, or 33 as Shameer Parmar did

Campion Loong
Campion Loong on 27 Jun 2019
Edited: Campion Loong on 27 Jun 2019
As Steve suggested above, you should import the CSV into a timetable and unstack to get what you want. I'd also mention the detectImportOptions function -- it gives you many knobs to tweak how exactly you'd like to import (without spending time to actually read the whole file):
Assuming Shameer's data format, your CSV would look like this:
data.csv:
STOCK1 21062019 1000
STOCK1 22062019 1500
STOCK1 23062019 2000
STOCK1 24062019 2500
STOCK1 21062019 3000
STOCK2 22062019 3500
STOCK2 23062019 4000
STOCK2 24062019 4500
STOCK3 21062019 5000
STOCK3 22062019 5500
STOCK3 23062019 6000
STOCK3 24062019 6500
Here are the steps to get your data into MATLAB, in your desired layout:
>> opt = detectImportOptions('data.csv');
% Not necessary (automatically detected) if your timestamps are in a more common time format
% e.g. '21-06-2019' instead of '21062019'
>> opt.VariableTypes{2} = 'datetime';
>> opt.VariableOptions(2).InputFormat = 'ddMMyyyy';
% Give your timestamps more friendly formats and variables meaningful names
>> opt.VariableOptions(2).DatetimeFormat = 'dd-MMM-yyyy';
>> opt.VariableNames = ["Ticker" "Time" "data"];
% Now read-in your data
>> tt = readtimetable('data.csv',opt)
tt =
12×2 timetable
Time Ticker data
___________ ________ ____
21-Jun-2019 'STOCK1' 1000
22-Jun-2019 'STOCK1' 1500
23-Jun-2019 'STOCK1' 2000
24-Jun-2019 'STOCK1' 2500
21-Jun-2019 'STOCK2' 3000
22-Jun-2019 'STOCK2' 3500
23-Jun-2019 'STOCK2' 4000
24-Jun-2019 'STOCK2' 4500
21-Jun-2019 'STOCK3' 5000
22-Jun-2019 'STOCK3' 5500
23-Jun-2019 'STOCK3' 6000
24-Jun-2019 'STOCK3' 6500
% Unstack to your desired layout
>> tt = unstack(tt,'data','Ticker')
tt =
4×3 timetable
Time STOCK1 STOCK2 STOCK3
___________ ______ ______ ______
21-Jun-2019 1000 3000 5000
22-Jun-2019 1500 3500 5500
23-Jun-2019 2000 4000 6000
24-Jun-2019 2500 4500 6500

Categories

Find more on Tables in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!