Trouble with date conversion

4 views (last 30 days)
Ellen
Ellen on 7 Nov 2023
Answered: Peter Perkins on 10 Nov 2023
I have a csv file in which the date is in the format YYYYMMDD for example 19910101 the nextr row has the hour. I cant manage to convert this tot three columns containing Year, month , day. Now I have imported the column als tekst How can I convert it to three Columns showing year, month day?
I dont have much experience with datasets yet.
the file is from the KNMI uurgeg_310_1991-2000.txt
%% Set up the Import Options and import the data
opts = delimitedTextImportOptions("NumVariables", 25);
% Specify range and delimiter
opts.DataLines = [34, Inf];
opts.Delimiter = ",";
% Specify column names and types
opts.VariableNames = ["Var1", "YYYYMMDD", "HH", "DD", "Var5", "FF", "Var7", "Var8", "Var9", "Var10", "Var11", "Var12", "Var13", "Var14", "P", "Var16", "Var17", "Var18", "Var19", "Var20", "Var21", "Var22", "Var23", "Var24", "Var25"];
opts.SelectedVariableNames = ["YYYYMMDD", "HH", "DD", "FF", "P"];
opts.VariableTypes = ["char", "char", "double", "double", "char", "double", "char", "char", "char", "char", "char", "char", "char", "char", "double", "char", "char", "char", "char", "char", "char", "char", "char", "char", "char"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Specify variable properties
opts = setvaropts(opts, ["Var1", "YYYYMMDD", "Var5", "Var7", "Var8", "Var9", "Var10", "Var11", "Var12", "Var13", "Var14", "Var16", "Var17", "Var18", "Var19", "Var20", "Var21", "Var22", "Var23", "Var24", "Var25"], "WhitespaceRule", "preserve");
opts = setvaropts(opts, ["Var1", "YYYYMMDD", "Var5", "Var7", "Var8", "Var9", "Var10", "Var11", "Var12", "Var13", "Var14", "Var16", "Var17", "Var18", "Var19", "Var20", "Var21", "Var22", "Var23", "Var24", "Var25"], "EmptyFieldRule", "auto");
opts = setvaropts(opts, ["HH", "DD", "FF", "P"], "ThousandsSeparator", ",");
% Import the data
tbl = readtable("D:\01 werk in uitvoering\uurgeg_310_1991-2000.txt", opts);
%% Convert to output type
YYYYMMDD = tbl.YYYYMMDD;
HH = tbl.HH;
DD = tbl.DD;
FF = tbl.FF;
P = tbl.P;
%% Clear temporary variables
clear opts tbl
  5 Comments
Ellen
Ellen on 7 Nov 2023
When I Use datetime its gives NaT, There seems to be no time in the column only a number that shows year, mont and day as a large number.
Ellen
Peter Perkins
Peter Perkins on 10 Nov 2023
"I cant manage to convert this tot three columns containing Year, month , day"
Hard to say without context, but you may nt want to do that. Once you have a datetime (as Les shows), likely you can do what you need to without separate date components.

Sign in to comment.

Accepted Answer

Les Beckham
Les Beckham on 7 Nov 2023
As @Stephen23 said, it is better to keep the date and time as a datetime array.
See below.
opts = delimitedTextImportOptions;
opts.DataLines = 34;
opts.VariableNames = {'STN', 'YYYYMMDD', 'HH', 'DD', 'FH', 'FF', 'FX', 'T', 'T10N', 'TD', 'SQ', 'Q', 'DR', 'RH', 'P', 'VV', 'N', 'U', 'WW', 'IX', 'M', 'R', 'S', 'O', 'Y'};
opts.VariableTypes = {'double', 'char', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double'};
% numel({'double', 'datetime', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double', 'double'})
T = readtable('uurgeg_310_2021-2030.txt', opts)
T = 24264×25 table
STN YYYYMMDD HH DD FH FF FX T T10N TD SQ Q DR RH P VV N U WW IX M R S O Y ___ ____________ __ ___ __ __ ___ __ ____ __ __ __ __ __ _____ __ _ __ ___ __ _ _ _ _ _ 310 {'20210101'} 1 300 30 40 60 35 NaN 18 0 0 0 0 10067 70 7 88 NaN 5 0 0 0 0 0 310 {'20210101'} 2 180 30 40 60 36 NaN 20 0 0 6 9 10072 70 8 89 23 7 0 1 0 0 0 310 {'20210101'} 3 220 30 30 50 30 NaN 18 0 0 3 6 10075 65 7 91 23 7 0 1 0 0 0 310 {'20210101'} 4 220 40 40 50 31 NaN 19 0 0 0 0 10074 65 7 91 NaN 5 0 0 0 0 0 310 {'20210101'} 5 300 20 10 50 37 NaN 26 0 0 4 1 10075 60 8 92 81 7 0 1 0 0 0 310 {'20210101'} 6 190 20 30 40 40 11 24 0 0 0 -1 10077 68 8 89 23 7 0 1 0 0 0 310 {'20210101'} 7 310 20 30 40 46 NaN 22 0 0 0 -1 10081 69 8 84 51 7 0 1 0 0 0 310 {'20210101'} 8 300 30 30 60 42 NaN 12 0 0 0 -1 10085 75 8 80 22 7 0 1 0 0 0 310 {'20210101'} 9 280 40 50 70 51 NaN 16 3 11 0 0 10089 75 8 77 2 7 0 0 0 0 0 310 {'20210101'} 10 280 50 50 70 54 NaN 18 3 31 0 0 10093 75 8 77 NaN 5 0 0 0 0 0 310 {'20210101'} 11 270 50 50 70 59 NaN 24 3 50 0 0 10095 67 8 78 NaN 5 0 0 0 0 0 310 {'20210101'} 12 270 50 60 80 62 27 22 7 81 0 0 10092 69 8 75 2 7 0 0 0 0 0 310 {'20210101'} 13 260 60 70 80 58 NaN 20 5 62 0 0 10092 72 8 76 NaN 5 0 0 0 0 0 310 {'20210101'} 14 260 70 70 90 56 NaN 19 0 21 0 0 10092 75 8 77 NaN 5 0 0 0 0 0 310 {'20210101'} 15 260 80 90 100 55 NaN 23 7 20 0 0 10094 69 8 79 NaN 5 0 0 0 0 0 310 {'20210101'} 16 250 80 90 100 54 NaN 23 2 6 0 0 10099 72 8 80 NaN 5 0 0 0 0 0
DateTime = table(datetime(T.YYYYMMDD, 'InputFormat', 'yyyyMMdd') + hours(T.HH));
T = [DateTime T];
T = renamevars(T, 'Var1', 'DateTime')
T = 24264×26 table
DateTime STN YYYYMMDD HH DD FH FF FX T T10N TD SQ Q DR RH P VV N U WW IX M R S O Y ____________________ ___ ____________ __ ___ __ __ ___ __ ____ __ __ __ __ __ _____ __ _ __ ___ __ _ _ _ _ _ 01-Jan-2021 01:00:00 310 {'20210101'} 1 300 30 40 60 35 NaN 18 0 0 0 0 10067 70 7 88 NaN 5 0 0 0 0 0 01-Jan-2021 02:00:00 310 {'20210101'} 2 180 30 40 60 36 NaN 20 0 0 6 9 10072 70 8 89 23 7 0 1 0 0 0 01-Jan-2021 03:00:00 310 {'20210101'} 3 220 30 30 50 30 NaN 18 0 0 3 6 10075 65 7 91 23 7 0 1 0 0 0 01-Jan-2021 04:00:00 310 {'20210101'} 4 220 40 40 50 31 NaN 19 0 0 0 0 10074 65 7 91 NaN 5 0 0 0 0 0 01-Jan-2021 05:00:00 310 {'20210101'} 5 300 20 10 50 37 NaN 26 0 0 4 1 10075 60 8 92 81 7 0 1 0 0 0 01-Jan-2021 06:00:00 310 {'20210101'} 6 190 20 30 40 40 11 24 0 0 0 -1 10077 68 8 89 23 7 0 1 0 0 0 01-Jan-2021 07:00:00 310 {'20210101'} 7 310 20 30 40 46 NaN 22 0 0 0 -1 10081 69 8 84 51 7 0 1 0 0 0 01-Jan-2021 08:00:00 310 {'20210101'} 8 300 30 30 60 42 NaN 12 0 0 0 -1 10085 75 8 80 22 7 0 1 0 0 0 01-Jan-2021 09:00:00 310 {'20210101'} 9 280 40 50 70 51 NaN 16 3 11 0 0 10089 75 8 77 2 7 0 0 0 0 0 01-Jan-2021 10:00:00 310 {'20210101'} 10 280 50 50 70 54 NaN 18 3 31 0 0 10093 75 8 77 NaN 5 0 0 0 0 0 01-Jan-2021 11:00:00 310 {'20210101'} 11 270 50 50 70 59 NaN 24 3 50 0 0 10095 67 8 78 NaN 5 0 0 0 0 0 01-Jan-2021 12:00:00 310 {'20210101'} 12 270 50 60 80 62 27 22 7 81 0 0 10092 69 8 75 2 7 0 0 0 0 0 01-Jan-2021 13:00:00 310 {'20210101'} 13 260 60 70 80 58 NaN 20 5 62 0 0 10092 72 8 76 NaN 5 0 0 0 0 0 01-Jan-2021 14:00:00 310 {'20210101'} 14 260 70 70 90 56 NaN 19 0 21 0 0 10092 75 8 77 NaN 5 0 0 0 0 0 01-Jan-2021 15:00:00 310 {'20210101'} 15 260 80 90 100 55 NaN 23 7 20 0 0 10094 69 8 79 NaN 5 0 0 0 0 0 01-Jan-2021 16:00:00 310 {'20210101'} 16 250 80 90 100 54 NaN 23 2 6 0 0 10099 72 8 80 NaN 5 0 0 0 0 0
  1 Comment
Peter Perkins
Peter Perkins on 10 Nov 2023
Probably best to then create TT as
TT = timetable(T,RowTimes=DateTime)
(Also best to not name variable quite so similarly to classes.)

Sign in to comment.

More Answers (2)

Stephen23
Stephen23 on 7 Nov 2023
Rather than fiddling around with text or numerics, just import the 2nd column as DATETIME right from the start:
fnm = 'uurgeg_310_2021-2030.txt';
opt = detectImportOptions(fnm, 'Delimiter',',', 'Range',32, 'VariableNamingRule','preserve');
opt = setvartype(opt, 'YYYYMMDD','datetime');
opt = setvaropts(opt, 'YYYYMMDD','InputFormat','uuuuMMdd');
tbl = readtable(fnm,opt)
tbl = 24264×25 table
# STN YYYYMMDD HH DD FH FF FX T T10N TD SQ Q DR RH P VV N U WW IX M R S O Y _____ ___________ __ ___ __ __ ___ __ ____ __ __ __ __ __ _____ __ _ __ ___ __ _ _ _ _ _ 310 01-Jan-2021 1 300 30 40 60 35 NaN 18 0 0 0 0 10067 70 7 88 NaN 5 0 0 0 0 0 310 01-Jan-2021 2 180 30 40 60 36 NaN 20 0 0 6 9 10072 70 8 89 23 7 0 1 0 0 0 310 01-Jan-2021 3 220 30 30 50 30 NaN 18 0 0 3 6 10075 65 7 91 23 7 0 1 0 0 0 310 01-Jan-2021 4 220 40 40 50 31 NaN 19 0 0 0 0 10074 65 7 91 NaN 5 0 0 0 0 0 310 01-Jan-2021 5 300 20 10 50 37 NaN 26 0 0 4 1 10075 60 8 92 81 7 0 1 0 0 0 310 01-Jan-2021 6 190 20 30 40 40 11 24 0 0 0 -1 10077 68 8 89 23 7 0 1 0 0 0 310 01-Jan-2021 7 310 20 30 40 46 NaN 22 0 0 0 -1 10081 69 8 84 51 7 0 1 0 0 0 310 01-Jan-2021 8 300 30 30 60 42 NaN 12 0 0 0 -1 10085 75 8 80 22 7 0 1 0 0 0 310 01-Jan-2021 9 280 40 50 70 51 NaN 16 3 11 0 0 10089 75 8 77 2 7 0 0 0 0 0 310 01-Jan-2021 10 280 50 50 70 54 NaN 18 3 31 0 0 10093 75 8 77 NaN 5 0 0 0 0 0 310 01-Jan-2021 11 270 50 50 70 59 NaN 24 3 50 0 0 10095 67 8 78 NaN 5 0 0 0 0 0 310 01-Jan-2021 12 270 50 60 80 62 27 22 7 81 0 0 10092 69 8 75 2 7 0 0 0 0 0 310 01-Jan-2021 13 260 60 70 80 58 NaN 20 5 62 0 0 10092 72 8 76 NaN 5 0 0 0 0 0 310 01-Jan-2021 14 260 70 70 90 56 NaN 19 0 21 0 0 10092 75 8 77 NaN 5 0 0 0 0 0 310 01-Jan-2021 15 260 80 90 100 55 NaN 23 7 20 0 0 10094 69 8 79 NaN 5 0 0 0 0 0 310 01-Jan-2021 16 250 80 90 100 54 NaN 23 2 6 0 0 10099 72 8 80 NaN 5 0 0 0 0 0
"I cant manage to convert this tot three columns containing Year, month , day"
[tbl.Year,tbl.Month,tbl.Day] = ymd(tbl.YYYYMMDD);
tbl % scroll to the right
tbl = 24264×28 table
# STN YYYYMMDD HH DD FH FF FX T T10N TD SQ Q DR RH P VV N U WW IX M R S O Y Year Month Day _____ ___________ __ ___ __ __ ___ __ ____ __ __ __ __ __ _____ __ _ __ ___ __ _ _ _ _ _ ____ _____ ___ 310 01-Jan-2021 1 300 30 40 60 35 NaN 18 0 0 0 0 10067 70 7 88 NaN 5 0 0 0 0 0 2021 1 1 310 01-Jan-2021 2 180 30 40 60 36 NaN 20 0 0 6 9 10072 70 8 89 23 7 0 1 0 0 0 2021 1 1 310 01-Jan-2021 3 220 30 30 50 30 NaN 18 0 0 3 6 10075 65 7 91 23 7 0 1 0 0 0 2021 1 1 310 01-Jan-2021 4 220 40 40 50 31 NaN 19 0 0 0 0 10074 65 7 91 NaN 5 0 0 0 0 0 2021 1 1 310 01-Jan-2021 5 300 20 10 50 37 NaN 26 0 0 4 1 10075 60 8 92 81 7 0 1 0 0 0 2021 1 1 310 01-Jan-2021 6 190 20 30 40 40 11 24 0 0 0 -1 10077 68 8 89 23 7 0 1 0 0 0 2021 1 1 310 01-Jan-2021 7 310 20 30 40 46 NaN 22 0 0 0 -1 10081 69 8 84 51 7 0 1 0 0 0 2021 1 1 310 01-Jan-2021 8 300 30 30 60 42 NaN 12 0 0 0 -1 10085 75 8 80 22 7 0 1 0 0 0 2021 1 1 310 01-Jan-2021 9 280 40 50 70 51 NaN 16 3 11 0 0 10089 75 8 77 2 7 0 0 0 0 0 2021 1 1 310 01-Jan-2021 10 280 50 50 70 54 NaN 18 3 31 0 0 10093 75 8 77 NaN 5 0 0 0 0 0 2021 1 1 310 01-Jan-2021 11 270 50 50 70 59 NaN 24 3 50 0 0 10095 67 8 78 NaN 5 0 0 0 0 0 2021 1 1 310 01-Jan-2021 12 270 50 60 80 62 27 22 7 81 0 0 10092 69 8 75 2 7 0 0 0 0 0 2021 1 1 310 01-Jan-2021 13 260 60 70 80 58 NaN 20 5 62 0 0 10092 72 8 76 NaN 5 0 0 0 0 0 2021 1 1 310 01-Jan-2021 14 260 70 70 90 56 NaN 19 0 21 0 0 10092 75 8 77 NaN 5 0 0 0 0 0 2021 1 1 310 01-Jan-2021 15 260 80 90 100 55 NaN 23 7 20 0 0 10094 69 8 79 NaN 5 0 0 0 0 0 2021 1 1 310 01-Jan-2021 16 250 80 90 100 54 NaN 23 2 6 0 0 10099 72 8 80 NaN 5 0 0 0 0 0 2021 1 1
  1 Comment
Ellen
Ellen on 9 Nov 2023
thx for learning me something new!
Ellen

Sign in to comment.


Peter Perkins
Peter Perkins on 10 Nov 2023
Worth saying that if the file was a spreadsheet and contained things like 20230101 as numeric values, there's a conversion for that too:
datetime(20230102,ConvertFrom="yyyymmdd")
ans = datetime
02-Jan-2023

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!