Combine non integer time steps into daily values

71 views (last 30 days)
Cristóbal
Cristóbal on 14 Oct 2025 at 15:21
Commented: Torsten on 15 Oct 2025 at 21:34
Dear Experts,
I have the following data, which have time and volume (.mat file attached).
time (Days) Volume (L)
0 0
30.6741806 0
1.168E-05 0.006798073
2.2995E-05 0.047292948
4.4165E-05 0.223091253
7.7015E-05 0.833100076
9.636E-05 2.066383935
0.00012045 4.343111658
0.000150745 8.336749383
0.000187975 14.55932459
0.000289445 22.84538025
0.00036135 32.30381193
Unfortunately, the time steps are not consistent, and I would like to group them into values of one day. As volume is linked to those time steps, I can not add the time steps, as they don't sum 1.
With some effort in Excel I could group the values, with their respective proportional volume:
time (Days) Volume (L)
0.97022767 70827.51996
0.02977233 1854.371465
0.07209406 4490.383105
0.7850486 48689.812711
0.1428574 9532.3878414
If I sum the bold and underlined ones, I get tha volume values for 1 day (integer)
time (Days) Volume (L)
1 72681.89142
1 62712.58366
As there are so many values (more than 900), Is it possible to do it on a propper way in MATLAB?
Thank you!
EDITED underlined values
  6 Comments
Star Strider
Star Strider ongeveer 2 uur ago
Edited: Star Strider 20 minuten ago
I cannot make any sense of that.
How did you decide on those particular values and ranges?
What criteria define the 'time' ranges?
(I experimented by summing the consecutive time values using cumsum and then summing the volume values that corresponded to the first time sum that was less than or equal to 1. My results were not at all similar to yours.)
EDIT --
I am giving up on even trying to understand this.
I have deleted my Answer.
.
Cristóbal
Cristóbal ongeveer 12 uur ago
Ranges are given by the sum until I reach 1, beginning from 1.168E-5 as the previous value 30.6741806 has no volume.
For the 49 values that follows 30.6741806, I sum all the 49 values (rows 4 to 52 in Excel file): 1.168E-5 + 2.2995E-5 + 4.4165E-5 + ... + 0.084265725 + 0.092457785 = 0.97022767
The next value (row 53 in Excel file) is 0.10186639, but I can't add it to the result 0.97022767 as it gives a number greater than 1 (1.07209406). So, instead, I just take the part of 0.10186639 I need to reach 1, that is 0.02977233.
So 0.97022767 + 0.02977233 = 1 day.
Associated volume for each row is added or taken proportional, so I have the volume for that 1 day.
The remaining of the value 0.10186639, that is 0.07209406 is used in the following group. So I have that, and I have the sum of row 54 to 59 (0.10856925 + 0.1156904 + ... + 0.146811395 + 0.1513363 = 0.7850486).
Those two again gives less than 1 when added: 0.07209406 + 0.7850486 = 0.8571426
So I take the proportion of the following row, as again if I used the complete value gives a number greater than 1. Row 60 is 0.152159375, so I take only 0.1428574 because
0.07209406 + 0.7850486 + 0.1428574 = 1 day
The remaining of the value 0.152159375, that is 0.0093020 is used in the following group. And again, associated volume for each row is added or taken proportional, so I have the volume for that 1 day.
I hope this clarify something. Thank you for taking the time to analyse this =)

Sign in to comment.

Answers (1)

Torsten
Torsten on 15 Oct 2025 at 0:14
Edited: Torsten on 15 Oct 2025 at 0:29
Looking at the volume values, I'm almost sure that these values are already cumulative values. Why should the increase in volume for later times take such enormous values within relatively small timespans ? But I compute both variants below.
Thus use "cumsum" for the first column of your data to determine the actual time and leave the second column as is or also apply cumsum to it. Then use interp1 to interpolate the actual value for the volume to daily values.
LD = load('timesteps.mat');
Tcum = cumsum(LD.time);
V = LD.volume;
Vcum = cumsum(LD.volume);
T_dayly = Tcum(1):floor(Tcum(end));
V_dayly = interp1(Tcum,V,T_dayly);
Vcum_dayly = interp1(Tcum,Vcum,T_dayly);
figure(1)
plot(T_dayly,V_dayly)
grid on
figure(2)
plot(T_dayly,Vcum_dayly)
grid on
  7 Comments
Umar
Umar ongeveer 8 uur ago
% Robust approach to handle duplicate or near-duplicate time values
% Load your data
load('timesteps.mat');
% Calculate cumulative time
Tcum = cumsum(time);
Vcum = volume;
%% Method 1: Remove exact duplicates using uniquetol
% This handles values that are "close enough" (within tolerance)
tolerance = 1e-10; % Adjust based on your precision needs
[Tcum_unique, unique_idx] = uniquetol(Tcum, tolerance, 'DataScale', 
1);
Vcum_unique = Vcum(unique_idx);
fprintf('Original data points: %d\n', length(Tcum));
fprintf('After removing duplicates: %d\n', length(Tcum_unique));
fprintf('Removed %d duplicate/near-duplicate points\n\n', length(Tcum)
- length(Tcum_unique));
%% Method 2: Average values at duplicate time points (alternative   approach)
% This preserves information if you have legitimate duplicates with   different volumes
[Tcum_unique2, ~, ic] = uniquetol(Tcum, tolerance, 'DataScale', 1);
Vcum_unique2 = accumarray(ic, Vcum, [], @mean);
%% Proceed with interpolation using cleaned data
T_daily = 0:1:floor(Tcum_unique(end));
Vcum_daily = interp1(Tcum_unique, Vcum_unique, T_daily, 'linear', 
'extrap');
% Calculate daily increments
V_daily = diff(Vcum_daily);
T_daily_increments = T_daily(2:end);
%% Verification
fprintf('Final cumulative volume: %.2f L\n', Vcum_daily(end));
fprintf('Sum of daily increments: %.2f L\n', sum(V_daily));
fprintf('Original final volume: %.2f L\n', Vcum(end));
fprintf('Difference: %.2f L (%.4f%%)\n\n', ...
  Vcum(end) - Vcum_daily(end), ...
  100*abs(Vcum(end) - Vcum_daily(end))/Vcum(end));
%% Additional quality check: identify problematic duplicates
% Find time differences between consecutive points
time_diffs = diff(Tcum);
small_diffs = time_diffs < 1e-6; % Flag very small time steps
if any(small_diffs)
  fprintf('Warning: Found %d time intervals smaller than 1e-6 days\n',     sum(small_diffs));
  fprintf('First few occurrences at indices: %s\n', ...
      mat2str(find(small_diffs, 5)'));
    % Show examples
    idx_examples = find(small_diffs, 3);
    if ~isempty(idx_examples)
        fprintf('\nExample near-duplicates:\n');
        for i = 1:length(idx_examples)
            idx = idx_examples(i);
            fprintf('  Point %d: Time=%.12f, Volume=%.2f\n', idx, 
            Tcum(idx), Vcum(idx));
            fprintf('  Point %d: Time=%.12f, Volume=%.2f\n', idx+1, 
            Tcum(idx+1), Vcum(idx+1));
            fprintf('  Difference: %.2e days\n\n', time_diffs(idx));
        end
    end
  end
%% Create results table
results_table = table(T_daily_increments', V_daily', ...
  'VariableNames', {'Time_Days', 'Daily_Volume_L'});
% Display first 50 rows
fprintf('First 50 daily volumes:\n');
disp(results_table(1:min(50, height(results_table)), :));
%% Visualization
figure('Position', [100 100 1200 500]);
subplot(1,2,1)
plot(T_daily, Vcum_daily, 'b-', 'LineWidth', 1.5)
hold on
plot(Tcum_unique, Vcum_unique, 'r.', 'MarkerSize', 4)
xlabel('Time (Days)')
ylabel('Cumulative Volume (L)')
title('Cumulative Volume Over Time')
legend('Interpolated Daily', 'Original Data', 'Location', 'northwest')
grid on
subplot(1,2,2)
plot(T_daily_increments, V_daily, 'b-', 'LineWidth', 1)
xlabel('Time (Days)')
ylabel('Daily Volume Increment (L)')
title('Daily Volume Changes')
grid on
ylim([0 max(V_daily)*1.1]) % Better visualization
%% Function to export results
% Uncomment to save results
% writetable(results_table, 'daily_volumes_cleaned.xlsx');
% fprintf('Results exported to daily_volumes_cleaned.xlsx\n');

Note: please see attached results.

Torsten
Torsten ongeveer 7 uur ago
What improvement could be done in the case data has repeated values in x to use interp1 (i.e., x vector not unique)?
Before applying interp1, sort out almost equal data points using "uniquetol".

Sign in to comment.

Products


Release

R2025a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!