Trying to reshape/reformat data set into multiple sets

2 views (last 30 days)
Hi,
So I've been trying to find ways to break this giant data set up into seperate sets without having to use either a for loop and having to retake these measurements manually. My data set includes cyclic voltammagrams which consist of 100 segments (50 complete CVs). My goal is to break this into 50 spectra from the original data set so that I can apply some further analysis on the data (since they are different over the course of the 100 segments). The data has a total of 140,000 lines, and for every 2801 lines, a new spectra is made. The original dat set is a 140000x2 matrix. Column 1 is potential, and column 2 is current.
Thanks!

Answers (3)

Star Strider
Star Strider on 15 Oct 2024
A matrix of 140000 rows won’t work, however a matrix of 140050 rows will.
Use the mat2cell function to break it up into managable sub-matrices.
Try this —
t = linspace(0, 1, 2801).';
te = repmat(t, 1, 50);
Potential = 1 + cos(2*pi*t);
Current = 1 + sin(2*pi*t);
VI = repmat([Potential Current], 50, 1);
figure
plot(VI(:,1), 'DisplayName','Potential')
hold on
plot(VI(:,2), 'DisplayName','Current')
hold off
grid
xlabel('Index')
ylabel('Value')
legend('Location','best')
VIcell = mat2cell(VI, ones(1,50)*2801, 2)
VIcell = 50x1 cell array
{2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double}
figure
plot(VIcell{1}(:,1), 'DisplayName','Potential')
hold on
plot(VIcell{2}(:,2), 'DisplayName','Current')
grid
hold off
xlabel('Index')
ylabel('Value')
title('Cell #1')
legend('Location','best')
figure
plot(VIcell{50}(:,1), 'DisplayName','Potential')
hold on
plot(VIcell{50}(:,2), 'DisplayName','Current')
grid
hold off
xlabel('Index')
ylabel('Value')
title('Cell #50')
legend('Location','best')
.

Matt J
Matt J on 15 Oct 2024
Using mat2tiles, downloadable from,
data=rand(140000,2);
spectra=mat2tiles(data,[2801,nan])
spectra = 50x1 cell array
{2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double} {2801x2 double}

Umar
Umar on 15 Oct 2024

Hi @Kayla,

To address your query regarding the segmentation of a large dataset into manageable spectra, you can utilize MATLAB's matrix indexing capabilities. The provided code effectively demonstrates how to achieve this without the need for explicit loops, thereby enhancing performance and readability. Below, I will break down the code step-by-step, explaining each component in detail.

Step 1: Generate Synthetic Data

The first part of the code is generating synthetic data for demonstration purposes. This is crucial for testing the segmentation process without needing real experimental data.

numLines = 140000; % Total number of data points
potential = linspace(-1, 1, numLines)'; % Example potential values
current = rand(numLines, 1); % Example current values (random data)
data = [potential, current]; % Combine into a 140000x2 matrix
  • numLines: This variable defines the total number of data points in the dataset.
  • potential: A linearly spaced vector representing potential values ranging from -1 to 1.
  • current: A vector of random values simulating current measurements.
  • data: A 140,000x2 matrix combining potential and current values.

Step 2: Define Parameters for Segmentation

Next, define the parameters necessary for segmenting the dataset into spectra.

linesPerSpectrum = 2801; % Number of lines per spectrum
numSpectra = floor(numLines / linesPerSpectrum); % Total number of spectra
  • linesPerSpectrum: This variable indicates how many lines constitute a single spectrum.
  • numSpectra: This calculates the total number of spectra that can be extracted from the dataset by dividing the total number of lines by the number of lines per spectrum.

Step 3: Preallocate a Cell Array

To store the extracted spectra, we preallocate a cell array, which is efficient for handling variable-sized data.

spectra = cell(numSpectra, 1); % Preallocate cell array

Step 4: Extract Each Spectrum

The core of the segmentation process involves extracting each spectrum using matrix indexing. This is where we can avoid explicit loops by leveraging MATLAB's vectorized operations.

for i = 1:numSpectra
  startIndex = (i-1) * linesPerSpectrum + 1; % Calculate start index
  endIndex = startIndex + linesPerSpectrum - 1; % Calculate end index
  spectra{i} = data(startIndex:endIndex, :); % Extract spectrum
end
  • startIndex: This calculates the starting index for each spectrum based on the current iteration.
  • endIndex: This determines the ending index for the current spectrum.
  • spectra{i}: Each spectrum is extracted from the data matrix using the calculated indices and stored in the preallocated cell array.

Step 5: Plotting the Spectra

Finally, we visualize the extracted spectra using MATLAB's plotting functions.

figure;
hold on; % Hold on to plot multiple spectra
colors = lines(numSpectra); % Generate distinct colors for each spectrum
for i = 1:numSpectra
  plot(spectra{i}(:, 1), spectra{i}(:, 2), 'Color', colors(i, :), 
 'DisplayName', sprintf('Spectrum %d', i));
end
xlabel('Potential (V)');
ylabel('Current (A)');
title('Cyclic Voltammetry Spectra');
legend show; % Show legend
grid on; % Add grid for better visualization
hold off; % Release the hold
  • figure: Creates a new figure window for plotting.
  • hold on: Allows multiple spectra to be plotted on the same graph.
  • colors: Generates a distinct color for each spectrum to enhance visual differentiation.
  • plot: Plots each spectrum with appropriate labels and colors.
  • xlabel, ylabel, title: These functions label the axes and title the plot for clarity.
  • legend show: Displays a legend to identify each spectrum.
  • grid on: Adds a grid to the plot for better visualization.
  • hold off: Releases the hold on the current figure.

Please see attached.

By utilizing matrix indexing and preallocating a cell array, the code not only enhances performance but also maintains clarity and ease of understanding. This approach allows for further analysis of each spectrum, facilitating insights into the data.

Hope this helps.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!