You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
How can I group data points together
3 views (last 30 days)
Show older comments
I need to "bin" data points together into one number. This file has 70,000 lines of data points, in unevenly spaced, often repeating increments. For example, I need to average all the different numbers (2.419, 2.417, 2.405, etc...) with decimals into 2.000.
4 Comments
Image Analyst
on 2 Dec 2020
Why did you delete your question?
Rik
on 2 Dec 2020
I need to "bin" data points together into one number. This file has 70,000 lines of data points, in unevenly spaced, often repeating increments. For example, I need to average all the different numbers (2.419, 2.417, 2.405, etc...) with decimals into 2.000.
Matthew Suddith
on 2 Dec 2020
Thank you! I meant to edit it but I completely deleted it and couldnt figure out how to get it back.
Rena Berman
on 6 May 2021
(Answers Dev) Restored edit
Answers (1)
Cris LaPierre
on 30 Nov 2020
It doesn't sound like you want to average them. For the example you've given, why not just round the numbers down to 2? The functions round, ceil, floor and fix might be of interest to you.
vals = [2.419, 2.417, 2.405];
round(vals)
ans = 1×3
2 2 2
ceil(vals)
ans = 1×3
3 3 3
●
floor(vals)
ans = 1×3
2 2 2
●
fix(vals)
ans = 1×3
2 2 2
35 Comments
Matthew Suddith
on 30 Nov 2020
The numbers are the depth at which the measurements were taken, and have associated calculations with them in 30 columns. I need to average each different depth, and its associated calculations, into a single depth so that I have one value at each depth. So 2.4100 m, 2.400m, etc is 2.000 m
Cris LaPierre
on 30 Nov 2020
Edited: Cris LaPierre
on 30 Nov 2020
For those of us not familiar with your data, how would we know what values to use?
Matthew Suddith
on 30 Nov 2020
What values do you need, I need to a function to average each decimal depth into a single 2.000 depth
Cris LaPierre
on 30 Nov 2020
I would think a better undertanding of what your data looks like would greatly faciliate proposing a solution. You can attach a sample of your data using the paperclip icon.
Absent that, I would encourage you to look at the documentation for groupsummary. You can average your data by groups you specify.
Matthew Suddith
on 30 Nov 2020
Here is a small snapshot of the data file. The depths range to the 700s, then return all the way back to the starting depth
Matthew Suddith
on 30 Nov 2020
the depth is the first value on the left
Cris LaPierre
on 30 Nov 2020
Edited: Cris LaPierre
on 30 Nov 2020
Ok, so for the solution I'm thinking about to work, i would need to create an "actual depth" column that would take your values and bin them to the corresponding actual value. Could you tell us what the actual depths should be? How much does the values in the file vary from the actual depths?
Any chance you can upload an actual text file of your data? I'm not feeling motivated enough to transcribe the values in the png. I believe the upload has to be less that 5MB, so you can delete some rows if necessary. It would be nice to see at least a couple of the different depths in the file.
Matthew Suddith
on 30 Nov 2020
Here's the problem, the actual text file is 70,000 lines, so attached is a small sample of depths 1.988 to 6.811. So then, the final product for those depths would need to look like the finalproduct txt file
Matthew Suddith
on 30 Nov 2020
The txt file has that many points for each depth, all the way up to 750, then it repeats in descending order back to the first depth. Its a CTD file, it collects data at each depth as it is lowered into the ocean, then again on the way up
Matthew Suddith
on 30 Nov 2020
Open this file instead for the sample, that previously linked file looks strange
Cris LaPierre
on 30 Nov 2020
Ok, so now it's just about developing an algorithm for turning recorded depth into standard depths. Since no guidance has been given on how to do that, I defer to my original answer. Here's some sample code that uses the round function.
data = readtable("sample.txt");
data.Properties.VariableNames(1) = "Depth";
% Create groups by rounding the depths to integer values
data.grpDepth = round(data.Depth);
newData = groupsummary(data,"grpDepth","mean")
newData = 6x31 table
grpDepth GroupCount mean_Depth mean_Var2 mean_Var3 mean_Var4 mean_Var5 mean_Var6 mean_Var7 mean_Var8 mean_Var9 mean_Var10 mean_Var11 mean_Var12 mean_Var13 mean_Var14 mean_Var15 mean_Var16 mean_Var17 mean_Var18 mean_Var19 mean_Var20 mean_Var21 mean_Var22 mean_Var23 mean_Var24 mean_Var25 mean_Var26 mean_Var27 mean_Var28 mean_Var29
________ __________ __________ _________ _________ _________ _________ _________ _________ _________ _________ __________ __________ __________ __________ __________ __________ ___________ ___________ __________ __________ __________ __________ __________ __________ __________ __________ __________ __________ __________ ___________
2 38 2.0818 32.252 14.397 0.13886 1.014 5.9752 260.6 101.96 5.8557 255.39 213.7 0.00020461 2.115 32.252 14.397 6.5789e-05 -0.00016055 246.76 39337 37267 -0.10753 1024 23.982 14.397 4.4036 0.22415 0.19865 3.0578 -9.99e-29
3 31 3.0398 32.252 14.397 0.13741 1.0094 5.9737 260.54 101.94 5.8557 255.39 152.62 0.0002013 3.0614 32.252 14.397 -0.00012905 -0.00024194 246.76 39338 38069 -0.35504 1024 23.982 14.397 4.4052 0.2218 0.19815 2.9156 -3.2226e-30
4 98 3.8058 32.252 14.398 0.13692 1.052 5.9826 260.93 102.09 5.8556 255.39 131.18 0.00019675 3.8405 32.252 14.398 7.3469e-05 -0.00035509 246.76 39339 32514 -0.31612 1024 23.982 14.397 4.4057 0.21853 0.20287 2.8509 -6.3202e-29
5 32 5.0267 32.252 14.397 0.13738 1.0955 5.99 261.25 102.22 5.8556 255.39 110.7 0.00019822 5.0663 32.252 14.398 0.0004 -0.0004125 246.76 39339 35651 -0.09156 1024 23.982 14.397 4.4052 0.21957 0.20769 2.7771 0
6 76 6.0728 32.252 14.397 0.13706 1.1068 5.9849 261.03 102.13 5.8556 255.39 91.28 0.00019676 6.1242 32.252 14.397 0.00010393 -0.00041053 246.76 30540 36233 -0.35731 1024 23.982 14.396 4.4056 0.21853 0.20898 2.6938 -3.1547e-29
7 13 6.6319 32.252 14.367 0.13656 1.0951 5.9974 261.57 102.34 5.8556 255.39 89.053 0.00019123 6.6817 32.252 14.397 0.00018333 -0.0004166 246.76 39340 39340 -0.28558 1024 23.982 14.396 4.4061 0.21462 0.20768 2.6834 0
Matthew Suddith
on 30 Nov 2020
Thank you so much, that looks like it could work for what I need to do! But I get this error message: "Error using round
First argument must be a numeric, logical, or char array."
Matthew Suddith
on 30 Nov 2020
When I subsitute the "sample.txt" with the actual file
Matthew Suddith
on 30 Nov 2020
Also, when this creates a table, would I be able to turn that table back into a txt file?
Cris LaPierre
on 30 Nov 2020
This likely means at least one of your depth values contains a value that cannot be rounded. Does one of the rows contain a non-numeric value? Does it work with a subset of the actual data?
Matthew Suddith
on 30 Nov 2020
Ok, yes the problem is the actual data file has a header on top that isnt numbers. So I need to make it read the file, but beginning at a certain line: needs to start at line 365.
Cris LaPierre
on 1 Dec 2020
Do any of those header lines contain variable names identifying what is in each column?
Matthew Suddith
on 1 Dec 2020
Yes they do, they identify what is in each column
Cris LaPierre
on 1 Dec 2020
Which line has that info? Line 364?
Matthew Suddith
on 1 Dec 2020
this is the header, copied from the actual file
Cris LaPierre
on 1 Dec 2020
There are 10 rows unaccounted for. This head has 354 rows of data, but you mentioned the numbers start at row 365. Are there blank rows between the header and the first row of numbers? Could you attach a subset of your data file containing the first 1000 rows including the header?
Matthew Suddith
on 1 Dec 2020
Sorry, it actually starts at 355. Here is the first 1000
Cris LaPierre
on 1 Dec 2020
Edited: Cris LaPierre
on 1 Dec 2020
There are numerous ways to do this, but probably the easiest to understand is this.
data = readtable("first1000.txt",'NumHeaderLines',354,"MultipleDelimsAsOne",true,"LeadingDelimitersRule","ignore");
data.Properties.VariableNames(1) = "Depth";
% Create groups by rounding the depths to integer values
data.grpDepth = round(data.Depth);
newData = groupsummary(data,"grpDepth","mean")
Matthew Suddith
on 1 Dec 2020
That worked with the complete file, I hate to keep asking you questions, but how would I then change the name of each mean_variable# column header to the variable it is supposed to be?
Cris LaPierre
on 1 Dec 2020
This has to be done manually. Using the names I see in the header, something like this at the end should do it.
newData.Properties.VariableNames(3:end) = ["depSM","sal00","t090C","CStarAt0",...
"flECO-AFL","sbeox0ML/L","sbox0Mm/Kg","sbeox0PS","oxsatML/L","oxsatMm/Kg",...
"par","turbWETbb0","prDM","sal11","t190C","T2-T190C","secS-priS","timeJ",...
"c0uS/cm","c1uS/cm","C2-C1uS/cm","density00","sigma-theta00","potemp090C",...
"v4","v3","v2","v5","flag"]
Cris LaPierre
on 1 Dec 2020
WRT the questions asked here
I really don't have enough information to answer that question. What file are you comparing it to? How were the data grouped and averaged in that file?
You could try using a method other than round.
I'm not sure how using a for/while loop helps you here.
Matthew Suddith
on 1 Dec 2020
The file that I'm comparing it to is the "binned" version of the file I showed you, the one with the 70,000 lines. But I don't know how it was grouped and averaged. I'm happy with the rounding method you showed me, I really appreciate your help. I may continue posting a few more tiny questions, but I don't expect you to answer all night.
Cris LaPierre
on 1 Dec 2020
Edited: Cris LaPierre
on 1 Dec 2020
Try using fix (assigns 2-2.9 a value of 2) or ceil (assigns 2.01-3 a vaue of 3) instead of round. It's a simple change to make to the code, and together form the 3 most likely methods used.
Matthew Suddith
on 1 Dec 2020
When using writetable, how can you outfile it to a text document
Matthew Suddith
on 1 Dec 2020
so replacing round(data.Depth) with one of those gives me an error on the = both ways
Cris LaPierre
on 1 Dec 2020
Round, ceil, fix and floor all worked for me. I think you have a syntax error. I suggest reading the documentation I linked to previously to see how to use them. You should just have to replace "round" in the current code with "ceil", for example.
Yes, it is possible to use writetable to save a table to a text file. Again. read the documentation I linked to previously to see how to do it.
Matthew Suddith
on 1 Dec 2020
Is there a way to format the output of writetable, because on the txt file it outputs it is a big jumbled mess
Cris LaPierre
on 1 Dec 2020
It looks like by default it writes a csv file. You could look at the name-value pairs for what options are available.
Matthew Suddith
on 2 Dec 2020
That worked. How could I plot my rounded data and the original data in one plot for a direct comparison, is that doable?
See Also
Categories
Find more on Logical in Help Center and File Exchange
Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)