You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
How to plot 2D location vs corresponding data in MATLAB
12 views (last 30 days)
Show older comments
Hi,
I have location of users in terms of lattitude and longitude, each user have a particular demand value. I have to create clusters using the position of users, and demand. Can anyone help me in designing it as there are three variables.
One another part of the problem is to how to plot it in such a way that lattitude comes in x axis, longitude in y axis, and user demand corresponding to them
It is like this
User Lattitude Longitude Demand
1 38.8643 9.2866 13
and so on
Please help me out.
9 Comments
Dyuman Joshi
on 14 May 2022
What is the criteria for making cluster? What is the data type? Numeric/String/Char/Cell/Table? Just copy-pasting doesn't help. Mention specifically so that it is easy for us to help you.
Shwet Kashyap
on 16 May 2022
Thanks for the reply, actually I have the lattitude,longitude,and demand of 6000 users of Europe,it is already divide in 71 different areas but some areas have zero demand,and some have high demand.based on these three variables i want to make 71 clusters in which the user demand becomes uniform,the cluster area may vary.
Shwet Kashyap
on 16 May 2022
At least if you can help in plotting this data where x axis is lattitude, y axis is longitude,and the corresponding demand on the graph
Shwet Kashyap
on 16 May 2022
HI,
Please find the attached excel file, will be grateful to you for your help as from last four days working on it but not able to plot.
Regards,
Dyuman Joshi
on 16 May 2022
I plotted a 3d plot > x - lattitude, y - longitude, z - demand
This is what I obtained. I still don't know what do you want to plot. And what did you try?
Shwet Kashyap
on 16 May 2022
Thanks a lot Dyuman, but it should be 2D, x - lattitude, y - longitude, and demand should be the corresponding value of that particular lattitude, and longitude. It may be geo density plot which I am trying for, I have plotted it using geobubble plot.
Using these three variables lattitude, longitude and demand I want to perform k-means clustering, If there were two variables then it was easy to perform clustering but stuck due to three variables.
The attached xls file contains the data which is divided in 71 beams,or clusters,for beam 1 there is zero demand. The attached pdf is the plot of the lattitude, longitude, and demand for the problem. Hope it helps.
regards
Shwet Kashyap
on 18 May 2022
Hey Dyuman,
Did you get the data, any progress or any suggestion, please if possible
Walter Roberson
on 18 May 2022
(Context appears to be fixed-beam signals for geostationary communication satellites.)
Answers (3)
KSSV
on 18 May 2022
Edited: KSSV
on 18 May 2022
REad abour kmeans
T = readtable('https://in.mathworks.com/matlabcentral/answers/uploaded_files/999455/MATLAB.xlsx')
T = 6689×3 table
Lattitude Longitude Demand
_________ _________ ______
38.864 -9.2866 13
39.073 -9.1616 13
38.606 -9.0866 13
38.614 -8.9283 13
39.089 -8.8949 13
38.848 -8.8116 13
37.706 -8.7699 13
37.622 -8.7116 13
38.614 -8.6949 13
39.006 -8.6449 13
37.347 -8.6282 13
38.731 -8.5782 13
38.864 -8.5449 13
39.048 -8.5282 13
38.214 -8.4866 13
38.439 -8.4282 13
x = T.(1) ;
y = T.(2) ;
d = T.(3) ;
idx = kmeans([x y d],71) ;
Warning: Ignoring rows of X with missing data.
scatter3(x,y,d,[],idx,'filled')
colorbar
cmap = turbo(71) ;
colormap(cmap)
view(2)
5 Comments
Shwet Kashyap
on 19 May 2022
Thank you so much for answering. I have few more doubts. How to assess the data alloted to all 71 beams in excel format, and how to apply k++, and DBSCAN on the same data.
Reards,
Thanks once again
KSSV
on 19 May 2022
Shwet Kashyap
on 19 May 2022
Thanks a lot once again, I wanted the demand cluster wise, for example the demand of cluster 1, 2, 3 .....71.
I the following:
idx = T.cluster == 1 ;
T(idx,:)
Unrecognized table variable name 'cluster'.
idx = T.color == 1 ;
T(idx,:)
Unrecognized table variable name 'color'.
Shwet Kashyap
on 20 May 2022
hey I tried the dot command, and was able to plot as dots, but still not sure how to count the nymber of dots, or corresponding value, please help.
Attachin the plot.
Shwet Kashyap
on 22 May 2022
idx = T.Demand == 13 ;
T(idx,:)
This command provide data from the original table T, I require data from the clustered table.
Walter Roberson
on 18 May 2022
You can create a triangulation object and then use trimesh() to plot the data.
Or you can create a scatteredInterpolant() object and interpolate over a grid of coordinates and then imagesc() or pcolor() to create a map.
7 Comments
Walter Roberson
on 18 May 2022
"i want to make 71 clusters in which the user demand becomes uniform"
Suppose that you have one of the 555 demand sites, and then a gap, and then (say) 500 km away you had one of the 13 demands. Suppose that you had 3 satellites available for this subset. (555+13)/3 is about 189 load per satellite. So you park two of them directly over the 555 demand, covering 378 of the 555 demand with high efficiency near-vertical direction. The remaining 177 of the 555 and the 13 of the other have to be covered by the remaining satellite.
Do you park that third satellite directly over the 555, giving full signal to all 555 units but 200 km and steeper angle inferior service to the 13 site? Do you declare that each site is equal importance regardless of its demand, and so position half the way between, with the 177 units and the 13 units each being served mediocre from 100 km away — so the first 378 units of the 555 would be excellent service but the remaining would be mediocre?
Do you position a distance proportional to the demand, so 13/190*200km away from the 555 (more generally, at the center of mass of the demand being served)?
... or do you position two satellites over the 555 and have them serve the 555 between them, and position the third over the 13, giving efficient service to all of them, assuming that the 555 load can be satisfied between the two satellites?
What is the consequence of exceeding the ideal load? Failure of all communication? Smooth (linear) degradation of service? Exponential degradation of service (finding a communication slot gets less and less probable due to competing and signals from independent stations clash forcing both to back off)? Attempts beyond the limit just do not get served?
Is it truly the case that the allocation is to be purely by total demand over the area, and no consideration is to be given to cost of beaming the signal longer distances, or the fact that steeper angles through the atmosphere can result in more skipping or more path loss, or that for reliability, longer distances might call for a lower payload rate (increased length of error correcting code)?
It would be my expectation that allocation strictly by total demand on the satellite, as if all demand is the same cost, would not be appropriate.
Walter Roberson
on 18 May 2022
Suppose that you have a 555 demand and a 13 demand, with the two being substantial distance apart. And suppose that you have two satellites available. You put one satellite over the 555, and the other goes where? Based on dividing the demand equally, 284 of the 555 gets served by one satellite, and the other has to serve the remainder and the 13. But if they are far enough apart then any possible position of the second satellite is over the horizon from at least one of the two sites.
If you position the satellites equal distance from each other and the sites, then site 555, then 1/3 of the distance between sites to the first satellite, then another third of the distance to the second satellite, then the remaining third to site 13. Seems fair, right? But now the satellites might be over the horizon from both sites, and over the horizon from each other! No communication at all!
Maybe give up on equal demand per satellite, and position one over each of the two sites so that each gets service?
... though how are the satellites communicating with each other? Don't you need to reserve some satellites to be relay stations, possibly not in geostationary orbits in order to be able to reach multiple satellites? There could potentially be a hierarchy of distances away rather than relying on each satellite to store-and-foreward in series to the next geostationary satellite within view...
Shwet Kashyap
on 19 May 2022
Hey Walter,
Thanks a lot for your reply, and concern. The situation here is that we have only one geo-stationery sattelite, with array fed reflector antennaa which produces multiple spot beams which cover the alloted areas. The problem here is that in the area of some spot beams there is zero demand, or no user. At some beam areas the demand is higher but the power supply/data rate from the satellite to all beams is uniform. Hence in some beams the data is underutilised, and in some beams it is over utilized. I am trying to create 71 clusters using k-means, or k++ to make the demand uniform for each cluster. One cluster corresponds to one beam. The beams projection can be altered through digital signal processing on-board on the satellite.
Walter Roberson
on 22 May 2022
Given the conditions you have set out, the algorithm is this:
- adjust all beams to cover the exact same area, all locations
- each endpoint should generate a random beam number to associate with. If the throughput drops below an acceptable value, the endpoint should re-associate with a random beam.
This algorithm works because the throughput of a beam is not affected by the area the beam is covering, so you might as well have all beams cover all areas.
Walter Roberson
on 23 May 2022
%kmeans
target_clusters = 71;
[idx_kmeans, centers_kmeans] = kmeans([x y d], target_clusters, 'EmptyAction', 'singleton') ;
T.cluster_kmeans = idx_kmeans;
%dbscan
min_per_cluster = floor(numel(x)/target_clusters * 0.95);
[idx_dbscan, core_dbscan] = dbscan([x y d], 0.5, min_per_cluster);
idx_dbscan(idx_dbscan < 0) = nan;
T.cluster_dbscan = idx_dbscan;
%plot kmeans
subplot(2,1,1)
scatter3(x, y, d, [], idx_kmeans, 'filled');
colorbar();
title('cluster by kmeans');
%plot dbscan
subplot(2,1,2)
scatter3(x, y, d, [], idx_dbscan, 'filled')
colorbar();
title('cluster by dbscan');
%illustrating taking a subset by cluster number
idx = T.cluster_kmeans == 1;
subset1 = T(idx,:);
This is the code you asked for.
This code will not do what you want .
When you use kmeans, the only control over cluster size is EmptyAction -- you can prevent a cluster from becoming completely empty.
kmeans makes no attempt to equalize the cluster size. None. Equalizing cluster size is completely absent from the algorithm.
For example if you had
13 255
and two clusters, then it would put one of the clusters at the 13, and the other at the 255, and absolutely would not consider putting a cluster in the middle to split the 255 to give more equal cluster sizes.
dbscan, on the other hand, has no way to request a specific number of clusters. You might notice that I have configured a minimum cluster size of 95% of what would give you an equal distribution for the target number of clusters -- but since those are minimum dbscan could decide to just put everything into (say) 5 clusters.
Furthermore, in both cases, the "demand" (d) information is being used as a coordinate, not as a replicate. A location with a demand of 255 will not be split between two beams for either kmeans or dbscan. Instead, the demand will be taken pretty much as a Z coordinate, and treated as a distance. And since "13" is a fair difference from "255", the effect would tend to be to group all of the 13 together providing that they are geographically not too far apart (250-ish degrees... which is basically further away than it is possible to get on the world.)
In order to have any possibility of splitting a location between different beams, instead of using [x, y, d] coordinates, what you would need to do is similar to
repmat([x, y], d, 1)
and then add "jitter" to the coordinates, converting a single point with demand 255 into 255 nearby points with demand 1. That at least could result in the location being split. But in practice, unless you add an amount of jitter roughly the same as half the distance between locations, then you will just get a cluster placed at the centroid that will "absorb" the 255 individual points. See again what I said about kmeans never even attempting to use clusters the same size.
Do algorithms exist at all? Probably yes. For example the task is much the same as dividing land up into political districts of equal-ish population. See for example http://autoredistrict.org/
Do you have a hope using kmeans or dbscan? NO.
I would like to take this opportunity to remind you that I already posted a solution that is fully compliant with all rules that you have established: namely to have all of the beams cover the entire area, and then join beams at random. If that is not an acceptable solution, then the implication is that you have other criteria that you have not discussed.
Shwet Kashyap
on 24 May 2022
Hey,
Thanks for your attempt, may be I was not able to explain you the problem.
Here I don't want to divide the demand data equally in clusters but by clustering techniques such as K-means clustering, dbscan, Gaussian Mixture Model algorithm, BIRCH algorithm, heirarchial clustering, fuzzy clustering, etc. I want to minimze uneven distribution. The original data distribution shows uneven distribution which I want to minimize. Some people have already done work using the same data applying k-means ++ clustering, I was trying to validate it, then apply other methods.
Walter Roberson
on 24 May 2022
kmeans and dbscan are completely incompetent at making the distribution fair. I believe some of the other algorithms you mentioned are as well.
Shwet Kashyap
on 25 May 2022
Hey, it may be they are not able to make a fair distribution which is a subject of research, but one should be able to make clusters using the data given. If we ignore Demand, there must be a way out to create clusters using the lattitude, and longitude. Whereas the kmeans is providing some improvement from the original data, as in in orifginal data some clusters were having zero demand but after k-means clustering all clusters have some demand. The dbscan is a matter of research. Can this data be processed using support vector machine method
regards,
11 Comments
Walter Roberson
on 25 May 2022
https://www.mathworks.com/matlabcentral/answers/1719025-how-to-plot-2d-location-vs-corresponding-data-in-matlab#comment_2173025
take that code and remove d from the [x y d] of the kmeans and dbscan calls. You will get clusters that ignore demand.
Walter Roberson
on 25 May 2022
"Can this data be processed using support vector machine method"
Yes, of course it can be. The question you should be asking is whether it can be usefully processed with SVM.
SVM aims to try to find a hyper dimensional line that best separates two groups of classes. So to use it, each item being processed must be associated with a class label. For example you could label each item with its demand, and then ask to find a parabola that best separates the 13s on one side and the 255s on the other side.
Would that help? No.
You could assign class 71 different class labels, one for each beam, and ask SVM to find 70 dividing lines.
Would that help? No. It would not give you any guidance as to which label to assign to which point to start with.
But, hey, you can totally use SVM if you want to. All that will happen is that you will waste your time, other than you will be able to point to the failure of the method in your report. Make-work reports always appreciate if you prove experimentally that a technique is doomed to failure, rather than just proving from theory that it is doomed to failure, since the working hypothesis of such useless reports is that you probably don't understand the theory.
Shwet Kashyap
on 27 May 2022
Hey,
there is inconsistency in the outcome of k-means with same data, for example yesterday when I ran the same program in cluster 20 there were total 50 users, but when I ran today there are 105 users in cluster number 20.
One thing more I need to know that how to set the number of iterations in the program.
What will be the K++ command for the same data.
Regards,
Walter Roberson
on 27 May 2022
By default kmeans initializes cluster centers randomly.
To get repeatably behaviour, you can tell kmeans to initialize the particular values that you supply, or you can use rng() to set the random number generator to a consistent value before each call to kmeans that you need to repeat.
To set the number of iterations for kmeans, pass 'MaxIter', and the maximum number you want.
kmeans++ is not supplied by Mathworks. If you use https://www.mathworks.com/matlabcentral/fileexchange/28804-k-means then there is no way to set the number of iterations.
Shwet Kashyap
on 29 May 2022
target_clusters = 71;
[idx_kmeans, centers_kmeans] = kmeans([x y d], target_clusters, 'EmptyAction', 'singleton', "MaxIter', 200) ;
T.cluster_kmeans = idx_kmeans;
The maxiter is not working, am i using/writing it wrong.
Shwet Kashyap
on 30 May 2022
[idx_kmeans, centers_kmeans] = kmeans([x y d], target_clusters, 'EmptyAction', 'singleton',"MaxIter',1000) ;
↑
Error: String is not terminated properly.
Walter Roberson
on 30 May 2022
Edited: Walter Roberson
on 30 May 2022
[idx_kmeans, centers_kmeans] = kmeans([x y d], target_clusters, 'EmptyAction', 'singleton','MaxIter',1000) ;
Shwet Kashyap
on 31 May 2022
MaxIter command is not giving a stable solution, I require a stable solution in terms of less fluctuation, in one or two or three clusters the demand may be higher, but not very high.
I was going through the K-means++ to run the program it is demanding a code, how to write a code based on my data table of long, latt, demand.
Walter Roberson
on 31 May 2022
I already explained multiple times that the kmeans algorithm makes absolutely no attempt to balance the number of items in a cluster. Balance is completely outside of the kmeans algorithm.
Walter Roberson
on 31 May 2022
The entire point of kmeans is to group points that are close together, and separate the groups (clusters). If you have a demand of 555 at one location and a distance to a demand of 13 and you ask for two clusters, then it is going to put one at the 13 and the other at the 555, and will not try to split the load.
See Also
Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)