Run Matlab file on each Node of Hadoop Cluster
1 view (last 30 days)
Show older comments
Truong Nang Toan
on 16 Oct 2019
Commented: Truong Nang Toan
on 17 Oct 2019
Hi everybody,
I want to buid a cluster which using Hadoop. But I don't know that I can run file Matlab with each Datanode ?
I will install Matlab on each Node or there is another way to solve problem?
Please help me. Thanks a lot.
0 Comments
Accepted Answer
Steven Lord
on 16 Oct 2019
This documentation page describes how to configure a Hadoop cluster so that client MATLAB sessions can submit to it. The first step in that workflow, "Cluster Configuration", links to a page that describes how to install MATLAB on the worker nodes.
I don't think this process has changed in recent releases, but if you're using a release older than the current release (the online documentation is for the current release, right now that is release R2019b) you probably want to find the equivalent of that page in the installation of one of the clients who are planning to submit to the Hadoop cluster. Alternately find the correct release's documentation from the documentation archive.
3 Comments
Jason Ross
on 16 Oct 2019
Hadoop handles the splitting and distribution of your data when you put the data into HDFS using something like "put" or "copyFromLocal". You can specify your datasets using the datastore command. The mapreduce step runs as a job on the YARN scheduler, which starts the MATLAB install you specify on the cluster object. There are a variety of ways to get your data back -- you can write it to HDFS.
We have a number of examples and documentation on how to use mapreduce here. There is a worked example at the bottom of the page that shows the mapreduce workflow.
More Answers (0)
See Also
Categories
Find more on Deploy Tall Arrays to a Spark Enabled Hadoop Cluster in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!