Reading HDFS from Matlab - what toolboxes do I need?

We're planning to implement Hadoop at my work, and I need a way to retreive the data from the Hadoop clusters in the data lake and get it into Matlab. What toolboxes do I need for this? Note that I'm only reading the data from HDFS-files.
Additionally, would I need other toolboxes to be able to read data?

Answers (2)

Hadoop Sequence Files can be read directly in base MATLAB.
If you want to do "mapreduce" on a Hadoop cluster, then you need to have licenses for the Parallel Computer Toolbox and MATLAB Distributed Computer Server.  Documentation on how to Configure a Hadoop cluster and run "mapreduce" on it is linked to below.
The h5read function has come standard since Matlab release 2011a, and requires no special toolboxes.

Tags

Asked:

on 8 Sep 2017

Answered:

on 11 Sep 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!