Components: Hadoop's HDFS

Description: One of the most important part of Hadoop technology stack - distributed filesystem. Used as main storage for processed data and as source for different ML task. Used also as main storage for Hive warehouse

Provision script (Puppet manifest): hadoop.pp

Additional info: Hadoop site


Table of content: