How to calculate hadoop cluster growth plan based on storage


How to calculate Hadoop cluster growth plan based on storage?
This calculation is for small 3 node Hadoop cluster assume average daily ingest rate of 10 GB per node.

Average daily ingest rate 10 GB
Replication factor 3 (copies of each block)
Daily raw consumption 30 GB  (Ingest × replication)
Node raw storage 600 GB  (2 x 300GB SATA II HDD)
MapReduce temp space reserve 25% For intermediate MapReduce data
Node-usable raw storage 450 GB (Node raw storage – MapReduce reserve)
1 year (flat growth)

24 Node (Ingest × replication × 365 / node raw storage)
(10 GB x 3 x 365/450 GB)
1 year (5% growth per month)  
1 year (10% growth per month)  

 

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s