Big Data Ecosystem and Snapshots


When your selecting Big Data Ecosystem for your mission critical application, make sure you chose the correct product which can allow you to take take real time snapshots.  If your Big Data Ecosystem not allowing you to take real time snapshots, you need to consider higher cost for backup.

Following Video and link may help you to understand how Snapshots work for different Big Data Ecosystems

 

Big Data Ecosystems Snapshot and how it works: MapR vs HDFS Snapshots vs HBase Snapshots

bigdata_snapshots_hdfs_hbase_mapr
Big Data Ecosystems Snapshot and how it works: MapR vs HDFS Snapshots vs HBase Snapshots

 

What makes MapR superior to other Hadoop distributions


Why I prefer MapR

  • File system metadata is distributed (think of it in terms of many mini name nodes). No central name node is needed. This eliminates name node bottlenecks.
  • MapR-FS is written in C. No JVM garbage collection choking.
  • NFS mount. You can mount the MapR-FS locally and read directly from it or write directly to it.
  • MapR-FS implements POSIX. There is no need to learn any new commands. Your Linux administrator can apply existing knowledge to navigate the file system. You can view the content on MapR-FS using standard Unix commands, e.g. to view the contents of a file on MapR-FS you can just use tail <file_name>.
  • While MapR-FS is proprietary it is compatible with the Hadoop API. You don’t have to rewrite your applications if you want to migrate to MapR. hadoop fs -ls /user on MapR-FS works the same as ls /user.
  • You can directly load the data into the file system. No need to set down the data on the local file system first. Guess what? Using NFS mounts there is no distinction between MapR-FS and the local filesystem. MapR-FS in a way is the local filesystem. No additional tools such as Flume etc. are needed to ingest data.
  • True and consistent snapshots. Run point in time queries against your snapshots.

MapR Installation


MapR Installation

What happens if the MapR administrator account is not present on the OS when installing MapR?

The MapR administrator will be created during the installation process, if it dose not already exist. As with all clusters users, it’s UID and GID must be same on all nodes in the cluster.
What information do you need to supply to the mapr-setup.sh script?
You need to provide credentials for the MapR administrator account

When do you install the MapR license?

You can install a trail license through the web installer or you can add a license after the installation completes

MapR-LicenseOption

During the node verification, what dose a white node icon signify?

MapR-ConfigureNodes

The node icons are white while they are being verified, and change color once verification is complete. Green means the node is ready for installation. Yellow means installation can proceed, but there are warnings on that node. Red means that there is a problem with the node that prevents it from being part of the cluster.
What is the most likely cause of failed services immediately after installation?
It is not unusual to see a  failed service that first time you log into the MCS;this is because MapR attempts to start the services before the license was applied. Once the license is applied, restart any failed services.

Service Layout for a Large Hadoop Cluster using MapR


Service Layout for a Large Hadoop Cluster using MapR

ServiceLayoutForLargeHadoopCluster-MapR

Service Layout for a small Hadoop Cluster using MapR


ServiceLayoutForSmallHadoopCluster-MapR