Understanding the DNS of Data Science


Data Science is the competitive advantage of the future for organizations interested in turning their data into a product through analytics. Industries from health, to national security, to finance, to energy can be improved by creating better data analytics through Data Science.

The Field Guide to Data Science

 Magic Quadrant for Business Intelligence and Analytics Platforms

Aryan Nava:

Deploy MongoDB to Azure: It’s Never Been Easier

WebMatrix + MongoLab + Windows Azure

This post is to continue the story of my MongoDB self-learning back in January. Also, the theme for March self-learning is about Windows Azure, thus I guess it’s a good opportunity to combine these two knowledge together. So, let’s continue the story now.

Basically, after the one-month MongoDB learning in January, I have successfully built a simple web application allowing users to add pinpoints on Google Map and store those info on MongoDB. However, all those are happening in local machine. So, how to do that if we would like to deploy it on, for example, Azure for the public to access?

Fortunately, with the help of Microsoft WebMatrix, the whole process is rather simple and straight-forward.

Deploy The Website in 3 Simple Steps

Firstly, there is a Publish feature available on WebMatrix. After adding your Windows account on WebMatrix, there is a simple Publish interface which allows you to publish our current…

Step by step instruction to install MongoDB on Ubuntu based on: http://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/

Install MongoDB

Configure Package Management System (APT)

The Ubuntu package management tool (i.e. dpkg and apt) ensure package consistency and authenticity by requiring that distributors sign packages with GPG keys. Issue the following command to import theMongoDB public GPG Key:

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10

Create a /etc/apt/sources.list.d/mongodb.list file using the following command.

echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.list.d/mongodb.list

Create a folder to install ubuntu

Now issue the following command to reload your repository:

sudo apt-get update

Install Packages

Issue the following command to install the latest stable version of MongoDB:

sudo apt-get install mongodb-10gen

When this command completes, you have successfully installed MongoDB! Continue for configuration and start-up suggestions.




Based on the article article from: http://searchdatamanagement.techtarget.com/feature/NoSQL-databases-dent-relational-softwares-data-processing-dominance NoSQL Family NoSQL databases are geared toward managing large sets of varied and frequently updated data,often in distributed systems or the cloud. They avoid the rigid schemas associated with relational databases. But the architectures themselves vary and are separated into four primary

NoSQL Databases Family

Document Database Store data elements in document like structures that encode information in formats such as JSON. Common uses include content management and monitoring web and monitoring web and mobile – Couchbase Server, CouchDB, MarkLogic, MongoDB

Graph Database Emphasize connections between data elements, storing related “nodes” in graphs to accelerate querying. Common uses include recommendation engines and geospatial applications -InfniteGraph, Neo4J

Key Value Databases Use a simple data model that pairs a unique key and it’s associated value in storing data elements. Common uses include storing clickstream data and applicaiotn logs. – Aerospike, DynamoDb, Redis, Riak

Wide Column Stores Also called table-style databases-store data across tables that can have very large numbers of columns. Common uses include internet search and other large-scale web applications. – Accumulo, cassandra, HBase, Hypertable, SimpleDB

What if:

… you could identify which employees are likely to turnover in the next 6 months?

… you could understand which transformers are likely to fail during our next major storm?

… your call center agents could delight customer with the best next-step recommendations?

Big Data is the catch phrase of the day.  Everyone has heard of it, most have a vague idea what it means, some have a clear grasp of what it can really do, and few can execute it effectively.  I’ve been doing a lot of reading on Big Data and there is certainly no lack of resources on the topic.  As I read through many reports, white papers, press releases, magazine articles and company presentations (all available via a quick Google search) it occurred to me that visually communicating the value of Big Data is challenging because of the need to convey different concepts simultaneously.  The most popular category by far are plot charts on an X-Y axis.  These charts plot analytical complexity against some sort of business value measurement in a positive correlation that looks entertainingly similar to human evolution charts we’ve all seen, with man becoming more upright and intelligent with time.

Less popular, but also useful, are a bulls-eyes, Venn diagrams and an stacked area triangles. Regardless of graphic representation, they all follow the progression from What Happen(descriptive analytics), Why Did It Happen (correlation analytics), What Will Happen Next(predictive analytics), and What Should I Do About It (prescriptive analytics).  Which one do you prefer


Why We Like It: This chart is unique in that it goes all the way back to the beginning when data is first created and gathered in raw form.  So much of the resources needed to develop prescriptive analytics takes place in the very early stages of the process, and it’s nice that this graphic gives it a mention.  The overwhelming majority of data available for analysis does make it to the final predictive/prescriptive model.  If each circle represented the amount of actual data at that stage, the raw data circle (and cleaned data circle) would dwarf all the others, so thank you SAP for giving data its due.


Data Science Venn Diagram v2.0

You can see a static image of the final viz below and check out the full story and interactive version in The Evolution of Languages on Twitter.

Languages of tweets

When installing Microsoft System Center Service Manager (SCSM) 2010, If you get “An error occurred while executing a custom action:_CreateMOMRegKey”

Microsoft System Center Service Manager (SCSM) 2010 Setup cannot complete
One or more of the following components did not install correctly. Setup has rooled back this installation. Correct any problems before running setup again.

An error occurred while executing a custom action:_CreateMOMRegKey

An error occurred while executing a custom action:_CreateMOMRegKey


Do not use special Characters in your password

How to customize SharePoint 2013 Global Navigation?
When Installing Microsoft System Center Service Manager if you get error “Invalid command line argument. Consult the Windows Installer SDK for detailed command line help”


Invalid command line argument. Consult the Windows Installer SDK for detailed command line help.

SCSM Installation Error: nvalid command line argument. Consult the Windows Installer SDK for detailed command line help. To resolve the problem and install Microsoft System Center Service Manager , perform the following task


Part 1

Go to command prompt and type “msiexec /unregister”,and then press Enter.
Type “msiexec /regserver” and then press Enter.
Exit From Command Prompt

Click the Start | Administrative Tools | Services
Find and right-click the “Windows Installer”, and then click the “Properties”.

In the “Startup type”, select the “Automatic”, and then click “Apply” and “OK” and restart your computer



Part 2.

Make sure you don’t have any special character in your password (Do not use double quote in the password or any special character)
The top 10 now includes Apple AAPL -1.29% , FacebookFB -4.61% , Google and Amazon.com AMZN -3.18% . 14 years ago, Apple hadn’t yet launched the iPod, much less the iPhone or iPad, and Google wasn’t a publicly traded company. Facebook CEO Mark Zuckerberg was still in high school, and Amazon had been public for less than two years.

Who’s stayed in the top 10 after all this time? MicrosoftMSFT -2.78% , Cisco CSCO -1.64% , Qualcomm QCOM -2.51% and Intel INTC -0.95%  — though their market capitalizations are all greatly diminished.

How to install Vagrant on Windows and run using VirtualBox?

Based on the instructions from http://www.seascapewebdesign.com/blog/part-1-getting-started-vagrant-windows-7-and-8   , I am in the process of installing Vagrant on Windows 7. In this tutorial, we will be installing Vagrant, a bare bones server with Ubuntu installed. Vagrant is a server that runs under VirtualBox. You will need to have VirtualBox installed. You will also need to have Putty installed in order to access your new Vagrant server via SSH. These instructions also apply to Windows 8. Requirements: A hard connection to the Internet Putty needs to be installed. http://www.putty.org/ VirtualBox needs to be installed.

1. Download and install the most recent VirtualBox for Windows from https://www.virtualbox.org/wiki/Downloads

2. Setup new Virtual Machine In Virtual Box

3.  Download and install the latest version of Vagrant from http://downloads.vagrantup.com.
Steps are very simple and once the installation complete this is what you will see in C:\


4. Setup Vagrant in Windows 7/8

Change directory to C:\HashiCorp\vagrant\bin

Then type the following commands:

C:\HashiCorp\vagrant\bin> vagrant box add lucid32 http://files.vagrantup.com/lucid32.box


Successfully added Box ‘Lucid32″ for “Virtualbox”

C:\HashiCorp\vagrant\bin> vagrant init lucid32

A `Vagrantfile` has been placed in this directory. You are now ready to `vagrant up` your first virtual environment! Please read
the comments in the Vagrantfile as well as documentation on`vagrantup.com` for more information on using Vagrant.


A `Vagrantfile` has been placed in this directory

C:\HashiCorp\vagrant\bin> vagrant up

C:\HashiCorp\Vagrant\bin>vagrant up
Bringing machine ‘default’ up with ‘virtualbox’ provider…
==> default: Importing base box ‘lucid32′…
==> default: Matching MAC address for NAT networking…
==> default: Setting the name of the VM: bin_default_1396737444370_66404
==> default: Clearing any previously set network interfaces…
==> default: Preparing network interfaces based on configuration…
default: Adapter 1: nat
==> default: Forwarding ports…
default: 22 => 2222 (adapter 1)
==> default: Booting VM…
==> default: Waiting for machine to boot. This may take a few minutes…
default: SSH address:
default: SSH username: vagrant
default: SSH auth method: private key

Vagrant Up

Vagrant Up

 5. Open VirtualBox . Vagrant setup Ubuntu virtual machine in three line of command.


Very powerful and easy to setup Virtual Machine using Vagrant


6. Now let’s connect to Vagrant build virtual machine using putty

Open Putty and enter the following information.  When I installed I got IP address of with port 2222




You may get puTTY security Alert, click “Yes”


The server’s host key is not cached in the registry. You have to guarantee that the server is computer you think it is.
If you trust this host, hit yes to add the key to PuTTy’s cache and carry on connecting.
If you want to carry on connecting just once, without adding the key to the cache, hit NO. If you do not trust this host, hit Cancel to abandon the connection.

7. Enter username: vagrant and Password: vagrant

Welcome to your Vagrant-built virtual machine.
Last login: Fri Sep 14 07:26:29 2012 from



vagrant-built-virtual-machine is ready 


Simon Momber, VMware Solution Architect, provides a technical tour of vCloud Hybrid Service, what it is, how you can procure it, use cases, and where to get more information.

How to listen or seen all the TCP and UDP endpoints in Windows System?


TCPView is a Windows program that will show you detailed listings of all TCP and UDP endpoints on your system, including the local and remote addresses and state of TCP connections. On Windows Server 2008, Vista, and XP, TCPView also reports the name of the process that owns the endpoint. TCPView provides a more informative and conveniently presented subset of the Netstat program that ships with Windows. The TCPView download includes Tcpvcon, a command-line version with the same functionality.


Using TCPView

When you start TCPView it will enumerate all active TCP and UDP endpoints, resolving all IP addresses to their domain name versions. You can use a toolbar button or menu item to toggle the display of resolved names. On Windows XP systems, TCPView shows the name of the process that owns each endpoint.

By default, TCPView updates every second, but you can use theOptions|Refresh Rate menu item to change the rate. Endpoints that change state from one update to the next are highlighted in yellow; those that are deleted are shown in red, and new endpoints are shown in green.

You can close established TCP/IP connections (those labeled with a state of ESTABLISHED) by selecting File|Close Connections, or by right-clicking on a connection and choosing Close Connections from the resulting context menu.

You can save TCPView’s output window to a file using the Save menu item.

