Machine Learning and the Microsoft Data Platform

If you’re into data and databases and you have not heard the term “machine learning,” may I suggest that you’re not reading enough? This technology is hot and hyped, largely because it is the secret ingredient in many successful Big Data projects.

Machine learning (ML) seeks to harnesses computational power to run predictive models that learn from existing data enabling it to forecast the future, such as future outcomes from a particular strategic decision, consumer behaviors, or emerging trends. The “computation power” referenced in the previous sentence usually means one or more commonly applied algorithms which can greatly help with predictions, such as Bayesian inference, linear regression, and so on.

Once you have an ML implementation up and running, you can use its output to do all sorts of things that improve and extend your IT capabilities. For example, ML is at the heart of many credit card fraud detection systems, as well as in online retail websites which recommend other products based on what you’ve purchased in the past. Back in my NASA days, my team used the Markov Chain Monte Carlo method to figure out how big space stations like the ISS gradual acquire their own atmosphere in microgravity. The difference between then, in the late 1980’s, and today is the vast amounts of processing power and the reams of data we can apply these analytics.

Non-vendor Specific Machine Learning

ML is a broad technology with many different vendor implementations and products you can adopt, in the same way that relational databases are a general technology with my RDBMS platforms, both commercial and open-source, which you can use to support your data-driven applications. Thus, you can begin the process of learning ML through a non-vendor, generalize academic curriculum:

If you’re an absolute beginner, you might start with the outstanding video parts 1 and 2 by Nando De Frietas posted on YouTube. For learning material more like a undergraduate course, consider where loads of great lectures on the topic available for you to review, such as:

Other outstanding free sessions include Coursera’s from Stanford’s Andrew Ng and Udacity offers

Microsoft Machine Learning

No matter what vendor you might currently be using for your RDBMS, I strongly encourage you to consider Microsoft’s offerings. In my opinion, no commercial vendor gets close to the innovative work that Microsoft gets close, with the only competitive technologies out there being a well-established data science team who have rolled their own solutions for many years.

Microsoft ML is entirely in the cloud, Azure, and has an enormous variety of supporting feature sets, algorithms, and solutions available for purchase from the Microsoft Azure Marketplace ( (An open-source alternative to the Azure Marketplace can be found at When you’re ready to build an Azure ML model, you will use the Machine Learning Studio a collaborative, drag-and-drop browser-based tool you can use to create, test, and deploy solutions on your data. Machine Learning Studio is available at, where you can try Azure ML for free.

The best place to start is the Microsoft Azure: What is Machine Learning page at In addition, I just love the great data visualizations and learning resources at the Microsoft Azure: Algorithm Cheat Sheet at The cheat sheet is a very useful poster PDF that any aspiring data scientist would find useful, whether they use Azure ML or not.

Microsoft is “All-In” on Cloud

That headline was made famous by Steve Ballmer several years ago. But it has definitely come to fruition today, from the executive leadership of Joseph Sirosh all the way down to the frequently updated blog of the software engineers (at It’s time to get started!