Get Hands-On Experience with Machine Learning by Using H2O

Lauren MarckTechnical TipsLeave a Comment

H20 ai logo

Machine learning is a method of teaching computers how to use data to make predictive models and perform important tasks accurately without manually programming them to do so. In other words, machine learning allows a system to learn from data and experiences so that it can improve its performance in future tasks.

For example, have you ever seen on Amazon the “Frequently Bought Together” section? This is where Amazon recommends add-ons or complementary goods based on the product you are looking at. If Amazon had to manually program recommendations for each of its 200 million products, it would take a lifetime of coding to finish, and would be impossible to optimize.

Instead, machine learning allows the system to do all the work. The Amazon systems learn from the data that it collects around what products are frequently bought together, and with this, the machines can accurately determine targeted, highly effective recommendations to customers without any explicit hardcoding. Even though the concept of machine learning has been around for several decades, the overwhelming amount of data now accessible to companies has made machine learning more prevalent than ever in everyday processes and products.

Using data, the system learns an algorithm, and then uses it to build a predictive model. The system then performs the recommended task and uses feedback data to tune the model to be more accurate.

Machine Learning Process:

Flow Diagram

So now that we have covered high-level machine learning concepts and workflow, how does a company get started integrating machine learning into their operations? Well, all they need is data, an idea of what they want to predict and model, and an open source tool such as scikit-learn, Mlpack, Spark MLlib, or H2O. In this blog, we are going to take a closer look at H2O from and show you how companies can incorporate it into their everyday operations.  

Founded in 2011, has established itself as a data modeling and machine learning company serving customers such as Cisco, Capital One, and PayPal. It is quite easy to get started with H2O. Interfaces are available for R, Python, and Scala. Processing can be performed in R or distributed across a Hadoop cluster. Even Spark is supported via Sparking Water (Think Spark + H2O and you get Sparkling Water!). The best way to get experience with machine learning techniques is to get started with real-world use cases. You can get started with H2O by going to this Link and selecting the platform you wish to work with. Then simply follow the download and installation instructions. There are many Quick Start Videos available to help get you started using H2O.

Once you have it downloaded and installed, you are then able to use it through your platform or, for a more novice-friendly interface, you may choose to use H2O Flow. Flow runs in your web browser and has a point-and-click interface, which makes it easy to use for novice coders.

Flow Interface

Flow’s “Assist Me” tool makes it simple to import data files, build models and make predictions without having to write one line of code.

Assist me

Some of the common uses of H2O include predictive models for advertising ROI, customer analytics, customer churn and fraud detection. In one case, PayPal claimed to have saved $1 million per month by using H2O’s predictive capabilities to detect fraud.

Just about every company that collects data can benefit from machine learning in one way or another. has lowered the barriers to getting started with machine learning. Give it a try and let us know about your experience.

Leave a Reply

Your email address will not be published.