Making your own smart ‘machine learning’ thermostat using Arduino, AWS, HBase, Spark, Raspberry PI and XBee

This blog post describes making your own smart thermostat using machine learning (K-means clustering) and a bunch of hardware: Arduino, Raspberry PI, two XBee’s and an Amazon Cloud sever (see: idea in brief). I have to start with a disclaimer. I am not a programmer good programmer and certainly not an electrical engineer. However, this research project yielded a working thermostat that is able to learn over time how to improve energy efficiency.

For those who want to directly go to the smart ‘learning’ part, you can skip to Part 7: Learning and adapting temperature scenarios in the Amazon cloud (SPARK). Otherwise, the structure of this post, after a short introduction and overview, follows the path of data, control and necessary feedback loops as shown in figure 1. Every part contains specific code examples. The full source code can found on github

  1. Introduction & overview
  2. Reading data form a ‘dumb’ thermostat and various temperature sensors (Arduino)
  3. Sending data, at 1,000 values per second, to a Raspberry PI (Python)
  4. Storing data in the Amazon Cloud (HBase)
  5. Turning the boiler on and off at the right time (Arduino)
  6. Using outside temperature and scenarios to control an Arduino from a Raspberry PI
  7. Learning and adapting temperature scenarios in the Amazon cloud (SPARK)
Idea in brief
This blog post describes building and programming your own smart thermostat. The smart part is based on machine learning in the form of K-means clustering to optimize when, how often and how long the boiler/furnace turns on. The thermostat is built on the concept of feedback loops (figure 1).

  1. The first feedback loop is an Arduino directly controlling the boiler (furnace).
  2. The second feedback loop is a Raspberry PI that uses XBee to wirelessly receive temperature data and boiler status information from the Arduino and send instructions back to the Arduino.
  3. The third and last feedback loop runs on a server in the Amazon cloud. This server uses the Spark Machine Learning Library (MLlib) and HBase to optimize the boiler control model that is running on the Raspberry PI.

Figure 1: Overview

Continue reading