Making your own smart ‘machine learning’ thermostat using Arduino, AWS, HBase, Spark, Raspberry PI and XBee

This blog post describes making your own smart thermostat using machine learning (K-means clustering) and a bunch of hardware: Arduino, Raspberry PI, two XBee’s and an Amazon Cloud sever (see: idea in brief). I have to start with a disclaimer. I am not a programmer good programmer and certainly not an electrical engineer. However, this research project yielded a working thermostat that is able to learn over time how to improve energy efficiency.

For those who want to directly go to the smart ‘learning’ part, you can skip to Part 7: Learning and adapting temperature scenarios in the Amazon cloud (SPARK). Otherwise, the structure of this post, after a short introduction and overview, follows the path of data, control and necessary feedback loops as shown in figure 1. Every part contains specific code examples. The full source code can found on github

  1. Introduction & overview
  2. Reading data form a ‘dumb’ thermostat and various temperature sensors (Arduino)
  3. Sending data, at 1,000 values per second, to a Raspberry PI (Python)
  4. Storing data in the Amazon Cloud (HBase)
  5. Turning the boiler on and off at the right time (Arduino)
  6. Using outside temperature and scenarios to control an Arduino from a Raspberry PI
  7. Learning and adapting temperature scenarios in the Amazon cloud (SPARK)
Idea in brief
This blog post describes building and programming your own smart thermostat. The smart part is based on machine learning in the form of K-means clustering to optimize when, how often and how long the boiler/furnace turns on. The thermostat is built on the concept of feedback loops (figure 1).

  1. The first feedback loop is an Arduino directly controlling the boiler (furnace).
  2. The second feedback loop is a Raspberry PI that uses XBee to wirelessly receive temperature data and boiler status information from the Arduino and send instructions back to the Arduino.
  3. The third and last feedback loop runs on a server in the Amazon cloud. This server uses the Spark Machine Learning Library (MLlib) and HBase to optimize the boiler control model that is running on the Raspberry PI.

Figure 1: Overview

Continue reading


Exploring the future of information part 2

Why ancient Egyptian building concepts should not be applied to using information.

This is the second post in a series in which I explore the future of information. In these posts I combine a variety of concepts ranging from the impact of the philosophy of Aristotle on using information to the Internet of Things.

In this second post I explore my own career in the field of Business Intelligence, explain why the pyramid should not be used to represent an organization and how information can be applied in its context instead of a hierarchy. Previous post: Introduction, The rise and deception of post-modernism. Upcoming posts are:

3. Aristotle on the value of information.
4. Consumerization of Information
5. Consumer Data Governance
6. The right app-platform for the Internet of Things
7. The stars in the future field of information technology.

Five years ago I made a career change. I moved from the digital media industry to the field of Business Intelligence (BI). Part of this move was going from an organization of 8 people to a company with 80,000 employees. Starting in this new field I was struck by the abundant use of a particular geometric form: the pyramid. Almost to the extent that the shape itself holds the truth to using information in organizations, which overly simplified what the field of Business Intelligence is about. The shape was used for describing the hierarchy of information itself: data, information, knowledge and wisdom. The shape was used to describe the transformation from raw data to usable information dashboards. Most importantly, the shape was used to represent an organization, often divided into the three levels of operational, tactical and strategic. Concepts often used in the BI paradigm like ‘one version of the truth’, ‘the enterprise data warehouse’ (with an emphasis on enterprise), ‘management information’ and ‘operational BI’ comply with the idea of the organization as a pyramid. When I moved towards to field of Performance Management (PM), I found that here the pyramid had an even greater importance. According do the PM paradigm information flows and narrows to the top by aggregating Key Performance Indicators and targets and actions cascade from the top to the ‘operational’ level.

From the start I felt uneasy with using a pyramid to represent an organization, because it did not fit with the theory from my MBA education. Modern organizations do not function according to a strict hierarchy. Also, the idea of ‘centralized control’ is replaced by empowerment and internal markets. At first I attributed this feeling to moving from a tiny company to one of the largest IT firms in the world. After five years of working as a BI Consultant, however, I have never seen an organization that could be represented by a pyramid. Even when an attempt is made to make it look like one, it always disappoints. One of the best examples comes from when I was working on an assignment for the Academy of a big government institution. This academy, seen as an education related HR department, operated completely independent of the primary activities of the rest of the organization, which dealt mostly with handling large amounts of money. Yet the academy was managed in the same way as any other business unit. Not only did it have to use the same IT systems and other facilities that were designed for these primary activities, but it also had to use the same metrics. This led to such great inefficiencies that any business unit preferred using an external company for training or HR services if they could. This was not only because of cost but also because it was better equipped for doing the task it was set-out to do.

So how should we look at an organization? First a bit of theory. In 1979 the renowned professor Henry Mintzberg published his most famous book ‘The Structuring of Organizations’. In Mintzberg’s model (figure 1) the basic organization consists of five parts: operating core, middle line, strategic apex, techno structure and support staff. To be honest, even here the outline of a pyramid can still be seen. But at leas two parts are added, the techno structure and the support staff. Looking at Mintzberg’s five variations of organizational structure (figure 2) the pyramid almost completely disappears. This is most apparent in the diversified, or divisionalised organization. In the Harvard Business Review article following the book (1981) Minztberg states that most Fortune 500 firms have adopted this divisionalised form. And according to him, “The Divisionalised Form differs from the other four structural configurations in one important respect. It is not a complete structure from the strategic apex to the operating core, but rather a structure superimposed on others”. He further explains that each division has its own structure.

In my article ‘Beyond agility, evolutionary IT-systems and business processes’ I argue that by using divisions organizations can adapt to their environment. Using this form is essential for survival.

Taking a glance on the current organizational structure of the 2012 list of Fortune 500 firms reveals that the divisionalised structure is still most popular amongst these firms. For example ExxonMobil has an organizational structure based on 12 separate global businesses. Philips, the Dutch electronics giant, has three business units based on market sectors and one unit called ‘Innovation & Emerging Businesses’, which contains all support staff. Philips needs a total of six separate organizational charts to explain its organization. Starbucks changed in 2011 to a structure containing five separate organizations: three based on region (China and Asia Pacific, Americas), EMEA and two on the promising brands Seattle’s Best Coffee and Tazo tea. Also in the nonbusiness sector the Divisionalised form is well known. Take for example the Dutch Justice Department. Its organizational “chart” is so complex-containing at least three IT related business units-that it only exists in words and complete sentences. Designing an actual chart would be too complex.

Now what does this mean for the future of information? First of all when using information in organizations the actual complexity of the organization has to be taken into account. This means that the ideal of an ‘Enterprise Data Warehouse’ is no longer attainable. The buying up and selling off of smaller companies by larger corporations not only makes this impossible, but completely integrating business units is also not desirable. Furthermore, ‘one version of the truth’ is a goal that should not be attained. Every business unit is its own organization and operates in its own environment. Every part of an organization, therefore, has its own ‘version of the truth’.

From my own experience I have found it is much better to focus on the actual application of information, regardless of the structure. Information can flow from and to all parts of the organization, as displayed in figure 3. This can be up, down, lateral across business units, diagonal across business units or even from or to outside the organization. A good example of this is the use of information by a well-known large Dutch retailer. They embedded an algorithm in their supply chain to automatically send last minute updates to the warehouses. This algorithm uses data from all corners of the business. By using this diverse information their replenishment process is one of the most efficient in the world. The application of information in its context instead of a hierarchy is something I will further explore in my next posts: ‘Aristotle on the value of information’, ‘the consumerization of information’.

The series continues with:
Part 3 Aristotle on the value of information.

Modified on April 2, 2013 Added link to part 3 in reference to future posts.