I have been involved in software development, embedded systems and cloud-connected devices for a few years now, and I was wondering today: if I had the luxury of designing an IoT pipeline from scratch, how far could I push the design to make it functional and performant, yet low cost?

And so the idea for the IoT-1M challenge was born.

Ok, I owe you a bit of context here.

IoT stands for Internet of Things and describes the trend of connecting real-world products to the Internet in order to make them smarter. In more concrete terms, think about common everyday products like appliances, cars, traffic lights, airplane engines and not-so-common ones like power plants and waste management systems. By connecting them, we allow monitoring their performance, detecting failures early, updating their software, collecting valuable data to improve next-generation designs and so on. In brief, IoT.

The challenge

So back to the challenge, I want to push the envelope and see if it’s possible to develop a system that:

  • receives data samples from 1 million active devices connected to the Internet, with each device generating 10 data points every second
  • stores said data for a limited time (1 month)
  • allows querying back data with the following access patterns:
    • single stream of data points for a specific device
    • historical data points for a small group of devices
    • stream of aggregate data points over the entire fleet
  • and central to this challenge, costs less than $1000 / month to operate

Contributing

Now, I would love to compare my design to others developed in the open, so I’ve created an IoT-1M-Challenge organization on GitHub. If you’d like your project hosted or mirrored there, please open a ticket or send me a message at [email protected].

My next blog post describes the first step in this challenge, IoT-1M, covering the bases.