I have been involved in software development, embedded systems and cloud-connected devices for a few years now, and I was wondering today: if I had the luxury of designing an IoT pipeline from scratch, how far could I push the design to make it functional and performant, yet low cost?
And so the idea for the IoT-1M challenge was born.
Ok, I owe you a bit of context here.
IoT stands for Internet of Things and describes the trend of connecting real-world products to the Internet in order to make them smarter. In more concrete terms, think about common everyday products like appliances, cars, traffic lights, airplane engines and not-so-common ones like power plants and waste management systems. By connecting them, we allow monitoring their performance, detecting failures early, updating their software, collecting valuable data to improve next-generation designs and so on. In brief, IoT.
The challenge
So back to the challenge, I want to push the envelope and see if it’s possible to develop a system that:
- receives data samples from 1 million active devices connected to the Internet, with each device generating 10 data points every second
- stores said data for a limited time (1 month)
- allows querying back data with the following access patterns:
- single stream of data points for a specific device
- historical data points for a small group of devices
- stream of aggregate data points over the entire fleet
- and central to this challenge, costs less than $1000 / month to operate
Contributing
Now, I would love to compare my design to others developed in the open, so I’ve created an IoT-1M-Challenge organization on GitHub. If you’d like your project hosted or mirrored there, please open a ticket or send me a message at [email protected].
My next blog post describes the first step in this challenge, IoT-1M, covering the bases.