The Internet of Things (IoT) space is a growing area for utilising Machine Learning (ML) solutions. When it comes to ML, working in the IoT space presents some unique challenges. In this two part series we will first define these challenges, then in part two we will explore the data and ML engineering solutions that can be implemented to overcome them.
The Challenges
Typically in machine learning:
- Models are trained using data that is created in an online platform and so then stored in the cloud as a consequence. This data is then readily accessible for pre-processing to train a model.
- Data labelling is done by humans, taken from existing labelled datasets or a combination of both.
- The selected models are then deployed to an endpoint online which is accessed through an API for predictions.
However, in the IoT space:
- The data is created by sensors locally and if it is stored at all it is stored locally, often for just a short period of time.
- Data is often unique to the device as a result of the physical environment it is collected in that is specific to the use case or the specifications of the combination of sensors used to collect the data. This unique, unlabelled data is collected in massive quantities.
- Models are frequently required to be deployed on the IoT device for reasons such as real-time, low latency prediction or simply because the device is not guaranteed to have a connection to the internet or any other device.
The challenges this environment presents are:
1. Getting data to the cloud — for effective model training to take place the data must be accessible for training, and it is best accessed in the cloud where training processes can take advantage of the effectively unlimited amount and exhaustive variety of resources available. Therefore, any relevant data captured by the IoT devices must communicated to the cloud. This means:
- The IoT device must have access to an internet connection in some way. This could be directly, or through connection to another device with internet connection.
- The IoT device must be able to store the relevant data for enough time to deal with periods where it may not have access to a connection to the internet.
2. Data labelling — the IoT space is characterised in part by the sheer amount of data that needs to be captured and processed. In itself this is challenging, but in the context of ML this presents an extra challenge:
- The data is likely unique to the sensors on which it was collected and environment in which it was collected. Therefore it is unlikely there are similar enough ready-labelled datasets available to train a model.
- This vast quantity of data must be labelled to give a source of ground truth for training ML models. The sensor signals may only be interpretable to a highly trained domain expert. So human labelling on such a large scale is not feasible.
3. Model training & deployment to embedded-tech — in embedded-tech, many of the hardware requirements are driven by the need for the device to be small and light. This in turn drives the software requirements, and the ML model requirements. The model likely has to be lightweight, be required to operate on a specific OS, and maybe with a limitation on the bit size of the data used. Also, each device must have its own copy of the model. Therefore:
- Models deployed must run on the device exactly as it did in the machine that trained it. To achieve this, the training environment should be as similar as possible to device environment. For example, the IoT device may be limited to using an 8 bit OS, so the training should mimic this as having a 32 bit or 64 bit number in the model parameters could change the model behaviour entirely when deployed.
- Every time a model is updated it must be deployed to multiple devices. Some sort of automated version control and automated deployment process must be implemented for both deploying to new devices and deploying updates to existing devices.
- Multiple device versions (hardware or software) may be in use at the same time to perform the same task. This is builds upon points a. and b. above; models deployed should run the same on all devices performing the same task, and if this means different model files, this should be accounted for in the automated deployment process.