We want to make IoT cloud enabled, but the question is, Are we there yet? The major problem IoT industry is facing today is how to create a “secure” and “robust” ecosystems where sensors/actuators based applications can run without interruptions and data is handled/processed securely.
With the introduction of Kubernetes aka K8, we were able to solve the problem of microservice/container reliability and robustness at cloud level. The applications no longer needed to have knowledge of their host node or communication security strategy with peer services as it was taken care at orchestrator level.
The approach suggested here tries to exploit this by bringing K8 into the fog layer. The idea is to use fog layer components like Routers, Switches to run lightweight applications and the orchestration engine itself. The Raspberry PIs (or any other smart ecosystem component for that matter) can act as worker nodes for k8 and run “services” which in turn talk to sensors. All such services are bound by the firewall and cannot access the internet and vice versa.
Since K8 needs a stable master node, a switch might serve the purpose just fine. To have multiple node K8, we could use both routers and switches for this purpose. All the other routers, and all raspberry pi would act as a worker node. While adding pi as worker nodes, they need to be labelled to determine where they being used in actual scenario. For instance, Pi A and B are for getting lathe machine sensor data (considering a Manufacturing factory use case) and Pi C is to monitor the water tank. Then A and B goes with the label “Lathe” and C with “Tank” (A and B can be distinguished using “location” label).
- “Controller Service”: This includes services like client side dashboards, data analysis for sensitive data and other logic definitions.
- “Sensor Service”: These are the ones that get data from sensors and perform logic operations, if any. These services have to be configured to run on particular raspberry pis based on their label (“Lathe”, “Tank”…). Each Pi unit can in-fact consist of 2 Raspberry PIs, second pi being the backup node. The actuators based services will be very much similar to these which would trigger an action rather than sucking data out of it.
- MQTT message queue: A light weight MQTT based broker for internal message/sensor data handling. All the sensor services can directly dump data or talk to other services using it. Same can be used to ask an actuators based service to trigger an action. This would be an open queue accessible throughout the internal network.
- MQTT KAFKA Bridge: This is the place where data gets “ready” to penetrate the firewall. We take all the necessary “topics” from the queue, filter the messages if required, and put them into the external Kafka queue. This will act as both producer and consumer for Kafka and needs access to internet and should be allowed by the firewall.
- Apache Kafka: Since it is well suited for both online and batch processing tasks, it is a good, if not obvious, choice. Data gets out of firewall for the first time. It is a single message queue for all the clients and has to be highly scalable and robust. The client authentication is done using TLS certificates. The data is consumed by various cloud services as shown and could be used to send messages back and forth b/w client and cloud.
- K8 API Server and other infra based services: The kube API server runs on master node and should be accessible from outside. Other services like logging, monitoring etc could use the same apache kafka queue to dump messages and can be taken care of at the cloud platform level (local setup remains light this way and client does not need to care about these services).
Security: We try to make the system more secure by first, having services and devices constrained under firewall, with only one point of exposure (the mqtt-kafka bridge) where the data is filtered and transferred onto Kafka using TLS certificates to authenticate the identity.
Robustness: We use Kubernetes, the container orchestrator, to make the services reliable in the client ecosystem. If the sensor services now go down or do not respond, K8 can automatically spin up another instance of it on a “similar” node (with same labels as this one).
The software updates can now be online as well, as the kube-api service is exposed and managed by the cloud itself. Adding new raspberry pis to system could be fairly easy. Just plug the Pi, get its IP, and use Kube Admin to configure a new worker node with appropriate labels. Although, the system where we could have the “Plug and Play” functionality in raspberry Pi, where K8 can somehow discover a new worker, is still not clear.
Please share your views on how can we improve this architecture and if it can fit in your scenario.