Federated Learning in IIoT

Gustavo Chinchayan Bernal
7 min readMar 9, 2023

How the emergence of this technique has changed the machine-learning process for edge devices within Industry 4.0

Emergence

Google coined the term federated learning (FL) during the Cambridge Analytical scandal of 2018. This was the time when sharing personal information online gained attention globally due to concerns over digital privacy. An absence of privacy leads to sensitive data being highly exposed to cyber-attacks and information hacking. In light of this, the European Union had a step-forward solution by enforcing the General Data Protection Regulation (GDPR). GDPR established rules and limits to the sharing/storage of data for companies that operate as data controllers and processors. This regulation has led developers on a quest to provide privacy-protecting technologies and methodologies. FL is different from the traditional centralized machine learning approach that is widely used in many applications/industries.

Data Privacy

Centralized Model

In a traditional centralized model, all data that is used to train the model is collected and stored in a central location, this can be either a data center or a cloud provider. Machine learning models and training to make predictions or classifications are done at this level. During training, data collected from edge devices (clients) are divided into batches, where the model learns from these batches through a Stochastic Gradient Descent (SGD) methodology.

SGD changes the parameters for each training sample one at a time for each training example in the dataset as per Analytics Vidhya

SGD is used to adjust the parameters of the model in order to minimize the difference between the predictions and actual outcomes of the training data. Once the data is trained, it can be deployed to make predictions or classifications on new data. The ML model is then used by the user through an API by sending a request to access a specific feature. This type of communication is called REST API, where the server exposes an API that the users can use to send data to the server for training or to receive predictions from the trained model. Communication in a centralized model is based on a client-server architecture (as shown below), where the server acts as the central authority for the ML model, and the clients are responsible for sending the data to the server or receiver predictions from the server.

A Centralized ML Model where edge devices send and receive data to a centralized ML database where ML model and training are done

Limitations

Industries that mainly use centralized machine learning are finance, e-commerce, and healthcare where models are used for tasks in fraud detection, recommendations, and medical diagnosis. A centralized architecture works in industries where large amounts of data are collected and stored in centralized locations however, this can also inherit many risks in regard to privacy concerns. Some common examples of these limitations:

  • Data Breaches: A breach of data that is stored in a central location can expose hundreds if not thousands of individuals.
  • Data Diversity: When data is stored in one central location, this can be perceived as a biased model because it may not be representative of the population it is being used to make predictions on.
  • Limited Accountability and Control: Users have limited control when data is stored in a central location and when a data breach occurs it is difficult to hold organizations accountable.

Federated Learning

On the other hand, the FL architecture is different because machine learning is done across multiple edge devices (clients) that collaborate in the training of the ML model. In this instance, raw data is not transferred to a central location as the model is trained locally on each device and only the model updates are sent to a central server. The aggregated updates are used to create an improved central ML model.

In FL, each device trains a model and sends its parameters to the server for aggregation. Data is kept on devices and knowledge is shared through an aggregated model with peers.

IoT devices in the FL process are responsible for storing the raw data and training a local model using the data. The local model is then trained using the data stored on the device. The local model is updated utilizing the SGD methodology. The IoT devices send model updates to the central server. Model aggregation occurs when the central ML model receives updates from the IoT devices to improve it (one common approach is using the federated averaging algorithm).

new_global_model = sum(w_i * local_model_i) / sum(w_i)

where:
'new_global_mode' = the updated global model
'local_model_i' = local model trained on the 'i-th' IoT device
'w_i' = the weight assigned to the 'i-ith' IoT device which represents
the number of samples used to train the local model

The updated global model is sent back to the IoT devices and the process repeats in multiple iterations to improve its accuracy.

Benefits and Drawbacks

The goal of federated machine learning for IoT devices is to address the issues of privacy and scalability associated with centralized machine learning. This specifically addresses the concern of dealing with sensitive data because raw data is kept only on IoT devices. This enables the organization to train machine learning models on multiple devices without the need to transfer this data to a central location.

Presently, FL has technical challenges such as dealing with heterogeneous data sources (each device having non-identically and independently distributed data) and ensuring data privacy and security of sensitive models. IoT devices can also have limited computational resources which can make it challenging to train a complex model. Additionally, the privacy of the data on the devices must be protected to ensure that confidential information is not exposed. FL is still considered an emerging topic and ongoing research is taking place to develop new techniques to make them more effective and reliable.

IIoT Applications

FL models are designed to work with data that is distributed across multiple devices or servers. The data can be comprised in any form that is suitable for machine learning such as numerical, image, audio, and text data. Within the area of the Industrial Internet of Things (IIoT), utilizing FL can be very effective in industrial automation sectors using predictive maintenance. Data such as:

  • process control: temperature, pressure, flow rate, energy consumption
  • quality assurance: defect rate, yield, service level, customer satisfaction
  • equipment monitoring: downtime, temperature/vibration, mean time to repair

This data can be a privacy concern for customers utilizing automation within their organization. Training machine learning models using the FL approach means that the data is trained locally on edge devices so that the privacy and security of data can be established.

Figure: Conceptual architecture of IIoT devices within Industrial automation. Reference from Semantic Scholar: Edge Powered Industrial Control

FL can also be useful when combining data from multiple sources to train machine learning models that can be used to improve manufacturing processes, reduce downtime, and optimize equipment performance. In terms of scalability, an FL approach can be used to train ML models on data from multiple factories to identify common patterns and trends. These organizations can use them to identify common patterns and trends in their factories. The aggregate ML model can then make predictions or recommendations for optimizing operations across all the factories.

FL approach can exist in different areas of IIoT such as:

  • Energy management: To optimize energy consumption and reduce costs
  • Quality Control: To identify defects and quality issues in the manufacturing process
  • Supply Chain: To optimize the supply chain operation from logistics and transportation systems
  • Environmental monitoring: Using environmental sensors that monitor air and water quality in order to detect pollution and predict environmental risks

Latest advancements in FL

As mentioned previously, FL is an active area of research and there have been several advancements in this field. Secure aggregation and Differential privacy is aiming to provide solutions to the complete privacy/encryption of sensitive data from end to end. ML models are being compressed to reduce their size in order to meet the computational resources required to run on IoT devices effectively, this improves efficiency and maintains model accuracy. Federated transfer learning is a new technique that allows ML models to be transferred between different FL applications when dealing with different data distributions.

Final thoughts

Overall, the application and usage of a Federated learning model or Centralized model will be largely dependent on the nature of the data and the objectives of the ML model. The main differences between both methodologies are the distribution of data and computation. An organization must first identify whether the data they control or process is subject to privacy regulations. Alternatively, an organization can utilize a hybrid approach using both the centralized model and FL models.

References

  1. Abdulrahman, S., Tout, H., Ould-Slimane, H., Mourad, A., Talhi, C., & Guizani, M. (2020). A Survey on Federated Learning: The Journey From Centralized to Distributed On-Site Learning and Beyond. IEEE Internet of Things Journal, PP.
  2. K. A. Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, & Karn Seth (2016). Practical Secure Aggregation for Federated Learning on User-Held Data. In NIPS Workshop on Private Multi-Party Machine Learning.
  3. Sun, T., Li, D., & Wang, B. (2023). Decentralized Federated Averaging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4), 4289–4301.
  4. Yiqiang Chen, Jindong Wang, Chaohui Yu, Wen Gao, & Xin Qin (2021). FedHealth: A Federated Transfer Learning Framework for Wearable Healthcare. CoRR, abs/1907.09173.

BECOME a WRITER at MLearning.ai

--

--