The rapid increase of Internet of Things (IoT) devices and portable technology has led to an exponential increase in the amount of data generated. According to a Cisco report, around 2.32 zettabytes of data were produced daily at the network edge by 2023. This enormous volume of data has driven the adoption of edge computing, allowing efficient local data processing by bringing computation closer to the edge. Furthermore, edge computing has enabled the emergence of federated learning (FL), a distributed machine learning (ML) approach for collaboratively training models across edge nodes. As the scale of IoT devices and data continues to grow, edge computing and federated learning have become pivotal in managing and extracting value from massive decentralized data sources.
1. Federated Learning and Privacy in Heterogeneous Networks
To analyze the massive amounts of edge data, machine learning approaches like Federated Learning (FL) can enable intelligent services. Different regions in heterogeneous networks produce large amounts of sensitive data, making privacy preservation crucial. Traditionally, centralized model training involved collecting vast amounts of raw data from devices to a centralized edge server. Although centralized methods provide accuracy, they encounter privacy violations and limits on data transfer capacity. Moreover, devices generate imbalanced, non-independent, and non-identical data in heterogeneous networks. Synchronous and asynchronous FL methods often lead to long training latencies and require massive communication resources.
A semi-asynchronous FL mechanism combines the advantages of both methods or enforces local model synchronization while allowing asynchronous aggregation. This balance enhances computing efficiency (accuracy and latency) and communication efficiency (rate and latency). However, parts of selected devices still wait for the slowest device to complete their training, leading to wasted computing resources.
2. Workload Distribution and Task Scheduling
Research has investigated the complexities of workload distribution and task scheduling in edge computing environments, focusing on dynamic resource allocation strategies. Management techniques for optimized performance and efficient resource utilization are crucial. RoofSplit is an edge computing framework that splits convolutional neural networks (CNN) models for collaborative inference across heterogeneous nodes, optimizing split layers based on node capabilities and network conditions. Test results show it reduces inference latency by up to 63%. Employing a heterogeneous graph neural network for mobile app recommendations at the edge enables personalized app recommendations while preserving user privacy, with 11% higher accuracy and 21 times lower latency than centralized methods.
3. Integration of Edge Computing and IoT
The integration of edge computing and IoT technologies revolutionizes data processing, analytics, and services at the network’s edge, significantly reducing latency and enhancing real-time analytics. Decentralized processing for IoT applications improves resource management, energy efficiency, and optimization strategies. The challenges of device heterogeneity and dynamic environments necessitate efficient resource allocation. Federated deep learning algorithms, equipped with gating mechanisms and optimized aggregation weights, improve accuracy and speed, enabling collaborative deep learning on diverse edge computing hardware under non-IID data distributions. Techniques like model compression, quantization, and optimization for edge-specific hardware ensure efficient model deployment and execution on edge devices.
4. Multi-Edge Clustering and Edge AI Architecture
We propose a novel Multi-Edge Clustering and Edge AI architecture for Federated Learning (MEC-AI HetFL) in diverse networks. It employs asynchronous edge devices, federated learning, and synchronous Edge AI and FL mechanisms, allowing collaborative learning across heterogeneous networks and IoT devices. This architecture ensures diversity, quality scores, and accuracy improvements. MEC-AI HetFL operates across three layers:
- Edge Clusters-Devices Layer: Each edge cluster selects multiple IoT/edge devices depending on their arrival sequence in each iteration. A node selection strategy ensures that devices with higher quality scores are chosen, handling system heterogeneity between edge clusters and end devices.
- MEC-in-Edge AI Layer: Edge AI nodes select local models from edge clusters based on arrival order for each round. An edge AI update approach synchronizes within each edge cluster, allowing iterative retraining of local models to enhance accuracy.
- MEC-AI-HetFL Layer: This layer performs synchronous heterogeneous federated learning, aggregating client models from all edge AI nodes to share information. Edge AI maintains model individuality, enabling chosen clients to retrain global models to further improve accuracy.
5. Optimization Framework for Heterogeneous Federated Learning
We propose an optimization framework for heterogeneous federated learning, focusing on joint device node selection and resource allocation. This maximizes efficiency by employing an Edge Node Selection algorithm and a low-complexity EDGE AI/FL approach for collaborative learning.
6. Real-Time Asynchronous FL Approach
We introduce a real-time asynchronous FL approach for edge AI-IoT device layers within the MEC-AI (HetFL) architecture. Edge clusters can locally select and train devices asynchronously with new/updated data. This allows training to continue even if some devices are unavailable, prioritizing devices that contribute high-quality data.
7. EDGE AI Algorithm for Heterogeneous Devices
An EDGE AI algorithm is proposed to solve the Edge cluster selection problem with heterogeneous devices. A joint allocation strategy optimizes resource utilization over the heterogeneous network, considering constraints like communication bandwidth and device computational power.
8. Advantages of MEC-AI HetFL Framework
The novelty of the proposed MEC-AI HetFL framework lies in efficiently addressing heterogeneous edge computing environments and non-IID data distributions. Unlike conventional frameworks, such as EdgeFed, FedSA, FedMP, and H-DDPG, which face high communication costs or inefficiencies in node selection, MEC-AI HetFL introduces a multi-edge clustering mechanism combined with an asynchronous edge AI update strategy. This innovation allows dynamic selection of edge nodes, optimizing resource allocation in real-time, significantly improving training speed, and model accuracy. The Edge Node Selection Algorithm enhances scalability and system performance with precise and efficient node selection, reducing computational overhead.
9. Related Work
With the increasing popularity of IoT, significant data is generated from the physical world per second. Traditionally, this vast data is forwarded to remote clouds for processing and training, which may cause delays and privacy leaks. Edge computing shifts computation to the network edge, promoting efficient local data processing and federated learning. Researchers have presented three main FL paradigms: synchronous, asynchronous, and semi-asynchronous.
- Synchronous FL: Combines local models to create a global model, enhancing performance with efficient resource allocation, but suffers from long waits for diverse edge devices.
- Asynchronous FL: Sends models to an edge server, combining all models and sending the latest global model, reducing round duration but with high communication costs.
- Semi-Asynchronous FL: Combines synchronous and asynchronous mechanisms to enhance round efficiency and convergence. It optimizes participant selection based on edge characteristics, improving accuracy and resource utilization.
Conclusion
The surge in Internet of Things (IoT) devices and portable tech has resulted in immense growth in data generation. Cisco reports that by 2023, approximately 2.32 zettabytes of data were created daily at the network edge. This vast data volume has propelled edge computing adoption, which allows for efficient local data processing by moving computation closer to the data source. Additionally, edge computing has facilitated the rise of federated learning (FL), a distributed machine learning (ML) method that trains models collaboratively across various edge nodes. As the number of IoT devices and the data they produce continues to expand, edge computing and federated learning have become crucial for managing and deriving insights from huge decentralized data sources. By processing data locally and training models across multiple devices, these technologies help address the challenges associated with data privacy, latency, and bandwidth constraints, ultimately enhancing the efficiency and effectiveness of data-driven decision-making.