What is Federated Learning?

Last Updated : 01 May, 2025

Traditional Machine Learning training relied on large datasets, which were stored in centralized locations like data centers, and the goal was to get accurate predictions and generate insights that would profit us in the end. But this approach came with challenges like data storage issues, privacy concerns, and processing. Recently, there has been a key development of the concept of federated learning, which is providing some groundbreaking solutions.

Federated-Learning-(1) — Federated Learning

Federated Learning is a technique of training machine learning models on decentralized data, where the data is distributed across multiple devices or nodes, such as smartphones, IoT devices, edge devices, etc. Instead of centralizing the data and training the model in a single location, in Federated Learning, the model is trained locally on each device, and the updates are then aggregated and shared with a central server.

Types of Federated Learning

There are various strategies that are used for Federated Learning. Let's take a brief look at them.

1. Centralized Federated Learning

Here, a central server is used to perform different steps of the algorithm. The central system is subjected to selecting the nodes at the beginning of the training process and then it is also responsible for aggregating the model updates that we received from different nodes/devices. Here, all the selected nodes, send the updates to this central server and hence it is the bottleneck of the system. This method can cause bottleneck problems.

2. Decentralized Federated Learning

In Decentralized Federated Learning, the nodes themselves can coordinate to get the updated model. This approach can help in preventing the single server problems, that we can get from the centralized federated learning, as in this the model updates are shared between the interconnected nodes without the need of the central system. Here, the model's performance is totally dependent on what network topology we opt for.

3. Heterogeneous Federated Learning

This learning involves a large no of heterogenous clients e.g., mobile devices, and IoT devices. These devices can differ in software or hardware configurations. Recently, a Federated learning framework called HeteroFL has emerged, specifically designed to tackle the challenges posed by heterogeneous clients with varying computation and communication capabilities.

How Federated Learning work?

Let us understand federated learning in a more detailed manner, i.e. the steps. The base model is stored at the central server, and a copy of this model is stored on all devices. Whenever the user enters some information, the following step takes place:

Step 1: The particular device will download the current model.
Step 2: The model would make improvements from the new data that we got from the device.
Step 3: The model changes are summarized as an update and communicated to the cloud. This communication is encrypted.
Step 4: On the cloud, there are many updates coming in from multiple users. These all updates are aggregated and the final model is built.

So, there is no huge amount of data being uploaded to the cloud and also the model is trained with the different data. In this process, the trained data resides within your own smartphone/mobile device.

Real-Life Application of Federated Learning: Google Keyboard

Data Collection: Google Keyboard (Gboard) collects data, like the names of restaurants you search for, but keeps it on your phone. Your personal data never leaves your device.
Federated Learning in Action: The model is trained directly on your phone using the data (like your search history). No need to send your data to a central server, keeping it private.
Model Updates: After training locally, the updates to the model are sent to a central server in an encrypted form. The server combines updates from different devices to improve the model.
Better Suggestions: As more updates are gathered, the model gets better at suggesting things, like more accurate restaurant names or typing predictions based on your habits.
Privacy and Efficiency: Your data stays private, and only model updates are shared. This helps save bandwidth and makes everything run more efficiently.

Advantages of Federated Learning

Reduced Power Usage: Reduced data size means reduced computation time and hence less power usage.
Guarantees Privacy: Data stays on the device, maintaining privacy without loss of training.
No Device Performance Impact: Training is done only when the device is in idle or charging state, hence no impact on performance.
Scalability: Scales with big, distributed datasets across multiple devices.
Better Model Performance: Uses heterogeneous data from various devices to improve model accuracy.
Real-time Updates: Allows real-time model updates on each device.