0% found this document useful (0 votes)

28 views

FFSML - Thesis Presentation

The document proposes a methodology for utilizing federated learning and meta-learning for few-shot learning on edge devices. It involves: 1. Training a global prototypical network model on a large base dataset at a centralized server. 2. Distributing the global model to edge devices for fine-tuning on local support sets and predicting on local query sets. 3. Aggregating the locally fine-tuned models back at the server to update the global model, improving few-shot learning performance across devices.

Uploaded by

kousalyavoleti16

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

FFSML - Thesis Presentation

Uploaded by

kousalyavoleti16

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 56

Utilizing Federated Learning and Meta-Learning

for Few-Shot Learning on Edge Devices

Presented by,
Lahari Voleti
Contents • Introduction
• Related Work
• Our Proposed Methodology
• Datasets
• Experimental Designs
• Empirical Results
• Conclusions & Future Work
• References
Introduction
Introduction
• In recent years, mobile devices being a convenient way
of connecting users to internet, have become a platform
to deploy ML models.

• Importance is given to make these devices smarter

and more human-like by using ML techniques.
• One important on-device machine learning task
is prediction using small amount of data on
Few-shots Learning
the resource-limited devices.

• The solution for this challenging problem of predicting

using model trained using limited data (few-shots
learning) can be enhanced using "more experience"
model (meta-learning) and knowledge (e.g., model)
sharing between devices and a centralized server
via federated learning.

Federated Learning
Wang, H. (2021). GitHub - wangshusen/DeepLearning.
GitHub. https://github.com/wangshusen/DeepLearning
Federated Learning

• Federated learning is a machine learning

technique which can successfully train
a model with a number of edge devices
with their local data and
allows them to collaboratively train and
share knowledge to improve prediction
accuracy.

• Enhances model predictability, data

security and personalization of edge
devices.
Example of a Federated Learning scenario

Chandorikar, K. (2021). Introduction to Federated Learning and Privacy

Preservation. https://towardsdatascience.com/introduction-to-federated-learning-and-privacy-preservation-75644686b559
Federated Learning Types

• Centralized federated learning: In

this setting, a central server is used to
orchestrate the different steps of
algorithms and coordinate all the
participating nodes during the learning
process.

• Decentralized federated learning: In

this type, nodes are able to coordinate
themselves to obtain the global model.

Li, Q. (2019, July 23). A Survey on Federated Learning Systems: Vision, Hype and Reality. https://arxiv.org/abs/1907.09693
Few-shot classification using Meta Learning

• Few-shot
learning means building
predictive models which can
efficiently solve the challenge of
prediction using only
limited amount of data for each
object class. N-way K-shot Q-query tasks (3-way 2-shot 1-query)
• Meta-learning is learning-to-
learn, a technique of making a
machine learning to make a
model more experienced on
a distribution of tasks and use this
experience to improve on
future learning performance.
• Goal is to make machine learning
models more human-like, reduce
data collection and improve
predictability.
Meta-Training
Yasd, J. (2018). GitHub - johnnyasd12/awesome-few-
shot-meta-learning: awesome few shot / meta learning
papers. GitHub.
Problem Statement and Solution Overview

Problem: How can we perform few-shot learning using meta-learning on a

centralized federated learning setting?
Related Work
Federated Learning Algorithms
Federated Averaging (FedAvg) [7]

• During aggregation, the edge device's

local model parameters are just The FedAvg algorithm
averaged to prepare the aggregated
global model parameters.

Advantages: Easy implementation

Disadvantages: Can handle data

heterogeneity

“Breaking Privacy in Federated Learning,” KDnuggets.

https://www.kdnuggets.com/2020/08/breaking-privacy-
federated-learning.html.
Federated Learning Algorithms
Federated Personalization (FedPer) [8]

• The client neural network is seen as a combination of

base and personalization layers

• Advantage: Allows client specialization and easy

handling of variety of data.

Arivazhagan, M. G., Agarwal, V., Singh, A. K., & Choudhary, S.(2019).

Federated Learning with Personalization Layers. arXiv:1912.00818 [Cs.LG].
Retrieved from https://arxiv.org/abs/1912.00818
Federated Learning Algorithms
Federated Proximization (FedProx) [9]

• During the client local updates, for every round of training we minimize the loss function F using a proximal
term as follows:

• Where is the old loss, and is the proximal term which varies according to the data. is the client
local data weight parameters and is the global model parameters at time t.

• Advantage: specifically addresses and deals with the inconstant resource constraints of clients during
federated learning and the issue with heterogeneity of local data at the clients
Few-shot learning using meta-learning

• Relation Networks :
• It uses one module for generating
feature maps called embedding module
and relation module for calculating
relation score between support and
query images.
❖ Drawbacks: More cumbersome architecture
and adaptation to new few-shot tasks is of
minimal accuracy.

Tan, F. (2022). Learning to Compare: Relation Network for Few-shot Learning. https://medium.com/mlearning-
ai/learning-to-compare-relation-network-for-few-shot-learning-fa9c40c22701
Few-shot learning using meta-learning

• Model Agnostic Meta Learning

(MAML):
• Its main goal is to learn how to initialize
good parameters of the model which can
successfully make an accurate prediction
based on optimal minimization of the
loss function.

• Drawbacks: Unstable, hard to train to

implement efficiently . Its higher order
derivatives are expensive to compute and
takes longer runtimes.

• Harder to implement with complex

architectures

Meta-Learning. (2020). https://meta-learning.fastforwardlabs.com/

Federated learning using meta-learning
• Federated Meta-Learning (FedMeta):
• Uses MAML meta-learning algorithm in
federated learning scenario to improve the
model adaptation to heterogenous data .

• Drawbacks: This work is not about few-shot

classification task and is also expensive to
compute and takes longer runtimes.

F. Chen, M . Luo, Z. Dong, Z. Li, and X. He,

“Federated M eta-Learning with Fast Convergence and
Efficient Communication,” arXiv:1802.07876 [cs],
2019
Prototypical Networks
• These networks are implemented by calculating the prototypes for
images in each class of the support set.

• An appropriate feature extractor like CNN, ResNet is used to get

feature vectors of images.

• A Prototype is simply the mean of all the feature vectors of images

in the support set.

• Once these are set up, we calculate the Euclidean distance

between each prototype and the feature vector of the input image
(from query set).

• The class of prototype with very minimum distance is predicted

as an output.

• Advantages: computationally inexpensive as well as efficient and

easily implemented.
Snell, J., Swersky, K., & Zemel, R. (2017).
Prototypical networks for few-shot learning. Advances
in neural information processing systems, 30.
Our Proposed Methodology
Solution Overview
We implement our proposed federated few-shot learning framework utilizing Prototypical Networks a meta-learning
algorithm, on all devices in a centralized data distributed architecture such that different few-shot predictive models
can be executed on the edge devices and the server.
Our Methodology
Step 1: Server side global model M episodical pre-training of large base dataset using Prototypical Network . See
figure below

Step 2: The global model M is sent to the clients.

Step 3: Individual clients fine-tuning with model M with their distinct support sets S1, S2, S3 and perform
prediction on their query sets Q1, Q2, Q3 using their respective fine-tuned model M '1, M '2, M '3.

Server Pre-Training/ Initialization

Our Methodology (contd.)
Step 4: Local copy of global models which are
updated in Step 3, are sent back to the server.

Step 5: The resulting model is referred to as M’.

Step 6: Using M’ we test the server using S and make

a prediction on Q and obtain a server testing accuracy.

These steps are iterated for multiple rounds (20) in our

experimental scenario implementation.
Empirical Study
The main goals to be made in the empirical study of our framework are:

(1) compare the performance of the following three federated learning approaches on few-shots:
 Federated Averaging (FedAvg)

 Federated Proximal (FedProx)

 Federated Personalization (FedPer) in our proposed framework and

(2) explore the effect of data heterogeneity (using different datasets on different edge devices) on the few-shot
learning performance.
Datasets
Fashion-MNIST Dataset
• It is a dataset of Zalando’s article images which consists of 70,000 images from 10 different dataset.
The images are different types of clothes such as trousers, shirts etc. Every image is a 28 x 28 grayscale
image. [4]

• Total: 10*7000=70000 data points

• Training set: 60,000 ; Testing set: 10,000
• From training set:
• Server Base Pre-Training: 30,000 samples
• Each client Fine-tuning: 10,000 samples

• From Testing set:

• Server Testing : 10,000 samples

Fashion-MNIST dataset

Fashion MNIST. (2017, December 7). Kaggle..https://www.kaggle.com/datasets/zalando-research/fashionmnist

Omniglot Dataset
• It contains 1623 different handwritten characters from 50 different series of alphabets, where
each character was handwritten by 20 different people. [5]

• 1623*20=32460 alphabets

• Training set=24460 ; Test set=8000

• From training set:

 Server Base Pre-Training: 8460 samples
 Client 1 Fine-tuning : 8000 samples
 Client 2 Fine-tuning: 8000 samples

• From Testing set: Omniglot dataset

 Server Testing : 8000 samples

Dataset from: https://github.com/brendenlake/omniglot

CIFAR-100 Dataset
• The CIFAR100 (Canadian Institute For Advanced Research) dataset consists of 100 classes with 600
color images . It is divided into 500 training and 100 testing images per class. [3]

• Total: 600*100=60000 data points

• Training set: 50,000 ; Testing set: 10,000
• From training set:
• Server Base Pre-Training: 20,000 samples
• Each client Fine-tuning: 10,000 samples

• From Testing set:

• Server Testing : 10,000 samples

CIFAR-100 dataset

Image from: https://docs.activeloop.ai/datasets/cifar-100-dataset

Experimental Scenarios
• To explore the feasibility and efficacy of our few-shot solution using meta-learning and federated learning on both
homogeneous and heterogeneous data, we perform experiments on two scenarios:
➢ Single Dataset
➢ Multiple Datasets

• We also limit the number of shots in the few-shot learning tasks on the clients. We consider three few-shot learning task
configurations, namely:
❑ 3-way:5-shot:10 query

❑ 5-way:5-shot:10-query

❑ 5-way:5-shot:5-query

• The feature extractor we use in our Prototypical Networks is ResNet18, and optimization of loss is done using SGD.
• Software used: Google Collaboratory Pro+ with GPU (NVIDIA PT100) , 52 GB RAM
• Performance measure: Accuracy
• Number of rounds: 20
• Number of clients: either 2 or 3
Experimental Designs
Experimental Design of FedPer Algorithm

Question: Which network configuration to use for FedPer approach?

Performance Comparison of 2 Layer configurations in FedPer using Fashion-MNIST

• Maintaining an appropriate number of base layers is an essential criteria here.

• 2-client scenario .
• For ResNet18, there are 120 total layers , In this experiment, we compare 2 configurations:
(1) 78 base layers and 42 personalized layers
(2) 42 base layers and 78 personalization layers
FedPer on Fashion-MNIST – 2 clients(20 rounds)

Out of these two configurations, We use the model with the first configuration for the rest of the experiments
(78 base and 42 personalized layers) which emphasizes less on the importance of local data.
Experimental Design of FedProx Algorithm
Question: What is the effect of proximal term in FedProx?
Effect of FedProx proximal term on Prediction Performance using Fashion-MNIST

• In our experiments, we use four different

proximal values for this method they are, 0.01,
0.1, 1,1.5 .

• One observes that we cannot pick a particular

value that works best across the different few-
shot learning task configurations even if the
configuration variation is not significant.

• We choose mu=1 for the rest of the

experiments.

Fashion-MNIST(2 clients)
Empirical Results
Part A: Single-Dataset Scenario (Homogeneous data)
Omniglot (Only 2 client scenarios) :
• In the below figures, we can see the sample individual round performances of
clients (green) and the aggregated server model(red).

5-5-10 5-5-5
2-clients 2-clients
Fashion-MNIST (Both 2 and 3-client scenarios):
CIFAR-100 (Both 2-client and 3-client scenarios ):
Empirical Results
Part B: Multiple-Dataset Scenario (Heterogeneous Data)
Important Investigations

1) How well the edge devices and server aggregated model performs on a completely new task
it hasn't seen before?

2) What is the performance of these edge devices and server aggregated model under heterogenous data?
Experimental Results-Multiple Dataset Scenario
Server Pre-Training Client Fine-Tune Accuracy Server Aggregated M' Testing
Accuracy Accuracy
Number N-way K- Q- Server Base M C1 C-2 C-3 M' on M' on M'
of clients & shot query Train Accu S1-Q1 S2-Q2 S3-Q3 S-Q S_Q S-Q
Type of algo CIFAR,OMNIGLOT CIFAR Omniglot FashionMNIST CIFAR Omniglot FashionMNIST

2 Fedavg 3 5 10 63.933 41.33 90.00 - 41.16 79.33 66.50

3 Fedavg 64.6583 40.33 90.66 60.33 44.83 83.00 74.00

2 Fedper 64.191 46.333 92.00 - 48.50 71.50 64.16

3 Fedper 61.975 44.83 87.166 55.666 45.66 81.00 69.166

2 Fedprox 64.2083 48.83 91.000 - 45.666 87.833 79.00

3 Fedprox 65.1416 44.833 88.50 64.33 47.83 86.53 73.66

2 Fedavg 5 5 5 53.02 26.285 85.904 - 33.40 76.20 60.60

3 Fedavg 56.6 27.60 87.00 49.600 32.80 69.60 67.4

2 Fedper 52.19 27.60 86.60 - 27.80 84.200 52.00

3 Fedper 53.05 33.20 88.00 49.40 33.80 71.60 62.800

2 Fedprox 53.11 31.80 84.20 - 30.000 80.60 47.80

3 Fedprox 53.18 29.80 86.20 49.20 32.20 77.200 66.800

Effect of data heterogeneity

SINGLE-DATA Scenario MULTIPLE-DATA Scenario

OMNIGLOT

5-5-5 5-5-5
Conclusions
• The main observations and conclusions from the empirical results on our proposed framework are as follows:

1) Varying the ratio of base and personalization layers has shown that a considerable amount of base layers is
good for the FedPer algorithm .

2) Varying the FedProx proximal term between 0.01 and 1.5 does not have a significant effect on the prediction
performance for our proposed approach using FedProx for federated learning.

3) For few-shot classification tasks with reasonable difficulty (> 50% accuracy), the proposed approach is able
to improve the edge devices’ individual prediction performance and improve significantly on the global
model (on the server) using any of the federated learning approaches when the few-shot tasks are from the
same datasets.

4) Unsurprisingly, the aggregated (global) models from FedPer perform the best most frequently, followed by
aggregated models from FedProx.

5) Data heterogeneity problem affects the prediction performance of our proposed solution no matter which
federated learning approach we used.
Future Work

❖ In our thesis, we perform experiments with the assumption that data used by the server and clients to
generate few-shot tasks are from all classes (even though the data for the server and clients are non-
overlapping). Future work include testing our proposed approach on experimental setting such that the server
and the clients have data from different classes (and non-overlapping).

❖ Federated few-shot learning with meta-learning in decentralized architectures (i.e., no server).

❖ Improve our proposed framework to handle heterogenous data problem using transfer learning approaches.
References
[1] Wang, Y., Yao, Q., Kwok, J. T., & Ni, L. M. (2021). Generalizing from a Few Examples. ACM Computing Surveys, 53(3),
1–34. https://doi.org/10.1145/3386252

[2] Li, Q. (2019). A Survey on Federated Learning Systems: Vision, Hype and Reality. https://arxiv.org/abs/1907.09693

[3] Benchmarks CIFAR-100. (n.d.). [Database]. CIFAR-100; benchmarks. ai. https://benchmarks. ai/cifar-100

[4] Z. (2017). GitHub - zalandoresearch/fashion-mnist: A MNIST-like fashion product database. Benchmark [Dataset].
Retrieved from https://github.com/zalandoresearch/fashion-mnist

[5] Lake, B. M., Salakhutdinov, R., & Tenenbaum, J. B. (2015). Human-level concept learning through probabilistic program
induction (Vol. 350, Issue 6266). https://doi.org/10.1126/science.aab3050

[6] Snell, J. (2017). Prototypical Networks for Few-shot Learning. Advances in Neural Information Processing Systems 30
(NIPS 2017). Retrieved from https://papers.nips.cc/paper/2017/hash/cb8da6767461f2812ae4290eac7cbc42 -Abstract.html

[7] McMahan, B. H., Moore, E., Ramage, D., Hampson, S., & Arcas, B. A. (2016). Communication-Efficient Learning of
Deep Networks from Decentralized Data. arXiv:1602.05629 [Cs.LG]. Retrieved from https://arxiv.org/abs/1602.05629

[8] Li, T., Sahu, A. K., Zaheer, M., Sanjabi, M., Talwalkar, A., & Smith, V. (2020). Federated Optimization In Heterogeneous
Networks. arXiv:1812.06127 [Cs.LG]. Retrieved from https://arxiv.org/pdf/1812.06127

[9] Arivazhagan, M. G., Agarwal, V., Singh, A. K., & Choudhary, S.(2019). Federated Learning with Personalization Layers.
arXiv:1912.00818 [Cs.LG]. Retrieved from https://arxiv.org/abs/1912.00818
Thank you
QUESTION?
Appendix: Additional
Results
Federated learning algorithms on Few-shot data of Fashion-MNIST

3-5-10
5-5-5
3-clients
2-clients
• In the below figures, we can see the sample individual round performances of clients
(green) and the aggregated server model(red).

5-5-10 5-5-5
3-clients 3-clients
Observations on Fashion-MNIST:

1. Improvement in server performance from 60% to around 80%.

2. The Server aggregated model with Fedper algorithm performs slightly better than the other two algorithms.
3. No algorithm has shown consistent improvement (or convergence) in prediction performance for the few-shot
learning task as more rounds iterated.
4. Time taken:
- When it comes to a 2-client prediction, each experimental trial takes about 300 seconds (FedPer for 3-5-10) to
420 seconds (Fedper of 5-5-10).
- In the case of 3-clients, an experimental trial takes 600 (FedPer 3-5-10) to 696 seconds (5-5-10 FedProx) .
Fed-avg/Fed-per/Fed-prox on Omniglot – 2clients

3-5-10 5-5-10 5-5-5

Observations on Single-Dataset Scenario

Omniglot:

1. Improvement in server performance from 50% to around 90%.

2. None of the three federated learning methods outperforms the others as their aggregated global models do not
consistently help improve clients’ predictive performance.
3. No consistent improvement (or convergence) in prediction performance for the few-shot learning task as more
rounds are iterated.
4. Time taken:
- In this case of 2-clients, all experimental trails takes about 251 sec (Fedper 3-5-10) to 507 sec (FedPer 5-5-10).
Federated learning algorithms on Few-shot data of CIFAR-100

3-5-10
5-5-5
3-clients
2-clients
• In the below figures, we can see the sample individual round performances of
clients (green) and the aggregated server model(red).

5-5-10 5-5-5
3-clients 3-clients
Observations on Single-Dataset Scenario

CIFAR-100:

1. For this dataset there is only very slight improvement (29% to 39%) in aggregated server testing performance in
case of 5-5-10 and 5-5-5 .
2. None of the three federated learning methods outperforms the others as their aggregated global models do not
consistently help improve clients’ predictive performance.
3. No consistent improvement (or convergence) in prediction performance for the few-shot learning task as more
rounds are iterated.
4. Time taken:
- When it comes to a 2-client prediction, each experimental trial takes about 275 seconds (FedPer for 3-5-10) to
452 seconds (Fedper of 5-5-10).
- In the case of 3-clients, an experimental trial takes 380 (FedPer 3-5-10) to 609 seconds (5-5-5 FedProx) .
Experimental Results-Multiple Dataset Scenario
Number N-way K-shot Q- Server Base M C1 C-2 C-3 Server Testing M' on M' on
of clients & query Train Accu S1-Q1 S2-Q2 S3-Q3 M' S-Q S_Q
Type of algo CIFAR,OMNIGLOT CIFAR Omniglot FashionMNIS T S -Q CIFAR Omniglot
FashionMNIS T

2 Fedavg 3 5 10 63.933 41.33 90.00 66.50 41.16 79.33

3 Fedavg 64.6583 40.33 90.66 60.33 74.00 44.83 83.00

2 Fedper 64.191 46.333 92.00 64.16 48.50 71.50

3 Fedper 61.975 44.83 87.166 55.666 69.166 45.66 81.00

2 Fedprox 64.2083 48.83 91.000 79.00 45.666 87.833

3 Fedprox 65.1416 44.833 88.50 64.33 73.66 47.83 86.53

2 Fedavg 5 5 10 60.62 32.30 91.400 59.90 30.30 87.50

3 Fedavg 58.545 31.00 87.50 56.300 62.60 29.60 74.900

2 Fedper 57.365 32.40 88.80 53.30 27.70 87.10

3 Fedper 59.925 31.70 88.90 51.00 64.60 34.30 67.90

2 Fedprox 58.16 27.900 89.60 49.70 28.30 85.500

3 Fedprox 56.56 30.00 86.500 50.10 60.60 33.10 74.80

2 Fedavg 5 5 5 53.02 26.285 85.904 60.60 33.40 76.20

3 Fedavg 56.6 27.60 87.00 49.600 67.4 32.80 69.60

2 Fedper 52.19 27.60 86.60 52.00 27.80 84.200

3 Fedper 53.05 33.20 88.00 49.40 62.800 33.80 71.60

2 Fedprox 53.11 31.80 84.20 47.80 30.000 80.60

3 Fedprox 53.18 29.80 86.20 49.20 66.800 32.20 77.200

Comparison of single-data and
multiple-data scenarios

CIFAR-100
Comparing Fed Algorithms of Single-Data and Multiple-Data scenario- (5-5-5)
We investigate how the global model reacts to few-shot learning tasks on an unseen dataset, but only provided
relevant information to the client model fine-tuning.

SINGLE-DATA Scenario
MULTIPLE-DATA Scenario MULTIPLE-DATA Scenario
With and without F-MNIST

Observations:
-When the client has not been trained on Fashion-MNIST, server performance is not better.
-When the client has been trained on Fashion-MNIST, the server model trained using FedProx is the best
performer over all three aggregation algorithms.
Comparing on Single-Dataset Scenario and Multiple-data scenarios

1. Fashion-MNIST test accuracies for single-dataset scenarios are in the range of 80% to 86% where in case of
multiple-dataset scenarios it is only 60% to 70% in all few-shot learning task configurations.

2. CIFAR-100 under multiple-dataset scenarios has accuracies in the range of 27% to 33% in case of 5-5-5 and 5-5-
10 few-shot learning task configuration and 40% to 48% in case of 3-5-10 few-shot learning task configuration,
whereas in single-dataset case it is just 30% to 35% in all three few-shot learning task configurations.

3. Omniglot in single-dataset scenarios is more accurate in the range of 88% to 90% whereas in multiple-dataset
scenarios its accuracy is only between 71% and 87%.

Prelim Lab Quiz 1
No ratings yet
Prelim Lab Quiz 1
7 pages
Edgecomm - 2023 - Lahari (5) - ACM - Workshop
No ratings yet
Edgecomm - 2023 - Lahari (5) - ACM - Workshop
5 pages
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
paper13
No ratings yet
paper13
13 pages
Practical MXNet Applications: Definitive Reference for Developers and Engineers
From Everand
Practical MXNet Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
DR +Sofia+Kovacs
No ratings yet
DR +Sofia+Kovacs
9 pages
1-s2.0-S0893608021003919-main
No ratings yet
1-s2.0-S0893608021003919-main
11 pages
Few-Shot Machine Learning: Doing More with Less Data
From Everand
Few-Shot Machine Learning: Doing More with Less Data
Robert Johnson
No ratings yet
SAPENet: Self-Attention Based Prototype Enhancement Network For Few-Shot Learning
No ratings yet
SAPENet: Self-Attention Based Prototype Enhancement Network For Few-Shot Learning
11 pages
Internship Report: Meta-Learning Algorithms For Few-Shot Computer Vision
No ratings yet
Internship Report: Meta-Learning Algorithms For Few-Shot Computer Vision
35 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
2409.07989v2
No ratings yet
2409.07989v2
14 pages
Foundational Models and Architectures S1: Generative AI, #1
From Everand
Foundational Models and Architectures S1: Generative AI, #1
Leaster Startx
No ratings yet
Multimodal Federated Learning On Iot Data: Yuchen Zhao Payam Barnaghi Hamed Haddadi
No ratings yet
Multimodal Federated Learning On Iot Data: Yuchen Zhao Payam Barnaghi Hamed Haddadi
12 pages
Caffe Deep Learning Framework Essentials: Definitive Reference for Developers and Engineers
From Everand
Caffe Deep Learning Framework Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
07 - Induction Networks For Few-Shot Text Classification
No ratings yet
07 - Induction Networks For Few-Shot Text Classification
10 pages
Mastering C: Advanced Techniques and Tricks
From Everand
Mastering C: Advanced Techniques and Tricks
Ted Norice
No ratings yet
Java Concurrency and Parallelism: Master advanced Java techniques for cloud-based applications through concurrency and parallelism
From Everand
Java Concurrency and Parallelism: Master advanced Java techniques for cloud-based applications through concurrency and parallelism
Jay Wang
No ratings yet
Few Shot Learning Seminar
No ratings yet
Few Shot Learning Seminar
14 pages
PeFLL Presentation Sm24mtech11005
No ratings yet
PeFLL Presentation Sm24mtech11005
28 pages
Java/J2EE Design Patterns Interview Questions You'll Most Likely Be Asked: Second Edition
From Everand
Java/J2EE Design Patterns Interview Questions You'll Most Likely Be Asked: Second Edition
Vibrant Publishers
No ratings yet
Detectron2 in Practice: Definitive Reference for Developers and Engineers
From Everand
Detectron2 in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Essential Federated Learning: AI at the Edge
From Everand
Essential Federated Learning: AI at the Edge
Robert Johnson
No ratings yet
Suppose You Have Good Knowledge in A Certain Topic Learning Allied Topics Becomes Easier As You Can Always Build On The Fundamentals
No ratings yet
Suppose You Have Good Knowledge in A Certain Topic Learning Allied Topics Becomes Easier As You Can Always Build On The Fundamentals
7 pages
Implementation and Analysis of A Federated Learning Architecture Using CIFAR 10 Dataset 1
No ratings yet
Implementation and Analysis of A Federated Learning Architecture Using CIFAR 10 Dataset 1
6 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
LightGBM in Practice: Definitive Reference for Developers and Engineers
From Everand
LightGBM in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
Federated Reconstruction: Partially Local Federated Learning
No ratings yet
Federated Reconstruction: Partially Local Federated Learning
13 pages
Mini Project Mid Sem Evaluation
No ratings yet
Mini Project Mid Sem Evaluation
21 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
2024 MTH058 Lecture08 N ShotLearning
No ratings yet
2024 MTH058 Lecture08 N ShotLearning
39 pages
Machine Learning Mastery for Engineers
From Everand
Machine Learning Mastery for Engineers
Abdellatif Sadeq
No ratings yet
FedAFR Enhancing Federated Learning With Adaptive Fea - 2024 - Computer Communi
No ratings yet
FedAFR Enhancing Federated Learning With Adaptive Fea - 2024 - Computer Communi
8 pages
Azure Fundamentals Success Kit
From Everand
Azure Fundamentals Success Kit
PRIYANKA
No ratings yet
Project Report
No ratings yet
Project Report
8 pages
A CLOSER LOOK AT FEW-SHOT CLASSIFICATION
No ratings yet
A CLOSER LOOK AT FEW-SHOT CLASSIFICATION
17 pages
PyTorch Essentials: A Comprehensive Guide to Machine Learning Techniques
From Everand
PyTorch Essentials: A Comprehensive Guide to Machine Learning Techniques
Adam Jones
No ratings yet
Deep Meta-Learning Learning To Learn in The Concept Space
No ratings yet
Deep Meta-Learning Learning To Learn in The Concept Space
10 pages
2017 Konecny Et Al Federated Learning Google Paper
No ratings yet
2017 Konecny Et Al Federated Learning Google Paper
10 pages
Mastering Java Design Patterns: Unlock the Secrets of Expert-Level Skills
From Everand
Mastering Java Design Patterns: Unlock the Secrets of Expert-Level Skills
Larry Jones
No ratings yet
Artificial Intelligence 2024 Book 2 of 2: AI, #2
From Everand
Artificial Intelligence 2024 Book 2 of 2: AI, #2
Yang Yen Thaw
No ratings yet
Deep Learning with Fast.ai: Definitive Reference for Developers and Engineers
From Everand
Deep Learning with Fast.ai: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Python Machine Learning
From Everand
Python Machine Learning
Sebastian Raschka
4/5 (18)
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
Semi-Synchronous Personalized Federated Learning
No ratings yet
Semi-Synchronous Personalized Federated Learning
17 pages
Mainframe Containerization Mastery: Mainframes
From Everand
Mainframe Containerization Mastery: Mainframes
Ricardo Nuqui
No ratings yet
FL 1
No ratings yet
FL 1
25 pages
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
From Everand
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
DAVID MACKAY
No ratings yet
Federated learning Overview, strategies, applications, tools and
No ratings yet
Federated learning Overview, strategies, applications, tools and
24 pages
Federated Learning
No ratings yet
Federated Learning
50 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Efficient Development with JetBrains Tools: Definitive Reference for Developers and Engineers
From Everand
Efficient Development with JetBrains Tools: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
ML_Meta-pretraining Then Meta-learning for Few-shot Text Classification
No ratings yet
ML_Meta-pretraining Then Meta-learning for Few-shot Text Classification
3 pages
Federated Learning For Commercial Image Sources, WACV'23
No ratings yet
Federated Learning For Commercial Image Sources, WACV'23
10 pages
Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning
From Everand
Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning
Margaux Masson-Forsythe
No ratings yet
多模态联邦学习
No ratings yet
多模态联邦学习
15 pages
Constrained Conditional Model: Fundamentals and Applications
From Everand
Constrained Conditional Model: Fundamentals and Applications
Fouad Sabry
No ratings yet
Few Shot Learning
No ratings yet
Few Shot Learning
1 page
Recent Advances
No ratings yet
Recent Advances
18 pages
FedFMSL_ Federated Learning of Foundation Models With Sparsely Activated LoRA
No ratings yet
FedFMSL_ Federated Learning of Foundation Models With Sparsely Activated LoRA
15 pages
Digital Subtraction Angiography PDF
No ratings yet
Digital Subtraction Angiography PDF
14 pages
Compact.: Versatile For Business
No ratings yet
Compact.: Versatile For Business
2 pages
EU Cybersecurity (Directive (EU) 2022 - 2555)
No ratings yet
EU Cybersecurity (Directive (EU) 2022 - 2555)
36 pages
Desktop Migration Proposal
No ratings yet
Desktop Migration Proposal
7 pages
PH and ORP Sensors: User's Manual
No ratings yet
PH and ORP Sensors: User's Manual
23 pages
Weekly Flight Schedule: Search The PIA Website
No ratings yet
Weekly Flight Schedule: Search The PIA Website
4 pages
Unit 5 Input Output Organization
No ratings yet
Unit 5 Input Output Organization
37 pages
Resume CV Format Download-10
No ratings yet
Resume CV Format Download-10
1 page
Ultra GammaView Aluminus 4555 - U12-40720-Manual
No ratings yet
Ultra GammaView Aluminus 4555 - U12-40720-Manual
1 page
Error Code Tyt 8fd-G-15
100% (3)
Error Code Tyt 8fd-G-15
549 pages
H8 Pro: Pan & Tilt Wi-Fi Camera
No ratings yet
H8 Pro: Pan & Tilt Wi-Fi Camera
12 pages
Oil Pressure Gauge 9
No ratings yet
Oil Pressure Gauge 9
14 pages
ITIL 4 Foundation (Practice Exam 1)
No ratings yet
ITIL 4 Foundation (Practice Exam 1)
14 pages
MBT103
No ratings yet
MBT103
2 pages
Clauset Et Al - 2004 - Finding Community Structure in Very Large Networks
No ratings yet
Clauset Et Al - 2004 - Finding Community Structure in Very Large Networks
6 pages
CMR Session Tracking Sheet-Feb
No ratings yet
CMR Session Tracking Sheet-Feb
9 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
62 pages
13-PAM-ADMIN Privileged Threat Analytics
No ratings yet
13-PAM-ADMIN Privileged Threat Analytics
36 pages
Lab 2qgis
No ratings yet
Lab 2qgis
20 pages
Engine - VQ25DE and VQ35DE Cooling System PDF
No ratings yet
Engine - VQ25DE and VQ35DE Cooling System PDF
27 pages
Injection Blow Molding With FDM: Application Guide
No ratings yet
Injection Blow Molding With FDM: Application Guide
5 pages
RFID Vending Machine Thesis
No ratings yet
RFID Vending Machine Thesis
39 pages
6-Buses-Part 1
No ratings yet
6-Buses-Part 1
47 pages
Lecture 3 - BCSE302L - DBMS Architecture
No ratings yet
Lecture 3 - BCSE302L - DBMS Architecture
17 pages
DA12_2_e
No ratings yet
DA12_2_e
76 pages
Ford Siraj Machine Learning in Cyber Security Final Manuscript
No ratings yet
Ford Siraj Machine Learning in Cyber Security Final Manuscript
6 pages
BS en 1092-1 PN40 Flange Dimensions
No ratings yet
BS en 1092-1 PN40 Flange Dimensions
4 pages
Sri Venkateswara University: Tirupati: Department of Computer Science
No ratings yet
Sri Venkateswara University: Tirupati: Department of Computer Science
24 pages
__ First Year Hall Ticket Download _
No ratings yet
__ First Year Hall Ticket Download _
1 page

FFSML - Thesis Presentation

Uploaded by

FFSML - Thesis Presentation

Uploaded by

Utilizing Federated Learning and Meta-Learning

for Few-Shot Learning on Edge Devices

• Importance is given to make these devices smarter

• The solution for this challenging problem of predicting

• Federated learning is a machine learning

• Enhances model predictability, data

Chandorikar, K. (2021). Introduction to Federated Learning and Privacy

• Centralized federated learning: In

• Decentralized federated learning: In

Problem: How can we perform few-shot learning using meta-learning on a

• During aggregation, the edge device's

Advantages: Easy implementation

Disadvantages: Can handle data

“Breaking Privacy in Federated Learning,” KDnuggets.

• The client neural network is seen as a combination of

• Advantage: Allows client specialization and easy

Arivazhagan, M. G., Agarwal, V., Singh, A. K., & Choudhary, S.(2019).

• Model Agnostic Meta Learning

• Drawbacks: Unstable, hard to train to

• Harder to implement with complex

Meta-Learning. (2020). https://meta-learning.fastforwardlabs.com/

• Drawbacks: This work is not about few-shot

F. Chen, M . Luo, Z. Dong, Z. Li, and X. He,

• An appropriate feature extractor like CNN, ResNet is used to get

• A Prototype is simply the mean of all the feature vectors of images

• Once these are set up, we calculate the Euclidean distance

• The class of prototype with very minimum distance is predicted

• Advantages: computationally inexpensive as well as efficient and

Step 2: The global model M is sent to the clients.

Server Pre-Training/ Initialization

Step 5: The resulting model is referred to as M’.

Step 6: Using M’ we test the server using S and make

These steps are iterated for multiple rounds (20) in our

 Federated Proximal (FedProx)

 Federated Personalization (FedPer) in our proposed framework and

• Total: 10*7000=70000 data points

• From Testing set:

Fashion MNIST. (2017, December 7). Kaggle..https://www.kaggle.com/datasets/zalando-research/fashionmnist

• Training set=24460 ; Test set=8000

• From training set:

• From Testing set: Omniglot dataset

Dataset from: https://github.com/brendenlake/omniglot

• Total: 600*100=60000 data points

• From Testing set:

Image from: https://docs.activeloop.ai/datasets/cifar-100-dataset

Question: Which network configuration to use for FedPer approach?

Performance Comparison of 2 Layer configurations in FedPer using Fashion-MNIST

• Maintaining an appropriate number of base layers is an essential criteria here.

• In our experiments, we use four different

• One observes that we cannot pick a particular

• We choose mu=1 for the rest of the

3 Fedavg 64.6583​ 40.33​ 90.66​ 60.33​ 44.83​ 83.00​ 74.00​

2 Fedper 64.191​ 46.333​ 92.00​ - 48.50​ 71.50​ 64.16​

3 Fedper 61.975​ 44.83​ 87.166​ 55.666​ 45.66​ 81.00​ 69.166​

2 Fedprox 64.2083​ 48.83​ 91.000​ - 45.666​ 87.833​ 79.00​

3 Fedprox 65.1416​ 44.833​ 88.50​ 64.33​ 47.83​ 86.53​ 73.66​

3 Fedavg 56.6​ 27.60​ 87.00​ 49.600​ 32.80​ 69.60​ 67.4​

2 Fedper 52.19​ 27.60​ 86.60​ - 27.80​ 84.200​ 52.00​

3 Fedper 53.05​ 33.20​ 88.00​ 49.40​ 33.80​ 71.60​ 62.800​

2 Fedprox 53.11​ 31.80​ 84.20​ - 30.000​ 80.60​ 47.80​

3 Fedprox 53.18​ 29.80​ 86.20​ 49.20​ 32.20​ 77.200​ 66.800​

SINGLE-DATA Scenario MULTIPLE-DATA Scenario

❖ Federated few-shot learning with meta-learning in decentralized architectures (i.e., no server).

1. Improvement in server performance from 60% to around 80%.

3-5-10 5-5-10 5-5-5

1. Improvement in server performance from 50% to around 90%.

3 Fedavg 64.6583​ 40.33​ 90.66​ 60.33​ 74.00​ 44.83​ 83.00​

2 Fedper 64.191​ 46.333​ 92.00​ 64.16​ 48.50​ 71.50​

3 Fedper 61.975​ 44.83​ 87.166​ 55.666​ 69.166​ 45.66​ 81.00​

2 Fedprox 64.2083​ 48.83​ 91.000​ 79.00​ 45.666​ 87.833​

3 Fedprox 65.1416​ 44.833​ 88.50​ 64.33​ 73.66​ 47.83​ 86.53​

3 Fedavg 58.545​ 31.00​ 87.50​ 56.300​ 62.60​ 29.60​ 74.900​

2 Fedper 57.365​ 32.40​ 88.80​ 53.30​ 27.70​ 87.10​

3 Fedper 59.925​ 31.70​ 88.90​ 51.00​ 64.60​ 34.30​ 67.90​

2 Fedprox 58.16​ 27.900​ 89.60​ 49.70​ 28.30​ 85.500​

3 Fedprox 56.56​ 30.00​ 86.500​ 50.10​ 60.60​ 33.10​ 74.80​

3 Fedavg 64.6583 40.33 90.66 60.33 44.83 83.00 74.00

2 Fedper 64.191 46.333 92.00 - 48.50 71.50 64.16

3 Fedper 61.975 44.83 87.166 55.666 45.66 81.00 69.166

2 Fedprox 64.2083 48.83 91.000 - 45.666 87.833 79.00

3 Fedprox 65.1416 44.833 88.50 64.33 47.83 86.53 73.66

3 Fedavg 56.6 27.60 87.00 49.600 32.80 69.60 67.4

2 Fedper 52.19 27.60 86.60 - 27.80 84.200 52.00

3 Fedper 53.05 33.20 88.00 49.40 33.80 71.60 62.800

2 Fedprox 53.11 31.80 84.20 - 30.000 80.60 47.80

3 Fedprox 53.18 29.80 86.20 49.20 32.20 77.200 66.800

3 Fedavg 64.6583 40.33 90.66 60.33 74.00 44.83 83.00

2 Fedper 64.191 46.333 92.00 64.16 48.50 71.50

3 Fedper 61.975 44.83 87.166 55.666 69.166 45.66 81.00

2 Fedprox 64.2083 48.83 91.000 79.00 45.666 87.833

3 Fedprox 65.1416 44.833 88.50 64.33 73.66 47.83 86.53

3 Fedavg 58.545 31.00 87.50 56.300 62.60 29.60 74.900

2 Fedper 57.365 32.40 88.80 53.30 27.70 87.10

3 Fedper 59.925 31.70 88.90 51.00 64.60 34.30 67.90

2 Fedprox 58.16 27.900 89.60 49.70 28.30 85.500

3 Fedprox 56.56 30.00 86.500 50.10 60.60 33.10 74.80

3 Fedavg 56.6 27.60 87.00 49.600 67.4 32.80 69.60

2 Fedper 52.19 27.60 86.60 52.00 27.80 84.200

3 Fedper 53.05 33.20 88.00 49.40 62.800 33.80 71.60

2 Fedprox 53.11 31.80 84.20 47.80 30.000 80.60

3 Fedprox 53.18 29.80 86.20 49.20 66.800 32.20 77.200