0% found this document useful (0 votes)

121 views47 pages

Train 400x faster Static Embedding Models with Sentence Transformers

This blog post presents a method for training static embedding models that are 100x to 400x faster on CPU compared to existing models, while maintaining high quality. Two models are introduced: one for English retrieval and another for multilingual similarity, both achieving at least 85% performance of their counterparts. The post includes details on the training strategy, datasets used, and encourages community exploration for further enhancements.

Uploaded by

joe patterson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

121 views47 pages

Train 400x faster Static Embedding Models with Sentence Transformers

Uploaded by

joe patterson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Search models, datasets, users...

Back to Articles

Train 400x faster Static Embedding

Models with Sentence Transformers
Published January 15, 2025

Update on GitHub

Upvote 72 +66

tomaarsen
Tom Aarsen

TL;DR

This blog post introduces a method to train static embedding models that run
100x to 400x faster on CPU than state-of-the-art embedding models, while
retaining most of the quality. This unlocks a lot of exciting use cases, including on-
device and in-browser execution, edge computing, low power and embedded
applications.

We apply this recipe to train two extremely efficient embedding models: sentence-
transformers/static-retrieval-mrl-en-v1 for English Retrieval, and sentence-
transformers/static-similarity-mrl-multilingual-v1 for Multilingual Similarity
tasks. These models are 100x to 400x faster on CPU than common counterparts
like all-mpnet-base-v2 and multilingual-e5-small, while reaching at least 85% of
their performance on various benchmarks.

Today, we are releasing:

The two models (for English retrieval and for multilingual similarity)
mentioned above.

The detailed training strategy we followed, from ideation to dataset selection

to implementation and evaluation.

Two training scripts, based on the open-source sentence transformers library.

Two Weights and Biases reports with training and evaluation metrics
collected during training.

The detailed list of datasets we used: 30 for training and 13 for evaluation.

We also discuss potential enhancements, and encourage the community to

explore them and build on this work!

Click to see Usage Snippets for the released models

Table of Contents

TL;DR

Table of Contents

What are Embeddings?

Modern Embeddings

Static Embeddings

Our Method

Training Details

Training Requirements
Model Inspiration

English Retrieval

Multilingual Similarity

Training Dataset Selection

English Retrieval

Multilingual Similarity

Code

Loss Function Selection

Code

Matryoshka Representation Learning

Code

Training Arguments Selection

Code

Evaluator Selection

Code

Hardware Details

Overall Training Scripts

English Retrieval

Multilingual Similarity

Usage

English Retrieval
Multilingual Similarity

Matryoshka Dimensionality Truncation

Third Party libraries

LangChain

LlamaIndex

Haystack

txtai

Performance

English Retrieval

NanoBEIR

GPU

CPU

Matryoshka Evaluation

Multilingual Similarity

Matryoshka Evaluation

Conclusion

Next Steps

What are Embeddings?

Embeddings are one of the most versatile tools in natural language processing,
enabling practitioners to solve a large variety of tasks. In essence, an embedding is
a numerical representation of a more complex object, like text, images, audio, etc.

The embedding model will always produce embeddings of the same fixed size.
You can then compute the similarity of complex objects by computing the
similarity of the respective embeddings.

This has a large amount of use cases, and serves as the backbone for
recommendation systems, retrieval, outlier detection, one-shot or few-shot
learning, similarity search, clustering, paraphrase detection, classification, and
much more.

Modern Embeddings
Many of today's embedding models consist of a handful of conversion steps.
Following these steps is called "inference".

Text Tokenizer Tokens Encoder Token Embeddings Pooler Text Embedding

The Tokenizer and Pooler are responsible for pre- and post-processing for the
Encoder , respectively. The former chops texts up into tokens (a.k.a. words or

subwords) which can be understood by the Encoder , whereas the latter combines
the embeddings for all tokens into one embedding for the entire text.

Within this pipeline, the Encoder is often a language model with attention layers,
which allows each token to be computed within the context of the other tokens.
For example, bank might be a token, but the token embedding for that token will
likely be different if the text refers to a "river bank" or the financial institution.

Large encoder models with a lot of attention layers will be effective at using the
context to produce useful embeddings, but they do so at a high price of slow
inference. Notably, in the pipeline, the Encoder step is generally responsible for
almost all of the computational time.

Static Embeddings

Static Embeddings refers to a group of Encoder models that don't use large and
slow attention-based models, but instead rely on pre-computed token
embeddings. Static embeddings were used years before the transformer
architecture was developed. Common examples include GLoVe and word2vec.
Recently, Model2Vec has been used to convert pre-trained embedding models into
Static Embedding models.
For Static Embeddings, the Encoder step is as simple as a dictionary lookup: given
the token, return the pre-computed token embedding. Consequently, inference is
suddenly no longer bottlenecked by the Encoder phase, resulting in speedups of
several orders of magnitude. This blogpost shows that the hit on quality can be
quite small!

Our Method

We set out to revisit Static Embeddings models, using modern techniques to train
them. Most of our gains come from the use of a contrastive learning loss function,
as we'll explain shortly. Optionally, we can get additional speed improvements by
using Matryoshka Representation Learning, which makes it possible to use
truncated versions of the embedding vectors.

We'll be using the Sentence Transformers library for training. For a more general
overview on how this library can be used to train embedding models, consider
reading the Training and Finetuning Embedding Models with Sentence
Transformers v3 blogpost or the Sentence Transformers Training Overview
documentation.

Training Details

The objective with these reimagined Static Embeddings is to experiment with

modern embedding model finetuning techniques on these highly efficient
embedding models. In particular, unlike GLoVe and word2vec, we will be using:
1. Contrastive Learning: With most machine learning, you take input $X$ and
expect output $Y$, and then train a model such that $X$ fed through the
model produces something close to $Y$. For embedding models, we don't
have $Y$: we don't know what a good embedding would be beforehand.

Instead, with Contrastive Learning, we have multiple inputs $X_1$ and

$X_2$, and a similarity. We feed both inputs through the model, after which
we can contrast the two embeddings resulting in a predicted similarity. We
can then push the embeddings further apart if the true similarity is low, or
pull the embeddings closer together if the true similarity is high.

2. Matryoshka Representation Learning (MRL): Matryoshka Embedding

Models (blogpost) is a clever training approach that allows users to truncate
embedding models to smaller dimensions at a minimal performance hit. It
involves using the contrastive loss function not just with the normal-sized
embedding, but also with truncated versions of them. Consequently, the
model learns to store information primarily at the start of the embeddings.

Truncated embeddings will be faster with downstream applications, such as

retrieval, classification, and clustering.

For future research, we leave various other modern training approaches for
improving data quality. See Next Steps for concrete ideas.

Training Requirements

As shown in the Training Overview documentation in Sentence Transformers,

training consists of 3 to 5 components:

1. Dataset
2. Loss Function

3. Training Arguments (Optional)

4. Evaluator (Optional)

5. Trainer

In the following sections, we'll go through our thought processes for each of these.

Model Inspiration

In our experience, embedding models are either used 1) exclusively for retrieval
or 2) for every task under the sun (classification, clustering, semantic textual
similarity, etc.). We set out to train one of each.

For the retrieval model, there is only a limited amount of multilingual retrieval
training data available, and hence we chose to opt for an English-only model. In
contrast, we decided to train a multilingual general similarity model because
multilingual data was much easier to acquire for this task.

For these models, we would like to use the StaticEmbedding module, which
implements an efficient tokenize method that avoids padding, and an efficient
forward method that takes care of computing and pooling embeddings. It's as

simple as using a torch EmbeddingBag, which is nothing more than an efficient

Embedding (i.e. a lookup table for embeddings) with mean pooling.

We can initialize it in a few ways: StaticEmbedding.from_model2vec to load a

Model2Vec model, StaticEmbedding.from_distillation to perform Model2Vec-
style distillation, or initializing it with a Tokenizer and an embedding dimension
to get random weights.
Based on our findings, the last option works best when fully training with a large
amount of data. Matching common models like all-mpnet-base-v2 or bge-large-
en-v1.5, we are choosing an embedding dimensionality of 1024, i.e. our
embedding vectors consist of 1024 values each.

English Retrieval

For the English Retrieval model, we rely on the google-bert/bert-base-uncased

tokenizer. As such, initializing the model looks like this:

from sentence_transformers import SentenceTransformer

from sentence_transformers.models import StaticEmbedding
from tokenizers import Tokenizer

tokenizer = Tokenizer.from_pretrained("google-bert/bert-base-uncased
static_embedding = StaticEmbedding(tokenizer, embedding_dim=1024)

model = SentenceTransformer(modules=[static_embedding])

The first entry in the modules list must implement tokenize , and the last one
must produce pooled embeddings. Both is the case here, so we're good to start
training this model.

Multilingual Similarity

For the Multilingual Similarity model, we instead rely on the google-bert/bert-

base-multilingual-uncased tokenizer, and that's the only thing we change in our

initialization code:
from sentence_transformers import SentenceTransformer
from sentence_transformers.models import StaticEmbedding
from tokenizers import Tokenizer

tokenizer = Tokenizer.from_pretrained("google-bert/bert-base-multilin
static_embedding = StaticEmbedding(tokenizer, embedding_dim=1024)

model = SentenceTransformer(modules=[static_embedding])

Training Dataset Selection

Alongside dozens of Sentence Transformer models, the Sentence Transformers

organization on Hugging Face also hosts 70+ datasets (at the time of writing):

Embedding Model Datasets

Beyond that, many datasets have been tagged with sentence-transformers to

mark that they're useful for training embedding models:

Datasets with the sentence-transformers tag

English Retrieval

For the English Retrieval datasets, we are primarily looking for any dataset with:

question-answer pairs, optionally with negatives (i.e. wrong answers) as

well, and
no overlap with the BEIR benchmark, a.k.a. the Retrieval tab on MTEB. Our
goal is to avoid training on these datasets so we can use MTEB as a 0-shot
benchmark.

We selected the following datasets:

gooaq

msmarco - the "triplet" subset

squad

s2orc - the "title-abstract-pair" subset

allnli - the "triplet" subset

paq

trivia_qa

msmarco_10m

swim_ir - the "en" subset

pubmedqa - the "triplet-20" subset

miracl - the "en-triplet-all" subset

mldr - the "en-triplet-all" subset

mr_tydi - the "en-triplet-all" subset

Multilingual Similarity

For the Multilingual Similarity datasets, we aimed for datasets with:

parallel sentences across languages, i.e. the same text in multiple languages,
or
positive pairs, i.e. pairs with high similarity, optionally with negatives (i.e.
low similarity).

We selected the following datasets as they contain parallel sentences:

wikititles

tatoeba

talks

europarl

global_voices

muse

wikimatrix

opensubtitles

And these datasets as they contain positive pairs of some kind:

stackexchange - the "post-post-pair" subset

quora - the "triplet" subset

wikianswers_duplicates

all_nli - the "triplet" subset

simple_wiki

altlex

flickr30k_captions

coco_captions

nli_for_simcse
negation

Code

Loading these datasets is rather simple, e.g.:

from datasets import load_dataset, Dataset

gooaq_dataset = load_dataset("sentence-transformers/gooaq", split="tr

gooaq_dataset_dict = gooaq_dataset.train_test_split(test_size=10_000
gooaq_train_dataset: Dataset = gooaq_dataset_dict["train"]
gooaq_eval_dataset: Dataset = gooaq_dataset_dict["test"]

print(gooaq_train_dataset)
"""
Dataset({
features: ['question', 'answer'],
num_rows: 3002496
})
"""

print(gooaq_eval_dataset)
"""
Dataset({
features: ['question', 'answer'],
num_rows: 10000
})
"""

The gooaq dataset doesn't already have a train-eval split, so we can make one with
train_test_split. Otherwise, we can just load a precomputed split with e.g.
split="eval" .

Note that train_test_split does mean that the dataset has to be loaded into
memory, whereas it is otherwise just kept on disk. This increased memory is not
ideal when training, so it's recommended to 1) load the data, 2) split it, and 3)
save it to disk with save_to_disk. Before training, you can then use
load_from_disk to load it again.

Loss Function Selection

Within Sentence Transformers, your loss model must match your training data
format. The Loss Overview is designed as an overview of which losses are
compatible with which formats.

In particular, we currently have the following formats in our data:

(anchor, positive) pair, no label

(anchor, positive, negative) triplet, no label

(anchor, positive, negative_1, ..., negative_n) tuples, no label

For these formats, we have some excellent choices:

1. MultipleNegativesRankingLoss (MNRL): Also known as in-batch negatives

loss or InfoNCE loss, this loss has been used to train modern embedding
models for a handful of years. In short, the loss optimizes the following:

“Given an anchor (e.g. a question), assign the highest similarity to the

corresponding positive (i.e. answer) out of all positives and negatives (e.g. all
answers) in the batch.”
If you provide the optional negatives, they will only be used as extra options
(also known as in-batch negatives) from which the model must pick the
correct positive. Within reason, the harder this "picking" is, the stronger the
model will become. Because of this, higher batch sizes result in more in-batch
negatives, which then increase performance (to a point).

2. CachedMultipleNegativesRankingLoss (CMNRL): This is an extension of

MNRL that implements GradCache, an approach that allows for arbitrarily
increasing the batch size without increasing the memory.

This loss is recommended over MNRL unless you can already fit a large
enough batch size in memory with just MNRL. In that case, you can use
MNRL to save the 20% training speed cost that CMNRL adds.

3. GISTEmbedLoss (GIST): This is also an extension of MNRL, it uses a guide

Sentence Transformer model to remove potential false negatives from the list
of options that the model must "pick" the correct positive from.

False negatives can hurt performance, but hard true negatives (texts that are
close to correct, but not quite) can help performance, so this filtering is a fine
line to walk.

Because these static embedding models are extremely small, it is possible to fit
our desired batch size of 2048 samples on our hardware: a single RTX 3090 with
24GB, so we don't need to use CMNRL.

Additionally, because we're training such fast models, the guide from the
GISTEmbedLoss would make the training much slower. Because of this, we've

opted to use MultipleNegativesRankingLoss for our models.

If we were to try these experiments again, we would pick a larger batch size, e.g.
16384 with CMNRL. If you try, please let us know how it goes!

Code

The usage is rather simple:

from sentence_transformers import SentenceTransformer

from sentence_transformers.losses import MultipleNegativesRankingLoss

# Prepare a model to train

tokenizer = Tokenizer.from_pretrained("google-bert/bert-base-uncased
static_embedding = StaticEmbedding(tokenizer, embedding_dim=1024)
model = SentenceTransformer(modules=[static_embedding])

# Initialize the MNRL loss given the model

loss = MultipleNegativesRankingLoss(model)

Matryoshka Representation Learning

Beyond regular loss functions, Sentence Transformers also implements a handful

of Loss modifiers. These work on top of standard loss functions, but apply them in
different ways to try and instil useful properties into the trained embedding
model.

A very interesting one is the MatryoshkaLoss, which turns the trained model into
a Matryoshka Model. This allows users to truncate the output embeddings at a
minimal loss of performance, meaning that retrieval or clustering can be sped up
due to the smaller dimensionalities.
Code

The MatryoshkaLoss is applied on top of a normal loss. It's recommended to also

include the normal embedding dimensionality in the list of matryoshka_dims :

from sentence_transformers import SentenceTransformer

from sentence_transformers.losses import MultipleNegativesRankingLoss

# Prepare a model to train

tokenizer = Tokenizer.from_pretrained("google-bert/bert-base-uncased
static_embedding = StaticEmbedding(tokenizer, embedding_dim=1024)
model = SentenceTransformer(modules=[static_embedding])

# Initialize the MNRL loss given the model

base_loss = MultipleNegativesRankingLoss(model)
loss = MatryoshkaLoss(model, base_loss, matryoshka_dims=[1024, 768, 5

Training Arguments Selection

Sentence Transformers supports a lot of training arguments, the most valuable of

which have been listed in the Training Overview > Training Arguments
documentation.

We used the same core training parameters to train both models:

num_train_epochs : 1

We have sufficient data, should we want to train for more, then we can
add more data instead of training with the same data multiple times.
per_device_train_batch_size / per_device_eval_batch_size : 2048

2048 dimensions fit comfortably on our RTX 3090. Various papers (Xiao
et al., Li et al.) show that even larger batch sizes still improve
performance. For future versions, we will apply
CachedMultipleNegativesRankingLoss with a larger batch size, e.g.

16384.

learning_rate : 2e-1

Note! This is much larger than with normal embedding model training,
which often uses a loss around 2e-5.

warmup_ratio : 0.1

0.1 or 10% is a pretty standard warmup ratio to smoothly introduce the

high learning rate to the model.

bf16 : True

If your GPU(s) support(s) bf16 - it tends to make sense to train with it.
Otherwise you can use fp16=True if that's supported instead.

batch_sampler : BatchSamplers.NO_DUPLICATES

All losses with in-batch negatives (such as MNRL) benefit from this
batch sampler that avoids duplicates within the batch. Duplicates often
result in false negatives, weakening the trained model.

multi_dataset_batch_sampler :

MultiDatasetBatchSamplers.PROPORTIONAL

When you're training with multiple datasets, it's common that not all
datasets are the same size. When that happens, you can either:
Round Robin: sample the same amount of batches from each
dataset until one is exhausted. You'll have an equal distribution of
data, but not all data will be used.

Proportional: sample each dataset until all are exhausted. You'll use
up all data, but you won't have an equal distribution of data. We
chose this one as we're not too concerned with a data imbalance.

Beyond these core arguments, we also set a few training arguments for tracking
and debugging: eval_strategy , eval_steps , save_strategy , save_steps ,
save_total_limit , logging_steps , logging_first_step , and run_name .

Code

In the end, we used these SentenceTransformerTrainingArguments for the two

models:

run_name = "static-retrieval-mrl-en-v1"
# or
# run_name = "static-similarity-mrl-multilingual-v1"

args = SentenceTransformerTrainingArguments(
# Required parameter:
output_dir=f"models/{run_name}",
# Optional training parameters:
num_train_epochs=1,
per_device_train_batch_size=2048,
per_device_eval_batch_size=2048,
learning_rate=2e-1,
warmup_ratio=0.1,
fp16=False, # Set to False if you get an error that your GPU can
bf16=True, # Set to True if you have a GPU that supports BF16
batch_sampler=BatchSamplers.NO_DUPLICATES, # MultipleNegativesRa
multi_dataset_batch_sampler=MultiDatasetBatchSamplers.PROPORTIONA
# Optional tracking/debugging parameters:
eval_strategy="steps",
eval_steps=1000,
save_strategy="steps",
save_steps=1000,
save_total_limit=2,
logging_steps=1000,
logging_first_step=True,
run_name=run_name, # Used if `wandb`, `tensorboard`, or `neptune
)

Evaluator Selection

If we provide an evaluation dataset to the Sentence Transformer Trainer, then

upon evaluation we will get an evaluation loss. This'll be useful to track whether
we're overfitting or not, but not so meaningful when it comes to real downstream
performance.

Because of this, Sentence Transformers additionally supports Evaluators. Unlike

the training loss, these give qualitative metrics like NDCG, MAP, MRR for
Information Retrieval, Spearman Correlation for Semantic Textual Similarity, or
Triplet accuracy (number of samples where similarity(anchor, positive) >
similarity(anchor, negative) ).

Due to its simplicity, we will be using the NanoBEIREvaluator for the retrieval
model. This evaluator runs Information Retrieval benchmarks on the NanoBEIR
collection of datasets. This dataset is a subset of the much larger (and thus
slower) BEIR benchmark, which is commonly used as the Retrieval tab in the
MTEB Leaderboard.

Code

Because all datasets are already pre-defined, we can load the evaluator without
any arguments:

from sentence_transformers import SentenceTransformer

from sentence_transformers.evaluation import NanoBEIREvaluator

# Load an example pre-trained model to finetune further

model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

# Initialize the NanoBEIR Evaluator

evaluator = NanoBEIREvaluator()

# Run it on any Sentence Transformer model

evaluator(model)

Hardware Details

We're training these models on consumer-level hardware, specifically:

GPU: RTX 3090

CPU: i7-13700K

RAM: 32GB
Overall Training Scripts

This section contains the final training scripts for both models with all of the
previously described components (datasets, loss functions, training arguments,
evaluator, trainer) combined.

English Retrieval

Click to expand

This script produced sentence-transformers/static-retrieval-mrl-en-v1 after 17.8

hours of training. In total, it consumed 2.6 kWh of energy and emitted 1kg of
CO2. That is roughly equivalent to the amount of CO2 an average person exhales
per day.

See our Weights and Biases report for the training and evaluation metrics
collected during training.

Multilingual Similarity

Click to expand

This script produced sentence-transformers/static-similarity-mrl-multilingual-v1

after 3.1 hours of training. In total, it consumed 0.5 kWh of energy and emitted
0.2kg of CO2. That is roughly 20% of the CO2 that an average person exhales per
day.

See our Weights and Biases report for the training and evaluation losses collected
during training.
Usage

The usage of these models is very straightforward, identical to the normal

Sentence Transformers flow:

English Retrieval

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub

model = SentenceTransformer("sentence-transformers/static-retrieval-m
# Run inference
sentences = [
'Gadofosveset-enhanced MR angiography of carotid arteries: does s
'To evaluate the diagnostic accuracy of gadofosveset-enhanced mag
'In a longitudinal study we investigated in vivo alterations of C
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings

similarities = model.similarity(embeddings[0], embeddings[1:])
print(similarities)
# tensor([[0.7649, 0.3279]])

The upcoming Performance > English Retrieval section will show that these
results are quite solid, within 15% of commonly used Transformer-based encoder
models like all-mpnet-base-v2.
SentenceTransformer API Reference.

SentenceTransformer.encode API Reference.

SentenceTransformer.similarity API Reference.

Multilingual Similarity

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub

model = SentenceTransformer("sentence-transformers/static-similarity
# Run inference
sentences = [
'It is known for its dry red chili powder.',
'It is popular for dried red chili powder.',
'These monsters will move in large groups.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings

similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, 0.8388, -0.0012],
# [ 0.8388, 1.0000, 0.0445],
# [-0.0012, 0.0445, 1.0000]])

This model only loses about 8% of performance compared to the popular but
much slower multilingual-e5-small, as shown in the upcoming Performance >
Multilingual Similarity section.

SentenceTransformer API Reference.

SentenceTransformer.encode API Reference.

SentenceTransformer.similarity API Reference.

Matryoshka Dimensionality Truncation

To reduce the dimensionality of your calculated embeddings, you can simply pass
the truncate_dim parameter. This works for all Sentence Transformer models.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub

model = SentenceTransformer(
"sentence-transformers/static-retrieval-mrl-en-v1",
device="cpu",
truncate_dim=256,
)
# Run inference
sentences = [
'Gadofosveset-enhanced MR angiography of carotid arteries: does s
'To evaluate the diagnostic accuracy of gadofosveset-enhanced mag
'In a longitudinal study we investigated in vivo alterations of C
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 256]

# Get the similarity scores for the embeddings

similarities = model.similarity(embeddings[0], embeddings[1:])
print(similarities)
# tensor([[0.7844, 0.3561]])

Third Party libraries

This model also works out of the box in various third party libraries, for example
LangChain, LlamaIndex, Haystack, and txtai.

LangChain

# pip install langchain langchain_huggingface

from langchain_huggingface import HuggingFaceEmbeddings

model_name = "sentence-transformers/static-retrieval-mrl-en-v1"
model_kwargs = {'device': 'cpu'} # you can use 'truncate_dim' here
model = HuggingFaceEmbeddings(
model_name=model_name,
model_kwargs=model_kwargs,
)

HuggingFaceEmbeddings documentation.

LlamaIndex

# pip install llama-index llama-index-embeddings-huggingface

from llama_index.core import Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

# Set up the HuggingFaceEmbedding class with the required model to us

model_name = "sentence-transformers/static-retrieval-mrl-en-v1"
device = "cpu"
embed_model = HuggingFaceEmbedding(
model_name=model_name,
device=device,
# truncate_dim=256, # you can use 'truncate_dim' here
)
Settings.embed_model = embed_model

HuggingFaceEmbedding documentation and API Reference.

Haystack

# pip install haystack sentence-transformers

from haystack.components.embedders import (
SentenceTransformersDocumentEmbedder,
SentenceTransformersTextEmbedder,
)

model_name = "sentence-transformers/static-retrieval-mrl-en-v1"
device = "cpu"
document_embedder = SentenceTransformersDocumentEmbedder(
model=model_name,
device=device,
# truncate_dim=256, # you can use 'truncate_dim' here
)
text_embedder = SentenceTransformersTextEmbedder(
model=model_name,
device=device,
# truncate_dim=256, # you can use 'truncate_dim' here
)

SentenceTransformersDocumentEmbedder documentation.

SentenceTransformersTextEmbedder documentation.

txtai

# pip install txtai sentence-transformers

from txtai import Embeddings

model_name = "sentence-transformers/static-retrieval-mrl-en-v1"
embeddings = Embeddings(path=model_name)

Embeddings documentation

Performance

English Retrieval

After training, we've evaluated the final model sentence-transformers/static-

retrieval-mrl-en-v1 on NanoBEIR (normal dimensionality and with Matryoshka
dimensions) as well as on BEIR.

NanoBEIR
We've evaluated sentence-transformers/static-retrieval-mrl-en-v1 on NanoBEIR
and plotted it against the inference speed computed on our hardware. For the
inference speed tests, we calculated the number of computed query embeddings
of the GooAQ dataset per second, either on CPU or GPU.

We evaluate against 3 types of models:

1. Attention-based dense embedding models, e.g. traditional Sentence

Transformer models like all-mpnet-base-v2, bge-base-en-v1.5, and gte-
large-en-v1.5.

2. Static Embedding-based models, e.g. static-retrieval-mrl-en-v1, potion-

base-8M, M2V_base_output, and glove.6B.300d.

3. Sparse bag-of-words model, BM25, often a strong baseline.

Click to expand BM25 implementation details

“NOTE: Many of the attention-based dense embedding models are finetuned on the
training splits of the (Nano)BEIR evaluation datasets. This gives the models an unfair
advantage in this benchmark and can result in lower downstream performance on real
retrieval tasks.

static-retrieval-mrl-en-v1 is purposefully not trained on any of these datasets.”

Click to see a table with all values from the next 2 Figures

GPU
CPU
We can draw some notable conclusions from these figures:

1. static-retrieval-mrl-en-v1 outperforms all other Static Embedding

models, like GloVe or Model2Vec.

2. static-retrieval-mrl-en-v1 is the only Static Embedding model to

outperform BM25.

3. static-retrieval-mrl-en-v1 is

87.4% as performant as the commonly used all-mpnet-base-v2,

24x faster on GPU,

397x faster on CPU.

4. static-retrieval-mrl-en-v1 is quicker on CPU than on GPU: This model

can run extraordinarily quickly everywhere, including consumer-grade PCs,
tiny servers, phones, or in-browser.
Matryoshka Evaluation

Additionally, we experimented with the results on NanoBEIR performance when

we performed Matryoshka-style dimensionality reduction by truncating the
output embeddings to a lower dimensionality.

These findings show that reducing the dimensionality by e.g. 2x only has a 1.47%
reduction in performance (0.5031 NDCG@10 vs 0.4957 NDCG@10), while
realistically resulting in a 2x speedup in retrieval speed.

Multilingual Similarity

We've additionally evaluated the final sentence-transformers/static-similarity-

mrl-multilingual-v1 model on 5 languages which have a lot of benchmarks across
various tasks on MTEB.
We want to reiterate that this model is not intended for retrieval use cases.
Instead, we evaluate on Semantic Textual Similarity (STS), Classification, and
Pair Classification. We compare against the excellent and small multilingual-e5-
small model.
Across all measured languages, static-similarity-mrl-multilingual-v1 reaches an
average 92.3% for STS, 95.52% for Pair Classification, and 86.52% for
Classification relative to multilingual-e5-small.

To make up for this performance reduction, static-similarity-mrl-multilingual-v1

is approximately ~125x faster on CPU and ~10x faster on GPU devices than
multilingual-e5-small. Due to the super-linear nature of attention models, versus
the linear nature of static embedding models, the speedup will only grow larger as
the number of tokens to encode increases.

Matryoshka Evaluation

Lastly, we experimented with the impacts on English STS on MTEB performance

when we did Matryoshka-style dimensionality reduction by truncating the output
embeddings to a lower dimensionality.

As you can see, you can easily reduce the dimensionality by 2x or 4x with minor
(0.15% or 0.56%) performance hits. If the speed of your downstream task or your
storage costs are a bottleneck, this should allow you to alleviate some of those
concerns.
Conclusion

This blogpost described all of the steps that we undertook from ideation to
finished models, in addition to details regarding usage and evaluation of the two
resulting models: static-retrieval-mrl-en-v1 and static-similarity-mrl-multilingual-
v1.

The evaluations show that:

Static Embedding-based models can exceed 85% of the performance of

common attention-based dense models,

Static Embedding-based models are realistically 10x to 25x faster on GPUs

and 100x to 400x faster on CPUs than common efficient alternatives like
all-mpnet-base-v2 and multilingual-e5-small. This speedup only grows larger
with longer texts.

Training with a Matryoshka Loss allows significant preservation of

downstream performance:

4x smaller gives a 0.56% performance decrease by static-similarity-

mrl-multilingual-v1 for English STS, and

2x smaller gives a 1.47% performance decrease by static-retrieval-

mrl-en-v1 for English Retrieval.

Should you need an efficient CPU-only dense embedding model for your retrieval
or similarity tasks, then static-retrieval-mrl-en-v1 and static-similarity-mrl-
multilingual-v1 will be extremely performant solutions at minimal costs that get
surprisingly close to the attention-based dense models.
Next Steps

Try it out! If you already use a Sentence Transformer model somewhere, feel free
to swap it out for static-retrieval-mrl-en-v1 or static-similarity-mrl-multilingual-
v1. Or, better yet: train your own models on data that is representative for the task
and language of your interest.

Furthermore, some questions remain about the trained models:

1. Because Static Embedding-based models aren't bottlenecked by positional

embeddings or superlinear time complexity, they can have arbitrarily high
maximum sequence lengths. However, at some point the law of large
numbers is likely to "normalize" all embeddings for really long documents,
such that they aren't useful anymore.

More experiments are required to determine what a good cutoff point is. For
now, we leave the maximum sequence length, chunking, etc. to the user.

Additionally, there are quite a few possible extensions that are likely to improve
the performance of this model, which we happily leave to other model authors.
We are also open to collaborations:

1. Hard Negatives Mining: Search for similar, but not quite relevant, texts to
improve training data difficulty.

2. Model Souping: Combining weights from multiple models trained in the

same way with different seeds or data distributions.

3. Curriculum Learning: Train on examples of increasing difficulties.

4. Guided False In-Batch Negatives Filtering: Exclude false negatives via an

efficient pre-trained embedding model.
5. Seed Optimization for the Random Weight Initialization: Train the first steps
with various seeds to find one with a useful weight initialization.

6. Tokenizer Retraining: Retrain a tokenizer with modern texts and learnings.

7. Gradient Caching: Applying GradCache via

CachedMultipleNegativesRankingLoss allows for larger batches, which

often result in superior performance.

8. Model Distillation: Rather than training exclusively using supervised training

data, we can also feed unsupervised data through a larger embedding model
and distil those embeddings into the static embedding-based student model.

Acknowledgements

I would like to thank Stéphan Tulkens and Thomas van Dongen of The Minish Lab
for bringing Static Embedding models to my attention via their Model2Vec work.
Additionally, I would like to thank Vaibhav Srivastav and Pedro Cuenca for their
assistance with this blogpost, and Antoine Chaffin for brainstorming the release
checkpoints.

Lastly, a big thanks to all researchers working on embedding models, datasets,

and open source Python packages. You strengthen the industry, and I build on
your shoulders. One day, I hope you build on mine.

starkx about 15 hours ago

Nvidia: You need to buy GPUs and machines.

Developer: No, we can just tweak the algorithms.
Elon Musk: Look at me buying GPUs, you poor folks.

+ Reply

NickyNicky about 14 hours ago

Thank you very much for the post, great work,

I have already trained some English and Spanish models:

NickyNicky/StaticEmbedding-MatryoshkaLoss-gemma-2-2b-en-es

NickyNicky/StaticEmbedding-MatryoshkaLoss-gemma-2-2b-gooaq-en
I would like to know how to increase or decrease the
'max_length example 371'

when I check 'print(model.max_seq_length) # -> Inf'.

Is it possible, how? I can't find documentation about it

thank you so much

4 replies · 🔥 3 +

tomaarsen Article author about 8 hours ago

Hello!

Nice work on those models! Am I correct in understanding that one of those models
reaches 0.5623 NDCG@10 on NanoBEIR across all datasets? That's a pretty huge jump
from the 0.5032 NDCG@10 for static-retrieval-mrl-en-v1.

“I would like to know how to increase or decrease the

'max_length example 371' ”

Are you referring to the 'max' in your model card here?

That is simply some approximate statistics on the training data; taken from the first
1000 samples. Although it's not always recommended to use texts with (much) larger
sequence lengths than the training data, the actual maximum sequence length is
indeed infinity. It is defined here: https://github.com/UKPLab/sentence-
transformers/blob/cccab8303aaf6e18f069b0da578b3d162bf8442a/sentence_transfor
mers/models/StaticEmbedding.py#L106-L108

In short: the model will never truncate sequences, because the approach

1. has linear complexity (2x more data -> 2x slower) unlike Transformer models (2x
more data -> (much) slower than 2x).

2. is not beholden to positional embeddings that might impose limitations on the

maximum sequence length.

So, Static Models don't have a maximum sequence length. They just require care by
the user to make sure that they're not feeding documents that are too large, as all
documents will eventually embed very similarly if they are long enough.

Tom Aarsen

Expand 3 replies

Reply in thread

GrimSqueaker about 10 hours ago

This is really cool! I'm surprised you do better than model2vec - is the difference really
just the use of a (better) contrastive loss pretraining formula?

5 replies · 🧠 1 +

tomaarsen Article author about 8 hours ago

Yes! The architecture is identical. In fact, the StaticEmbedding module that is used
for the models described in this blogpost is actually the same that is used when
loading a Model2Vec model in Sentence Transformers:

from sentence_transformers import SentenceTransformer

from sentence_transformers.models import StaticEmbedding
from tokenizers import Tokenizer

# Pre-distilled embeddings:
static_embedding = StaticEmbedding.from_model2vec("minishlab/M2V_b

model = SentenceTransformer(modules=[static_embedding])

embeddings = model.encode(["What are Pandas?", "The giant panda (A

similarity = model.similarity(embeddings[0], embeddings[1])
# tensor([[0.9177]]) (If you use the distilled bge-base)

StaticEmbedding docs

Expand 4 replies

Reply in thread

vdmbrsv about 8 hours ago

Amazing work and excellent writing!

❤️ 1 + Reply

Edit Preview
Start discussing this article

Tap or paste here to upload images

Comment

Sign up or log in to comment

System theme

Company
TOS
Privacy
About
Jobs

Website
Models
Datasets
Spaces
Pricing
Docs

Hugging Face Transformers
No ratings yet
Hugging Face Transformers
8 pages
Fundamentals-of-Computer-and-IT-BCA Notes (Unit1, Unit2, Unit3 and Unit4)
No ratings yet
Fundamentals-of-Computer-and-IT-BCA Notes (Unit1, Unit2, Unit3 and Unit4)
187 pages
CCOA ISACA Certified Cybersecurity Operations Analyst Exam Dumps
No ratings yet
CCOA ISACA Certified Cybersecurity Operations Analyst Exam Dumps
15 pages
Ultimate Guide to Embedding Models
No ratings yet
Ultimate Guide to Embedding Models
50 pages
Jina-Embeddings-V3:: Multilingual Embeddings With Task Lora
No ratings yet
Jina-Embeddings-V3:: Multilingual Embeddings With Task Lora
20 pages
EELBERT Tiny Models Through Dynamic Embeddings 1705151354
No ratings yet
EELBERT Tiny Models Through Dynamic Embeddings 1705151354
9 pages
chapter4
No ratings yet
chapter4
61 pages
Improving Text Embeddings With Large Language Models
No ratings yet
Improving Text Embeddings With Large Language Models
20 pages
NLP Concepts
No ratings yet
NLP Concepts
37 pages
Pars BERT
No ratings yet
Pars BERT
10 pages
Chapter 1
No ratings yet
Chapter 1
45 pages
14-Word Embeddings II
No ratings yet
14-Word Embeddings II
31 pages
The Illustrated Word2vec - Jay Alammar - Visualizing Machine Learning One Concept at A Time
100% (1)
The Illustrated Word2vec - Jay Alammar - Visualizing Machine Learning One Concept at A Time
24 pages
InteliDrive DCU Industrial Global Guide
No ratings yet
InteliDrive DCU Industrial Global Guide
357 pages
1.3 Scoop On Power
No ratings yet
1.3 Scoop On Power
17 pages
Uipath Brand Messaging Architecture
100% (1)
Uipath Brand Messaging Architecture
37 pages
Empowering-the-Mute-and-Deaf-Community
No ratings yet
Empowering-the-Mute-and-Deaf-Community
9 pages
8085 Microprocessors
No ratings yet
8085 Microprocessors
39 pages
Facility Management System Software Streamlines Facility Operations
No ratings yet
Facility Management System Software Streamlines Facility Operations
13 pages
artificial intelligence unit 4
No ratings yet
artificial intelligence unit 4
9 pages
Advance Data Analytics Using MS Excel - Office Master
No ratings yet
Advance Data Analytics Using MS Excel - Office Master
10 pages
3.3.12-Vlan configure
No ratings yet
3.3.12-Vlan configure
18 pages
SMS PPT 123
No ratings yet
SMS PPT 123
12 pages
P.KARTHIK REDDY
No ratings yet
P.KARTHIK REDDY
2 pages
Operating System Lab Manual: Ex - No:1.a Basics of Unix Commands Introduction To Unix
No ratings yet
Operating System Lab Manual: Ex - No:1.a Basics of Unix Commands Introduction To Unix
68 pages
Java Lab
No ratings yet
Java Lab
38 pages
Unit 2 - BA
100% (1)
Unit 2 - BA
51 pages
Ray Casting Tutorial
No ratings yet
Ray Casting Tutorial
33 pages
An0005-Using Unitedsic Spice Model in Ltspice
No ratings yet
An0005-Using Unitedsic Spice Model in Ltspice
5 pages
Assignment 1 - VIT BHOPAL
No ratings yet
Assignment 1 - VIT BHOPAL
7 pages
Resume (June 2023)
No ratings yet
Resume (June 2023)
1 page
POP Assignment 1
No ratings yet
POP Assignment 1
1 page
Cambridge International AS & A Level: Computer Science 9618/22
No ratings yet
Cambridge International AS & A Level: Computer Science 9618/22
20 pages
Az Storage Notes
No ratings yet
Az Storage Notes
13 pages
Java
No ratings yet
Java
3 pages
Worksheet - Functions
No ratings yet
Worksheet - Functions
4 pages
Experiment 2.2 KNN Classifier
No ratings yet
Experiment 2.2 KNN Classifier
7 pages
SOCSE
No ratings yet
SOCSE
4 pages
Esa2 Pws 042021v2 Draft Rfi
No ratings yet
Esa2 Pws 042021v2 Draft Rfi
44 pages
Module 9 Math 3 1
No ratings yet
Module 9 Math 3 1
20 pages
CompTIA APlus 901 Acronyms
0% (1)
CompTIA APlus 901 Acronyms
4 pages
Python Machine Learning By Example: Unlock machine learning best practices with real-world use cases
From Everand
Python Machine Learning By Example: Unlock machine learning best practices with real-world use cases
Yuxi (Hayden) Liu
No ratings yet
Hands-On Monitoring and Alerting with Prometheus: Build Resilient, Real-time Monitoring and Alerting Systems Using Prometheus, PromQL, and Proven Best Practices for Modern Infrastructure (English Edition)
From Everand
Hands-On Monitoring and Alerting with Prometheus: Build Resilient, Real-time Monitoring and Alerting Systems Using Prometheus, PromQL, and Proven Best Practices for Modern Infrastructure (English Edition)
Muhammad Badawy
No ratings yet
Prompt to Profit: AI Patterns That Give Solo Builders an Unfair Advantage
From Everand
Prompt to Profit: AI Patterns That Give Solo Builders an Unfair Advantage
Lucas Merritt
No ratings yet
C# Data Structures and Algorithms: Harness the power of C# to build a diverse range of efficient applications
From Everand
C# Data Structures and Algorithms: Harness the power of C# to build a diverse range of efficient applications
Marcin Jamro
No ratings yet
Foundational Models and Architectures S1: Generative AI, #1
From Everand
Foundational Models and Architectures S1: Generative AI, #1
Leaster Startx
No ratings yet
Java Algorithms for Beginners: A Practical Guide with Examples
From Everand
Java Algorithms for Beginners: A Practical Guide with Examples
William E. Clark
No ratings yet
CHATGPT DALL.E 3: Complete Guide. Third Edition
From Everand
CHATGPT DALL.E 3: Complete Guide. Third Edition
Hesham Mohamed Elsherif
No ratings yet
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
From Everand
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
Timothy King
No ratings yet
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
From Everand
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
Avishek Nag
No ratings yet
Tapestry 5: Building Web Applications
From Everand
Tapestry 5: Building Web Applications
Alexander Kolesnikov
3.5/5 (2)
Programming in Star
From Everand
Programming in Star
Francis McCabe
No ratings yet
Implementing Domain-Specific Languages with Xtext and Xtend - Second Edition
From Everand
Implementing Domain-Specific Languages with Xtext and Xtend - Second Edition
Lorenzo Bettini
4/5 (1)
Learn C++
From Everand
Learn C++
Aishik Dutta
No ratings yet
Learning Concurrent Programming in Scala
From Everand
Learning Concurrent Programming in Scala
Aleksandar Prokopec
No ratings yet
Terraform for Developers, Second Edition: Essentials of Infrastructure Automation and Provisioning
From Everand
Terraform for Developers, Second Edition: Essentials of Infrastructure Automation and Provisioning
Kimiko Lee
No ratings yet
Building Transformer Models with PyTorch 2.0: NLP, computer vision, and speech processing with PyTorch and Hugging Face (English Edition)
From Everand
Building Transformer Models with PyTorch 2.0: NLP, computer vision, and speech processing with PyTorch and Hugging Face (English Edition)
Prem Timsina
No ratings yet
Terraform for Developers, Second Edition
From Everand
Terraform for Developers, Second Edition
Kimiko Lee
No ratings yet
Mastering C# Concurrency
From Everand
Mastering C# Concurrency
Agafonov Eugene
2/5 (2)
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
C# OOP Step by Step: A Practical Guide with Examples
From Everand
C# OOP Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
Microsoft Visual C++ Windows Applications by Example
From Everand
Microsoft Visual C++ Windows Applications by Example
Stefan BjÃ¶rnander
3.5/5 (3)
Data Structures and Algorithms with Go: Create efficient solutions and optimize your Go coding skills (English Edition)
From Everand
Data Structures and Algorithms with Go: Create efficient solutions and optimize your Go coding skills (English Edition)
Dušan Stojanović
No ratings yet
IBM WebSphere eXtreme Scale 6
From Everand
IBM WebSphere eXtreme Scale 6
Anthony Chaves
No ratings yet
SQLite Database Programming for Xamarin: Cross-platform C# database development for iOS and Android using SQLite.XM
From Everand
SQLite Database Programming for Xamarin: Cross-platform C# database development for iOS and Android using SQLite.XM
Anthony Serpico
No ratings yet
Groovy for Domain-Specific Languages, Second Edition: Extend and enhance your Java applications with domain-specific scripting in Groovy
From Everand
Groovy for Domain-Specific Languages, Second Edition: Extend and enhance your Java applications with domain-specific scripting in Groovy
Fergal Dearle
No ratings yet
Artificial Intelligence 2024 Book 2 of 2: AI, #2
From Everand
Artificial Intelligence 2024 Book 2 of 2: AI, #2
Yang Yen Thaw
No ratings yet
Learning .NET High-performance Programming
From Everand
Learning .NET High-performance Programming
Antonio Esposito
No ratings yet
Lexicon of Programming Terminology: Lexicon of Tech and Business, #17
From Everand
Lexicon of Programming Terminology: Lexicon of Tech and Business, #17
Mustafa Al-Dori
5/5 (1)
Mastering Computer Programming: A Comprehensive Guide
From Everand
Mastering Computer Programming: A Comprehensive Guide
Kondwani Hara
No ratings yet
Thinking About Star
From Everand
Thinking About Star
Francis McCabe
No ratings yet
Mastering Machine Learning Algorithms - Second Edition: Expert techniques for implementing popular machine learning algorithms, fine-tuning your models, and understanding how they work, 2nd Edition
From Everand
Mastering Machine Learning Algorithms - Second Edition: Expert techniques for implementing popular machine learning algorithms, fine-tuning your models, and understanding how they work, 2nd Edition
Giuseppe Bonaccorso
2/5 (1)
Machine Learning in Production: Master the art of delivering robust Machine Learning solutions with MLOps (English Edition)
From Everand
Machine Learning in Production: Master the art of delivering robust Machine Learning solutions with MLOps (English Edition)
Suhas Pote
No ratings yet
Mastering the Craft: Unleashing the Art of Software Engineering
From Everand
Mastering the Craft: Unleashing the Art of Software Engineering
Kiran Nagesh
No ratings yet
Modern C++ Templates: A Practical Guide for Developers
From Everand
Modern C++ Templates: A Practical Guide for Developers
Robert Johnson
No ratings yet
Getting Started with Terraform
From Everand
Getting Started with Terraform
Kirill Shirinkin
5/5 (1)
Learning Ansible
From Everand
Learning Ansible
Wayne Taylor
No ratings yet
Internet of Things (IoT) A Quick Start Guide: A to Z of IoT Essentials
From Everand
Internet of Things (IoT) A Quick Start Guide: A to Z of IoT Essentials
Chitra Lele
No ratings yet
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
From Everand
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
Robert Johnson
No ratings yet
KNIME Essentials
From Everand
KNIME Essentials
Gábor Bakos
No ratings yet
IGNOU PGDCA MCS 206 Object Oriented Programming using Java Previous Years solved Papers
From Everand
IGNOU PGDCA MCS 206 Object Oriented Programming using Java Previous Years solved Papers
Manish Soni
No ratings yet
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
Programming Concepts in C++
From Everand
Programming Concepts in C++
Robert Burns
No ratings yet
MCS-024: Object Oriented Technologies and Java Programming
From Everand
MCS-024: Object Oriented Technologies and Java Programming
Dr. DK Sukhani
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Introduction to Programming Languages
From Everand
Introduction to Programming Languages
IntroBooks Team
4/5 (1)
ORACLE PL/SQL Interview Questions You'll Most Likely Be Asked
From Everand
ORACLE PL/SQL Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
5/5 (1)
Java/J2EE Design Patterns Interview Questions You'll Most Likely Be Asked: Second Edition
From Everand
Java/J2EE Design Patterns Interview Questions You'll Most Likely Be Asked: Second Edition
Vibrant Publishers
No ratings yet
C# Interview Questions You'll Most Likely Be Asked
From Everand
C# Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet