Mixedbread

ColBERT

Learn about the ColBERT architecture, a powerful approach that combines vector search and cross-encoders for efficient and effective reranking and retrieval tasks. Discover Mixedbread's state-of-the-art ColBERT models and their performance on BEIR benchmarks.

In this documentation, you'll learn all you need to know about the ColBERT architecture, a way to enable great reranking and retrieval performance without the computational needs of traditional cross-encoders.

What are the Traditional Approaches?

The typical search approach uses the same model to encode both documents and queries. We then choose a metric, such as cosine similarity, to measure the distance between the query and the documents. However, there is an issue with that: our model has to determine the optimal placement within the latent space, so that query and relevant documents are positioned closely together, but there is no interaction between query and document within the model.

On the other hand, we have models like cross-encoders. With cross-encoders, the query and documents are fed to the model together, improving search accuracy. Unfortunately, cross-encoders are extremely compute-intensive, since we need to pass all possible combinations of documents and queries to the model. Therefore, these models are not suitable for large-scale search and are mostly used for reranking.

What is the ColBERT Architecture?

ColBERT stands for Contextualized Late Interaction BERT, and it combines both vector search and cross-encoders. In ColBERT, the queries and the documents are first encoded separately. However, instead of creating a single embedding for the entire document, ColBERT generates contextualized embeddings for each token in the document. To search, the token-level query embeddings are compared with the token-level embeddings of the documents using the lightweight scoring function MaxSim. This allows ColBERT to capture more nuanced matching signals while still being computationally efficient. The resulting scores are then used to rank the documents based on their relevance to the query.

Similarity scoring process of query and document in a ColBERT model

Similarity scoring process of query and document in a ColBERT model

While ColBERT can be used for both reranking and retrieval tasks, we mainly recommend using it for reranking and taking advantage of our for retrieval-related use cases.

Mixedbread ColBERT Models

With the recent release of the fresh and crunchy ColBERT model , there's a new model family in our portfolio!

The model family now includes:

ModelStatusContext LengthDimensionBEIR Average
API unavailable512102450.37 (Reranking)

Why mixedbread-colbert?

mixedbread-colbert is a powerful, computationally efficient ColBERT model family - again fully open-source under the Apache 2.0 license! The new model outperforms the other open models that are currently available on the BEIR benchmark on most subsets as well as on average. Its scores even beat the levels typical for traditional cross-encoder based rerankers in spite of its resource efficiency advantages.

Reranking performance in NDCG@10:

DatasetColBERTv2Jina-ColBERT-v1
ArguAna29.9933.4233.11
ClimateFEVER16.5120.6620.85
DBPedia31.8042.1640.61
FEVER65.1381.0780.75
FiQA23.6135.6035.86
HotPotQA63.3068.8467.62
NFCorpus33.7536.6936.37
NQ30.5551.2751.43
Quora78.8685.1886.95
SCIDOCS14.9015.3916.98
SciFact67.8970.2071.48
TREC-COVID59.4775.0081.04
Webis-touché202044.2232.1231.70
Average43.0849.8250.37

How Can You Get Started Using mixedbread-colbert Yourself?

Since our ColBERT model is not currently available via API, you'll need to get the model from and host it yourself. We recommend using our model with the framework . Please see the page on for more information!

On this page