Model description

is a state-of-the-art ColBERT (Contextualized Late Interaction BERT) model for reranking and retrieval tasks. It is based on the model and achieves state-of-the-art performance on 13 publicly available BEIR benchmarks.

ColBERT combines the benefits of vector search and cross-encoders. Queries and documents are encoded separately, but instead of creating a single embedding for the entire document, ColBERT generates contextualized embeddings for each token in the document. During search, the token-level query embeddings are compared with the token-level embeddings of the documents using the lightweight scoring function MaxSim. This allows ColBERT to capture nuanced matching signals while being computationally efficient.

is initialized from the model, which was trained on over 700 million samples from various domains. The ColBERT model was then fine-tuned on around 96 million samples to adapt it to the late interaction mechanism. This extensive training enables the model to be used for a wide range of tasks and domains.

On the BEIR benchmark, outperforms other ColBERT models on average and directly in most tasks. Its exceptionally high reranking score even surpasses typical scores for cross-encoder based reranker models on the benchmark, despite the advantages of the ColBERT architecture regarding resource efficiency. The model also demonstrates state-of-the-art retrieval performance when compared to other currently available ColBERT models.

LayersEmbedding DimensionRecommended Sequence LengthLanguage

Suitable Scoring Methods

  • MaxSim: The lightweight scoring function used in ColBERT to compare token-level query embeddings with token-level document embeddings.


  • Language: is trained on English text and is specifically designed for the English language.
  • Sequence Length: Any text longer than 512 tokens will be truncated.


We recommend using for utilizing our ColBERT model.

The result looks like this: