Embeddings API
TLDR:
The mixedbread Embeddings API enables you to generate embeddings for your input text. It's a superset of the OpenAI
API, allowing you to use their client by simply changing the endpoint. Leverage our powerful models with ease.
Create embeddings
This endpoint provides access to our embedding models. It returns embeddings for the input text you provide, which can be used for various tasks such as text similarity, clustering, and more.
The endpoint is also a superset of the OpenAI embedding API. This means you can use the OpenAI API client,
pointing it to https://api.mixedbread.ai
. However, note that not all mixedbread-specific features may be
available through the OpenAI client.
Request Body
- Name
input*
- Type
- string|string[]
- Description
A string or a list of strings, where each string represents a sentence or chunk of text to be embedded.
- Between 1-256 items.
- Texts will be truncated if longer than the model's maximum sequence length
- Name
model*
- Type
- string
- Description
The model to be used for generating embeddings.
- Must be a valid model. Refer to our supported models.
- Name
prompt
- Type
- string
- Description
An optional prompt to provide context to the model. Refer to the model's documentation for more information.
- A string between 1 and 256 characters
- Name
normalized
- Type
- boolean
- Description
Option to normalize the embeddings. Defaults to true.
- Name
dimensions
- Type
- number
- Description
The desired number of dimensions in the output vectors. Defaults to the model's maximum.
- A number between 1 and the model's maximum output dimensions
- Only applicable for Matryoshka-based models
- Name
encoding_format
- Type
- string|string[]
- Description
The desired format for the embeddings. Defaults to "float". If multiple formats are requested, the response will include an object with each format for each embedding.
- Options: float, float16, binary, ubinary, int8, uint8, base64
- Name
truncation_strategy
- Type
- string
- Description
The strategy for truncating input text that exceeds the model's maximum length. Defaults to "start". Setting it to "none" will result in an error if the text is too long.
- Options: start, end, none
Response Body
- Name
model*
- Type
- string
- Description
The embedding model used, which can be one of our hosted models or a custom fine-tuned model.
- Name
object*
- Type
- string
- Description
The type of the returned object. Always "list".
- Name
data*
- Type
- object[]
- Description
A list of the generated embeddings.
- Name
data[x].embedding*
- Type
- number[]|object
- Description
The vector representing the embedding, or an object with different encodings if multiple formats were requested.
- Name
data[x].index*
- Type
- number
- Description
The index of the input text corresponding to this embedding.
- Name
data[x].object*
- Type
- number
- Description
The type of the returned object. Always "embedding".
- Name
usage*
- Type
- object
- Description
Information about API usage for this request.
- Name
usage.prompt_tokens*
- Type
- number
- Description
The number of prompt tokens used to generate the embeddings.
- Name
usage.total_tokens*
- Type
- number
- Description
The total number of tokens used to generate the embeddings.
- Name
normalized*
- Type
- boolean
- Description
Indicates whether the embeddings are normalized.
Rate Limiting
To ensure smooth operation for all users, we have rate limits in place. If you exceed the rate limit, you will receive a 429 Too Many Requests
error. Please wait and try again after a short delay. The table below outlines the rate limits for each tier:
Tier | Requests per Minute | Tokens per Minute | Requests per Day | Burst |
---|---|---|---|---|
1 - Home Baker (Free) | 100 | 250,000 | 5,000 | 10 |
2 - Professional Baker | 300 | 500,000 | 10,000 | 20 |
3 - Bakery Shop | 500 | 1,000,000 | 10,000 | 50 |
4 - Bakery Chain | 1000 | 10,000,000 | 50,000 | 100 |
5 - Bakery Franchise | 2000 | 10,000,000 | 100,000 | 100 |
Custom | Custom | Custom | Custom | Custom |
Requesting a Rate Limit Increase
If you require a higher rate limit, we're here to help! Contact us and provide the following information:
- Your use case and what you're working on
- The estimated number of requests you anticipate needing
- Any additional details that would help us understand your requirements
We will review your request and collaborate with you to determine an appropriate limit.