Skip to content

Configuring Marqo

Marqo is configured through environment variables passed to the Marqo container when it is run.


Configuring usage limits

Limits can be set to protect the resources of the machine Marqo is running on.

Configuration name Default Description
MARQO_MAX_INDEX_FIELDS n/a Maximum number of fields allowed per index
MARQO_MAX_DOC_BYTES 100000 Maximum document size allowed to be indexed
MARQO_MAX_RETRIEVABLE_DOCS n/a Maximum number of documents allowed to be returned in a single request
MARQO_MAX_NUMBER_OF_REPLICAS 1 Maximum number of replicas allowed when creating an index
MARQO_MAX_CUDA_MODEL_MEMORY 4 Maximum CUDA memory usage (GB) for models in Marqo. For multi-GPU, this is the max memory for each GPU.
MARQO_MAX_CPU_MODEL_MEMORY 4 Maximum RAM usage (GB) for models in Marqo.
MARQO_MAX_VECTORISE_BATCH_SIZE 16 Maximum size of batch size to process in parallel (when, for example, adding documents ).
MARQO_MAX_ADD_DOCS_COUNT 64 Maximum number of documents allowed to be added to an index in a single request.

Example

docker run --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
    -e "MARQO_MAX_INDEX_FIELDS=400" \
    -e "MARQO_MAX_DOC_BYTES=200000" \
    -e "MARQO_MAX_RETRIEVABLE_DOCS=600" \
    -e "MARQO_MAX_CUDA_MODEL_MEMORY=5" \
    -e "MARQO_MAX_NUMBER_OF_REPLICAS=2" marqoai/marqo:latest
In the above example a marqo container is being run with the following limits:

  • The max number of fields per index is capped at 400

  • The max size of an indexed document is 0.2mb

  • The max number of documents allowed to be returned in a single request is 600

  • The max number of replicas allowed when creating an index is 2.

  • The max CUDA memory usage for models in Marqo is 5GB.

Configuring preloaded models

  • Variable: MARQO_MODELS_TO_PRELOAD

  • Default value: '["hf/all_datasets_v4_MiniLM-L6", "ViT-L/14"]'

  • Expected value: A JSON-encoded array of strings or objects.

This is a list of models to load and pre-warm as Marqo starts. This prevents a delay during initial search and index commands in actual Marqo usage.

Models in string form must be names of models within the model registry. You can find these models here

Models in object form must have model and model_properties keys.

Model Object Example (OPEN CLIP model)

'{
    "model": "my-open-clip-1",
    "model_properties": {
        "name": "ViT-B-32-quickgelu",
        "dimensions": 512,
        "url": "https://github.com/mlfoundations/open_clip/releases/download/v0.2-weights/vit_b_32-quickgelu-laion400m_avg-8a00ab3c.pt",
        "type": "open_clip"
    }
}'

Model Object Example (CLIP model)

'{
    "model": "generic-clip-test-model-2",
    "model_properties": {
        "name": "ViT-B/32",
        "dimensions": 512,
        "type": "clip",
        "url": "https://openaipublic.azureedge.net/clip/models/40d365715913c9da98579312b702a82c18be219cc2a73407c4526f58eba950af/ViT-B-32.pt"
    }
}'

Marqo Run Example (containing both string and object)

export MY_MODEL_LIST='[
    "sentence-transformers/stsb-xlm-r-multilingual",
    "hf/all_datasets_v4_MiniLM-L6",
    {
        "model": "generic-clip-test-model-2",
        "model_properties": {
            "name": "ViT-B/32",
            "dimensions": 512,
            "type": "clip",
            "url": "https://openaipublic.azureedge.net/clip/models/40d365715913c9da98579312b702a82c18be219cc2a73407c4526f58eba950af/ViT-B-32.pt"
        }
    }
]'

docker run --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
    -e MARQO_MODELS_TO_PRELOAD="$MY_MODEL_LIST" \
    marqoai/marqo:latest

Configuring log level

  • Variable: MARQO_LOG_LEVEL

  • Default value: 'info'

  • Expected value: a str from one of 'error', 'warning', 'info', 'debug'.

This environment variable will change the log level of timing logger and uvicorn logger. A higher log level (e.g., 'error') will reduce the amount of logs in Marqo, while a lower log level ('debug') will record more detailed information in the logs. The default level is 'info'.

Example

docker run --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
    -e MARQO_LOG_LEVEL='warning' \
    marqoai/marqo:latest

Configuring Marqo-OS request retries

Due to transient network errors or other connection reasons, it may be necessary to retry HTTP requests from Marqo to Marqo-OS. The following environment variables control the number of retries and the backoff time between retries. All of the following environment variables must be integers.

Configuration name Default Description
MARQO_MAX_BACKEND_ADD_DOCS_RETRY_ATTEMPTS 0 Maximum number of request retry attempts from Marqo to the backend if add_documents encounters a backend communication error.
MARQO_MAX_BACKEND_ADD_DOCS_RETRY_BACKOFF 1 Maximum time (in seconds) for Marqo to wait before retrying request if add_documents encounters a backend communication error.
MARQO_MAX_BACKEND_SEARCH_RETRY_ATTEMPTS 0 Maximum number of request retry attempts from Marqo to the backend if search encounters a backend communication error.
MARQO_MAX_BACKEND_SEARCH_RETRY_BACKOFF 1 Maximum time (in seconds) for Marqo to wait before retrying request if search encounters a backend communication error.

Note

  • MARQO_MAX_BACKEND_ADD_DOCS_RETRY_ATTEMPTS and MARQO_MAX_BACKEND_SEARCH_RETRY_ATTEMPTS represent the number of retry attempts, not total attempts. For example, if MARQO_MAX_BACKEND_ADD_DOCS_RETRY_ATTEMPTS is set to 2, then Marqo will attempt to add documents to the backend 3 times in total (the initial attempt plus 2 retries).

  • Setting values greater than 0 for MARQO_MAX_BACKEND_ADD_DOCS_RETRY_ATTEMPTS could potentially lead to unexpected edgecases in Marqo during a network partition. A sample scenario is as follows:

    1. An add_documents() call successfully adds a document (for example: "_id": "doc_1") to the vector storage layer, but gets no response due to a transient network error, prompting a retry.
    2. A second user performs a delete_documents() call, attempting to delete the document "_id": "doc_1". This returns a 200 OK (rather than a 404 not found) for deleting the document, as it was successfully created in the storage layer.
    3. Then, the retry attempt from the add_documents call succeeds succesfully, writing "doc_1" to the storage layer, after the deletion. The user gets a success response from the add_documents call.

    The user would see a succesful document creation and a successful document deletion despite the document still existing in the index. If this behaviour is not acceptable for your application, we recommend setting MARQO_MAX_BACKEND_ADD_DOCS_RETRY_ATTEMPTS to 0, and implementing the retry logic within your application instead.

Retry backoff time calculation

Retry backoff time is calculated exponentially, with the formula:

(2 ^ attempt_number) * 10
in milliseconds, where attempt_number starts at 0. Therefore Marqo will wait 10ms before the first retry, 20ms before the second retry, 40ms before the third retry, and so on. This time will cap out at MARQO_MAX_BACKEND_ADD_DOCS_RETRY_BACKOFF or MARQO_MAX_BACKEND_SEARCH_RETRY_BACKOFF, depending on the endpoint.

Advanced Marqo-OS retry configuration

The following environment variables control the default retry behavior for all Marqo endpoints. For more fine-grained control, is recommended that you set the variables specific to add_documents and search and leave these defaults as-is.

Configuration name Default Description
DEFAULT_MARQO_MAX_BACKEND_RETRY_ATTEMPTS 0 Default maximum number of request retry attempts from Marqo to the backend if any endpoint encounters a backend communication error.
DEFAULT_MARQO_MAX_BACKEND_RETRY_BACKOFF 1 Default maximum time (in seconds) for Marqo to wait before retrying request if any endpoint encounters a backend communication error.

Configuring throttling

Configuration name Default Description
MARQO_ENABLE_THROTTLING "TRUE" Adds throttling if "TRUE". Must be a str: Either "TRUE" or "FALSE".
MARQO_MAX_CONCURRENT_INDEX 8 Maximum allowed concurrent indexing threads
MARQO_MAX_CONCURRENT_SEARCH 8 Maximum allowed concurrent search threads

These environment variables set Marqo's allowed concurrency across index and search. If these limits are reached, then Marqo will return 429 on subsequent requests. These should be set with respect to available resources of the machine Marqo will be running on.

Example

docker run --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
    -e MARQO_ENABLE_THROTTLING='TRUE' \
    -e MARQO_MAX_CONCURRENT_SEARCH='10' \
    marqoai/marqo:latest

Other configurations

Configuration name Default Description
MARQO_EF_CONSTRUCTION_MAX_VALUE 4096 The maximum ef_construction value of Marqo indexes created by this Marqo instance.
MARQO_MAX_SEARCHABLE_TENSOR_ATTRIBUTES null The maximum allowed number of tensor fields to be searched in a single tensor search query