Skip to content

Visual Discovery API Documentation

Overview

The Visual Discovery API is an image detection service that identifies objects in images and returns bounding boxes with confidence scores. This can be leveraged to allow users to photograph items or upload images and then search your product catalogue directly.

Guide Prerequisites

  • Marqo Cloud Account: Sign up at cloud.marqo.ai. (if you don't have an account, please reach out!)
  • Python 3.7+ installed on your machine
  • Visual Discovery API Key given to you by your Marqo represenative. Please reach out to the team if you don't have one.

Base URL

https://visual-discovery.marqo-ep.ai

Authentication

All API requests require authentication using an API key passed in the request header:

x-api-key: <your-api-key>

Endpoints

Detect Objects

Detects objects in an image and returns bounding boxes with confidence scores and labels.

Endpoint: POST /detect

Headers: - Content-Type: application/json - x-api-key: <your-api-key>

Request Body:

{
  "image": "<image_url_or_base64>"
}

Parameters: - image (string, required): Either a URL to an image or a base64-encoded image string

Supported Image Formats: - JPEG - WebP - PNG

Example Request:

curl -X POST 'https://visual-discovery.marqo-ep.ai/detect' \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: your-api-key-here' \
  --data '{
    "image": "https://example.com/image.jpg"
  }'

Example Response:

{
  "items": [
    {
      "bounding_box": [
        [100, 150],  // Top-left corner (x, y)
        [300, 400]   // Bottom-right corner (x, y)
      ],
      "score": 0.95,
      "labels": ["bag", "handbag"]
    },
    {
      "bounding_box": [
        [200, 100],
        [400, 300]
      ],
      "score": 0.87,
      "labels": ["top", "shirt"]
    }
  ],
  "original_image": {
    "width": 800,
    "height": 600
  }
}

Response Fields:

  • items (array): Array of detected objects
  • bounding_box (array): Coordinates of the bounding box using top-left corner as origin (0,0)
    • First array: [x1, y1] - Top-left corner coordinates
    • Second array: [x2, y2] - Bottom-right corner coordinates
  • score (float): Confidence score between 0 and 1
  • labels (array): Array of category labels for the detected object
  • original_image (object): Original image dimensions
  • width (integer): Image width in pixels
  • height (integer): Image height in pixels

Performance Characteristics

  • Availability: 99.999% (5 nines)
  • P99 Latency: < 1 second
  • P90 Latency: < 500ms
  • Scalability: Supports 1000s of requests per second
  • Caching: Results are cached for 30 days to improve performance

Error Responses

400 Bad Request

{
  "error": "Missing 'image' field in request"
}

401 Unauthorized

{
  "error": "Invalid or missing API key"
}

405 Method Not Allowed

{
  "error": "Invalid request method"
}

500 Internal Server Error

{
  "error": "Error from inference server"
}

Use Cases

  1. Fashion E-commerce: Detect clothing items in product images for enhanced search and recommendations
  2. Visual Search: Enable image-to-image search capabilities from users' photos

After detecting objects in an image, you can:

  1. Extract the bounding box coordinates from the response
  2. Crop the original image using the bounding box coordinates
  3. Convert the cropped image to base64
  4. Submit the cropped image to Marqo's image-to-image search endpoint for finding similar products

SDKs and Libraries

Currently, the API is REST-based and can be used with any HTTP client. Future SDKs may be available for popular programming languages.

Support

For technical support or questions about the Visual Discovery API, please contact the Marqo team.

Changelog

Version 1.0.0 (Current)

  • Initial release with object detection capabilities
  • Support for URL and base64 image inputs
  • Bounding box detection with confidence scores
  • Multi-label classification
  • High-performance caching layer