Skip to content

Implementing Typeahead Search

Typeahead (also known as search-as-you-type or autocomplete) helps users find what they're looking for by providing intelligent query suggestions as they type. This reduces zero-result searches, speeds up the search process, and improves user experience.

What is Typeahead?

Typeahead provides real-time query suggestions based on partial user input. As users type, the system returns a list of suggested complete queries that match their input, helping them:

  • Find relevant queries faster
  • Discover popular search terms
  • Avoid typos and spelling mistakes
  • Navigate to high-intent content quickly

Marqo's typeahead uses a two-stage process:

  1. Retrieval: Match indexed queries using prefix matching and fuzzy search
  2. Ranking: Score suggestions using popularity and relevance (BM25) to surface the best matches

Setting Up Typeahead

Prerequisites

Typeahead is supported with Marqo 2.24.0 and later versions. You'll need an existing Marqo index - typeahead works with any index type and is supported out of the box with no extra setup required during index creation.

The first step is building your suggestion corpus by indexing popular search queries. These typically come from:

  • Search logs and analytics
  • Popular product names
  • Common search patterns
  • Trending terms

Indexing Queries from Search Logs

curl -XPOST 'http://localhost:8882/indexes/ecommerce-products/suggestions/queries' \
  -H 'Content-type:application/json' -d '
{
  "queries": [
    {
      "query": "wireless bluetooth headphones",
      "popularity": 245.0,
      "metadata": {
        "hitCount": 245
      }
    },
    {
      "query": "iphone 15 case clear",
      "popularity": 189.0,
      "metadata": {
        "hitCount": 189
      }
    },
    {
      "query": "running shoes nike air",
      "popularity": 156.0,
      "metadata": {
        "hitCount": 156
      }
    },
    {
      "query": "coffee machine espresso",
      "popularity": 134.0,
      "metadata": {
        "hitCount": 134
      }
    }
  ]
}'

For Marqo Cloud, you will need to access the endpoint of your index and replace your_endpoint with this. You will also need your API Key.

curl -XPOST 'your_endpoint/indexes/ecommerce-products/suggestions/queries' \
-H 'x-api-key: XXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
  "queries": [
    {
      "query": "wireless bluetooth headphones",
      "popularity": 245.0,
      "metadata": {
        "hitCount": 245
      }
    },
    {
      "query": "iphone 15 case clear",
      "popularity": 189.0,
      "metadata": {
        "hitCount": 189
      }
    },
    {
      "query": "running shoes nike air",
      "popularity": 156.0,
      "metadata": {
        "hitCount": 156
      }
    },
    {
      "query": "coffee machine espresso",
      "popularity": 134.0,
      "metadata": {
        "hitCount": 134
      }
    }
  ]
}'

Batch Processing Tips

When indexing large numbers of queries:

  • Process in batches of up to 128 queries per request (default limit)
  • Monitor the response for any errors in the errors array
  • Consider implementing retry logic for failed requests

Implementing Search-as-you-type

Frontend HTML Page

Create a simple HTML page with typeahead functionality. This creates a search interface where typing displays a dropdown list of suggestions below the input field:

Typeahead Interface

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Typeahead Search</title>
    <style>
        .search-container {
            width: 400px;
            margin: 50px auto;
            position: relative;
        }

        #search-input {
            width: 100%;
            padding: 10px;
            font-size: 16px;
            border: 2px solid #ddd;
            border-radius: 4px;
        }

        #suggestions {
            position: absolute;
            top: 100%;
            left: 0;
            right: 0;
            background: white;
            border: 1px solid #ddd;
            border-top: none;
            max-height: 200px;
            overflow-y: auto;
            z-index: 1000;
        }

        .suggestion-item {
            padding: 10px;
            cursor: pointer;
            border-bottom: 1px solid #eee;
        }

        .suggestion-item:hover {
            background-color: #f0f0f0;
        }
    </style>
</head>
<body>
    <div class="search-container">
        <input type="text" id="search-input" placeholder="Search for products..." />
        <div id="suggestions"></div>
    </div>

    <script>
        let debounceTimeout;

        document.getElementById('search-input').addEventListener('input', function(e) {
            const query = e.target.value.trim();
            const suggestionsDiv = document.getElementById('suggestions');

            clearTimeout(debounceTimeout);

            debounceTimeout = setTimeout(async () => {
                if (query.length >= 2) {
                    try {
                        const response = await fetch('/api/suggestions', {
                            method: 'POST',
                            headers: {
                                'Content-Type': 'application/json'
                            },
                            body: JSON.stringify({
                                q: query,
                                limit: 8,
                                fuzzyEditDistance: 1,
                                minFuzzyMatchLength: 10,
                                popularityWeight: 0.4,
                                bm25Weight: 0.6
                            })
                        });

                        const data = await response.json();
                        displaySuggestions(data.suggestions || [], suggestionsDiv);
                    } catch (error) {
                        console.error('Error fetching suggestions:', error);
                        suggestionsDiv.innerHTML = '';
                    }
                } else {
                    suggestionsDiv.innerHTML = '';
                }
            }, 200);
        });

        function displaySuggestions(suggestions, container) {
            container.innerHTML = '';

            suggestions.forEach(suggestion => {
                const div = document.createElement('div');
                div.className = 'suggestion-item';
                div.textContent = suggestion.suggestion;

                div.addEventListener('click', () => {
                    document.getElementById('search-input').value = suggestion.suggestion;
                    container.innerHTML = '';
                    // Redirect to search results
                    window.location.href = `/search?q=${encodeURIComponent(suggestion.suggestion)}`;
                });

                container.appendChild(div);
            });
        }
    </script>
</body>
</html>

Python Proxy Server

Create a Python Flask server that proxies requests to Marqo Cloud:

from flask import Flask, request, jsonify, render_template_string
import requests
import os

app = Flask(__name__)

# Configuration
MARQO_ENDPOINT = "your_endpoint"  # Replace with your Marqo Cloud endpoint
MARQO_API_KEY = "your_api_key"  # Replace with your API key
INDEX_NAME = "ecommerce-products"


@app.route("/")
def index():
    # Serve the HTML page (you can also serve as a static file)
    with open("index.html", "r") as f:
        return f.read()


@app.route("/api/suggestions", methods=["POST"])
def get_suggestions():
    data = request.get_json()
    query = data.get("q", "").strip()

    if len(query) < 2:
        return jsonify({"suggestions": []})

    # Prepare request to Marqo Cloud
    marqo_url = f"{MARQO_ENDPOINT}/indexes/{INDEX_NAME}/suggestions"
    headers = {"x-api-key": MARQO_API_KEY, "Content-Type": "application/json"}

    payload = {
        "q": query,
        "limit": data.get("limit", 8),
        "fuzzyEditDistance": data.get("fuzzyEditDistance", 2),
        "popularityWeight": data.get("popularityWeight", 0.4),
        "bm25Weight": data.get("bm25Weight", 0.6),
    }

    try:
        # Make request to Marqo Cloud
        response = requests.post(marqo_url, json=payload, headers=headers)
        response.raise_for_status()

        return jsonify(response.json())

    except requests.RequestException as e:
        app.logger.error(f"Error calling Marqo API: {str(e)}")
        return jsonify({"suggestions": []}), 500


@app.route("/search")
def search():
    query = request.args.get("q", "")
    # Implement your search results page here
    return f"<h1>Search Results for: {query}</h1><p>Implement your search results here.</p>"


if __name__ == "__main__":
    app.run(debug=True)

Tuning Suggestions

Adjusting Fuzzy Matching

Fuzzy matching helps handle typos and variations. Tune these parameters based on your use case:

Good for technical terms, model numbers:

curl -XPOST 'your_endpoint/indexes/ecommerce-products/suggestions' \
-H 'x-api-key: XXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
  "q": "iphn",
  "limit": 8,
  "fuzzyEditDistance": 1,
  "minFuzzyMatchLength": 4,
  "popularityWeight": 0.4,
  "bm25Weight": 0.6
}'

Good for general search, brand names:

curl -XPOST 'your_endpoint/indexes/ecommerce-products/suggestions' \
-H 'x-api-key: XXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
  "q": "bluetoth",
  "limit": 8,
  "fuzzyEditDistance": 3,
  "minFuzzyMatchLength": 2,
  "popularityWeight": 0.4,
  "bm25Weight": 0.6
}'

Balancing Popularity and Relevance

The scoring system combines popularity (how often queries are searched) with BM25 relevance (how well the input matches the query text):

Great for trending topics, seasonal items:

curl -XPOST 'your_endpoint/indexes/ecommerce-products/suggestions' \
-H 'x-api-key: XXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
  "q": "headphones",
  "limit": 8,
  "fuzzyEditDistance": 2,
  "popularityWeight": 0.8,
  "bm25Weight": 0.2
}'

Better for precise matching, technical searches:

curl -XPOST 'your_endpoint/indexes/ecommerce-products/suggestions' \
-H 'x-api-key: XXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
  "q": "headphones",
  "limit": 8,
  "fuzzyEditDistance": 2,
  "popularityWeight": 0.2,
  "bm25Weight": 0.8
}'

Good general-purpose setting:

curl -XPOST 'your_endpoint/indexes/ecommerce-products/suggestions' \
-H 'x-api-key: XXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
  "q": "headphones",
  "limit": 8,
  "fuzzyEditDistance": 2,
  "popularityWeight": 0.4,
  "bm25Weight": 0.6
}'

Best Practices

Query Management

  • Index popular queries from search logs regularly
  • Update popularity scores based on actual usage patterns
  • Remove outdated or irrelevant queries periodically
  • Add seasonal/trending queries proactively
  • Process query updates in batches of up to 128 queries

Configuration

  • Start with balanced weights (0.4 popularity, 0.6 BM25)
  • Use fuzzy distance of 1-2 for most use cases
  • Set min fuzzy length to 3+ for better precision
  • Limit suggestions to 5-10 for optimal UX

Performance Optimization

  • Implement caching strategies for frequently requested queries
  • Use debouncing on frontend (200-300ms)
  • Consider async API calls for better performance
  • Monitor response times and adjust accordingly

User Experience

  • Show suggestions after 2+ characters
  • Highlight matching text in suggestions
  • Provide keyboard navigation (up/down arrows)
  • Handle empty states gracefully

Monitoring and Analytics

  • Track which suggestions users actually select
  • Monitor suggestion quality metrics and effectiveness
  • Analyze user behavior patterns to improve suggestion relevance
  • Set up alerts for performance degradation

Typeahead is a powerful tool for improving search experience. By following these practices, implementing proper monitoring, and regularly analyzing user behavior, you can create a suggestion system that truly helps users find what they're looking for faster and more effectively.