Evaluation Metrics
Available API-level Evaluation Metrics
Metric Name | Sub-metrics | Description |
---|---|---|
ndcg | NDCG@10, NDCG@100 | Normalized Discounted Cumulative Gain at different cut-off points, measuring the quality of ranking results. |
mrr | MRR@1000 | Mean Reciprocal Rank at a specified cut-off, indicating the average position of the first relevant result. |
mAP | MAP@1000 | Mean Average Precision at a specified cut-off, evaluating the precision across all relevant items. |
precision | P@10 | Precision at a specified cut-off, measuring the proportion of relevant items among the retrieved top results. |
recall | Recall@10, Recall@50 | Recall at different cut-off points, measuring how many relevant items are retrieved among all possible items. |
More evaluation metrics can be accessed by downloading the evaluation results using download api.
from marqtune.client import Client
url = "https://marqtune.marqo.ai"
api_key = "{api_key}"
marqtune_client = Client(url=url, api_key=api_key)
marqtune_client.evaluation("evaluation_id").download()
curl --location 'https://marqtune.marqo.ai/evaluations/{evaluation_id}/download/url \
--header 'x-api-key: {api_key}'
Detailed Description of Metrics
- NDCG (Normalized Discounted Cumulative Gain): This metric evaluates the effectiveness of ranking results by comparing the relevance of documents in the predicted order to an ideal ranking order. Higher values indicate better ranking quality.
- MRR (Mean Reciprocal Rank): MRR evaluates the average of the reciprocal ranks of the first relevant result across queries. The closer to 1, the better the ranking model.
- MAP (Mean Average Precision): MAP measures the mean precision scores for all queries, considering only relevant results. It is a comprehensive indicator of a system’s precision.
- Precision: Indicates the fraction of relevant results among the retrieved items at a specific rank threshold.
- Recall: Indicates the fraction of all possible relevant items that are successfully retrieved.
Example: Evaluation Metrics Configuration
evaluation_metrics = {
"NDCG@10": "",
"NDCG@100": "",
"MRR@1000": "",
"MAP@1000": "",
"P@10": "",
"Recall@10": "",
"Recall@50": "",
}