Indexes
List indexes
GET /indexes
Example
curl http://localhost:8882/indexes
mq.get_indexes()
Response: 200 OK
{
"results": [
{
"index_name": "Book Collection"
},
{
"index_name": "Animal facts"
}
]
}
Create index
By default the settings look like this. Settings can be set as the index is created.
POST /indexes/{index_name}
Create and index with (optional) settings.
This endpoint accepts the application/json
content type.
Path parameters
Name | Type | Description |
---|---|---|
index_name |
String | name of the index |
Body Parameters
The settings for the index. The settings are represented as a nested JSON object.
Name | Type | Default value | Description |
---|---|---|---|
index_defaults |
Dictionary | "" |
The index defaults object |
number_of_shards |
Integer | 5 |
The number of shards for the index |
number_of_replicas |
Integer | 1 |
The number of replicas for the index |
Index Defaults Object
The index_defaults
object contains the default settings for the index. The parameters are as follows:
Name | Type | Default value | Description |
---|---|---|---|
treat_urls_and_pointers_as_images |
Boolean | "" |
Fetch images from pointers |
model |
String | hf/all_datasets_v4_MiniLM-L6 |
The model to use for the index |
normalize_embeddings |
Boolean | true |
Normalize the embeddings to have unit length |
text_preprocessing |
Dictionary | "" |
The text preprocessing object |
image_preprocessing |
Dictionary | "" |
The image preprocessing object |
Text Preprocessing Object
The text_preprocessing
object contains the specifics of how you want the index to preprocess text. The parameters are as follows:
Name | Type | Default value | Description |
---|---|---|---|
split_length |
Integer | 2 |
The length of the chunks after splitting by split_method |
split_overlap |
Integer | 0 |
The length of overlap between adjacent chunks |
split_method |
String | sentence |
The method by which text is chunked ('character', 'word', 'sentence', 'passage') |
Image Preprocessing Object
The image_preprocessing
object contains the specifics of how you want the index to preprocess images. The parameters are as follows:
Name | Type | Default value | Description |
---|---|---|---|
patch_method |
String | null |
The method by which images are chunked ('simple' and 'frcnn') |
Below is a sample index settings JSON object. When using the Python client, pass this dictionary as the settings_dict
parameter for the create_index
method.
{
"index_defaults": {
"treat_urls_and_pointers_as_images": false,
"model": "hf/all_datasets_v4_MiniLM-L6",
"normalize_embeddings": true,
"text_preprocessing": {
"split_length": 2,
"split_overlap": 0,
"split_method": "sentence"
},
"image_preprocessing": {
"patch_method": null
}
},
"number_of_shards": 5
}
Example
curl -XPOST 'http://localhost:8882/indexes/my-first-index' -H 'Content-type:application/json' -d '
{
"index_defaults": {
"treat_urls_and_pointers_as_images": false,
"model": "hf/all_datasets_v4_MiniLM-L6",
"normalize_embeddings": true,
"text_preprocessing": {
"split_length": 2,
"split_overlap": 0,
"split_method": "sentence"
},
"image_preprocessing": {
"patch_method": null
}
},
"number_of_shards": 5
}'
index_settings = {
"index_defaults": {
"treat_urls_and_pointers_as_images": False,
"model": "hf/all_datasets_v4_MiniLM-L6",
"normalize_embeddings": True,
"text_preprocessing": {
"split_length": 2,
"split_overlap": 0,
"split_method": "sentence"
},
"image_preprocessing": {
"patch_method": None
}
},
"number_of_shards": 5
}
mq.create_index("my-first-index", settings_dict=index_settings)