Image Retrieval with IBM watsonx.data and Milvus (Vector) Database : A Deep Dive into Similarity Search

Published in

IBM Data Science in Practice

5 min readApr 9, 2024

What is Milvus?

Milvus is an open-source vector database specifically designed for efficient similarity search across large datasets. Unlike traditional relational databases, Milvus excels at storing and searching high-dimensional vectors, which are mathematical representations that capture the essence of an image.

The Power of Embeddings

To leverage Milvus for image retrieval, we don’t store the raw images themselves. Instead, we use pre-trained deep learning models like VGG or ResNet to extract feature vectors from the images. These vectors act as condensed representations, encapsulating the image’s visual content.

Image retrieval search architecture

The architecture follows a typical machine learning workflow for image retrieval. Here’s a simplified explanation of how it works:

The user submits a query image or a text description of the image they want to find.
The system selects an appropriate image recognition model and processing pipeline based on the input.
The input image is pre-processed and converted into an image embedding using the Towhee pipeline.
The image embedding is then queried against the Milvus database containing embeddings of all the images in the dataset.
The system retrieves the most similar images based on the nearest neighbour search and presents them to the user.

Building the Image Search Pipeline

1. Data Preparation

Here we use a subset of the ImageNet dataset (100 classes). You can follow command below to download the data. The example data is organized as follows:

- train: directory of candidate images, 10 images per class from ImageNet train data

- test: directory of query images, 1 image per class from ImageNet test data

- reverse_image_search.csv: a csv file containing *id, path, and label* for each candidate image

curl -L https://github.com/towhee-io/examples/releases/download/data/reverse_image_search.zip -O
unzip -q -o reverse_image_search.zip

2. Import necessary packages and library

In this section, we will learn how to build the image search engine using Towhee. Towhee is a framework that provides ETL for unstructured data using SoTA machine learning models. It allows to create data processing pipelines. It also has built-in operators for different purposes, such as generating image embeddings, inserting data into Milvus collection, and querying across Milvus collection

We import packages & set parameters at the beginning. You are able to change parameters according to your needs and environment. Please note that the embedding dimension `DIM` should match the selected model name `MODEL`.

! python -m pip install -q towhee opencv-python pillow

import csv
from glob import glob
from pathlib import Path
from statistics import mean

from towhee import pipe, ops, DataCollection
from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection, utility


# Towhee parameters
MODEL = 'resnet50'
DEVICE = None # if None, use default device (cuda is enabled if available)

# Milvus parameters
HOST = '<Milvus-host-name received from watsonx.data service>'
PORT = '<port from watsonx.data Milvus service>'
SERVER_NAME = '<server-name>'
USER = 'ibmlhapikey'
PASSWORD = 'your-ibm-api-key'
TOPK = 10
DIM = 2048 # dimension of embedding extracted by MODEL
COLLECTION_NAME = 'reverse_image_search'
INDEX_TYPE = 'IVF_FLAT'
METRIC_TYPE = 'L2'

# path to csv (column_1 indicates image path) OR a pattern of image paths
INSERT_SRC = 'reverse_image_search.csv'
QUERY_SRC = './test/*/*.JPEG'

3. Feature Extraction:

Select a pre-trained deep learning model for image feature extraction (e.g., VGG16, ResNet-50).
Use a deep learning library like PyTorch or TensorFlow to load the model and extract feature vectors for each image.

# Load image path
def load_image(x):
    if x.endswith('csv'):
        with open(x) as f:
            reader = csv.reader(f)
            next(reader)
            for item in reader:
                yield item[1]
    else:
        for item in glob(x):
            yield item
            
# Embedding pipeline
p_embed = (
    pipe.input('src')
        .flat_map('src', 'img_path', load_image)
        .map('img_path', 'img', ops.image_decode())
        .map('img', 'vec', ops.image_embedding.timm(model_name=MODEL, device=DEVICE))
)

Store the extracted feature vectors along with a unique identifier for each image (e.g., filename).

4. Build Search Pipeline

There are three steps to implement image search build pipeline

4.1 Create a Milvus collection

Define a schema for your collection in Milvus, specifying data types for image IDs and feature vectors (usually floats).
Create a collection in Milvus using the defined schema.
Use the Milvus client library (Python, Java, etc.) to insert the image IDs and their corresponding feature vectors into the collection.
Build an index on the feature vectors in Milvus. This optimizes search performance by creating a data structure for efficient similarity search.

# Create milvus collection (delete first if exists)
def create_milvus_collection(collection_name, dim):
    if utility.has_collection(collection_name):
        utility.drop_collection(collection_name)
    
    fields = [
        FieldSchema(name='path', dtype=DataType.VARCHAR, description='path to image', max_length=500, 
                    is_primary=True, auto_id=False),
        FieldSchema(name='embedding', dtype=DataType.FLOAT_VECTOR, description='image embedding vectors', dim=dim)
    ]
    schema = CollectionSchema(fields=fields, description='reverse image search')
    collection = Collection(name=collection_name, schema=schema)

    index_params = {
        'metric_type': METRIC_TYPE,
        'index_type': INDEX_TYPE,
        'params': {"nlist": 2048}
    }
    collection.create_index(field_name='embedding', index_params=index_params)
    return collection

Connect to Milvus with `HOST` & `PORT` and other information; create collection with `COLLECTION_NAME` & `DIM`:

try:
    connections.connect(host=HOST, port=PORT, secure=True, server_name=SERVER_NAME, user=USER, password=PASSWORD)
    print('Milvus Database connected successfully.')

    # Create collection
    collection = create_milvus_collection(COLLECTION_NAME, DIM)
    print(f'A new collection created: {COLLECTION_NAME}')

except Exception as e:
    print(f'Error connecting to Milvus Database: {e}')

4.2 Data Insert

This step uses an Insert Pipeline to insert image embeddings into Milvus collection.

p_insert = (
            p_embed.map(('img_path', 'vec'), 'mr', ops.ann_insert.milvus_client(
                        host=HOST,
                        port=PORT,
                        user=USER, 
                        password=PASSWORD,
                        collection_name=COLLECTION_NAME
                        ))
              .output('mr')
    )
print('Insert successful')

Insert all candidate images for `INSERT_SRC`

# Insert data
p_insert(INSERT_SRC)

4.3 Search pipeline

Preprocess the query image following the same steps as data preparation.
Use the same deep learning model to extract the feature vector for the query image.
Use the Milvus client library to submit the query vector for a nearest neighbour search. Specify the desired number of nearest neighbours to retrieve.
Milvus searches the collection based on the chosen distance metric (e.g., L2 distance) and returns the IDs of the nearest neighbor feature vectors.

p_search_pre = (
        p_embed.map('vec', ('search_res'), ops.ann_search.milvus_client(
                    host=HOST, port=PORT,   secure=True, 
                    server_name=SERVER_NAME, 
                    user=USER, 
                    password=PASSWORD ,limit=TOPK,
                    collection_name=COLLECTION_NAME))
               .map('search_res', 'pred', lambda x: [str(Path(y[0]).resolve()) for y in x])
)
p_search = p_search_pre.output('img_path', 'pred')

Use the retrieved image to fetch the corresponding image information (e.g., filenames) from your storage system.
Display the retrieved images to the user, ranked based on their similarity to the query image.

Query an example image ‘test/goldfish/*.JPEG’

# Search for example query image(s)
collection.load()
dc = p_search('test/goldfish/*.JPEG')

# Display search results with image paths
DataCollection(dc).show()

import cv2
from towhee.types.image import Image

def read_images(img_paths):
    imgs = []
    for p in img_paths:
        imgs.append(Image(cv2.imread(p), 'BGR'))
    return imgs

p_search_img = (
    p_search_pre.map('pred', 'pred_images', read_images)
                .output('img', 'pred_images')
)
DataCollection(p_search_img('test/goldfish/*.JPEG')).show()

Query output would look like below where “img” is an input seach image and “pred_images” are retrieved from Milvus based on nearest neighbour search.

Benefits of Milvus for Image Retrieval

Scalability: Milvus handles massive image collections efficiently, making it ideal for large-scale applications.
Performance: Vector similarity search is significantly faster compared to traditional methods, enabling real-time image retrieval.
Flexibility: Milvus supports various distance metrics and indexing algorithms, allowing you to customize the search based on your specific needs.

Conclusion

With its focus on efficient vector similarity search, Milvus empowers you to build robust and scalable image retrieval systems. Whether you’re managing a personal photo library or developing a commercial image search application, Milvus offers a powerful foundation for unlocking the hidden potential within your image collections