Open
Description
Is there an existing issue for this?
- I have searched the existing issues
Is your feature request related to a problem? Please describe.
Understood. Here’s the updated and concise version:
🔧 Milvus Highlight Design (v0.2)
✅ Schema Definition
Highlight is enabled per field in the collection schema:
{
"field_name": "content",
"data_type": "VarChar",
"enable_highlight": true,
"highlight_config": {
"mode": "bm25" | "colbert" | "rerank"
}
}
🔍 Query API Extension
Highlight must specify the target field:
{
"highlight": {
"field": "content",
"top_k": 5 // optional, for rerank/colbert
}
}
🧾 Response Format
{
"field": "content",
"text": "...",
"highlights": [
{"start": 15, "end": 22, "score": 0.81}
]
}
Modes
BM25 (keyword)
Based on N-GRAM index.
Highlights matched keywords.
Light-weight, fast.
ColBERT (multi-vector)
Requires ColBERT-based index.
Highlights document tokens with highest dot-product to query tokens.
Token-level semantic match.
ReRank (cross-encoder)
Works with rerank models like BGE-Reranker.
Highlights tokens with high attention/logit score.
High quality, slower.
See https://medium.com/%40hlealpablo/extracting-token-level-importance-from-cross-encoders-for-quick-simple-highlighting-ee4cca36764b for details
Describe the solution you'd like.
No response
Describe an alternate solution.
No response
Anything else? (Additional Context)
No response