Skip to content

[Bug]: The query node seems to be returning offset+limit rows instead of trimming the offset #42559

Open
@mcamou

Description

@mcamou

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.5.1
- Deployment mode(standalone or cluster): standalone
- MQ type(rocksmq, pulsar or kafka):  pulsar
- SDK version(e.g. pymilvus v2.0.0rc2): go-milvus 2.5.3
- OS(Ubuntu or CentOS): Ubuntu

Current Behavior

We have a use case (schema modifications) where we need to query all the data in a collection and insert it into a new one. We currently have a collection with ~1.2M rows. The way we are going about it is, do an empty Query within a loop, using Offset and Limit to do 10k-row batches. This works well in standalone mode. However, in Cluster mode we see the following error in the tenth loop:

/workspace/source/pkg/tracer/stack_trace.go:51 github.com/milvus-io/milvus/pkg/v2/tracer.StackTrace
/workspace/source/internal/util/grpcclient/client.go:575 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call
/workspace/source/internal/util/grpcclient/client.go:589 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall
/workspace/source/internal/distributed/querynode/client/client.go:106 github.com/milvus-io/milvus/internal/distributed/querynode/client.wrapGrpcCall[...]
/workspace/source/internal/distributed/querynode/client/client.go:231 github.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).Query
/workspace/source/internal/proxy/task_query.go:612 github.com/milvus-io/milvus/internal/proxy.(*queryTask).queryShard
/workspace/source/internal/proxy/lb_policy.go:210 github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).ExecuteWithRetry.func1
/workspace/source/pkg/util/retry/retry.go:44 github.com/milvus-io/milvus/pkg/v2/util/retry.Do
/workspace/source/internal/proxy/lb_policy.go:179 github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).ExecuteWithRetry
/workspace/source/internal/proxy/lb_policy.go:246 github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).Execute.func1: rpc error: code = ResourceExhausted desc = grpc: trying to send message larger than max (548853608 vs. 536870912)

After doing some code sleuthing I tried modifying the grpc limits. I narrowed it down to queryNode.grpc.{clientMaxRecvSize,serverMaxSendSize}. When I duplicated those (setting them to 1GiB = 1073741824) the number of records I was able to retrieve also doubled to 200k. This leads me to believe that the queryNode is doing the query, retrieving offset+limit rows, and then returning them, instead of trimming off the offset.

Expected Behavior

I should be able to use offset+limit to retrieve any number of records as long as the limit is reasonable.

Steps To Reproduce

Milvus Log

No response

Anything else?

No response

Metadata

Metadata

Assignees

Labels

help wantedExtra attention is neededkind/bugIssues or changes related a bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions