[Question] Object Detection running with UMat and/or OpenCL target noticeably slower

Hey everyone

I have a question regarding the Transparent API / Preferable Target and hope someone can help me understand.

My Object Detection program takes a lot longer to process images when using
`net.setPreferableTarget(cv2.dnn.DNN_TARGET_OPENCL_FP16)`
or
`image = cv2.imread(filePath, cv2.COLOR_BGR2RGB)
 uMat = cv2.UMat(image)`

I've created 4 benchmark programs running sequentually, processing the same 10 .jpg files.

My Baseline is a standard openCV object detection programm, not using the setPreferableTarget or UMat class for images.
The second one sets the setPreferableTarget to cv2.dnn.DNN_TARGET_OPENCL_FP16
The third converts images into UMat objects
The fourth sets the setPreferableTarget to cv2.dnn.DNN_TARGET_OPENCL_FP16 and converts images into UMat objects.

I always measured the full processing time starting before I read the image, ending after drawing the labels (excluding writing the output image or detection log) as well as the model inference time with 
`t, _ = net.getPerfProfile()
 infTime = (t /  cv2.getTickFrequency())`

The collected output is as follows:

Benchmark One Full Processing Time: 2.78063s
Benchmark One Model Inference Time: 1.030843s

Benchmark Two Full Processing Time: 3.2567s
Benchmark Two Model Inference Time: 1.12314s

Benchmark Three Full Processing Time: 12.76886s
Benchmark Three Model Inference Time: 10.83879s

Benchmark Four Full Processing Time: 13.43161047s
Benchmark Four Model Inference Time: 11.27375169s


Is there such a large gap between CPU and GPU execution because of the data transferrel between the processing units? Am I missing something crucial?

If this big gap difference can be explained by the data transfer, is there a possibility to "bundle" my workload to reduce the amount of transferrals?

I can provide the full code for these benchmark Programs if they should be helpful.

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] Object Detection running with UMat and/or OpenCL target noticeably slower #117

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Question] Object Detection running with UMat and/or OpenCL target noticeably slower #117

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions