Skip to content

[Question] Object Detection running with UMat and/or OpenCL target noticeably slower #117

Open
@angryGoat500

Description

@angryGoat500

Hey everyone

I have a question regarding the Transparent API / Preferable Target and hope someone can help me understand.

My Object Detection program takes a lot longer to process images when using
net.setPreferableTarget(cv2.dnn.DNN_TARGET_OPENCL_FP16)
or
image = cv2.imread(filePath, cv2.COLOR_BGR2RGB) uMat = cv2.UMat(image)

I've created 4 benchmark programs running sequentually, processing the same 10 .jpg files.

My Baseline is a standard openCV object detection programm, not using the setPreferableTarget or UMat class for images.
The second one sets the setPreferableTarget to cv2.dnn.DNN_TARGET_OPENCL_FP16
The third converts images into UMat objects
The fourth sets the setPreferableTarget to cv2.dnn.DNN_TARGET_OPENCL_FP16 and converts images into UMat objects.

I always measured the full processing time starting before I read the image, ending after drawing the labels (excluding writing the output image or detection log) as well as the model inference time with
t, _ = net.getPerfProfile() infTime = (t / cv2.getTickFrequency())

The collected output is as follows:

Benchmark One Full Processing Time: 2.78063s
Benchmark One Model Inference Time: 1.030843s

Benchmark Two Full Processing Time: 3.2567s
Benchmark Two Model Inference Time: 1.12314s

Benchmark Three Full Processing Time: 12.76886s
Benchmark Three Model Inference Time: 10.83879s

Benchmark Four Full Processing Time: 13.43161047s
Benchmark Four Model Inference Time: 11.27375169s

Is there such a large gap between CPU and GPU execution because of the data transferrel between the processing units? Am I missing something crucial?

If this big gap difference can be explained by the data transfer, is there a possibility to "bundle" my workload to reduce the amount of transferrals?

I can provide the full code for these benchmark Programs if they should be helpful.

Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions