Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upPERF: faster Hog computation #4957
Conversation
@WeiChungChang thanks for this contribution! Apologies that you haven't received a response yet. On first glance this is great! I'd like to request a few things:
After that I think we should be good — maybe some small tweaks to the code but nothing major. Thank you again! |
if fast == 1: | ||
print("use fast") | ||
else: | ||
print("use original") |
jni
Sep 10, 2020
Contributor
Suggested change
if fast == 1:
print("use fast")
else:
print("use original")
if fast == 1: | |
print("use fast") | |
else: | |
print("use original") |
Current Hog takes use of 3 loops, ex: for 9 bins, 128 * 128 cells.
outer loop - go through each bins.
middle loop - go through each rows.
inner loop - go through each columns.
However, 2 loops should be beter:
outer loop - go through each rows.
inner loop - go through each columns.
then divide orientation by number_of_orientations_per_180 and put result into corresponding bin.
Ex, if number_of_orientations_per_180 is 20 (divide 180 degree by 9 bins so each bin's range is 20 degree);
if current pixel has orientation = 45, it locates at (45/20) = 2nd bin.
if current pixel has orientation = 18, it locates at (18/20) = 0th bin.
and so on.
The experiment shows this way can speed up the calculation about 10% ~ 20% of whole flow.
Since the execution time is dominated by (1) calculation bin distribution for each cell, and (2) normalization,
it implies for (1)only we speed up to 20~40%(may by more).
Here provided a test patch and attached a file to measure the difference.
It shows the difference is small enough to be ignored.
Roughly test with script attached shows the difference between original flow and refined flow is very small (use sum(abs(diff)) and for 4K image it is far smaller than 0.01).
The proposal patch keeps original flow, so a performance comparison can be executed by calling hog(....., fast=True) to select 2 loops method.
Please consider to rework current logic to provide better performance.