Description
Summary
I'd like to propose an optimization for the disk cache cleanup process in SDWebImage. Currently, when the cache size exceeds the limit, the implementation sorts all cache files before deletion. This proposal suggests using a Min-heap data structure to improve performance when only a portion of files need to be removed.
Current Implementation
In SDDiskCache.m, the removeExpiredData method performs the following:
// Sort the remaining cache files by their last modification time or last access time (oldest first).
NSArray<NSURL *> *sortedFiles = [cacheFiles keysSortedByValueWithOptions:NSSortConcurrent
usingComparator:^NSComparisonResult(id obj1, id obj2) {
return [obj1[cacheContentDateKey] compare:obj2[cacheContentDateKey]];
}];
// Delete files until we fall below our desired cache size.
for (NSURL *fileURL in sortedFiles) {
if ([self.fileManager removeItemAtURL:fileURL error:nil]) {
// ... size tracking ...
if (currentCacheSize < desiredCacheSize) {
break;
}
}
}
While NSSortConcurrent provides excellent performance through parallel sorting, it still sorts the entire collection even when only a fraction of files need to be deleted.
Performance
Time Complexity Comparison
Current approach (Concurrent Sort): O(n log n) / p + O(k)
Where n = total files, k = files to delete, p = processor count
Proposed approach (Min-heap): O(n) + O(k log n)
Build heap: O(n)
Extract k minimum elements: O(k log n)
Additional Benefits Considering Cache Characteristics
Due to the nature of image caching, cache cleanup exhibits the following patterns:
- Frequent cleanup: The more frequently users use the app, the more often cache cleanup occurs
- Incremental growth: Since cache grows incrementally, the number of files to delete at each cleanup is relatively small