Skip to content

Optimize disk cache cleanup using Min-heap for partial sorting #3815

Open
@KYHyeon

Description

@KYHyeon

Summary

I'd like to propose an optimization for the disk cache cleanup process in SDWebImage. Currently, when the cache size exceeds the limit, the implementation sorts all cache files before deletion. This proposal suggests using a Min-heap data structure to improve performance when only a portion of files need to be removed.

Current Implementation

In SDDiskCache.m, the removeExpiredData method performs the following:

// Sort the remaining cache files by their last modification time or last access time (oldest first).
NSArray<NSURL *> *sortedFiles = [cacheFiles keysSortedByValueWithOptions:NSSortConcurrent
                                                         usingComparator:^NSComparisonResult(id obj1, id obj2) {
                                                             return [obj1[cacheContentDateKey] compare:obj2[cacheContentDateKey]];
                                                         }];

// Delete files until we fall below our desired cache size.
for (NSURL *fileURL in sortedFiles) {
    if ([self.fileManager removeItemAtURL:fileURL error:nil]) {
        // ... size tracking ...
        if (currentCacheSize < desiredCacheSize) {
            break;
        }
    }
}

While NSSortConcurrent provides excellent performance through parallel sorting, it still sorts the entire collection even when only a fraction of files need to be deleted.

Performance

Time Complexity Comparison

Current approach (Concurrent Sort): O(n log n) / p + O(k)
Where n = total files, k = files to delete, p = processor count

Proposed approach (Min-heap): O(n) + O(k log n)
Build heap: O(n)
Extract k minimum elements: O(k log n)

Additional Benefits Considering Cache Characteristics

Due to the nature of image caching, cache cleanup exhibits the following patterns:

  1. Frequent cleanup: The more frequently users use the app, the more often cache cleanup occurs
  2. Incremental growth: Since cache grows incrementally, the number of files to delete at each cleanup is relatively small

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions