The Wayback Machine - https://web.archive.org/web/20220126003712/https://github.com/dotnet/aspnetcore/pull/39743
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use IndexOfAny(char, char) with Span #39743

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

@martincostello
Copy link
Contributor

@martincostello martincostello commented Jan 25, 2022

Use ROS.IndexOfAny(char, char)

Use IndexOfAny(char, char) with ReadOnlySpan<char>.

Description

Use AsSpan().IndexOfAny(char, char) rather than pre-allocate a static array for use with IndexOfAny(params char[]) on a string.

I spotted the array in DefaultFileVersionProvider while working on #39738 so thought I'd see if there were any that could be removed in lieu of overloads that just took the characters without an array. This PR updates all the ones I found that were in projects with TFMs that supported it, except for the usages in RoutePatternFactory that use an array of 5 characters in multiple places.

I wrote a quick microbenchmark comparing the two approaches with .NET 6, and using the span version does appear to give a performance improvement.

Benchmarks

Results

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1466 (21H1/May2021Update)
Intel Core i9-9980HK CPU 2.40GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK=7.0.100-alpha.1.22068.17
  [Host]     : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT
  DefaultJob : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT

Method Mean Error StdDev Ratio RatioSD Allocated
IndexOfAny_String 8.037 ns 0.1816 ns 0.2018 ns 1.00 0.00 -
IndexOfAny_Span_Array 6.101 ns 0.1178 ns 0.1044 ns 0.76 0.02 -
IndexOfAny_Span_Two_Chars 5.013 ns 0.0341 ns 0.0302 ns 0.62 0.01 -

Code

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Diagnosers;
using BenchmarkDotNet.Running;

BenchmarkRunner.Run<IndexOfAnyBenchmarks>();

[MemoryDiagnoser]
public class IndexOfAnyBenchmarks
{
    private const string AUrlWithAPathAndQueryString = "http://www.example.com/path/to/file.html?query=string";
    private static readonly char[] QueryStringAndFragmentTokens = new[] { '?', '#' };

    [Benchmark(Baseline = true)]
    public int IndexOfAny_String()
    {
        return AUrlWithAPathAndQueryString.IndexOfAny(QueryStringAndFragmentTokens);
    }

    [Benchmark]
    public int IndexOfAny_Span_Array()
    {
        return AUrlWithAPathAndQueryString.AsSpan().IndexOfAny(QueryStringAndFragmentTokens);
    }

    [Benchmark]
    public int IndexOfAny_Span_Two_Chars()
    {
        return AUrlWithAPathAndQueryString.AsSpan().IndexOfAny('?', '#');
    }
}
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net6.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>disable</Nullable>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="BenchmarkDotNet" Version="0.13.1" />
  </ItemGroup>
</Project>
Use AsSpan().IndexOfAny(char, char) rather than pre-allocate an array for string.IndexOfAny(params char[]).
Copy link
Contributor

@javiercn javiercn left a comment

Looks good to me!

@martincostello
Copy link
Contributor Author

@martincostello martincostello commented Jan 25, 2022

I just realised that based on the benchmark result for IndexOfAny_Span_Array, RoutePatternFactory could still benefit from the addition of AsSpan() even if the static array isn't removed. I can push up an extra commit that does that if wanted.

@SteveSandersonMS
Copy link
Member

@SteveSandersonMS SteveSandersonMS commented Jan 25, 2022

Thanks for contributing this, @martincostello!

I can sort of speculate about why the IndexOfAny(char, char) variant might be faster, if the implementation is special-cased to looking for two chars. It would be like pre-unrolling the loop versus the char[] version, avoiding the bounds checks on each iteration. Or there might be something fancier in the implementation involving bitwise operations over the two input chars to discount most target chars in a single test.

However I don't have any intuition for why IndexOfAny(char[]) would be faster over a ROS<char> than a string. It seems like if that's true, the runtime could always do this for you and provide the same speed gain. Does anyone have a sense of why this would be faster? Is it some historical baggage in the string-based interpretation?

@martincostello
Copy link
Contributor Author

@martincostello martincostello commented Jan 25, 2022

However I don't have any intuition for why IndexOfAny(char[]) would be faster over a ROS<char> than a string. It seems like if that's true, the runtime could always do this for you and provide the same speed gain. Does anyone have a sense of why this would be faster? Is it some historical baggage in the string-based interpretation?

A fair question. Looking at the source, my guess would be that maybe it's due to the aggressive inlining? Otherwise the implementations seem to ultimately go to the same place to me.

https://github.com/dotnet/runtime/blob/8d0c3d1ab1f16c3a1d15b8d9eb25cc920d936694/src/libraries/System.Private.CoreLib/src/System/MemoryExtensions.cs#L92-L99

https://github.com/dotnet/runtime/blob/8d0c3d1ab1f16c3a1d15b8d9eb25cc920d936694/src/libraries/System.Private.CoreLib/src/System/String.Searching.cs#L90-L98

@SteveSandersonMS
Copy link
Member

@SteveSandersonMS SteveSandersonMS commented Jan 25, 2022

Looking at the source, my guess would be that maybe it's due to the aggressive inlining? Otherwise the implementations seem to ultimately go to the same place to me.

Agreed, that looks like the only difference (assuming that GetRawStringData also ends up being inlined). Another slight difference is that with mystring.IndexOf, the runtime has to do a nullcheck on mystring and possibly throw. Whereas with AsSpan(this mystring), the nullcheck is done in .NET code and it definitely never throws.

So if I'm reading this correctly, the optimization in this PR does have a behavioral difference, in that the old version would throw for a null input, and the new one won't. However I don't think there can be a null input so that's OK.

@SteveSandersonMS
Copy link
Member

@SteveSandersonMS SteveSandersonMS commented Jan 25, 2022

The change in Components/..../Router.cs looks fine to me.

The changes in RouteCollection.cs and DefaultFileVersionProvider.cs also look very safe but @pranavkm @halter73 do either of you want to approve or reject those?

@javiercn
Copy link
Contributor

@javiercn javiercn commented Jan 25, 2022

The changes in RouteCollection.cs

This is old routing, so it doesn't really impact anything :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants