Faster ParseHeaders #39216

EgorBo · 2021-12-28T16:12:01Z

Judging by the native traces from Platform-Plaintext TE benchmark (linux-x64) it seems we spend some noticeable time in ParseHeaders:

The current algorithm walks span of data to find "\r\n" and then it tries to extract name and value while doing some validation checks and trimming value. My implementation walks span just once, first it tries to find ":" using a sort of IndexOfAny(span, ':', ' ', '\t', '\n', '\r') function to find ':''s position and makes sure none of illegal symbols come before it.

I haven't done Arm64 support (waiting for some feedback on this one first) and haven't tested AVX vs SSE yet, but from what I have here it seems to show quite some stable improvements for Platform-Plaintext (up to +250_000 RPS):

                                             baseline                                mychanges               diff
[] | CPU Usage (%)          |                                   92 |                                   93 |  +1.09% |
[] | Cores usage (%)        |                                2,590 |                                2,617 |  +1.04% |
[] | Working Set (MB)       |                                   37 |                                   37 |   0.00% |
[] | Private Memory (MB)    |                                  370 |                                  370 |   0.00% |
[] | Start Time (ms)        |                                    0 |                                    0 |         |
[] | First Request (ms)     |                                   62 |                                   60 |  -3.23% |
[] | Requests/sec           |                           11,625,062 |                           11,875,956 |  +2.16% |
[] | Requests               |                          175,435,163 |                          179,301,568 |  +2.20% |
[] | Mean latency (ms)      |                                 1.21 |                                 1.16 |  -4.13% |
[] | Max latency (ms)       |                                53.73 |                                61.02 | +13.57% |
[] | Bad responses          |                                    0 |                                    0 |         |
[] | Socket errors          |                                    0 |                                    0 |         |
[] | Read throughput (MB/s) |                             1,392.64 |                             1,423.36 |  +2.21% |
[] | Latency 50th (ms)      |                                 0.71 |                                 0.69 |  -2.95% |
[] | Latency 75th (ms)      |                                 1.07 |                                 1.04 |  -2.80% |
[] | Latency 90th (ms)      |                                 1.79 |                                 1.78 |  -0.56% |
[] | Latency 99th (ms)      |                                13.91 |                                12.92 |  -7.12% |

(max RPS for my changes were around 11,913,453 req/s)

Test methodology

I was using crank like this:

crank --profile aspnet-citrine-lin \
 --application.framework net7.0 \
 --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/platform.benchmarks.yml \
 --scenario plaintext \
 --json t1.json \
 --application.options.outputFiles "/path/to/Microsoft.AspNetCore.Server.Kestrel.Core.dll"

where Microsoft.AspNetCore.Server.Kestrel.Core.dll is either a "baseline" or "baseline + my changes"

Also, I didn't optimize the Multi-span case (where header is split between two reads) but from what I see it happens quite rarely (I can print some statistics from our benchmarks if you need it).

The text was updated successfully, but these errors were encountered:

Faster ParseHeaders

260f34c

msftbot bot added the area-runtime label Dec 28, 2021

Nov	DEC	Sep
	28
2020	2021	2024

dotnet / aspnetcore Public

Faster ParseHeaders #39216

Faster ParseHeaders #39216

EgorBo commented Dec 28, 2021 •

edited

dotnet / aspnetcore Public

Faster ParseHeaders #39216

Are you sure you want to change the base?

Faster ParseHeaders #39216

Conversation

EgorBo commented Dec 28, 2021 • edited

Test methodology

EgorBo commented Dec 28, 2021 •

edited