Replies: 3 comments 1 reply
-
@Ygg01 Section 3.1.2 (which covers about 2 pages) describes in details vectorized classification. Have you read all of the text (about a full page)? Do not start from the figure. It is the wrong way to figure it out. You must read the text first. You may also enjoy these posts:
|
Beta Was this translation helpful? Give feedback.
-
First off thanks for that lovely document. I read it and watched the presentation - pure 10/10 stuff.
(Section 3.1.2 pg 7->8) Emphasis mine. When I compare it to code from simd-json.rs or simdjson it's not what I expect to see. -8, 0, 17, 2, 0, 4, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0 // from document
+8, 0, 18, 4, 0, 1, 0, 1, 0, 0, 0, 3, 2, 1, 0, 0 // from code Doing some backwards calculation, I can compute that desired values for classifier were shifted a bit.
However, that doesn't explain To simplify my questions:
|
Beta Was this translation helpful? Give feedback.
-
Think there is a misunderstanding. My problem's with the code, not the paper. Both paper high nibble and my own high nibbles agree. Alright. Let me show my work. I tried to find them independently for my own code. For these classifier values:
I constructed the following low/high nibble table. I
Based on that, I get the following values low nibbles Verification step: I will see what simdjson uses. Lines 50 to 51 in bf78341 The low nibbles are correct, but high nibbles value differ. Why? Additionally, the nibbles differ on bytes Question: What is causing the difference? Am I missing a step? Or is this an error in high nibbles? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Reading SIMD Json paper pg 8 I note the high nibble is
[8, 0, 17, 2, 0, 4, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0]
while the high nibble value issimdjson/src/arm64.cpp
Line 51 in bf78341
I was able to reverse engineer this, somewhat...
My question is where does
3
,2
,1
come from? I ask because I have different set of characters to check, and not sure how to adapt it.Is this used for UTF8 tests?
Beta Was this translation helpful? Give feedback.
All reactions