New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flattend json access for the tape #91
Comments
I'm not super familiar with pikkr and got to look at the paper it references given I find the time for it. I think the tape parsing That said, again I got to do a bit more research to say one way or the other it'd bump performance, and i sure will :) thanks for bringing bringing this up! |
I want to do some thing this. Example json: {
"name": "Licenser",
"skills": {
"language": "Rust"
}
} In order to get the language. The parser takes this flattened key I'm keeping this open for tracking <3. Thanks for looking into the issue. |
With the tape that shouldn't be too hard I think. It'd just be a traversal of the array with keeping nesting in check. I really like the idea it'd allow some flexibility on access. I'll mark it as a good first issue and help wanted, if you or anyone is interested in grabbing it I'll gladly put some time aside to pair on it or help otherwise. |
renamed (as it moved from question to feature) and assigned to 0.3 goal |
@Licenser I was just looking at pikkr's benchmark(s), it looks like we might be able to do a quick apples-to-apples comparison pretty easily. https://github.com/pikkr/pikkr/blob/master/benches/parser.rs @balajijinnah I also wanted to mention that a comparison of the approaches (not necessarily the current implementations) is provided in the "related work" section of @lemire 's SIMDJSON paper: https://arxiv.org/pdf/1902.08318.pdf Thank you for sharing this! |
Also, a JSONpath tool is part of SIMDJSON; presumably, this could be ported to simdjson-rs: https://github.com/lemire/simdjson/blob/master/tools/jsonpointer.cpp |
there is also this: https://github.com/pikkr/rust-json-parser-benchmark for benchmarks |
I put a 'looking for contributors out' https://users.rust-lang.org/t/twir-call-for-participation/4821/285 - the issue is nicely self contained and a great chance for someone to get their feet wet and perhaps learn or practice some rust :) |
Hello, I would like to take this feature, is anyone already working on it? |
Hi @miker1423 that's awesome :) and no not to my knowledge, Sunny and me stayed away from it since it's such a nice one to get started. If you got any questions, get stuck or just have general questions feel free to ask any time! When you open a PR just let us know how you prefer the review and what your goal is, if it's about learning we'll gladly go over it line by line and add suggestions, if it's about contributing then we'll have it through with as little hassle as possible :). |
Thanks! I'll start as soon as posible. |
Hello! |
First of all no worries :) life happens to all of us and it should always have priority, we totally understand! Pest vs. hand rolled is a tough question. The syntax of jsonpath is quite simple compared to a full language so building a custom parser isn't prohibitive (and might result in simpler code?) it also safes a improves build time since we can skip building pest itself. On the other hand pest can be handy to make the grammar bullet proof and since it's a well known entity might make it easier for people down the road to understand and probably has better error messages out of the box. If I were writing this I'd probably write my own parser, because saving compile time outweighs having the simpler tooling pest would give me building it - but I'm also very comfortable with custom parsers so I'm biased. Plus I've had very little interaction with pest and it'd probably take me more time to learn the in's and outs of it then to write the parser. On the other hand, without time constraints I might have just picked pest for the sake of learning it :). Neither would be a bad choice, and since you're implementing it, it would make sense to pick what seems the best fit for you. In my experience a clear understanding of why something was picked is often more important than what was picked unless there is some very heavy wight factor in favour of one or the other. I suspect the jsonpath expression will be compiled to some kind of data structure before querying so performance on that path is probably not a concern either. I hope this no-answer is a helpful one :) I don't want to arm chair quarterback your implementation. |
I also think that writting the parser is a better solution in this case, because the compile time could be affected just because of Pest, and with some proper unit testing, I would be comfortable with the parser results. |
@miker1423 are you still working on this? |
Do simdjson have flattened JSON access? (similar to https://github.com/pikkr/pikkr)
Will, there be any performance improvement if I use flattend json access?
Added by @Licenser as an issue description
The
Tape
struct should be querieable via a simplified version of JSONpath (section 3.2 in the paper linked below).To achieve this we need at minumum:
Tape
.<field>
to query a object field[<index>]
to query array indexesAdditional JSONpath operators are welcome but optional.
The text was updated successfully, but these errors were encountered: