Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Parentheses within custom match regexp #95
Comments
You can't define groups within groups at this time as there's no defined way this should act. If you can come up with a proposal for how it behaves, we can look at implementing support for it. |
I don't think there's any reason for groups within groups to have any special behavior outside of their usefulness for doing things like what I described in my original comment or, for another contrived example, like I don't think inner groups should be parsed at all and should just be left alone as part of the custom regexp. Does that make sense? |
Not really. What do we do with the matches inside the parentheses? The example above would create two matches. What about other types of matching groups that don't have output in the RegExp? How are unbalanced brackets handled? You are welcome to submit a PR if you know how it should work. Edit: I'd love to see it work, but the reason it does not exist is because it's not clear how/why. |
I just spent a bit of time attempting to put together a pull request for this but it turns out it's probably going to be really difficult. I found the regex that needs to be modified to accomplish what I've described and the portion of it that will require modification The problem is JavaScript's lack of recursive regex support so detecting inner parentheses will either mean drastically increasing the length and complexity of It sure would be nice if this were supported but I can see why it might make more sense, at least for now, to leave it as it is and maybe include a note in the README about this limitation. By the way I wrote a test for this case while working on my pull request. I'll include it here in case someone wants to move forward with this at some point: [
'/:test(\\d+(\\.\\d+)?)',
null,
[
{
name: 'test',
prefix: '/',
delimiter: '/',
optional: false,
repeat: false,
partial: false,
asterisk: false,
pattern: '\\d+(\\.\\d+)'
}
],
[
['/123', ['/123', '123']],
['/abc', null],
['/123/abc', null],
['/123.123', ['/123.123', '123.123']],
['/123.abc', null]
],
[
[{ test: 'abc' }, null],
[{ test: '123' }, '/123'],
[{ test: '123.123' }, '/123.123'],
[{ test: '123.abc' }, null]
]
] |
Yeah, it's tricky. The problem really comes from internal group matches. There's no clearly defined way of how the internal match would interact with the current support for tokens. It's also not clear how things like the reverse Edit: The above example is a good case. Right now, |
Yeah I was assuming they would be treated as non-matching somehow. I realize I'm probably coming off as really naive with this request but thanks for humoring me! You've clearly put a lot of thought into this yourself. I'll keep an eye on this to see where it goes, but thanks so much for all your hard work! Aside from this incredibly minor nitpick the library is amazing :) |
I wouldn't worry about the implementation, we can always make that work. The harder thing right now is deciding how it would work. Should internal brackets create a matching group? If they do, do we add a token to the list? If we add a token to the list, how do people use the reverse "create a path" from the tokens? I think that's a dead end, so a non-matching group is probably a better idea. In that case, we can replace I just know the issues around it, not the solutions, sorry |
FWIW I'd be perfectly content with the non-matching group solution. The implementation is a little beyond me though since I think there's still the issue of balancing parentheses. |
damn...I needed something like you requested: /(list|edit/\d+) and can't solve it :( |
@blakeembrey are you planning to add support for non-capturing groups? |
I haven't any plans, but can take another look at it. I'd also accept a PR that just ignores non-matching groups. |
@plandem when using React Router recently (which just hands it's path to this library) I discovered that this works for /foo and /bar
And at this point I don't know it's a bug or a feature. Edit: |
Any progress here? |
I want something like:
match:
don't match:
|
@sergeysova Why don't you use an array or a regular expression? You can also do |
@blakeembrey I want have named parameter :image And more subroutes will add under /:username/ |
@sergeysova Try an array? |
I am surprised that the primary hangup here is the semantics. To me the semantics are obvious:
I am under a bit of a deadline crunch but I might be able to submit a PR which parses JS regexp along these lines, if they're acceptable to you? |
Also if I don't get to this ever, I wanted to mention for anyone reading this thread that unless you are repeating something in parentheses indefinitely, you probably do not need Just to break down the cases in this, the first comment starts with the regexp A more complicated case: if you wanted to match a JSON float the regex for that is apparently:
Putting these together with A language with support for list comprehensions can do a lot of this for you, e.g. in Python you can convert
|
You’re welcome to submit a PR, but it’s definitely not just semantics stopping an implementation. Balancing nested matching brackets is a big rewrite of the existing code, if you’re able to do it simply without many changes that’s fantastic and definitely please submit a PR. Currently, I believe the logic needs to be rewritten to support a lexer instead of using a regexp internally. |
Oh yeah definitely. The language of regular expressions is famously not regular. |
What I found out is that if you are working with groups utilize an another set of parenthesis after. This of course will not work with nested grouping. Or use |
I believe that there is a good use case for having more control over the group: base64 validation:
This cannot be done with the current restrictions on custom match groups. |
@garrettmaring yeah that However, I would remind you and anyone else reading that these regexes are not in my view primarily security-related and so "validation" is a little bit of a dicey subject. Like, to my mind the point of the regex is to say "this route looks like _____". The basic problem is that you try to write an API which has both Base64 on the other hand is so permissive that it makes this very tricky because it doesn't rule out so many other routes -- so if So I think you're 100% right that there is a thing you cannot do here, but even if you could do it, you would not buy much over just the regex |
@crdrost At that point, I think you'd just be looking for some application logic to do that for you right?
|
Yeah, I'd say so. What's really at stake here is that URL paths come from a filesystem world but are a generic text format being used to transmit data in a very free-form format. If we were to instead take a more structured view towards API development, where we have valid API "midpoints" as well as the endpoints when the full URL is constructed, we might see this better. So each midpoint instead comes from some sort of philosophy, "the API structures we care about are Records (well-known properties which I can access, Most people seem to get by without explicit lists or tagged unions; lists are in some sense subsumed by dictionaries but tagged unions really have their own genuine power to them. The error that is being described above is an error of an API midpoint which does not know whether it is a Dictionary or a Record and thus is trying to be both, and the proposed solution is to make the midpoint firmly into a Record and add a property |
@crdrost delta! I agree that validation of that sort doesn't make sense in the URL as a regex. Thanks for the reply |
Added support with 1327699, but opted to not transform anything people write. It'll instead throw an error and prompt you to use the non-capturing group directly. So |
I'd like to be able to define groups within my custom match regexp like so:
/:foo(\\d+(\\.\\d+)?)
Which I would expect to match both of these paths:
/123
/123.123
Currently this doesn't seem to work as it appears to be parsing the inner parentheses as a new key. Any ideas on how I could achieve something like this?
My workaround for the moment is to define two separate paths like
/:foo(\\d+)
and/:foo(\\d+\\.\\d+)
, but I'd love to just have one like I defined above.