The Wayback Machine - https://web.archive.org/web/20221223155926/https://github.com/python/cpython/pull/92015
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-69929: Add more specific definition of \w #92015

Merged
merged 2 commits into from Dec 20, 2022
Merged

Conversation

slateny
Copy link
Contributor

@slateny slateny commented Apr 28, 2022

@JelleZijlstra
Copy link
Member

JelleZijlstra commented Apr 30, 2022

@SnoopJeDi on #82747 suggested referring to the str.isalnum() documentation. I can confirm that that's correct: they both map to the Py_UNICODE_ISALNUM macro. I think it's also clearer, so could you do that instead?

@slateny
Copy link
Contributor Author

slateny commented May 3, 2022

Thanks, that's a good suggestion - do you know whether the wording can be more exact, so instead of this includes most alphanumeric characters it can be this includes all alphanumeric characters?

e: or maybe more simply, 'Matches Unicode word characters; this is equivalent to alphanumeric characters as well as the underscore'

Doc/library/re.rst Outdated Show resolved Hide resolved
Co-authored-by: Jelle Zijlstra <[email protected]>
@zackw
Copy link

zackw commented Jul 18, 2022

I like the idea of referring to the definition of str.isalnum for what regex \w means, but the definition of str.isalnum could also stand to be improved. Right now it says

A character c is alphanumeric if one of the following returns True: c.isalpha(), c.isdecimal(), c.isdigit(), or c.isnumeric()

The definitions of isalpha, isdecimal, etc. refer to various Unicode properties and it's not obvious what the aggregate adds up to. I'd like to see something like this added:

Together, this includes everything in Unicode general categories L* and N*, plus U+005F (underscore).

@slateny
Copy link
Contributor Author

slateny commented Sep 21, 2022

I think a change like that would be better in a separate PR, so I've opened #96984 for possible discussion.

@JelleZijlstra JelleZijlstra merged commit 36a0b1d into python:main Dec 20, 2022
13 checks passed
@miss-islington
Copy link
Contributor

miss-islington commented Dec 20, 2022

Thanks @slateny for the PR, and @JelleZijlstra for merging it 🌮🎉.. I'm working now to backport this PR to: 3.10, 3.11.
🐍🍒🤖

@bedevere-bot
Copy link

bedevere-bot commented Dec 20, 2022

GH-100354 is a backport of this pull request to the 3.11 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Dec 20, 2022
…-92015)

(cherry picked from commit 36a0b1d)

Co-authored-by: Stanley <[email protected]>
Co-authored-by: Jelle Zijlstra <[email protected]>
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Dec 20, 2022
…-92015)

(cherry picked from commit 36a0b1d)

Co-authored-by: Stanley <[email protected]>
Co-authored-by: Jelle Zijlstra <[email protected]>
@bedevere-bot
Copy link

bedevere-bot commented Dec 20, 2022

GH-100355 is a backport of this pull request to the 3.10 branch.

miss-islington added a commit that referenced this pull request Dec 20, 2022
(cherry picked from commit 36a0b1d)

Co-authored-by: Stanley <[email protected]>
Co-authored-by: Jelle Zijlstra <[email protected]>
miss-islington added a commit that referenced this pull request Dec 20, 2022
(cherry picked from commit 36a0b1d)

Co-authored-by: Stanley <[email protected]>
Co-authored-by: Jelle Zijlstra <[email protected]>
@slateny slateny deleted the s/re branch Dec 20, 2022
jonburdo pushed a commit to jonburdo/cpython that referenced this pull request Dec 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants