Description
Describe the Bug
In Python, the following syntax is valid:
[(i, j) for i in range(3) if i > 0 for j in range(5) if j % 2 == 0]
However, in the current python.lark
grammar, the relevant rule is defined as:
comprehension{comp_result}: comp_result comp_fors [comp_if]
comp_fors: comp_for+
comp_for: [ASYNC] "for" exprlist "in" or_test
With this grammar definition, the two if
conditions cannot be parsed correctly. It only allows cases such as:
[(i, j) for i in range(3) for j in range(5) if j % 2 == 0 if j > 0]
In this structure, the if
clauses are associated globally with the entire comprehension rather than individually with each for
clause. This is incorrect because, in Python, each for
clause in a comprehension can have its own set of if
filters.
To Reproduce
I’m unable to provide a simple standalone reproduction script, but as described above, the provided example:
[(i, j) for i in range(3) if i > 0 for j in range(5) if j % 2 == 0]
fails to parse correctly under the current grammar. The issue is that the if
clauses are incorrectly attached to the comprehension
rule as a whole, rather than being associated with their respective comp_for
. Each for
loop in a comprehension should be able to have its own if
filters.
Proposed Fix
In my project, the issue was resolved by modifying the grammar as follows:
comprehension{comp_result}: comp_result comp_fors
comp_fors: comp_for+
comp_for: [ASYNC] "for" exprlist "in" or_test comp_if*
This change allows each for
clause to optionally have its own if
filters, which aligns with Python’s actual comprehension syntax.