Skip to content

comp_if should be moved into comp_for to correctly reflect Python's comprehension grammar #1525

Open
@JIAQIA

Description

@JIAQIA

Describe the Bug

In Python, the following syntax is valid:

[(i, j) for i in range(3) if i > 0 for j in range(5) if j % 2 == 0]

However, in the current python.lark grammar, the relevant rule is defined as:

comprehension{comp_result}: comp_result comp_fors [comp_if]
comp_fors: comp_for+
comp_for: [ASYNC] "for" exprlist "in" or_test

With this grammar definition, the two if conditions cannot be parsed correctly. It only allows cases such as:

[(i, j) for i in range(3) for j in range(5) if j % 2 == 0 if j > 0]

In this structure, the if clauses are associated globally with the entire comprehension rather than individually with each for clause. This is incorrect because, in Python, each for clause in a comprehension can have its own set of if filters.

To Reproduce

I’m unable to provide a simple standalone reproduction script, but as described above, the provided example:

[(i, j) for i in range(3) if i > 0 for j in range(5) if j % 2 == 0]

fails to parse correctly under the current grammar. The issue is that the if clauses are incorrectly attached to the comprehension rule as a whole, rather than being associated with their respective comp_for. Each for loop in a comprehension should be able to have its own if filters.

Proposed Fix

In my project, the issue was resolved by modifying the grammar as follows:

comprehension{comp_result}: comp_result comp_fors
comp_fors: comp_for+
comp_for: [ASYNC] "for" exprlist "in" or_test comp_if*

This change allows each for clause to optionally have its own if filters, which aligns with Python’s actual comprehension syntax.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions