Skip to content

Include the original text in the metadata of trees #1520

Open
@jacobm-tech

Description

@jacobm-tech

For the purposes of teaching parsing to my students, as well as for general visualization purposes, I would like to include the original text from which a node was created as part of the description of that node. For example, when parsing the expression "(x+5)*(x-2)" with the usual grammar, I want to create and visualize trees that look like this (fragment shown):

Image

Currently, Lark tree metadata includes the start and end character of the text from which the tree was made, but not the text itself.

I was able to create the visualization I wanted by modifying the pydot__tree_to_graph function to accept the source text and modifying the nodes created to include it, like this:

node = pydot.Node(i[0], style="filled", fillcolor="#%x;0.5:white" % color, gradientangle="270",
                          label=subtree.data+"\n"+text[subtree.meta.column-1:subtree.meta.end_column-1])

but this is very brittle and incomplete and not suitable for contribution to Lark. I wasn't able to find a good way to get the source text of the subtree from inside the class.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions