Misaligned error location when input text contain full-width characters

**Describe the bug**

When the input text contains [full-width characters](https://en.wikipedia.org/wiki/Halfwidth_and_Fullwidth_Forms_(Unicode_block)#Block), the error location indicator (^) will point at the wrong character because it uses half-width spaces (U+20) only but it should instead match input characters' widths and use full-width spaces (U+3000) as well.

**To Reproduce**

Use any CJK characters or [full-width version of latin scripts](https://en.wikipedia.org/wiki/Halfwidth_and_Fullwidth_Forms_(Unicode_block)#Block) and intentionally create a grammar error. The indicator (^) will point at the wrong location.

Current and expected behavior shown here:

```
    raise UnexpectedCharacters(stream, i, text_line, text_column, {item.expect.name for item in to_scan},
lark.exceptions.UnexpectedCharacters: No terminal matches '&#21476;' in the current parser context, at line 2 col 6
1.&#33757;&#33856;&#65306;&#21476;&#20195;&#19968;&#31278;&#35937;&#24501;&#31077;&#29790;&#30340;&#33609;&#12290;&#12298;&#24291;&#38907;&#65294;&#20837;&#32882;&#65294;&#26411;&#38907;&#12299;&#65306;&#12300;&#33757;&#65306;&#33757;&#33856;&#65292;&#29790;&#33609;&#12290;&#12301;
     ^  // using U+20 (current)
  &#12288;&#12288;&#12288;^  // using U+20 and U+3000 (expected)
Expected one of: 
	* LPAREN
	* I_LQUOTE
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Misaligned error location when input text contain full-width characters #1530

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Misaligned error location when input text contain full-width characters #1530

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions