language-model
Here are 868 public repositories matching this topic...
-
Updated
Oct 22, 2020
-
Updated
Feb 25, 2022 - Python
-
Updated
Mar 22, 2022 - Rust
chooses 15% of token
From paper, it mentioned
Instead, the training data generator chooses 15% of tokens at random, e.g., in the sentence my
dog is hairy it chooses hairy.
It means that 15% of token will be choose for sure.
From https://github.com/codertimo/BERT-pytorch/blob/master/bert_pytorch/dataset/dataset.py#L68,
for every single token, it has 15% of chance that go though the followup procedure.
PositionalEmbedding
Problem
Currently FARMReader
will ask users to raise max_seq_length
every time some samples are longer than the value set to it. However, this can be confusing if max_seq_length
is already set to the maximum value allowed by the model, because raising it further will cause hard-to-read CUDA errors.
See #2177.
Solution
We should find a way to query the model for the maximum va
-
Updated
Mar 21, 2022 - Python
目前的多音字使用 pypinyin 或者 g2pM,精度有限,想做一个基于 BERT (或者 ERNIE) 多音字预测模型,简单来说就是假设某语言有 100 个多音字,每个多音字最多有 3 个发音,那么可以在 BERT 后面接 100 个 3 分类器(简单的 fc 层即可),在预测时,找到对应的分类器进行分类即可。
参考论文:
tencent_polyphone.pdf
数据可以用 https://github.com/kakaobrain/g2pM 提供的数据
进阶:多任务的 BERT
![image](https://user-images.githubusercontent.com/24568452
-
Updated
Jan 7, 2022 - Python
-
Updated
Mar 22, 2022 - Python
-
Updated
Jan 22, 2022 - Python
-
Updated
Feb 24, 2022
-
Updated
Nov 11, 2020 - Python
-
Updated
Apr 23, 2021 - Python
-
Updated
Feb 12, 2022 - Python
-
Updated
Mar 22, 2022 - Python
-
Updated
May 11, 2021 - Python
-
Updated
Aug 9, 2021 - Python
Issue to track tutorial requests:
- Deep Learning with PyTorch: A 60 Minute Blitz - #69
- Sentence Classification - #79
-
Updated
Mar 22, 2022 - Go
-
Updated
Jan 10, 2022 - Python
-
Updated
Mar 15, 2022 - Python
-
Updated
Aug 5, 2020
-
Updated
Jan 1, 2019 - Python
-
Updated
Mar 22, 2022 - Python
-
Updated
Mar 10, 2022 - Python
-
Updated
Dec 16, 2021 - Python
-
Updated
Feb 26, 2022 - Jupyter Notebook
-
Updated
Dec 14, 2020 - Python
-
Updated
Feb 22, 2022 - TeX
Improve this page
Add a description, image, and links to the language-model topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the language-model topic, visit your repo's landing page and select "manage topics."
This issue is part of our Doc Test Sprint. If you're interested in helping out come join us on Discord and talk with other contributors!
Docstring examples are often the first point of contact when trying out a new library! So far we haven't done a very good job at ensuring that all docstring examples work correctly in🤗 Transformers - but we're now very