The Wayback Machine - https://web.archive.org/web/20211002153454/https://github.com/topics/language-classification

#

language-classification

Here are 19 public repositories matching this topic...

pemistahl / lingua

Star

👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

nlp natural-language-processing natural-language kotlin-library language-detection android-library java-library nlp-library nlp-machine-learning language-recognition language-processing language-identification language-classification

Updated Jul 19, 2021
Kotlin

pemistahl / lingua-go

Star

👄 The most accurate natural language detection library in the Go ecosystem, suitable for long and short text alike

nlp go natural-language-processing language-detection golang-library nlp-machine-learning language-recognition language-processing language-identification language-classification

Updated Jul 8, 2021
Go

pemistahl / lingua-rs

Star

👄 The most accurate natural language detection library in the Rust ecosystem, suitable for long and short text alike

nlp rust natural-language-processing language-detection rust-library nlp-machine-learning language-recognition language-processing rust-crate language-identification language-classification

Updated Jul 5, 2021
Rust

oscar-corpus / goclassy

Star

An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.

nlp corpus-linguistics fasttext common-crawl language-classification

Updated Apr 21, 2021
Go

JasonFengGit / RNN-Language-Classifier

Star

A Language Classifier powered by Recurrent Neural Network implemented in Python without AI libraries. AI from scratch.

numpy artificial-intelligence rnn language-classification recurrent-neural-network word-classifier

Updated Sep 7, 2021
Python

oscar-corpus / ungoliant

Star

Open

[BUG] Encoding errors in OSCAR 21.09

2

stefan-it commented Sep 29, 2021

Hi guys,

after downloading and extracting the Turkish part of the OSCAR 21.09 release, I've found some sentences with encoding errors:

I did a grep -c "�" tr_part_* over the complete corpus, here are some stats:

| Filename | Affected number of lines
| --------------

Read more

bug good first issue

Open

Support for Tigrinya

gokadin / hyperdimensional-computing

Star

Hyperdimensional computing explained and demonstrated

machine-learning vector artificial-intelligence hyperdimensional language-classification

Updated Aug 22, 2021
Go

greek-dialect-classifier

hb20007 / greek-dialect-classifier

Star

Classifier that identifies Greek text as Cypriot Greek or Standard Modern Greek

Updated Oct 4, 2019
Jupyter Notebook

sileixinhua / Python_sklearn_svm_language

Star

语言识别数据集的基本数据分析方法，包括SVM算法。

svm sklearn python3 language-classification

Updated Apr 22, 2017
Python

rtst777 / Toxic-Language-Classifier

Star

An ensemble of neural network models for toxic language classification

nlp deep-learning lstm neural-networks attention ensemble-model language-classification

Updated Nov 27, 2019
Python

mc-cat-tty / Language-Classification

Star

Suite of Python modules to recognise the language of a file

python language files flask twitter csv python3 language-recognition frequency-table language-classification language-classifier language-analyzer itis-fermi-modena

Updated Sep 23, 2020
Python

prabormukherjee / Language_classifier

Star

Classifying English, Slovak, Czech language using Naive Bayes

naive-bayes language-classification subwords vectorizing

Updated Sep 14, 2020
Smalltalk

sebastiandziadzio / hate-tweet

Star

Detecting hate speech in tweets using bag-of-trick models and bi-LSTM networks.

python natural-language-processing keras lstm fasttext bi-lstm language-classification

Updated Oct 13, 2017
Python

manish7294 / Regional-Language-Detector

Star

Detecting the location and native language of a place from an image

opencv deep-learning tensorflow image-processing neural-networks image-recognition object-detection language-classification crnn

Updated Oct 30, 2017
Python

jaineel-vyas / Language_Classifier

Star

Using Decision Tree and AdaBoost to classify languages(English/Dutch)

adaboost decision-tree decision-tree-classifier language-classification adaboost-algorithm

Updated May 13, 2020
Python

sahil8700 / Language-Identifier

Star

This program identify the input text language.

python nlp machine-learning svm-classifier language-classification

Updated Apr 21, 2020
Jupyter Notebook

vishnukanduri / Language-Classification-using-Naive-Bayes-in-Python

Star

Classified sentences into one of Slovak, Czech, and English. Implemented relevant preprocessing steps, addressed the class imbalance in training set by employing the learned theory of Naive Bayes Models, and implementing subword units.

visualization nlp natural-language-processing exploratory-data-analysis vectorization data-cleaning language-classification naive-bayes-implementation subword-units subwords

Updated Jun 4, 2020
Smalltalk

darshanbagul / Textual_Language_Identification

Star

Implementing a Naive Bayes Classifier for multiclass classification to identify language of a given text

scala computational-linguistics language-classification

Updated Aug 27, 2017
Scala

elisiojsj / NLP-language-classification-generation-translation

Star

NLP projects.

python nlp language-translation pytorch rnn language-classification language-generation translation-models

Updated Jan 21, 2021
Jupyter Notebook

Improve this page

Add a description, image, and links to the language-classification topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the language-classification topic, visit your repo's landing page and select "manage topics."