The Wayback Machine - https://web.archive.org/web/20210621050015/https://github.com/topics/data-mining
Skip to content
#

data-mining

Here are 3,396 public repositories matching this topic...

LightGBM
gensim
gojomo
gojomo commented Jun 12, 2021

(triggered by SO question: https://stackoverflow.com/questions/67944732/using-my-own-stopword-list-with-gensim-corpora-textcorpus-textcorpus/67951592#67951592)

Gensim has two remove_stopwords() functions with similar, but slightly-different behavior that risks confusing users.

gensim.parsing.preprocessing.remove_stopwords takes a space-delimited string, and always consults the current

pseudotensor
pseudotensor commented Jan 12, 2021

Problem: the approximate method can still be slow for many trees
catboost version: master
Operating System: ubuntu 18.04
CPU: i9
GPU: RTX2080

Would be good to be able to specify how many trees to use for shapley. The model.predict and prediction_type versions allow this. lgbm/xgb allow this.

sktime

人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
  • Updated Feb 6, 2020

Improve this page

Add a description, image, and links to the data-mining topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-mining topic, visit your repo's landing page and select "manage topics."

Learn more