data-mining

Migrate all Python code from old-fashioned format() functions, formatting % operators and simple concatenations (+) to modern f-strings (brief guide). They are known to be the fastest approach and also increase code readability.

![image](https://user-images.githubusercontent.com/25141164/112898582-a

(triggered by SO question: https://stackoverflow.com/questions/67944732/using-my-own-stopword-list-with-gensim-corpora-textcorpus-textcorpus/67951592#67951592)

Gensim has two remove_stopwords() functions with similar, but slightly-different behavior that risks confusing users.

gensim.parsing.preprocessing.remove_stopwords takes a space-delimited string, and always consults the current

Problem: the approximate method can still be slow for many trees
catboost version: master
Operating System: ubuntu 18.04
CPU: i9
GPU: RTX2080

Would be good to be able to specify how many trees to use for shapley. The model.predict and prediction_type versions allow this. lgbm/xgb allow this.

The official instructions say to use joblib for pickling PyOD models.

This fails for AutoEncoders, or any other TensorFlow-backed model as far as I can tell. The error is:

>>> dump(model, 'model.joblib')
...
TypeError: can't pickle _thread.RLock objects

Note that it's not sufficient to save the underlying Keras S

LET page = DOCUMENT("YOUR_URL", {
	driver: "cdp",
	ignore: {
        resources: [
            {
                url: "*",
                type: "image"
            }
        ]
    }
})

task is to auto generate a table of Capabilities like this

alan-turing-institute/sktime#996

and to get it to automatically display on sktime.org (we can help with this stage)

What's your use case?

May	JUN	Jul
	21
2020	2021	2022

data-mining

Here are 3,396 public repositories matching this topic...

eriklindernoren / ML-From-Scratch

academic / awesome-datascience

microsoft / LightGBM

RaRe-Technologies / gensim

JaidedAI / EasyOCR

rasbt / python-machine-learning-book

EthicalML / awesome-production-machine-learning

catboost / catboost

yzhao062 / pyod

MontFerret / ferret

yzhao062 / anomaly-detection-resources

jivoi / awesome-ml-for-cybersecurity

alan-turing-institute / sktime

tangyudi / Ai-Learn

rasbt / mlxtend

deanmalmgren / textract

biolab / orange3

r0f1 / datascience

jphall663 / awesome-machine-learning-interpretability

WZBSocialScienceCenter / pdftabextract

rob-med / awesome-TS-anomaly-detection

tirthajyoti / Papers-Literature-ML-DL-RL-AI

ankitrohatgi / WebPlotDigitizer

demidovakatya / vvedenie-mashinnoe-obuchenie

PatMartin / Dex

eBay / tsv-utils

CIRCL / AIL-framework

sepandhaghighi / pycm

polakowo / vectorbt

404notf0und / AI-for-Security-Learning

Improve this page

Add this topic to your repo