-
Updated
May 17, 2021 - Python
data-mining
Here are 3,396 public repositories matching this topic...
-
Updated
Jun 16, 2021
(triggered by SO question: https://stackoverflow.com/questions/67944732/using-my-own-stopword-list-with-gensim-corpora-textcorpus-textcorpus/67951592#67951592)
Gensim has two remove_stopwords()
functions with similar, but slightly-different behavior that risks confusing users.
gensim.parsing.preprocessing.remove_stopwords
takes a space-delimited string, and always consults the current
-
Updated
Jun 16, 2021 - Python
-
Updated
Oct 16, 2020 - Jupyter Notebook
-
Updated
Jun 14, 2021
Problem: the approximate method can still be slow for many trees
catboost version: master
Operating System: ubuntu 18.04
CPU: i9
GPU: RTX2080
Would be good to be able to specify how many trees to use for shapley. The model.predict and prediction_type versions allow this. lgbm/xgb allow this.
The official instructions say to use joblib for pickling PyOD models.
This fails for AutoEncoders, or any other TensorFlow-backed model as far as I can tell. The error is:
>>> dump(model, 'model.joblib')
...
TypeError: can't pickle _thread.RLock objects
Note that it's not sufficient to save the underlying Keras S
LET page = DOCUMENT("YOUR_URL", {
driver: "cdp",
ignore: {
resources: [
{
url: "*",
type: "image"
}
]
}
})
-
Updated
Jun 6, 2021 - Python

-
Updated
Jan 25, 2021
task is to auto generate a table of Capabilities like this
alan-turing-institute/sktime#996
and to get it to automatically display on sktime.org (we can help with this stage)
-
Updated
Feb 6, 2020
-
Updated
Jun 13, 2021 - Python
-
Updated
Jun 7, 2021 - HTML
- What's your use case?
-
Updated
Jun 17, 2021
-
Updated
Jun 20, 2021
-
Updated
Dec 28, 2020 - Python
-
Updated
Mar 28, 2021
-
Updated
Jun 2, 2021
-
Updated
Jun 2, 2021 - JavaScript
-
Updated
Jan 12, 2021
-
Updated
Feb 12, 2019 - JavaScript
-
Updated
Jun 13, 2021 - D
-
Updated
Jun 18, 2021 - Python
-
Updated
Jun 13, 2021 - Python
-
Updated
Jun 19, 2021 - Python
Improve this page
Add a description, image, and links to the data-mining topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the data-mining topic, visit your repo's landing page and select "manage topics."
Migrate all Python code from old-fashioned
format()
functions, formatting%
operators and simple concatenations (+
) to modernf-strings
(brief guide). They are known to be the fastest approach and also increase code readability.![image](https://user-images.githubusercontent.com/25141164/112898582-a