-
Updated
Feb 28, 2022
scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.
Here are 5,815 public repositories matching this topic...
-
Updated
Apr 14, 2022 - Jupyter Notebook
-
Updated
Mar 19, 2022 - Python
-
Updated
Apr 14, 2022 - Jupyter Notebook
-
Updated
Mar 11, 2022 - Jupyter Notebook
-
Updated
Apr 3, 2022 - Python
New Operator
Describe the operator
Why is this operator necessary? What does it accomplish?
This is a frequently used operator in tensorflow/keras
Can this operator be constructed using existing onnx operators?
If so, why not add it as a function?
I don't know.
Is this operator used by any model currently? Which one?
Are you willing to contribute it?
-
Updated
Jul 30, 2021 - Jupyter Notebook
I naively tried to do dd.merge(a, b, on="column_with_ten_values")
, where a
and b
were both large DataFrames with thousands of partitions each.
Eventually the compute failed with:
[File /opt/conda/envs/coiled/lib/python3.9/site-packages/dask/dataframe/multi.py:275, in merge_chunk()
File /opt/conda/envs/coiled/lib/python3.9/site-packages/pandas/core/frame.py:9329, i
-
Updated
Mar 3, 2022 - Python
-
Updated
Apr 1, 2022 - Python
-
Updated
Apr 16, 2022 - C++
-
Updated
Oct 1, 2020 - Jupyter Notebook
-
Updated
Apr 14, 2022 - Python
-
Updated
Oct 1, 2021 - Jupyter Notebook
We currently fit and predict upon loading autosklearn.experimental.askl2
for the first time. In environments with a non-persistent filesystem (autosklearn is installed into a new filesystem each time), this can add quite a bit of time delay as experienced in #1362
It seems more applicable to export the
- As a user, I wish featuretools
dfs
would take a string as cutoff_time aswell as a datetime object
Code Example
fm, features = ft.dfs(entityset=es,
target_dataframe_name='customers',
cutoff_time="2014-1-1 05:00",
instance_ids=[1],
cutoff_time_in_index=True)
as well as
The first entry is being eaten by the Differencer
in its current standard setting, which may cause user frustration, especially when combined with a pipeline (which is its "typical use"), see e.g., here: alan-turing-institute/sktime#2452
We should add an NA handling parameter setting and make the default to fill in sth for the first value, e.g., a difference from an
Interpret
Yes
-
Updated
Apr 6, 2022 - CSS
readthedocs analytics says that we have several search results that yield little or no useful results. Let's improvethose:
- gpu (only 2 results): make sure that explanation of
device
parameter mentionsgpu
as well - gridsearch (0 results): make sure to include the term
gridsearch
in the meta data of
Related: awslabs/autogluon#1479
Add a scikit-learn compatible API wrapper of TabularPredictor:
- TabularClassifier
- TabularRegressor
Required functionality (may need more than listed):
- init API
- fit API
- predict API
- works in sklearn pipelines
-
Updated
Apr 24, 2020 - Jsonnet
-
Updated
Apr 15, 2022 - Python
-
Updated
Apr 15, 2022 - Python
-
Updated
Nov 7, 2021 - Jupyter Notebook
-
Updated
Mar 29, 2022 - Jupyter Notebook
-
Updated
Apr 6, 2022 - Python
Hello everyone,
First of all, I want to take a moment to thank all contributors and people who supported this project in any way ;) you are awesome!
If you like the project and have any interest in contributing/maintaining it, you can contact me here or send me a msg privately:
- Email: [email protected]
PS: You need to be familiar with python and machine learning
Created by David Cournapeau
Released January 05, 2010
Latest release 4 months ago
- Repository
- scikit-learn/scikit-learn
- Website
- scikit-learn.org
- Wikipedia
- Wikipedia