The Wayback Machine - https://web.archive.org/web/20210821211514/https://github.com/topics/data-science
Skip to content
#

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

Here are 20,916 public repositories matching this topic...

glemaitre
glemaitre commented Aug 10, 2021

I just discover that we have a helper function to validate scalar:
https://scikit-learn.org/stable/modules/generated/sklearn.utils.check_scalar.html

Since this helper could help to get consistent error types and messages, I was wondering if we could make a long-running issue to introduce this helper everywhere possible.

I think this could be a good issue for first contributors and short spr

superset
michellethomas
michellethomas commented Aug 14, 2021

For a sqllab query that returns a large number of columns (I observed this in a query with ~70 columns), the sorting functionality is broken. Columns do not sort at all.

I'm not sure how easy this is to fix. Do we want to consider this a limitation of sqllab or is this fixable?

Expected results

Columns have the ability to sort

Actual results

When you click on the column head

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
  • Updated May 13, 2021
  • Python
pytorch-lightning
dash
MajorMajor807
MajorMajor807 commented Aug 16, 2021

Bug summary

I am using contourf to plot filled in contours, but some of the contours are not being filled in despite how values exist for those regions. I am including an example. The code behind the generation of R_mesh, Z_mesh, and total_mesh has been exempted for simplicity, but the problem remains the same.

Code for reproduction



R_mesh = [231.86725132, 220
gensim
c4n
c4n commented Jul 30, 2021

Is your feature request related to a problem? Please describe.
I want to evaluate multiple datasets (same formatting, they can share the same dataset reader). The "evaluate" command takes much longer to load the model than to evaluate.

Describe the solution you'd like
support passing multiple input files and output files to the "evaluate" command

**Describe alternatives you've cons

nni