The Wayback Machine - https://web.archive.org/web/20201115100251/https://github.com/topics/big-data
Skip to content
#

big-data

Here are 2,244 public repositories matching this topic...

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
  • Updated Oct 1, 2020
  • Python
ClickHouse
yunchat
yunchat commented Nov 13, 2018

Now insert and query share the resource ( Max Process Count control) 。 When the query with high TPS,the insert will get error (“error: too many process”). I think separator the resource for Insert and Query will makes sense. Ensure enough resource for insert。It looks like Use Yarn, Insert and Query use the different resource quota。
Or the simple way , Can we set Ratio for Insert and

Open Source Fast Scalable Machine Learning Platform For Smarter Applications: Deep Learning, Gradient Boosting & XGBoost, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
  • Updated Nov 15, 2020
  • Jupyter Notebook
robinroos
robinroos commented Oct 16, 2020

Targetting 4.1

Proposal

Take the existing:

hazelcast/hazelcast/src/main/java/com/hazelcast/map/impl/proxy/MapProxyImpl.java

Line 916 in 4fed159

 public <R> Iterator<R> iterator(int fetchSize, int partitionId, Projection<Map.Entry<K, V>, R> projection, 

And create a public API in IMap which differs from the above in that it:

  1. does not require partitionId but work
vespa

Improve this page

Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.