The Wayback Machine - https://web.archive.org/web/20201123220139/https://github.com/topics/pydata
Skip to content
#

pydata

Here are 82 public repositories matching this topic...

lr4d
lr4d commented Oct 8, 2020

Problem description

Our dask update graphs are not properly optimized.

We ussually use dask.dataframe optimization and set ave_width=repartition_ratio for kartothek.io.dask.dataframe.update_dataset_from_ddf graphs. We should return an optimized graph from update_dataset_from_ddf to make our users' life simple.

We already have code that does this, whoever picks this up can ping me

randyzwitch
randyzwitch commented Mar 28, 2019

In trying to write tests for #189, I'm finding very difficult to add columns to existing tests, as in some cases like the all_types table, the table is defined in a separate file than the tests and multiple tests try to write to the same table.

Additionally, our test suite doesn't prove that the data that are uploaded are the same as the data downloaded for all types.

We should consider m

Improve this page

Add a description, image, and links to the pydata topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the pydata topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.