The Wayback Machine - https://web.archive.org/web/20210720162157/https://github.com/topics/data-profiling
Here are
39 public repositories
matching this topic...
Create HTML profiling reports from pandas DataFrame objects
Updated
Jul 19, 2021
Jupyter Notebook
Visualize and compare datasets, target values and associations, with one line of code.
Updated
Jul 8, 2021
Python
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF and PySpark
Updated
Jul 20, 2021
Jupyter Notebook
Data validation and organization of metadata for data frames and database tables
Data profiling, testing, and monitoring for SQL accessible data.
Updated
Jul 15, 2021
Python
Monitor the stability of a pandas or spark dataframe ⚙︎
Updated
Jul 19, 2021
Python
🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF and PySpark)
a set of scripts to pull meta data and data profiling metrics from relational database systems
Updated
Apr 12, 2021
Python
Dataset search engine, discovering data from a variety of sources, profiling it, and allowing advanced queries on the index
Updated
Jul 8, 2021
Python
Updated
Jul 14, 2021
HTML
A Node.js tool to examine the correctness of Open Data Metadata and build custom dataset profiles
Updated
Jun 20, 2018
JavaScript
Updated
Apr 20, 2021
JavaScript
Open Data Profiling, Quality and Analysis on NYC OpenData dataset with semantic profiling using fuzzy ratio, Levenshtein distance and regex
Updated
Nov 10, 2020
Jupyter Notebook
Simplify usage of the RDS API for TypeScript/JavaScript developers
Updated
Aug 9, 2020
TypeScript
R package to simplify the usage of the RDS REST API and provide convenience in accessing data and metadata.
HPCC Systems ECL bundle that provides some basic data profiling and research tools to an ECL programmer
TypeScript/JavaScript example code using the RDS API
Updated
Dec 30, 2020
TypeScript
Automated exploration of files in a folder structure to extract metadata and potential usage of information.
Updated
Mar 28, 2020
Python
Profile. Generate data profiles in the browser (work in progress)
Updated
Mar 9, 2021
JavaScript
Updated
Jun 7, 2021
Python
The program compares two files at a time and does the following 1.Gathering metadata on the individual tables(column count,record count,list of columns with datatype etc) 2.Identifying matching columns between tables based on names as well as data. Using machine learning, we are handling syntactic as well as semantic variations of column names for accurate matching. 3. Finding duplicate columns in single table with the option to deduplicate if required 4. Finding columns with missing data/null values.
Updated
Feb 17, 2018
Python
Distributable UCC Discovery Algorithm based on Akka
Project to develop R scripts to capture, validate and analyze loan information from credit unions peruvian sector.
Updated
Jul 11, 2021
HTML
a nix DataProfiler for deep analysis of data quality on tabular files
Data profiler is an attempt to model the behavior of a given operator for a set of datasets.
A R Notebook to perform basic data profiling and exploratory data analysis on the FIFA19 players dataset and create a dream-team of the top 11 players considering various player attributes.
Updated
May 26, 2021
HTML
Python function to generate a mask analysis
Updated
Jul 22, 2017
Jupyter Notebook
Map naturally-occurring inter-subreddit content sharing patterns on Reddit by analyzing how posts are “cross-posted" between subreddits based on 2.5 million posts across the top 2,500 subreddits. Uses ECL and HPCC Systems.
MetricDoc is an interactive visual exploration environment for assessing data quality
Updated
Mar 30, 2020
JavaScript
Improve this page
Add a description, image, and links to the
data-profiling
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
data-profiling
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
Describe the bug
data docs columns shrink to 1 character width with long query
To Reproduce
Steps to reproduce the behavior:
<img width="1525" alt="Data_documentation_compiled_by_Great_Expectations" src="https://user-images.githubusercontent.com/928247/103230647-30eca500-4