The Wayback Machine - https://web.archive.org/web/20210821205855/https://github.com/topics/data-profiling
Here are
42 public repositories
matching this topic...
Create HTML profiling reports from pandas DataFrame objects
Updated
Aug 14, 2021
Jupyter Notebook
Visualize and compare datasets, target values and associations, with one line of code.
Updated
Jul 8, 2021
Python
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF and PySpark
Updated
Aug 21, 2021
Python
Data validation and organization of metadata for data frames and database tables
Data profiling, testing, and monitoring for SQL accessible data.
Updated
Aug 19, 2021
Python
Monitor the stability of a pandas or spark dataframe ⚙︎
Updated
Aug 9, 2021
Python
🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF and PySpark)
a set of scripts to pull meta data and data profiling metrics from relational database systems
Updated
Apr 12, 2021
Python
Dataset search engine, discovering data from a variety of sources, profiling it, and allowing advanced queries on the index
Updated
Aug 13, 2021
Python
Updated
Aug 19, 2021
Python
A Node.js tool to examine the correctness of Open Data Metadata and build custom dataset profiles
Updated
Jun 20, 2018
JavaScript
Updated
Apr 20, 2021
JavaScript
Open Data Profiling, Quality and Analysis on NYC OpenData dataset with semantic profiling using fuzzy ratio, Levenshtein distance and regex
Updated
Nov 10, 2020
Jupyter Notebook
Simplify usage of the RDS API for TypeScript/JavaScript developers
Updated
Aug 9, 2020
TypeScript
R package to simplify the usage of the RDS REST API and provide convenience in accessing data and metadata.
🔍 Your Data Quality Detector / Gain insight into your data and get it ready for use before you start working with it 💡 📊 🛠 💎
Updated
Aug 2, 2021
Python
Automated exploration of files in a folder structure to extract metadata and potential usage of information.
Updated
Mar 28, 2020
Python
TypeScript/JavaScript example code using the RDS API
Updated
Aug 9, 2021
TypeScript
Profile. Generate data profiles in the browser (work in progress)
Updated
Aug 11, 2021
JavaScript
HPCC Systems ECL bundle that provides some basic data profiling and research tools to an ECL programmer
The program compares two files at a time and does the following 1.Gathering metadata on the individual tables(column count,record count,list of columns with datatype etc) 2.Identifying matching columns between tables based on names as well as data. Using machine learning, we are handling syntactic as well as semantic variations of column names for accurate matching. 3. Finding duplicate columns in single table with the option to deduplicate if required 4. Finding columns with missing data/null values.
Updated
Feb 17, 2018
Python
Data profiler is an attempt to model the behavior of a given operator for a set of datasets.
Updated
Aug 6, 2021
Python
This is a Demo on Data Engineering using Great Expectations API
Updated
Jul 26, 2021
Jupyter Notebook
Distributable UCC Discovery Algorithm based on Akka
Project to develop R scripts to capture, validate and analyze loan information from credit unions peruvian sector.
Updated
Jul 23, 2021
HTML
a nix DataProfiler for deep analysis of data quality on tabular files
A R Notebook to perform basic data profiling and exploratory data analysis on the FIFA19 players dataset and create a dream-team of the top 11 players considering various player attributes.
Updated
May 26, 2021
HTML
Improve this page
Add a description, image, and links to the
data-profiling
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
data-profiling
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
Describe the bug
data docs columns shrink to 1 character width with long query
To Reproduce
Steps to reproduce the behavior:
<img width="1525" alt="Data_documentation_compiled_by_Great_Expectations" src="https://user-images.githubusercontent.com/928247/103230647-30eca500-4