aws-glue

It is not surprising that deep and shallow scan show different results. Shallow scan only looks at column names. Deep scan looks at a sample of the data. I've even noticed that two different runs of deep scan show different results as sample rows are different. This is the challenge with not scanning all of the data. Its a trade-off between performance/cost and accuracy. There is no right answer.

Sep	OCT	Nov
	16
2020	2021	2023

aws-glue

Here are 68 public repositories matching this topic...

awslabs / aws-data-wrangler

dgomesbr / awesome-aws-workshops

awslabs / athena-glue-service-logs

aws-samples / data-lake-as-code

tokern / piicatcher

Shallow scan should recognize phone, credit card, person and location from column names

aws-samples / cloud-experiments

aws-samples / amazon-deequ-glue

aws-samples / bring-your-own-data-labs

aws-samples / analyzing-reddit-sentiment-with-aws

awslabs / amazon-athena-cross-account-catalog

vincentclaes / serverless_data_pipeline_example

tokern / lakecli

GorillaStack / athena-cloudtrail-partitioner

webysther / aws-glue-docker

1oglop1 / aws-glue-monorepo-style

aws-samples / streamlit-application-deployment-on-aws

mikaelahonen-solita / aws-glue-tutorial

jonrau1 / AWS-ComplianceMachineDontStop

TrainingByPackt / Serverless-Architectures-with-AWS

chgasparoto / terraform-aws-glue

jhole89 / aws-glue-sbt-quickstart

canyousayyes / aws-real-time-data-collection

spe-uob / 2020-HealthcareLakeETL

bdoepf / aws-etl-example

geeknam / aws-neptune-aml

svajiraya / aws-glue-libs

mincloud1501 / DevOps

akhilpatlolla / Generic_ETL_Utility_AWS_GLUE

jhole89 / serverless-data-pipelines-demo

miztiik / stream-etl-with-glue

Improve this page

Add this topic to your repo