The Wayback Machine - https://web.archive.org/web/20201125063447/https://github.com/topics/spark-sql
Skip to content
#

spark-sql

Here are 357 public repositories matching this topic...

thrixton
thrixton commented Jul 13, 2020

This is more a question than a feature request.

When parsing JSON files, I need to sanitize the field names so field with spaces becomes field_with_spaces.
I want to preserve the original name as well, metadata about the column if you like :)

There is a metadata field on StructField, but it is internal.
Why is this internal, is it possible or desirable to expose it?

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
  • Updated Feb 1, 2019
  • TypeScript

Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .
  • Updated Oct 13, 2020
  • Scala

Improve this page

Add a description, image, and links to the spark-sql topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the spark-sql topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.