0% found this document useful (0 votes)
38 views10 pages

Creating Databricks Community edition

Databricks is a cloud-based data engineering platform that enables data engineers, data scientists, and data analysts to collaborate and work on various data-related tasks. Founded by the original creators of Apache Spark, Databricks provides a unified analytics platform that combines data engineering, data science, and data analytics. The platform allows users to process large amounts of data using Spark, build and train machine learning models, and deploy

Uploaded by

analystmaniac
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views10 pages

Creating Databricks Community edition

Databricks is a cloud-based data engineering platform that enables data engineers, data scientists, and data analysts to collaborate and work on various data-related tasks. Founded by the original creators of Apache Spark, Databricks provides a unified analytics platform that combines data engineering, data science, and data analytics. The platform allows users to process large amounts of data using Spark, build and train machine learning models, and deploy

Uploaded by

analystmaniac
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

CHENCHU’S

C .R. Anil Kumar Reddy


Associate Developer for Apache Spark 3.0

Part-01 : Creating DataBricks Community Edition


& Run your first code in databricks

www.linkedin.com/in/chenchuanil
CHENCHU’S

1. What is Databricks Community Edition?

Databricks Community Edition offers a free and accessible


platform for learning Apache Spark, practicing big data
analytics, and developing machine learning models. Features
include:

Micro-Cluster: A small cluster with 6 GB memory and 2 CPU


cores.

Notebook Environment: Interactive workspace for writing and


running code.

Sample Datasets: Access to datasets for practice.

Integration with Spark APIs: Utilize Spark's powerful data


processing capabilities.

It's ideal for students, educators, and professionals seeking


hands-on experience with big data processing.

www.linkedin.com/in/chenchuanil
CHENCHU’S

2. Prerequisites

Before you begin, ensure you have:

Email Account: A valid email address for registration.

Web Browser: Modern browser like Chrome, Firefox, or


Edge.

www.linkedin.com/in/chenchuanil
CHENCHU’S

3. Creating a Databricks Community Edition Account

Step 1: Visit the Sign-Up Page

Go to the Databricks Community Edition sign-up page:


Databricks Community Edition Sign-Up

Step 2: Fill Out the Registration Form

Provide the following information:

First Name
Last Name
Email Address
Company: If not affiliated with a company, you can enter
"Self-Employed" or "Student".
Title: Your professional title (e.g., Student, Data Analyst).
Country

Check the box to agree to the Terms of Service and Privacy


Policy, then click "Get Started for Free".

www.linkedin.com/in/chenchuanil
CHENCHU’S

Step 3: Verify Your Email

Check your email inbox for a message from Databricks.


Click the verification link in the email to confirm your
account.

Step 4: Set Your Password

After verification, you'll be prompted to create a password.


Enter a secure password and click "Submit".

Step 5: Log In to Your Account

You will be redirected to the login page.


Enter your email and the password you just created.
Click "Sign In" to access the Databricks workspace.

www.linkedin.com/in/chenchuanil
CHENCHU’S

4. Navigating the Databricks Workspace

Upon logging in, you'll see the main workspace, which includes:

Sidebar: Access different sections like Workspace, Clusters,


Jobs, Data.
Workspace: Organize notebooks and other files.
Compute : Manage your computing clusters.
Data: Upload and manage datasets.
Jobs: Schedule automated tasks (limited in Community
Edition).
Account Settings: Update personal information and
preferences.

www.linkedin.com/in/chenchuanil
CHENCHU’S

5. Creating and Running Your First Notebook

Step 1: Create a Cluster

Clusters are the computational resources for running your


notebooks.
1. Click on "Clusters" in the sidebar.
2. Click "Create Compute".
3. Configure your cluster:
Cluster Name: e.g., "My First Cluster" & click create
compute at bottom.
Cluster Mode: Keep as "Single Node".
Databricks Runtime Version: Use the default or select
the latest stable version.
4. Click "Create Cluster".
5. Wait for the cluster status to change to "Running".

www.linkedin.com/in/chenchuanil
CHENCHU’S

Step 2: Create a Notebook

1. Click on "Workspace" in the sidebar.


2. Click on the arrow next to your username to expand your
user folder.
3. Click "Create" and select "Notebook"
4. Fill in the notebook details:
Name: "First Notebook".
Default Language: Choose Python.

www.linkedin.com/in/chenchuanil
CHENCHU’S

Step 3: Write and Run Code

In your new notebook:


Cell 1: Print a greeting.

Run the cell by clicking the "Run" button or pressing Shift +


Enter. Notebooks are composed of cells, which are individual
blocks that contain code, text, or visualizations. In above
diagram you can see cell 1 and cell 2.

Congragulations you have run your first code in


databricks www.linkedin.com/in/chenchuanil
NIL REDDY CHENCHU CHENCHU’S

Torture the data, and it will confess to anything

DATA ANALYTICS

Happy Learning

SHARE IF YOU LIKE THE POST


Lets Connect to discuss more on Data

www.linkedin.com/in/chenchuanil

You might also like