Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
Pandas is a high-level data manipulation tool developed by Wes McKinney. It is built on the Numpy package and its key data structure is called the DataFrame. DataFrames allow you to store and manipulate tabular data in rows of observations and columns of variables.
Project examines factors affecting the change in car sales between 2019 and 2020 utilizing real world data, python, pandas, matplotlib and jupyter notebook.
This is a repository which contains a small demo of danfo.js. danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
Employee exit survey results from two institutes, Department of Education, Training and Employment (DETE) and the Technical and Further Education (TAFE) in Queensland, Australia, were analyzed to discover if there were notable relationships between employee dissatisfaction rates, and factors such as their Age, and how long they had been working at their respective institutes.
Analysis of Paycheck Protection Program by the government done to the Covid crisis in 2020. This repository focuses on merging, cleaning, normalizing and exploring the data using Python, Python Pandas and PySpark.
The purpose of creating a binary classifier capable of predicting applicant's success rate. The Data is preprocessed, a NNM (Neurnal Network Model) is compiled, trained and evaluated.
Using Python and SQLAlchemy to do basic climate analysis and data exploration of our climate database. Then after initial analysis, designing a Flask API based on the queries that we just developed.
Requested by Maria. Primary focus is to analyze school district data to gather new insights and visually represent clear results on individual school performance.
This project is part of Advanced Data Analysis Nanodegree with Udacity. It includes performing an exploratory data analysis using Python. Then, create a presentation with explanatory plots.
Loaded the iris dataset in Python using a Pandas data frame.Performed a PCA using Scikit Decomposition component.Plotted the Principal Components to recreate the scatterplot for each flower type