Sales Analytics Project
An End-to-End Data Pipeline from Raw CSV to Power BI Intelligence
Project Overview
This project demonstrates a full data analytics pipeline:
- CSV Data → SQL Import
- Data Cleaning using SQL
- Data Analysis Queries
- Dashboard in Power BI
Step 1: CSV Input Data
The raw source data used for this project:
order_id,order_date,product,category,sales,quantity,customer
1001,2023-01-05,Phone,Electronics,500,1,Amit
1002,2023-01-06,Laptop,Electronics,1200,1,Rahul
Step 2: SQL Table Creation
Initial staging table to hold raw string data:
CREATE TABLE sales_raw (
order_id VARCHAR(20),
order_date VARCHAR(20),
product VARCHAR(100),
category VARCHAR(100),
sales VARCHAR(20),
quantity VARCHAR(20),
customer VARCHAR(100)
);
Ready to See the Results?
Explore the interactive Power BI dashboard and full dataset on GitHub.
View Live DashboardStep 3: Data Cleaning SQL
Transforming data types and cleaning strings:
CREATE TABLE sales_clean AS
SELECT
CAST(order_id AS INT) AS order_id,
STR_TO_DATE(order_date, '%Y-%m-%d') AS order_date,
TRIM(product) AS product,
TRIM(category) AS category,
CAST(sales AS DECIMAL(10,2)) AS sales,
CAST(quantity AS INT) AS quantity,
TRIM(customer) AS customer
FROM sales_raw;
Step 4: Data Cleaning Operations
Removing anomalies and handling duplicates:
-- Remove invalid data
DELETE FROM sales_clean
WHERE sales <= 0 OR quantity <= 0;
-- Remove duplicates
DELETE t1 FROM sales_clean t1
JOIN sales_clean t2
ON t1.order_id = t2.order_id
AND t1.order_id > t2.order_id;
Step 5: Analysis Queries
Extracting business insights through SQL:
-- Total Revenue
SELECT SUM(sales) FROM sales_clean;
-- Top Products
SELECT product, SUM(sales)
FROM sales_clean
GROUP BY product
ORDER BY SUM(sales) DESC;
-- Monthly Sales
SELECT DATE_FORMAT(order_date,'%Y-%m') AS month,
SUM(sales)
FROM sales_clean
GROUP BY month;
Step 6: Power BI Dashboard Setup
Open Microsoft Power BI Desktop and import sales_clean
Add Visuals:
- 📈 Line Chart → Monthly Sales Trend
- 📊 Bar Chart → Top Products
- Pie Chart → Category Distribution
- KPI Cards → Revenue, Orders, Quantity
- Slicers → Category + Date Filter
Power BI DAX Measures
Total Revenue = SUM(sales_clean[sales])
Total Orders = COUNT(sales_clean[order_id])
Total Quantity = SUM(sales_clean[quantity])
Summary
You now have a complete analytics system:
CSV Data SQL Cleaning Data Analysis Power BI DashboardDownload Microsoft Power BI from the official Microsoft Webpage or it is available from Microsoft Store. And for data cleaning part you can use Pandas instead.