This project analyzes sales data for 3 products over time. The goal is to clean the dataset, handle missing values, and extract key business insights using R.
The dataset includes:
- Date
- Product
- Quantity
- Unit_Price
- Missing values (nulls)
Steps performed:
- Handling missing values (NA)
- Removing inconsistencies
- Total Revenue
- Total Sales per Product
- Quantity Sold over Time
- Revenue Trends
- Sales over time
- Revenue by product
- Quantity distribution
- The monthly revenue analysis for 2025 reveals significant fluctuations in sales performance throughout the year.
- Revenue shows high volatility, with no clear stable trend across months.
- A sharp decline in May is observed, followed by a strong recovery in June, which represents the highest revenue point of the year.
- A similar pattern appears between September and October, suggesting recurring short-term recovery cycles.
- The year starts at a moderate level in January, but ends with weak performance in December, which is the second lowest revenue month, potentially indicating seasonality effects, reduced demand or ineffective end-of-year strategy
These patterns may indicate the use of reactive business strategies, such as:
- promotional campaigns
- discounting strategies
- increased marketing efforts after low-performing periods
This behavior suggests that the business might not be following a consistent long-term sales strategy, but rather reacting to declines with short-term actions. 👉 A more stable approach could include:
- better demand forecasting
- consistent marketing planning
- proactive pricing strategy instead of reactive
- R
- dplyr
- ggplot2
- lubridate
- Clone the repository
- Open the notebook in RStudio or Kaggle
- Run all cells