• Objective: Train the model to predict the test bench time from 378 categorical features.
• Feature engineering and Feature selection: Handled features with zero variance, handled multicollinearity and treated categorical features with a large number of categories.
• Model Training: Used Linear regression, Ridge regression, Gradient boosting, and XGBoost algorithms to train the model, and performed hyperparameter tuning to optimize performance metrics. Model used to select feature values for minimum testing time.
• Objective: Perform Customer segmentation using RFM analysis (Recency, Frequency, and Monetary value) to identify prominent customers in store.
• Exploratory data analysis (EDA): Performed Cohort analysis & built RFM segments
• Perform Clustering on RFM data: Outliers detection, selected feature scaling method, applied K means clustering algorithm on scaled data, used elbow method and Silhouette score to decide the optimum number of clusters.
• Data Visualization: Created dashboard in Tableau to show average sales in different countries, Top selling products, hourly sales, and a heatmap for RFM values.
• Objective: To compare sales data between two regions using the Tableau dashboard and suggest necessary improvements to management.
• Created parameters for regions, shown the sum of sales for different products, shown variation of sales with respect to time, used maps to show states in different regions.
• Created a dashboard to compare sales characteristics of two different regions at a time.
• Objective: Data gives information about customers’ complaints received from different regions at different times of the year. Do the data analysis based on types of complaints, number of complaints, and region-based distribution of complaints. This will help the telecom service provider to take necessary actions for reducing the number of complaints.
• Tasks Performed: Using EDA Techniques in pandas library to for data analysis of registered complaints.
• Objective: To establish correlation between spend on marketing promotions and sales.
• Feature engineering and Feature selection: Log transformation, outliers’ detection, check for multicollinearity, feature scaling (Standardization).
• Model Training: Used Stats model, Linear regression, and XGBoost algorithms and performed hyperparameter tuning to optimize metrics. Generated Response curves.
• Data Visualization: Showing relation between Product, Price, Promotions and Places (4Ps) with Sales using Tableau dashboards. Optimizing spends on promotions.