Insurance Charges Prediction
(EDA + Regression + MLflow)
► Developed an insurance charge predictor utilizing supervised machine learning algorithms to assist insurance agents accurately forecast potential medical cost of prospective policyholders.
► Conducted detailed analysis on the dataset, providing insights that enable insurance companies refine and update their product portfolios, ensuring alignment with current market dynamics.
► Utilized MLflow to streamline model training.
Bank Transaction Classification
(Multimodal Classification)
► Constructed a multimodal classification model to classify bank transactions into categories to analyse clients spending habits and provide financial advice and product recommendations.
► Performed comprehensive analysis on data to identify transactional patterns and filter clients for targeted product recommendations based on specific metrics.
► Multimodal classification model utilizing structured and text data achieves 96% accuracy.
► Utilized MLflow to streamline model training.
Vehicle Specification Clustering
(EDA + PCA + Clustering)
► Performed clustering analysis on dataset containing a variety of vehicles with its features and specifications to uncover patterns and relationships between them.
► Categorized vehicles into 4 meaningful clusters, each representing a group of vehicles within a car segment.
Telco Churn Prediction
(Feature Engineering + Classification)
► This project is focused on predicting customer churn in the telecommunications industry.
► Through data analysis, the aim is to identify key subscription features and actions to be taken to retain customers.
► 53% of potential churn customers can be detected by the model built.
House Price Prediction
(Data Wrangling + Regression)
► Detailed data wrangling is performed to build a house price predictive model which predicts sale price of a property given the size, features, condition and location of a property.
► The model created predicts the sale price of properties with 92% accuracy.
Booking Cancellation Prediction
(EDA + Classification)
► Analyzed booking cancellation dataset within the hospitality and tourism industry to predict room cancellation and minimize opportunity losses for hotels.
► Strategic recommendations aimed at reducing opportunity losses and boosting profit through suitable marketing strategies is deduced through data analysis.
► The model developed accurately identifies 81% of room cancellations.
Income Evaluation Prediction
(EDA + Classification)
► Classified individuals based on their annual income range with 88% accuracy which enhances targeted product marketing for companies, optimizing resource allocation and reducing inefficiencies.
► Conducted comprehensive analysis on labeled dataset to discern key characteristics distinguishing individuals with high and low incomes.
Country & Region Similarity
(Clustering)
► Explored unsupervised machine learning by analyzing the World Factbook, a repository of diverse country-specific information.
► Categorized countries into meaningful clusters to facilitate informed decision-making by global stakeholders.
Olympic History Stats
(SQL + EDA)
► Conducted data querying and exploratory data analysis on a dataset containing athletes information and Olympic Games medal results from Athens 1896 to Rio 2016.
► Revealed patterns and trends along with determining factors impacting an athlete's winning probability.