I'm a freelance data analyst with experience using statistics, machine learning algorithms, and interactive dashboards to deliver accessible and actionable data solutions.
Get in touch
Email
LinkedIn
GitHub
27 September to 6 October 2023
Policing resources are scarce, and deploying assets without intelligence can put law enforcement officers and the public at risk. Awareness of policing needs is crucial for public safety and effective law enforcement. These interactive dashboards provide critical location- and time-based information, enabling decisionmakers to allocate resources with precision.
See my Python code for data preparation here and my interactive dashboard on Tableau Public here.
2-13 September 2023
Salifort Motors suffers from high employee turnover, inflating costs associated with interviewing, hiring, and onboarding new employees. To develop an effective retention strategy, the HR department commissioned me to analyse their data set on employee churn to gain insight into factors affecting employee turnover.
Goals: Support decision-making for developing an employee retention strategy by (i) identifying key factors affecting employees leaving the company and (ii) constructing a model that accurately predicts which individual employees are likely to leave the company.
Result: The goals were successfully achieved: (i) Key factors affecting employees leaving the company are satisfaction level, time at the company, and workload, and (ii) a model was developed with a 93% success rate in identifying employees at risk of leaving the company. Additional insights useful for developing a retention strategy were also extracted.
An EDA was conducted to clean and prepare the data set for predictive modelling. Features were extracted and selected iteratively in parallel with testing various binomial classification ML models (logistic regression, naïve Bayes, decision-tree, and a tree-based gradient boosting machine). Two tree-based models were highly successful at predicting employee turnover with similar performance metrics (a precision of 97% and a recall of 92-93%). Both models identified the same factors as impacting employees leaving the company, namely satisfaction, time at the company, and workload.
The identified factors (satisfaction, time at the company, and workload) did not have simple linear relationships to turnover. For example, employees within certain high and low satisfaction intervals were both more likely to leave. Thus, further investigation is required to determine the nature of the impact these factors have on employees leaving. This would be informative for developing a nuanced and effective employee retention strategy.
Additional findings useful to the goal that were not part of the initial project plan were also identified, mostly with regards to employee management. For instance, there doesn’t appear to be a clear process for promoting high-performing employees, or for developing capacity in struggling employees.
See my Python code on constructing logistic regression, naive Bayes, decision tree, and GBM models here.
9 June to 1 September 2023
Waze is a community driven navigation app that helps millions of users get to where they’re going through real-time road alerts and an up-to-the-moment map. High app user retention rates indicate satisfied users who repeatedly use the Waze app over time. This project aimed to develop a churn prediction model to help improve user retention, prevent churn, and grow Waze’s business.
This was a five-stage project, in which I was involved from the second stage.
It was established that the data is insufficient for reliably predicting user churn and that further granular data is needed on app usage and geography. Given the data, it could be determined that users who are professional drivers and who use the app more in a month are the biggest predictors of whether a user will churn or be retained.
See my Python code for stages 2-5 here.
24 April – 3 May 2023
Employee absenteeism can have a significant impact on a company’s productivity, operational efficiency, and overall performance. By conducting a comprehensive analysis and developing a predictive model, this project provides support to Human Resources decision-making. The project provides valuable insights into employee absenteeism patterns and delivers actionable recommendations for businesses to optimise workforce management, enhance productivity, and improve employee satisfaction.
See my Python code here.