Data Scientist | Infosys Limited
Feb 2023 – May 2025
As a Data scientist, I delivered end-to-end advanced analytics and machine learning solutions for a UK insurance client, transforming raw data into predictive insights and business value. My role spanned data analysis, modeling, visualization, and cloud deployements and also stakeholder engagement, ensuring scalable and impactful solutions.
Key Responsibilities:
- Extracted, transformed, and analyzed 10M+ rows of insurance policy, claims, provider, and member data using Oracle SQL and Azure Databricks for advanced EDA and reporting.
- Designed and deployed Power BI dashboards on claims, provider networks, fraud detection, and customer retention, reducing manual reporting by 40%.
- Improved dashboard responsiveness by 30% through DAX tuning and optimized data modeling, driving higher adoption across business units.
- Built and deployed fraud detection models that flagged high-risk claims, saving an estimated £2M annually.
- Developed predictive models for hospitalization and readmission risk, reducing readmissions by 10% in pilot programs.
- Applied NLP techniques on customer service transcripts to identify pain points, improving call-centre resolution time by 15%.
- Orchestrated end-to-end ML lifecycle using Azure Data Factory, Azure Data Lake, Azure Databricks, and Azure DevOps, embedding automation into data pipelines.
- Collaborated with cross-functional teams (data engineers, analysts, product owners) and drove delivery using Agile/Scrum methodology.
Tool Stack: Python, PySpark, Power BI, SQL, Azure Databricks, Azure Data Factory, Azure Data Lake, Azure DevOps, NLP, Machine Learning.
Junior Data Scientist | Infosys Limited
Feb 2022 – Jan 2023
As a Junior Data Scientist, I contributed to data analysis, modeling, and automation efforts on large-scale insurance datasets. I focused on data preparation, predictive modeling, and dashboarding, laying the foundation for advanced use cases in fraud detection and customer retention.
Key Responsibilities:
- Wrote complex SQL queries and Python/PySpark scripts for data cleansing, manipulation, and preprocessing of policy, claims, provider, and member data.
- Built and tested predictive models (churn prediction, policy renewal, and claim risk scoring) and deployed them into production with Azure ML and Databricks.
- Developed and containerized churn and policy renewal models with Docker, enabling underwriting teams to target high-risk customers with personalized offers.
- Designed claims-focused dashboards in Power BI to track fraud patterns, claim distributions, and risk exposure, giving leadership clear visibility into KPIs.
- Supported model deployment and scalability using Azure ML pipelines, ensuring seamless integration with claims and customer service systems.
- Collaborated with business analysts and project leads across geographies to deliver agile sprint deliverables on time.
Tech Stack: Python, PySpark, SQL, Power BI, Docker, Azure Databricks, Azure ML, Data Visualization, Predictive Modeling