Work

Nestle USA

Senior Analyst (Dec'24 - Present)

Sales and Marketing Analytics: Analyzed Nestlé USA’s sales data using SQL, Python (Pandas, NumPy), improving demand forecasting accuracy by 20%. Enhanced marketing campaign ROI by 18% through customer segmentation with Power BI, Tableau, and DAX. Optimized pricing strategies by 15% using statistical modeling and historical data analysis.

Supply Chain Optimization: Developed predictive models with Scikit-Learn, XGBoost, and Azure ML, reducing supply chain disruptions by 22%. Used Azure Data Factory, Synapse Analytics, and SQL Server for real-time inventory monitoring, minimizing stock outs by 30%. Implemented ETL pipelines with SSIS and Informatica, improving data processing efficiency by 25%.

Consumer Insights and Product Performance: Conducted customer sentiment analysis using NLP (VADER, BERT), improving product feedback analysis by 28%. Used A/B testing (t-tests, ANOVA) to assess new product launches, increasing customer adoption rates by 20%. Created executive dashboards in Power BI and Tableau, reducing decision-making time by 35%.

Manufacturing and Quality Control Analytics: Implemented IoT and real-time data streaming (Kafka, Spark Streaming) to monitor production efficiency, reducing defects by 18%. Developed anomaly detection models using Azure Cognitive Services, minimizing equipment failures by 25%. Integrated data from SAP BW, SAP HANA, and Microsoft Dynamics 365 to streamline operations.

AI-Driven Automation and Process Improvement: Built AI-powered recommendation engines for demand planning, improving forecast accuracy by 22%. Deployed predictive maintenance models for Nestlé USA’s manufacturing plants, reducing downtime by 30%. Automated data pipelines using Git, Jenkins, Docker, and CI/CD, improving reporting efficiency by 40%.

Montclair State University

Research Assistant - Machine Learning (Jan'24 - Dec'24)

As a Research Assistant specializing in Machine Learning, I'm currently involved in an ongoing project leveraging machine learning to detect malware in npm packages. This role encompasses a comprehensive pipeline, including data collection, model building, and real-time deployment.

Data Collection & Analysis: Scraping npm package data (metadata, source code, dependencies) and working with malware datasets. This includes cleaning, labeling, and feature extraction to identify patterns like obfuscation and malicious dependencies. I also explore dependency graphs to detect malware propagation and hijacking risks.

Model Development & Feature Engineering: Building machine learning models like decision trees and random forests to effectively detect malware. I apply NLP techniques to README files and code comments for additional insights. Models are trained and fine-tuned through cross-validation, with performance measured by precision, recall, and F1 scores. I use SHAP and LIME to ensure model interpretability.

Real-Time Pipeline & Reporting: Developing real-time detection pipelines that monitor npm registry changes and flag suspicious packages. I’ve created a dashboard and API to display flagged packages and risk scores. The models are deployed for real-time scanning in both cloud and on-premise environments. I also implement a feedback loop for manual review and integration of new malware findings to keep the system updated and effective.

BYJU'S - The Learning App

Data Scientist (Apr '20 - Apr '22)

At BYJU'S, I worked on various data-driven projects aimed at improving student performance, engagement, and overall business efficiency. Key contributions include:

Student Performance & Engagement Analysis: Analyzed student data using Python (Pandas, NumPy) and SQL, which improved personalized learning modules by 20%. I also enhanced marketing ROI by 15% through demographic analysis using Tableau and Power BI and increased user engagement by 25% through product interaction insights.

Recommendation Engine & Churn Prediction: Developed a recommendation engine using Matrix Factorization, boosting student engagement by 30%. Built churn prediction models with Logistic Regression and Decision Trees, reducing dropout rates by 25%. Applied ARIMA for sales forecasting, improving accuracy by 20%, and implemented K-Means Clustering for customer segmentation, increasing targeted marketing by 15%.

A/B Testing & Feature Evaluation: Conducted A/B testing to evaluate features like video lessons and quizzes, improving student outcomes by 20%. This led to a 15% rise in user engagement metrics like DAUs and session duration, using statistical tests (t-tests, ANOVA).

Executive Dashboards: Created real-time executive dashboards in Power BI and Tableau, integrating data from LMS and CRM systems to track KPIs, improving operational efficiency by 25%. Visual insights also reduced customer support response times by 30%.

NLP & Chatbot Development: Implemented NLP for personalized content recommendations, boosting engagement by 30%. I also performed sentiment analysis using VADER and BERT, improving user satisfaction by 25%, and developed AI-driven chatbots that reduced response times by 40%.

Enercast GmbH

Business Analyst/Data Scientist (Oct '17 - Mar'20)

At Enercast GmbH, I worked on predictive analytics and renewable energy forecasting, delivering key improvements for various clients:

Wind Energy Forecasting for AP TRANSCO: Developed models using Linear Regression, Random Forests, and GBM, increasing forecast accuracy from 94% to 97.6% for 1.3 GW of wind assets. Conducted EDA with Pandas and NumPy, engineered features, and designed LSTM and CNN algorithms for time-series forecasting. Established data quality protocols with RapidMiner, boosting model performance.

Solar Generation Forecasting for TS TRANSCO: Transitioned data collection intervals from hourly to 15 minutes using XGBoost and SVR, enhancing forecasting accuracy by 20%. Implemented ETL pipelines with Apache Spark for real-time data processing and developed dashboards in Tableau and Power BI for client workshops and real-time insights.

Asset Management Portal for Greenko Group: Led development of a portal integrating Enercast’s predictive analytics for managing 3 GW of renewable assets, improving maintenance scheduling by 25%. Used Plotly and Matplotlib to create interactive dashboards and integrated disparate data sources for real-time forecasting.

Risk Management & Decision Support: Implemented risk assessment frameworks with Scikit-Learn, improving failure detection in power generation by 15%. Conducted A/B testing to refine forecasting models and automated data cleaning processes to ensure high-quality inputs.

Big Data & Cloud Integration: Leveraged AWS, GCP, Hadoop, and Spark to deploy scalable machine learning models, improving computational efficiency and data management by 30%. Spearheaded model deployment using Docker and Kubernetes for better scalability.

Cross-Project Collaboration & Reporting: Worked closely with software engineers and product managers to improve user interfaces, boosting satisfaction by 20%. Provided detailed reports for executives and led cross-functional meetings to integrate feedback into product development.