I am Prasanth, a Data Scientist at CVS Health Retail, where I specialize in leveraging data science and machine learning to drive business impact. My current role involves demand forecasting, inventory optimization, and developing data-driven solutions to enhance operational efficiency and support strategic decision-making in the retail sector.
I hold a Master’s in Computer Science from Syracuse University, graduating with a GPA of 3.93/4.00, and a Bachelor’s in Computer Science and Engineering from NITK Surathkal. My academic achievements are complemented by a strong foundation in data science and engineering principles.
Before joining CVS Health, I worked as a Software Engineer and as a Data scientist at McAfee and Trellix, where I gained extensive experience in product data analysis, automation, quality assurance, and data engineering. Notably, I developed an automation framework for McAfee's Network Security Platform (NSP), streamlining testing processes and uncovering critical bugs, earning accolades from senior leadership. I also automated performance testing by integrating APIs to scale network traffic simulations efficiently.
My expertise spans time-series analysis, anomaly detection, and building scalable machine learning models. At McAfee, I focused on designing data pipelines and detecting anomalies in network traffic. Currently, I am expanding my skills in advanced machine learning topics, including Language Learning Models (LLMs) and recommendation systems, to stay at the forefront of innovation in the data field.
I am passionate about transforming data into actionable insights and delivering measurable business value. I look forward to contributing to impactful projects that harness the power of data science and analytics.
Syracuse University, New York
Master of Science | Computer Science and Engineering Specialization: Data Science
Aug '22 - May '24
GPA : 3.93 / 4.00
Relevant Coursework:
•
[CIS735] - Machine Learning for Security
•
[CIS675] - Design and Analysis of Algorithms
•
[CIS662] - Machine Learning
•
[CIS657] - Principles of Operating Systems
•
[CIS655] - Computer Architecture
•
[CIS628] - Cryptography
•
[CIS623] - Structured Programming and Formal Methods
•
[CIS600] - Social Media and Data Mining
•
[CIS600] - Applied Natural Language Processing(NLP)
•
[CIS563] - Data Science
National Institute of Technology Karnataka, Surathkal
Bachelor of Technology | Computer Science and Engineering
July '16 - July '20
Data Scientist, CVS Health Remote USA
August '24 - Present
•
Developed and optimized demand forecasting models for 11+ product categories in CVS Retail front store supply chain, enhancing inventory management and product availability.
•
Backtested and analyzed seasonal trends to refine Prophet based forecasting models, reducing MAPE by 12%.
•
Conducted A/B testing for 10% of CVS stores to validate forecasting models, ensuring accuracy and effectiveness.
•
Contributed to a DigitalTwin initiative by reverse engineering BlueYonder’s (BY) forecasting model,enhancing scalability, improving demand prediction accuracy, and streamlining inventory management.
•
Designed advanced forecasting strategies for new product launches using analogous product data, leading to a projected $250M reduction in inventory costs. Led the initiative from PoC to MVP and now driving production.
•
Utilized Azure Databricks and Snowflake to optimize datawork flows, employing efficient clusters and high-performance queries to support large-scale data analysis and reporting.
•
Conducted extensive Exploratory Data Analysis on over 2 billion transaction records from 7,000 CVS front stores and 60,000 SKUs, uncovering key insights to drive data-driven decisions and improve inventory and sales strategies.
ML intern & Graduate Teaching Assistant, Syracuse University Syracuse, USA
August '23 - July '24
•
Leveraged SQL to query datasets containing 10 million+ time series records, extracting key metrics that supported strategic decision-making, resulting in a 15% increase in revenue.
•
Applied supervised learning algorithms, including Linear Regression, Logistic Regression, Decision Trees, Random Forest, Naive Bayes, and Deep Learning models, to enhance predictive analytics capabilities.
•
Implemented deep learning algorithms, including RNN and LSTM, to forecast time series data, improving forecasting accuracy and model performance.
•
Utilized large language models(LLMs),including OpenAI’s GPT-3 and GPT-4, for applications such as content generation, natural language understanding, and conversational AI, driving improvements in natural language tasks.
•
Implemented CI/CD pipelines using CML and GitHub Actions to automate the end-to-end machine learning lifecycle, ensuring streamlined model deployment and updates.
•
Co-ordinating with Professor J.J Waclawski, Lectured a class of over 60 students, focusing on software implementations.
•
Proficiently instructed unit testing in Java, Python, and C++ using JUnit, pytest, and Google Test frameworks. [code] •
Managed 14 group projects with project planning strategies and creating sprints.
•
Designed efficient priority affinity scripts for optimizing process scheduling in Operating Systems.[code]
Data Scientist, McAfee Bangalore, India
August '20 - July '22
Data Analysis and Automation •
Developed scalable Python microservices and data solutions using AGILE and Test-Driven Development (TDD) to ensure robust backend services for McAfee’s product releases.
•
Conducted data analysis to identify datasets, source data, metadata, definitions, and formats.
•
Led the collection and processing of data from over 40,000 McAfee enterprise customers, applying statistical methods and machine learning techniques to ensure data quality for predictive and optimization models.
•
Worked with software engineering teams to deploy predictive models into production, ensuring seamless integration and validation using large historical datasets.
•
Worked on automation framework using python, REST API to automate NSP release testings.
•
Collaborated with product managers, data scientists, and engineers to optimize product features through data-driven methodologies, influencing the strategic direction of McAfee’s software offerings.
Machine Learning Research Intern, NITK Surathkal Mangalore, India
AI Based URL Detection
Data mining, Python, URL's, Feature Engineering, NLP, ML [code]
The project aims to detect URL's based on AI using a kaggle datatset.
•
The Algorithm was trained over 600000 URL's on various classification algorithms.
•
Over 22+ features were engineered and added into training model to achieve more meaningfull accuracy and detection.
•
The project was successfully integrated into django framework and hosted in cloud.
Reddit-Based Sentiment Analysis of Israel-Palestine Conflict
Data mining, Python, Sentiment Analysis, VADER, Time series illustrations, plots [code]
The project aims to find the sentiments over israelpalestine conflict from reddit data. The key features and milestones in the project are the following,
•
Using Reddit API(PRAW) to extract data from reddit, the data consists of 50000+ comments, description, owner info, and all other meta data.
•
Performed data pre-processing and employed the VADER sentiment analyzer to categorize Reddit posts into 3 categories
positive, negative, and neutral sentiments, enhancing understanding of sentiment trends
•
Analyzed Reddit sentiment trends, correlating with real-time events user spikes and disruptions. Identified over 10
observations including density, word counts etc to provide insights into public sentiment on the Israel-Palestine conflict.
Priority and affinity in Operating System
C Programming, Operating Systems, Process Scheduling, Priority and affinity [code]
•
Writing a cpu affinity code to demonstrate the priority and affinity setting of a process. .
•
The program takes an input file containing the process burst times, priorities and the core numbers which these processes are specified to execute.
•
This project enables setting specific cores to few processes which needed zero waiting time and real time executions not following the scheduling concept in Operating Systems.
•
Utilizes Snowflake and Apache Airflow for efficient ETL processes, extracting feature information and monitoring firewall attacks from 40,000+ McAfee customers’ servers.
•
Deployed Kubernetes to host 20+ versions of McAfee software in containers. Integrated Apache Airflow with Apache File Sensor for extracting debug files from customer servers. Leveraged Kubernetes for seamless deployment and data extraction, driving product enhancement initiatives.
•
The project served as the customer software information retrival and helped NSP in decision making during software update releases.
Wearable Device-Based Security: ML Authentication
Python, ML Algorithms, Double KNN [code]
•
Conducted analysis on head and torso movement data collected from 34 individuals as time series data. •
Evaluated multiple classification algorithms (KNN, SVM, decision trees, random forests) to identify the most accurate algorithm for classifying head and torso movements.
•
Implemented Double KNN algorithm and a hybrid combination of classification algorithms to leverage their strengths for improved accuracy and performance in classification tasks.
•
Evaluated various ML models, identifying a top-performing model with a 29% correlation, indicating authentication from two
wearable devices is independent, enhancing security.
Machine Learning in Agricultural Crop management and Disease detection
R Programming, SQL, Rapid Miner, Machine Learning Algorithms
•
The project aims to showcase the practical applications of machine learning within agricultural production systems.
•
The project's focus encompasses various aspects of crop cultivation, such as yield prediction, disease detection, weed detection, crop quality assessment, and species recognition.
•
Led analysis of 20-year crop data from Indian government, encompassing 500k+ records from various states. Demonstrated
leadership by collaborating with a team of 3 to leverage RapidMiner and SQL for feature extraction and processing.
•
The machine learning model(SVR Anova) deployed in the project demonstrated remarkable success by achieving an accuracy rate of approximately 84% in predicting crop yields, surpassing other models that could only achieve an accuracy of 75%.
•
Multiple machine learning models were employed in the project, including Support Vector Machines (SVM), Artificial Neural Networks (ANN), Bayesian Models (BM), and clustering techniques, to address specific agricultural production challenges and improve overall performance.
Automation of Performance testing for McAfee NSP using IXIA BPS
Python, SQL, Git, Selenium, Ixia Breaking point, Rest API
•
Led 11-month project automating network security performance testing using Ixia BreakingPoint API’s and Python which Significantly reduced testing time from 6 to 2 hours per device and 2 weeks per release cycle
•Led a team of 3, ensuring successful project integration with the existing automation framework.
•
Developed comprehensive results reporting through a user-friendly UI and private server.
•
Recognized as best TOP3 initiative in the network security team.
Customer Database Backup Automation and Bug Detection
Python, SQL, Git, Selenium, Rest API
•
Developed Python automation code to test customer DB backups during software releases at McAfee.
•
Integrated code into testing framework for comprehensive cross-software deployment and functional testing.
•
Improved code segment to verify software functionality post-version changes in backups, detecting critical bug,averting potential catastrophic collapse, and safeguarding $50 million in device value.
Online Examination System
PHP, MySQL, Apache HTTP, HTML/CSS, AWS [code]
•
Designed a web based platform using PHP ,making it dynamic and interactive for hosting online examinations with flexibility of various user roles.
•
The system is hosted on the XAMPP platform, utilizing an Apache server for web hosting and database management, ensuring a seamless online examination experience.
•
The system emphasizes data security and employs Role-Based Access Control (RBAC) to ensure authorized access for admin, teachers, and students.
•
It provides user-friendly panels for managing users and subjects, conducting exams, and includes a feedback system for user input, enhancing usability.
•
It provides a redundancy-based improvement to the present lossless data compression approach. LZW, a lossless data compression method, is combined with the BCH encoding process in the suggested approach. The new merging method reduced the size of an image file by 22 percent.
•
Improved Lempel–Ziv–Welch algorithm by combining it with well known Bose–Chaudhuri–Hocquenghem algorithm which corrects multiple bit errors.
•Reference : https://www.scirp.org/journal/paperinformation.aspx?paperid=23911
Programming
Python, C/C++, Haskell, SQL, Java, Bash, HTML/CSS, R
Developer Tools
Git, Docker, Kubernetes, AWS, Jenkins, Maven, VS Code, VM Ware, PyCharm, IntelliJ, Eclipse
Data Analysis and ML
Spark, Kafka, nltk, Pandas, NumPy, PySpark, Matplotlib, Scikit-learn, TensorFlow, Keras, PyTorch, RapidMiner, AWS S3, AWS Glue, AWS Athena, Databricks, Tableau, K8's
Testing
Regression, Performance, Stability, Integration, Unit, Functional, Sanity, Penetration, Acceptance, Upgrade, End to End
Database Technologies and Web Development
MySQL, RDS, PostgreSQL, MongoDB, Django.
•
I have been awarded a graduate position at Syracuse University in recognition of my consistently high CGPA, showcasing my dedication to academic excellence.
•
Received a prestigious gold medal🥇 at the International Mathematical Olympiad, a globally recognized competition that showcases exceptional mathematical talent and problem-solving skills.
•
Secured the 4th position in the state-wide Little Champs Education Academy Exam.
•
Achieved the 2nd🥈 position in the Ramanujan Talent Search Test in Mathematics at the county level.
•
Recipient of a $2,000 Scholarship from the Government of India for achieving a Top 2000 rank in the IITJEE.
•
Achieved the 49th State Rank in the Common Entrance Exam for Polytechnic.
•
Received recognition and accolades for achieving an outstanding All India Rank of 1082 out of 2 million applicants in the highly competitive JEE 2016, conducted by the Government of India.
Recognition
•
Earned 👏 high praise and commendation from Martin Stecher, distinguished McAfee Fellow and Head of Engineering, for delivering substantial and impactful contributions that propelled framework enhancements forward, resulting in a 25% increase in efficiency and a 30% reduction in system errors.
•
Recognized as a distinguished individual, having achieved the rare distinction of being featured twice in McAfee's monthly spotlights for outstanding contributions to their products.