Pranav Anand Joshi
Data Scientist II, Amazon
MS-Data Science || Michigan Tech
8.5 years of professional experience as Data Scientist
Phone:
+1-906-370-8942
Email:
​
Address:
Bellevue, WA, United States
​
Favorite Quote:-
"If you torture the data long enough, It will confess"
- Ronald Coase
Social Sites:
​
​
Hello! I'm Pranav
​
I hold a Master's degree in Data Science from Michigan Technological University. At present, I serve as a Data Scientist II at Amazon Transportation Services (ATS). With over 8.5 years of professional experience, I've contributed to diverse organizations including Amazon, ORR Safety, J & J Vision, and Ramco Systems.
I am passionate about applying my analytical skills and constantly learning new technologies and methodologies to decode an organizations' biggest asset, 'data' and to add tremendous value both as a team player and a leader. I'm also passionate about exploring questions within datasets, and effectively conveying this to product managers, engineers, and business stakeholders. I have a keen desire to solve business problems, and live to find patterns and insights within structured and unstructured data. I'm most motivated when striving for the 'near impossible' and would be thrilled with the opportunity to help the company reach its impressive growth target.
I'm incredibly self-motivated and very curious person, I commit myself to my projects. And I'm always trying to make myself a little better.
This keeps me enrolled in AWS bootcamps & various online courses on Coursera, Data camp and Udacity to expand my knowledge, and stay updated of the latest trends and techniques used in the field.
My other interests include delve into the realms of psychology, business literature, and cinema—ranging from thrilling Hollywood action films to heartfelt Bollywood romances. Music holds a special place in my heart, while dancing at night parties and engaging in weekend tennis/ping pong matches add relish to my life.
Apart from these passions, 'Astrology' runs in my blood, with both my parents having earned PhDs in the field of Astrology and Psychology.
​
EXPERIENCE
EDUCATION
May 2021-Present
Data Scientist II
Amazon​​
​​
-
Led the development and deployment of "PYRUSS", a machine learning based model for 72-hour yard utilization prediction. PYRUSS is supporting 500+ yards across 30+ site types across NA (US/CA), and has reduced trailer pool adjustment (TPA) move costs by $146.5K/wk. since it's launch with $7.6MM annualized entitlement. This approach aimed to curtail site losses, optimize truck arrival scheduling, and ensure timely customer deliveries.
-
Improved the labor planning precision by implementing a DeepAR+ forecasting model across 400 warehouse sites, yielding a WAPE of 3.2% on Network level. This achievement translated into projected savings of approximately $15 million.
-
Designing, developing, and maintaining automated CI/CD pipelines aimed at orchestrating the machine learning model training process. Enabling A/B testing and harmonizing version control systems within CI/CD tools to ensure seamless code and model version management. Implemented AWS services like AWS lambda, Sagemaker, EC2, AWS Glue, and AWS Eventbridge for data science architecture design and automation purposes.
-
Developed ML model to find out the factors causing "Unsafe Driving" community issues and applied strategic solution for the same by generating insights on delivery station wise level for both NA & EU which resulted in reduction of loss from $50M to $20M yearly.
-
Created an automated scheduler for the CSI Community Survey initiative, designed to assess data quality and integrity to identify deceptive surveys and assessed suboptimal responses using criteria such as speed, straightlining, red herring, and open-ended responses.
June 2019-May 2021
Data Scientist
-
Developed Deep Q-learning based RL model for "Pricing Optimization Engine" project for distinct Safety Products to maximize operating profit of Organization, and see how customers will respond to different prices& services through different channels.
-
Developed Intelligent Automation platform to help organization in transforming information-intensive business processes, reducing manual work and errors, minimizing costs, and improving customer engagement.
-
Created a tool that enables business analyst to conduct rapid analytic studies for any kind of Classification & Regression dataset. This analytic engine explores data, mine patterns & give recommendation using distinct Machine Learning models automatically.
-
Contributed in holistic architecture design, drafting and executing test case scenarios (UAT), and quality assurance (QA) of systems and enhancements.
May 2018-May 2019
Machine Analytics Co-op
​
-
Worked on 'Hydration Bypass' project in Machine Analytics (Data Science) team of Quality Assurance department at Johnson & Johnson Vision Care (Vistaken). Here, I applied Big Data, Machine learning and Deep Learning techniques and improved Acuvue contact lens production efficiency of over 14% with an estimated saving of $5.4MM/year to fulfill the major purpose of this project: ​
-
Identifying the reasons of daily defects (Average loss of 10,000 lenses daily ) across 40 lines in manufacturing lines;
-
Ways of reducing these defects and save million of dollar to achieve maximum profits and high efficiency.
-
-
Developed a “Johnson Speech Recognition” framework in Python using Natural language processing to pull large amount of manufacturing data in Tableau dashboard from a small amount of recorded target speech.
-
Design & develop statistical tests to provide conclusions to non-technical audience that will guide them to make forward thinking strategies.
May 2015-May 2017
Associate Data Scientist
-
Developed ‘RamcoGEEK’, a facially recognized employee attendance system, using simple python application based on OpenCV and dlib.
-
Identified customer purchasing trends by creating ‘Customer segmentation’ in python using k-means clustering containing data on various customers' annual spending amounts of diverse product categories & improved sales by 6%.
-
Database design and Development, data modelling, data wrangling & data warehouse of Ramco's home-grown HRMS System in SDLC, using SQL, Teradata SQL assistant and ETL tools (at the technical end).
-
Engaging in interactions with prospective clients, requirements gathering & Documentation, and gap & fitment analysis (at the functional end), to address clients’ concerns effectively.
-
Evaluation of HRMS RFP releases and fitment responses following up to preparation of cost estimates and proposals.
​
​
May 2014-Aug 2014
Internship
Bharat Heavy Electrical Limited (B.H.E.L)
I have completed my Summer Internship on the project "Sales Analysis of stator winding bar" in Bharat Heavy Electrical Limited, Haridwar.
-
Identified & Examined industry & geographic trends to increase the regional sales of over 10% by means of Visualization.
-
Gathered external data to analyze marketing strategies using exploratory data analysis and regression methods.
EDUCATION
Michigan Technological University
GPA: 4.0/4.0 (Teaching Assistant)​
-
Developed machine learning models using linear & logistic regression, random forests, neural networks, support vector machines, principal component analysis and cluster analysis.
-
Machine Learning Project on “Movie Recommendation Engine” to recommend the movies to end users on MovieLens dataset using collaborative & content-based filtering algorithm. [RMSE: - 1.06]
-
Created statistical methods project using MLR model to datasets regarding crime (in particular cities) to predict criminal events for a specific time and place in the future. [RMSE :- 10.42, Accuracy :- 82.01%]
-
Predictive Modelling project on “Heart Disease prediction” to predict blood vessel narrowing due to heart disease using Random Forest Classifier. [Accuracy – 83%]
-
Data Visualization project on Conway’s game of life to see if machine learning (or optimization) can predict the game of life in reverse by Random forest. [Accuracy achieved :- 87.65%]
-
Time Series Project on “Beery Production Volume Forecasting” to predict next 10 weeks of production based on 235 weeks of production data. [R2 – 82.78%]
-
Created extensive exploratory data visualizations using ggplot2, Bokeh, Seaborn and Matplotlib to gain insights into the data sets.
-
Tools and packages used: Python: - Pandas, Numpy, Scrapy, Matplotlib, Scipy, ScikitLearn; R: - Shiny, dplyr, ggplot2, Rmarkdown, google analytics.
Aug 2017-April 2019
MS Data Science
Aug 2011-May 2015
Bachelor's Degree
National Institute of Technology (NIT)
GPA: 7.33/10
I have completed my Under Graduation in Electrical & Electronics Engineering from NIT Uttarakhand which is one of the India's top ranked institute second to Indian Institute of Technology (IIT).
-
Completed a final year project on Remote control for home appliance designed for older people to fulfill the objective of controlling any home appliance with the help of a remote.
-
Completed Summer Internship in 3 companies B.H.E.L Haridwar, I.T.C Haridwar, and T.H.D.C Hydro Power Dam to study, analyse & check the efficiency of the distinct projects.
April 2010-May 2011
High School Diploma
Bal Mandir Senior Secondary School
During high school, my major field of study was PCM(Physics, Chemistry, and Mathematics ). Apart from this, physical education was also a part of my education.
SKILLS
Tools Used: - AWS Sagemaker, AWS Lambda, AWS EC2, AWS Step Function, Cloud Desktop, Docker, Kubernate, Alteryx, Power Automate, Azure ML Studio, Tableau, Power BI
Languages: - Python, R programming, SQL, Spark
Machine Learning, Data Mining, Data Visualization, Artificial Intelligence
Statistical Programming & Analysis, Predictive Modelling, Regression Analysis, Time Series Forecasting
Information System Management & Data Analytic, Data Ware house &Business Intelligence
Advanced - Microsoft Excel, Ms word, Power point
-
Machine Learning Project on “Movie Recommendation Engine” to recommend the movies to end users on MovieLens dataset using collaborative & content-based filtering algorithm. [RMSE: - 1.06]
-
Predictive Modelling project on “Heart Disease prediction” to predict blood vessel narrowing due to heart disease using Random Forest Classifier. [Accuracy – 83%]
-
Data Visualisation project on Conway’s game of life to see if machine learning (or optimization) can predict the game of life in reverse by Random forest. [Accuracy achieved :- 87.65%]
-
Statistical method project using MLR model to datasets regarding "crime (in particular cities)" to predict criminal events for a specific time and place in the future. [RMSE :- 10.42, Accuracy :- 82.01%]
-
Time Series Project on “Beery Production Volume Forecasting” to predict next 10 weeks of production based on 235 weeks of production data. [R2 – 82.78%]
-
Big data project on "Map Reduce" to find out the most common words from the list of tons of customer review files.
-
Web Scrapping Project on "Spotify API" to examine the popularity of songs using scatter plot and to check the missing data from the list of songs.
-
Data Visualization (Python) project on "Hurricanes Data" and "Climate Data Set" using Pandas, Numpy, Matplotlib, ggplot, seaborn, plotly libraries.
-
Information System Management & data analytic project on "Product Distribution Company".
COLLEGE PROJECTS
CERTIFICATIONS
-
AWS Certified Machine Learning Specialty
-
Microsoft Azure Data Scientist (DP-100) - Microsoft (On going)
-
A/B Testing Course - Udacity
-
Microsoft Azure Fundamentals (AZ-900) - Microsoft [Link]
-
Machine Learning Crash Course with TensorFlow - Google
-
Intro to Python for Data Science - DataCamp
-
Introduction to R programming - Udacity
-
Data Visualization with Tableau - Coursera