BHARGAV MISHRA

90 Callahan Ct, Newark, NJ 07103 · (646) 215-0173 · bpm23@njit.edu

About Me

A Polymath with conspicuous personality, capricious heart and whimsical sense of humour

Data Science professional with about 3 years of experience in Data Analysis & Database Management System. I am keen to continue developing my career in the field of Data Science. I have always been a numbers person, with exceptional mathematics and computer skills. I am proficient in several data management systems and software, including Alteryx, MS SQL Server, Big Data Technologies like Hadoop. Moreover, statistical significance, A/B testing, and data-driven optimization are the rhythm of the drumbeat I march to.

On a personal level, I am detail-oriented, organized, and precise in my work, the only thing cleaner than my room are my spreadsheets. I have strong communication skills with a flair for clear and illuminating presentation. I’m comfortable on my own facing the numbers, but I really enjoy being part of diverse and enthusiastic team & learn more.

Eager to work in an organization that will utilize my ability and technical skills and provides an opportunity to handle challenges & learn new technologies. A proactive learner with a knack for adopting emerging technologies. Effective communicator with strong analytical, problem solving and organizational abilities.


Education

New Jersey Institute of Technology

Master of Science
Data Science

GPA: 3.25/4

Aug 2018 - May 2020

Centre for Development of Advanced Computing

Post Graduation Diploma
Big Data Analytics

Grade: B

Feb 2018 - Aug 2018

Gujarat Technological University

Bachelor Of Engineering
Electronics & Communication

CGPA: 7.1/10

Aug 2011 - May 2015

Skills

  • Programming Languages:- Python, R, Java, SQL, PL/SQL, T-SQL
  • RDMS Databases:- MS SQL Server, MySQL, Oracle 10g
  • NoSQL Database:- Cassandra, MongoDB
  • Libraries:- NumPy, Pandas, SciPy, Scikit-learn, SpaCy, NLTK, OpenCV, Beautifulsoup4, PySpark, Matplotlib, Seaborn, Plotly, Folium, Streamlit, ggplot2, Dplyr, Caret,Tensorflow, Keras
  • Machine Learning:- Linear and Non-Linear Regression, Logistic Regression, KNN, SVM, Decision Tree, Random Forest, AdaBoost, Gradient Boost, XgBoost, K-Means, CART, Neural Network, Naïve Bayes
  • Statistics:- Hypothesis Test, A/B Testing, T-Test, Z-Test, PCA, Monte Carlo Simulation, ARIMA, ANOVA, Chi-Square
  • Big Data:- Hadoop, Elastic MapReduce, Kafka, Apache Spark, Apache Airflow, pyspark, Pig, Hive
  • BI Tools:- Tableau, PowerBI, QlikSense, Google Data Studio
  • Cloud Platforms:- Amazon Web Services(EC2, S3, Sagemaker), Azure(Cosmo DB)
  • Other Tools:- Alteryx, Matillion, SSIS, SSRS, R, Rapid Miner, Minitab, SPSS, Excel, SDLC, UML Modeling, Visual Studio, Microsoft Office, JIRA, SAS, Jupyter Notebook, GIT

Experience

Data Associate

New Jersey Institute of Technology
  • Designed the framework and programming interface to generate Billing reports for multiple departments
  • Assisted with automation of finance department’s manual processes by writing VBA code and using macros and formulas to speed processes and maximize accuracy
  • Created VBA program to automatically update Excel sheet for ResLife department to reduce complexity by 90%
  • Performed the Root Cause Analysis (RCA) for multiple issues and documenting it with reliable solutions, leading the deduction of future response time up to 75%
Jul 2019 - Mar 2020

Data Engineer/ Analyst

Puzzles Agro
  • Worked in a team of Puzzle Agro's Analytics department, towards market research in one of Gujarat’s fastest-growing FMCG in the MSME sector
  • Identifying market gaps between rural and urban areas
  • Targeting the audience with social media marketing using Facebook Ads
  • Automated an ETL processes across millions of rows of data which reduced manual workload by 50%
  • Ingested data from disparate data sources like Email, Google AdWords API, Facebook API into Amazon Redshift
  • Migrated & validated data from Oracle database to Snowflake using Matillion Database Query component
  • Fetched XML, JSON data from REST API into Snowflake for further Data Analysis
  • Design and develop ETL workflows to translate business logic from one system to another to be used by BI reporting tool
  • Frame worked and implemented entire AWS pipelines to orchestrate Machine Learning Models
  • Spearhead in-depth analysis of Marketing Email Campaign and build a ML model to optimize in future email campaigns using Random forest
Jan 2017 - Dec 2017

Associate Database Developer

Jack Solutions
  • Involved in designing and creating a RDBMS based on the client's requirements using Oracle Database
  • Developed stored procedures, functions to extract the data from flat text files in the operational database
  • Built and deployed ETL on development, testing and production servers using Oracle Data Integrator
  • Implemented Datawarehouse model and tune SQL queries, triggers, procedures, views using Oracle Warehouse Builder to improve querying performance by 20%
  • Involved in multi-source data ingestion for dimensional modeling using Star schema and Snowflake schema
  • Scheduled periodic reports on sales & marketing for business needs based on various factors using Oracle Reports
  • Responsible for Unit testing & assisted UAT testing team on issue clarification & resolution
Jul 2015 - Dec 2016

Database Intern

Jack Solutions
  • Participate in tool evaluation, integration, and support
  • Worked in a team environment to create logical and physical database design and assist application
  • Provided technical support for server database environments, including testing and installation of DBMS upgrades and backup and recovery of existing databases
May 2015 - Jul 2015

Intern

Oil & Natural Gas Corporation Limited
  • Studied in detail about Seismic survey which collects real-time data from different sensors in IoT ecosystem
  • Worked with data team in Regional Computer Lab on Hadoop, HDFS, MapReduce for processing huge volumes of data
May 2014 - Jul 2014

Projects

Conversion Rate   

Goal

The goal of this Project is to build a model that predicts conversion rate and based on the model, come up with ideas to improve revenue.

Description

We have data about users who hit our site: whether they converted or not as well as some of their characteristics such as their country, the marketing channel, their age, whether they are repeat users and the number of pages visited during that session (as a proxy for site activity/time spent on site).

Goal of this project is to:
  1. Predict conversion rate
  2. Come up with recommendations for the product team and the marketing team to improve conversion rate

Funnel Analysis   

Goal

The goal is to perform funnel analysis for an e-commerce website. Typically, websites have a clear path to conversion: for instance, you land on the home page, then you search, select a product, and buy it. At each of these steps, some users will drop off and leave the site. The sequence of pages that lead to conversion is called 'funnel'. Funnel analysis allows to understand where/when our users abandon the website. It gives crucial insights on user behavior and on ways to improve the user experience. Also, it often allows to discover bugs.

Description

You are looking at data from an e-commerce website. The site is very simple and has just 4 pages:

  • The first page is the home page. When you come to the site for the first time, you can only land on the home page as a first page.
  • From the home page, the user can perform a search and land on the search page.
  • From the search page, if the user clicks on a product, He/she will get to the payment page, where they will be asked to provide payment information in order to buy that product.
  • If user does decide to buy, she ends up on the confirmation page

Goal of this project is to:
  1. A full picture of funnel conversion rate for both desktop and mobile.
  2. Some insights on what the product team should focus on in order to improve conversion rate as well as anything you might discover that could help improve 18 conversion rate.

Price Optimization   

Goal

The goal of this project is to evaluate whether a pricing test running on the site has been successful. As always, you should focus on user segmentation and provide insights about segments who behave differently as well as any other insights you might find.

Description

Company sells a software for $39. Now company has decided to sell the same product for $59 and has decided to run a test increasing the price hoping that this would increase revenue. In the experiment, 66% of the users have seen the old price ($39), while a random sample of 33% users a higher price ($59). The test has been running for some time and the VP of Product is interested in understanding how it went and whether it would make sense to increase the price for all the users.

Goal of this project is to:
  1. Should the company sell its software for $39 or $59?
  2. The VP of Product is interested in having a holistic view into user behavior, especially focusing on actionable insights that might increase conversion rate. What are your main findings looking at the data?
  3. The VP of Product feels that the test has been running for too long and he should have been able to get statistically significant results in a shorter time. Do you agree with her intuition? After how many days you would have stopped the test? Please, explain why.

Ads Analysis   

Goal

The goal of this project is to look at a few ad campaigns and analyze their current performance as well as predict their future performance.

Description

There are running 40 different ad campaigns and want you to help them understand their performance.

Goal of this project is to:
  1. If you had to identify the 5 best ad groups, which ones would they be? Which metric did you choose to identify the best ad groups? Why? Explain the pros of your metric as well as the possible cons.
  2. For each group, predict how many ads will be shown on Dec, 15 (assume each ad group keeps following its trend).
  3. Cluster ads into 3 groups: the ones whose avg_cost_per_click is going up, the ones whose avg_cost_per_click is flat and the ones whose avg_cost_per_click is going down.

Subscription Rate   

Goal

The goal of this challenge is to model subscription retention rate.Subscriptions are a great business model. There are so many advantages for businesses in having subscribers compared to single purchase users: revenue by customer is much higher, it is possible to cross-sell to the subscribers, future revenue is easily predictable, there is a significant cost (time/effort/etc.) for the customer in canceling the subscription, etc.

Description

Pull data from all the users who subscribed in January and see, for each month, how many of them unsubscribed.

Goal of this project is to:
  1. A model that predicts monthly retention rate for the different subscription price points Based on your model, for each price point, what percentage of users is still subscribed after at least 12 months?
  2. How do user country and source affect subscription retention rate? How would you use these findings to improve the company revenue?