- Information
- AI Chat
Operation analytics
Data Analytics - Trainity
Visvesvaraya Technological University
Preview text
Operation Analytics
and
Investigating Metric Spike
By
Shireen Fathima
Project 3
PROJECT DESCRIPTION : This project mainly involves understanding the Operation Analytics, and Investigating Metric Spike which explains sudden changes in key metrics, such as a dip in daily user engagement or a drop in sales. We will be analyzing the case studies provided, using the SQL queries. APPROACH : ▪ Create the database and table required for the excel file ▪ Import the csv file onto the table ▪ Write SQL queries to analyse the data. ▪ Create visuals of the data, if needed. TECH STACK USED : ▪ MySQL Workbench 8 CE used to run the queries ▪ Microsoft Excel to visualise the outputs.
Throughput Analysis: Objective: Calculate the 7-day rolling average of throughput (number of events per second). Your Task: Write an SQL query to calculate the 7-day rolling average of throughput. Additionally, explain whether you prefer using the daily metric or the 7-day rolling average for throughput, and why. QUERY : Output: Insights : select ds as dates, round(count(event)/sum(time_spent), 2) as Daily_throughput, (select round(count(event)/sum(time_spent), 2) from job_data) as 7_day_rolling_avg from job_data group by ds order by ds; ▪ Throughput measures the total work that is done in certain amount of time ▪ Daily throughput provides the daily fluctuations whereas weekly throughput helps to provide the long term fluctuations which is more preferred. ▪ We calculate the daily throughput by counting the total events and divides it by the total time spent and rounding it to two decimal places, finally grouping and order the data by Date. ▪ We use derived column to calculate weekly throughput. Dates Daily_throughput 7_day_rolling_avg 11/25/2020 0 0. 11/26/2020 0 0. 11/27/2020 0 0. 11/28/2020 0 0. 11/29/2020 0 0. 11/30/2020 0 0.
Language Share Analysis: Objective: Calculate the percentage share of each language in the last 30 days. Your Task: Write an SQL query to calculate the percentage share of each language over the last 30 days. QUERY : Output: Insights : select language as languages, sum(time_spent) / (select sum(time_spent) from job_data) * 100 as percentages from job_data group by language order by 2 desc; ▪ As we have less than 30 entries, this query groups all languages and calculates the percentage of time spent on each language. ▪ We order the data based on percentages. 34. 32. 15. 8. 5 3. 0 5 10 15 20 25 30 35 40 French Persian Italian Arabic English Hindi Languages percentages French 34. Persian 32. Italian 15. Arabic 8. English 5. Hindi 3. French has the highest percentage share, followed by Persian.
QUERY : Output: Insights : Investigating Metric Spike CASE STUDY 2 - Tasks Weekly User Engagement: Objective: Measure the activeness of users on a weekly basis. Your Task: Write an SQL query to calculate the weekly user engagement. select week(str_to_date(occurred_at,'%d-%m-%y %h:%i')) as week_numbers, count(distinct user_id) as no_of_users from events where event_type = 'engagement' group by week_numbers order by 1; ▪ We identify the distinct active users based event type which is “Engagement” then will group and order them by the week_numbers. ▪ We use str_to_date function to convert occurred_at from text to date and then fetch the week from it. Week_Numbers No_of_Users 17 663 18 1068 19 1113 20 1154 21 1121 22 1186 23 1232 24 1275 25 1264 26 1302 27 1372 28 1365 29 1376 30 1467 31 1299 32 1225 33 1225 34 1204 35 104
NO_OF_USERS WEEK_NUMBERS Weekly User Engagement
QUERY : Weekly Retention Analysis: Objective: Analyze the retention of users on a weekly basis after signing up for a product. Your Task: Write an SQL query to calculate the weekly retention of users based on their sign-up cohort. SELECT first AS "Week Numbers", SUM(CASE WHEN week_number = 0 THEN 1 ELSE 0 END) AS "Week 0", SUM(CASE WHEN week_number = 1 THEN 1 ELSE 0 END) AS "Week 1", SUM(CASE WHEN week_number = 2 THEN 1 ELSE 0 END) AS "Week 2", SUM(CASE WHEN week_number = 3 THEN 1 ELSE 0 END) AS "Week 3", SUM(CASE WHEN week_number = 4 THEN 1 ELSE 0 END) AS "Week 4", SUM(CASE WHEN week_number = 5 THEN 1 ELSE 0 END) AS "Week 5", SUM(CASE WHEN week_number = 6 THEN 1 ELSE 0 END) AS "Week 6", SUM(CASE WHEN week_number = 7 THEN 1 ELSE 0 END) AS "Week 7", SUM(CASE WHEN week_number = 8 THEN 1 ELSE 0 END) AS "Week 8", SUM(CASE WHEN week_number = 9 THEN 1 ELSE 0 END) AS "Week 9", SUM(CASE WHEN week_number = 10 THEN 1 ELSE 0 END) AS "Week 10", SUM(CASE WHEN week_number = 11 THEN 1 ELSE 0 END) AS "Week 11", SUM(CASE WHEN week_number = 12 THEN 1 ELSE 0 END) AS "Week 12", SUM(CASE WHEN week_number = 13 THEN 1 ELSE 0 END) AS "Week 13", SUM(CASE WHEN week_number = 14 THEN 1 ELSE 0 END) AS "Week 14", SUM(CASE WHEN week_number = 15 THEN 1 ELSE 0 END) AS "Week 15", SUM(CASE WHEN week_number = 16 THEN 1 ELSE 0 END) AS "Week 16", SUM(CASE WHEN week_number = 17 THEN 1 ELSE 0 END) AS "Week 17", SUM(CASE WHEN week_number = 18 THEN 1 ELSE 0 END) AS "Week 18" FROM ( SELECT m_id, m_week, n, m_week - n AS week_number FROM (SELECT user_id,WEEK(STR_TO_DATE(occurred_at,'%d-%m-%Y %H:%i')) AS login_week FROM events GROUP BY 1, 2 ) m, (SELECT user_id,MIN(WEEK(STR_TO_DATE(occurred_at,'%d-%m-%Y %H:%i'))) AS first FROM events GROUP BY 1) n WHERE m_id = n_id) sub GROUP BY first ORDER BY first; Insights : ▪ cohort retention analysis helps to find the weekly retention of users after signing up. ▪ This query calculates the weekly retention of users from signup calculates the number of users who were active in the week after the sign up and groups them by the sign up week displays the count of active users for each week.
Output: - - - - - - - - 1275 - - - - - 1225 1225 - - - - - - - - - - -
Week Numbers Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Week 11 Week 12 Week 13 Week 14 Week 15 Week 16 Week 17 Week
QUERY : Output: Insights : Email Engagement Analysis: Objective: Analyze how users are engaging with the email service. Your Task: Write an SQL query to calculate the email engagement metrics. SELECT user_type,action, COUNT() as Total_count, COUNT(DISTINCT user_id) as Distinct_users, COUNT()/COUNT(DISTINCT user_id) as Avg_events FROM email_events GROUP BY 1,2; ▪ This query analyses how various user_types interact with different actions of email. ▪ We also calculate the total_events, total distinct users and also calculate the average number of events per person, finally grouping them by each combination of user type and action. user_type action Total_count Distinct_users Avg_events 1 email_clickthrough 2758 1529 1. 1 email_open 6511 1717 3. 1 sent_reengagement_email 892 892 1 1 sent_weekly_digest 18412 1217 15. 2 email_clickthrough 2521 1529 1. 2 email_open 5562 1701 3. 2 sent_reengagement_email 1071 1071 1 2 sent_weekly_digest 15232 1098 13. 3 email_clickthrough 3731 2219 1. 3 email_open 8386 2509 3. 3 sent_reengagement_email 1690 1690 1 3 sent_weekly_digest 23623 1796 13.
OUTCOMES : ▪ This project has helped write advanced sql queries. ▪ I learnt how to upload large csv files in different ways onto mysql after struggling with it for 2 days. ▪ It helped me learn and understand various terminologies used in this project. ▪ I learnt to write queries with derived tables. ▪ Note: several columns like occurred_at, created_at I created it using varchar as the datatype, when I tried using DATETIME as the datatype I couldn’t load the csv files onto, I used str_to_date function to extract week,date etc.
Operation analytics
Course: Data Analytics - Trainity
University: Visvesvaraya Technological University
- Discover more from: