- Information
- AI Chat
Data Analytics Manual BEMechanical SPPU 2019 Course
Data Analytics - Trainity
Visvesvaraya Technological University
Recommended for you
Preview text
See discussions, stats, and author profiles for this publication at: researchgate/publication/ Data Analytics Laboratory (402046) LAB MANUAL B. Mechanical Engineering (2019 COURSE) Savitribai Phule Pune University Technical Report · September 2022 CITATIONS 0 READS 3, 1 author: Ajinkya Salve R. H. Sapat College of Engineering, Management Studies and Research 4 PUBLICATIONS 12 CITATIONS SEE PROFILE All content following this page was uploaded by Ajinkya Salve on 17 September 2022. The user has requested enhancement of the downloaded file.
Data Analytics Laboratory- BE Mechanical
Mechanical Engg-GES's R H SAPAT COE MS&R,Nashik Gokhale Education Society’s R.H COLLEGE OF ENGINEERING MANAGEMENT STUDIES AND RESEARCH Prin. T. Kulkarni Vidyanagar,Nashik-422005. DEPARTMENT OF MECHANICAL ENGINEERING Data Analytics Laboratory ( 402046 ) LAB MANUAL B.(201 9 COURSE) ( 1 ST semester) Prepared By: Mr KISHOR SALVE Assistant Professor Mechanical Engineering Department
Data Analytics Laboratory- BE Mechanical
Mechanical Engg-GES's R H SAPAT COE MS&R,Nashik Program Outcomes(POs) PO1 Engineering Knowledge: Apply the knowledge of mathematics, science, engineering fundamentals, and an engineering specialization to the solution of complex engineering problems. PO2 Problem Analysis: Identify, formulate, research literature, and analyze complex engineering problems reaching substantiated conclusions using first principles of mathematics, natural sciences, and engineering sciences. PO3 Design/development of Solutions: Design solutions for complex engineering problems and design system components or processes that meet the specified needs with appropriate consideration for the public health and safety, and the cultural, societal, and environmental considerations. PO4 Conduct Investigations of Complex Problems: Use research-based knowledge and research methods including design of experiments, analysis and interpretation of data, and synthesis of the information to provide valid conclusions. PO5 Modern Tool usage: Create, select, and apply appropriate techniques, resources, and modern engineering and IT tools including prediction and modelling to complex engineering activities with an understanding of the limitations. PO6 The Engineer and Society: Apply reasoning informed by the contextual knowledge to assess societal, health, safety, legal and cultural issues and the consequent responsibilities relevant to the professional engineering practice. PO7 Environment and Sustainability: Understand the impact of the professional engineering solutions in societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable development. PO8 Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms of the engineering practice. PO9 Individual and Team Work: Function effectively as an individual, and as a member or leader in diverse teams, and in multidisciplinary settings. PO10 Communication: Communicate effectively on complex engineering activities with the engineering community and with society at large, such as, being able to comprehend and write effective reports and design documentation, make effective presentations, and give and receive clear instructions. PO11 Project Management and Finance: Demonstrate knowledge and understanding of the engineering and management principles and apply these to one’s own work, as a member and leader in a team, to manage projects and in multidisciplinary environments. PO12 Life-long Learning: Recognize the need for, and have the preparation and ability to engage in independent and life-long learning in the broadest context of technological change.
Data Analytics Laboratory- BE Mechanical
Mechanical Engg-GES's R H SAPAT COE MS&R,Nashik Course Objectives:
- To explore the fundamental concepts of data analytics.
- To understand the various search methods and visualization techniques.
- To apply various machine learning techniques for data analysis. Course Outcomes: CO1:UNDERSTAND the basics of data analytics using concepts of statistics and probability. CO2:APPLY various inferential statistical analysis techniques to describe data sets and withdraw useful conclusions from acquired data set. CO3:EXPLORE the data analytics techniques using various tools CO4:APPLY data science concept and methods to solve problems in real world context CO5:SELECT advanced techniques to conduct thorough and insightful analysis and interpret the results
Practical 1
To perform
Exploratory Data
Analysis on
Automobile data.
Aim: To perform Exploratory Data Analysis on Automobile data. Prerequisites: Automobile data, Jupyter Notebook / Google Colab Theory: What is Exploratory Data Analysis? Exploratory Data Analysis (EDA) is an approach to analyze the data using visual techniques. It is used to discover trends, patterns, or to check assumptions with the help of statistical summary and graphical representations. An EDA is a thorough examination meant to uncover the underlying structure of a data set and is important for a company because it exposes trends, patterns, and relationships that are not readily apparent. What are the types of exploratory data analysis? The four types of EDA are 1. univariate non-graphical, 2. multivariate non- graphical, 3. univariate graphical, 4. multivariate graphical. Techniques and Tools: There are a number of tools that are useful for EDA, but EDA is characterized more by the attitude taken than by particular techniques. Typical graphical techniques used in EDA are: Box plot Histogram Multi-vari chart Run chart Pareto chart Scatter plot (2D/3D) Stem-and-leaf plot Parallel coordinates Odds ratio Heat map Bar chart Horizon graph Dimensionality reduction: Multidimensional scaling
symboling normalized-lossesmake aspiration num-of-doorsbody-style drive-wheelsengine-locationwheel-baselength
8/3/22, 11:00 AM Copy of exploratory_data_analysis_on_automobile_dataset - Colaboratory colab.research.google/drive/1gjuj8XBQn--Gk5Az_bVjC32LwTQH_H2b#printMode=true 1/ exploratory-data-analysis-on-automobile-dataset
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib as plt
Data Loading
symboling
normalized-
losses
make aspiration
num-
of-
doors
body-
style
drive-
wheels
engine-
location
0 3 122
alfa-
romero
std two convertible rwd front
1 3 122
alfa-
romero
std two convertible rwd front
2 1 122
alfa-
romero
std two hatchback rwd front
3 2 164 audi std four sedan fwd front
4 2 164 audi std four sedan 4wd front
5 rows × 29 columns
path='s3-api.us-geo.objectstorage.softlayer/cf-courses-data/CognitiveClass/DA
df_automobile = pd_csv(path)
df_automobile()
df_automobile
(201, 29)
df_automobile()
<class 'pandas.core.frame'>
RangeIndex: 201 entries, 0 to 200
Data columns (total 29 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 symboling 201 non-null int
1 normalized-losses 201 non-null int
2 make 201 non-null object
3 aspiration 201 non-null object
4 num-of-doors 201 non-null object
5 body-style 201 non-null object
8/3/22, 11:00 AM Copy of exploratory_data_analysis_on_automobile_dataset - Colaboratory
6 drive-wheels 201 non-null object
13 engine-type 201 non-null object
dtypes: float64(11), int64(8), object(10)
memory usage: 45+ KB
Data Cleaning Data contains "?" replace it with NAN
df_data = df_automobile('?',np)
- 3 122 alfa-romerostd two convertiblerwd front 88 0.
- 3 122 alfa-romerostd two convertiblerwd front 88 0.
- 1 122 alfa-romerostd two hatchback rwd front 94 0.
- 2 164 audi std four sedan fwd front 99 0.
- 2 164 audi std four sedan 4wd front 99 0.
- 2 122 audi std two sedan fwd front 99 0.
- 1 158 audi std four sedan fwd front 105 0.
- 1 122 audi std four wagon fwd front 105 0.
- 1 158 audi turbo four sedan fwd front 105 0.
- 2 192 bmw std two sedan rwd front 101 0.
- 0 192 bmw std four sedan rwd front 101 0.
- 0 188 bmw std two sedan rwd front 101 0.
- 0 188 bmw std four sedan rwd front 101 0.
- 1 122 bmw std four sedan rwd front 103 0.
- 0 122 bmw std four sedan rwd front 103 0.
- 0 122 bmw std two sedan rwd front 103 0.
- 0 122 bmw std four sedan rwd front 110 0.
- 2 121 chevrolet std two hatchback fwd front 88 0.
- 1 98 chevrolet std two hatchback fwd front 94 0.
- 0 81 chevrolet std four sedan fwd front 94 0.
- 1 118 dodge std two hatchback fwd front 93 0.
- 1 118 dodge std two hatchback fwd front 93 0.
- 11 118 dodge118 dodge turboturbo twotwo hatchback fwdhatchback fwd frontfront 93.793 0.
- 1 148 dodge std four hatchback fwd front 93 0.
- 1 148 dodge std four sedan fwd front 93 0.
- 1 148 dodge std four sedan fwd front 93 0.
- 1 148 dodge turbo four sedan fwd front 93 0.
- -1 110 dodge std four wagon fwd front 103 0.
- 3 145 dodge turbo two hatchback fwd front 95 0.
- 2 137 honda std two hatchback fwd front 86 0.
- 2 137 honda std two hatchback fwd front 86 0.
- 1 101 honda std two hatchback fwd front 93 0.
- 1 101 honda std two hatchback fwd front 93 0.
- 1 101 honda std two hatchback fwd front 93 0.
- 0 110 honda std four sedan fwd front 96 0.
- 0 78 honda std four wagon fwd front 96 0.
- 0 106 honda std two hatchback fwd front 96 0.
- 0 106 honda std two hatchback fwd front 96 0.
- 0 85 honda std four sedan fwd front 96 0.
- 0 85 honda std four sedan fwd front 96 0.
- 0 85 honda std four sedan fwd front 96 0.
- 1 107 honda std two sedan fwd front 96 0.
- 0 122 isuzu std four sedan rwd front 94 0.
- 2 122 isuzu std two hatchback rwd front 96 0.
- 0 145 jaguar std four sedan rwd front 113 0.
- 0 122 jaguar std four sedan rwd front 113 0.
- 0 122 jaguar std two sedan rwd front 102 0.
- 1 104 mazda std two hatchback fwd front 93 0.
- 1 104 mazda std two hatchback fwd front 93 0.
- 0 48 2548 dohc four 130 mpfi 3 2 width height curb-weightengine-typenum-of-cylindersengine-sizefuel-systembore stroke compression-ratio
- 0 48 2548 dohc four 130 mpfi 3 2
- 0 52 2823 ohcv six 152 mpfi 2 3
- 0 54 2337 ohc four 109 mpfi 3 3
- 0 54 2824 ohc five 136 mpfi 3 3
- 0 53 2507 ohc five 136 mpfi 3 3 8.
- 0 55 2844 ohc five 136 mpfi 3 3 8.
- 0 55 2954 ohc five 136 mpfi 3 3 8.
- 0 55 3086 ohc five 131 mpfi 3 3 8. - 0 54 2395 ohc four 108 mpfi 3 2 8. - 0 54 2395 ohc four 108 mpfi 3 2 8. - 0 54 2710 ohc six 164 mpfi 3 3 - 0 54 2765 ohc six 164 mpfi 3 3
- 0 55 3055 ohc six 164 mpfi 3 3
- 0 55 3230 ohc six 209 mpfi 3 3
- 0 53 3380 ohc six 209 mpfi 3 3
- 0 56 3505 ohc six 209 mpfi 3 3
- 0 53 1488 l three 61 2bbl 2 3 9.
- 0 52 1874 ohc four 90 2bbl 3 3 9.
- 0 52 1909 ohc four 90 2bbl 3 3 9.
- 0 50 1876 ohc four 90 2bbl 2 3 9.
- 0 50 1876 ohc four 90 2bbl 2 3 9.
- 0.8861110 50.850 2128 ohc2128 ohc fourfour 98 mpfi98 mpfi 3.033 3.393 7.
- 0 50 1967 ohc four 90 2bbl 2 3 9.
- 0 50 1989 ohc four 90 2bbl 2 3 9.
- 0 50 1989 ohc four 90 2bbl 2 3 9.
- 0 50 2191 ohc four 98 mpfi 3 3 7.
- 0 59 2535 ohc four 122 2bbl 3 3 8.
- 0 50 2811 ohc four 156 mfi 3 3
- 0 50 1713 ohc four 92 1bbl 2 3 9.
- 0 50 1819 ohc four 92 1bbl 2 3 9.
- 0 52 1837 ohc four 79 1bbl 2 3 10.
- 0 52 1940 ohc four 92 1bbl 2 3 9.
- 0 52 1956 ohc four 92 1bbl 2 3 9.
- 0 54 2010 ohc four 92 1bbl 2 3 9.
- 0 58 2024 ohc four 92 1bbl 2 3 9.
- 0 53 2236 ohc four 110 1bbl 3 3
- 0 53 2289 ohc four 110 1bbl 3 3
- 0 54 2304 ohc four 110 1bbl 3 3
- 0 54 2372 ohc four 110 1bbl 3 3
- 0 54 2465 ohc four 110 mpfi 3 3
- 0 51 2293 ohc four 110 2bbl 3 3 9.
- 0 53 2337 ohc four 111 2bbl 3 3 8.
- 0 51 2734 ohc four 119 spfi 3 3 9.
- 0 52 4066 dohc six 258 mpfi 3 4 8.
- 0 52 4066 dohc six 258 mpfi 3 4 8.
- 0 47 3950 ohcv twelve 326 mpfi 3 2 11.
- 0 54 1890 ohc four 91 2bbl 3 3
- 0 54 1900 ohc four 91 2bbl 3 3
- colab.research.google/drive/1gjuj8XBQn--Gk5Az_bVjC32LwTQH_H2b#printMode=true 2/
- 8 wheel-base 201 non-null float 7 engine-location 201 non-null object
- 9 length 201 non-null float
- 10 width 201 non-null float
- 11 height 201 non-null float
- 12 curb-weight 201 non-null int
- 15 engine-size 201 non-null int 14 num-of-cylinders 201 non-null object
- 17 bore 201 non-null float 16 fuel-system 201 non-null object
- 18 stroke 197 non-null float
- 19 compression-ratio 201 non-null float
- 20 horsepower 201 non-null float
- 21 peak-rpm 201 non-null float
- 22 city-mpg 201 non-null int
- 23 highway-mpg 201 non-null int
- 24 price 201 non-null float
- 25 city-L/100km 201 non-null float
- 27 diesel 201 non-null int 26 horsepower-binned 200 non-null object
- 28 gas 201 non-null int
- symboling df_data().sum()
- normalized-losses
- make
- aspiration
- num-of-doors
- body-style
- drive-wheels
- engine-location
- wheel-base
- length
- width
- height
- curb-weight
- engine-type
- num-of-cylinders
- engine-size
- fuel-system
- bore
- stroke
- compression-ratio
- horsepower
- peak-rpm
- city-mpg
- highway-mpg
- price
8/3/22, 11:00 AM Copy of exploratory_data_analysis_on_automobile_dataset - Colaboratory colab.research.google/drive/1gjuj8XBQn--Gk5Az_bVjC32LwTQH_H2b#printMode=true 4/
symboling
normalized-losses
make aspiration
num-of-doors
body-style
drive-wheels
engine-location
0 3 122
alfa-
romero
std two convertible rwd front
1 3 122
alfa-
romero
std two convertible rwd front
2 1 122
alfa-
romero
std two hatchback rwd front
3 2 164 audi std four sedan fwd front
4 2 164 audi std four sedan 4wd front
5 2 122 audi std two sedan fwd front
6 1 158 audi std four sedan fwd front
7 1 122 audi std four wagon fwd front
8 1 158 audi turbo four sedan fwd front
9 2 192 bmw std two sedan rwd front
10 rows × 29 columns
Summary statistics of variable
symboling
normalized-
losses
wheel-base
length width height
count 201 201 201 201 201 201 201
mean 0 122 98 0 0 53 2555
std 1 31 6 0 0 2 517
min -2 65 86 0 0 47 1488
25% 0 101 94 0 0 52 2169
50% 1 122 97 0 0 54 2414
75% 2 137 102 0 0 55 2926
max 3 256 120 1 1 59 4066
df_automobile()
Univariate Analysis
plt(figsize=( 10 , 8 ))
df_automobile[['engine-size','peak-rpm','curb-weight','horsepower','price']].hist(figsize=
plt(figsize=( 10 , 8 ))
plt_layout()
plt()
8/3/22, 11:00 AM Copy of exploratory_data_analysis_on_automobile_dataset - Colaboratory colab.research.google/drive/1gjuj8XBQn--Gk5Az_bVjC32LwTQH_H2b#printMode=true 5/
<Figure size 720x576 with 0 Axes>
<Figure size 720x576 with 0 Axes>
Findings Most of the car has a Curb Weight is in range 1900 to 3100 The Engine Size is inrange 60 to 190 Most vehicle has horsepower 50 to 125 Most Vehicle are in price range 5000 to 18000 peak rpm is mostly distributed between 4600 to 5700
plt( 1 )
plt( 221 )
df_automobile['engine-type'].value_counts(normalize=True).plot(figsize=( 10 , 8 ),kind='bar',c
plt("Number of Engine Type frequency diagram")
plt('Number of Engine Type')
plt('engine-type');
plt( 222 )
df_automobile['num-of-doors'].value_counts(normalize=True).plot(figsize=( 10 , 8 ),kind='bar',
plt("Number of Door frequency diagram")
plt('Number of Doors')
plt('num-of-doors');
8/3/22, 11:00 AM Copy of exploratory_data_analysis_on_automobile_dataset - Colaboratory colab.research.google/drive/1gjuj8XBQn--Gk5Az_bVjC32LwTQH_H2b#printMode=true 7/ Findings curb-size, engine-size, horsepower are positively corelated city-mpg,highway-mpg are negatively corelated Bivariate Analysis Price Analysis
plt['figure']=( 18 , 9 )
ax = sns(x="make", y="price", data=df_automobile)
8/3/22, 11:00 AM Copy of exploratory_data_analysis_on_automobile_dataset - Colaboratory colab.research.google/drive/1gjuj8XBQn--Gk5Az_bVjC32LwTQH_H2b#printMode=true 8/
plt['figure']=( 19 , 7 )
ax = sns(x="body-style", y="price", data=df_automobile)
Positive linear relationship
plt['figure']=( 10 , 5 )
ax = sns(x="drive-wheels", y="price", data=df_automobile)
Data Analytics Manual BEMechanical SPPU 2019 Course
Course: Data Analytics - Trainity
University: Visvesvaraya Technological University
- Discover more from: