Tasks Download the following dataset:  mpg-new.xlsx Download mpg-new.xlsx  This dataset is a subset of the fuel economy data provided by the EPA, accessible through fueleconomy.gov. It comprises 38 popular car models spanning from 1999 to 2008. Each entry includes detailed information about specific car models, including manufacturer, displacement, city MPG, highway MPG, and more. Utilize Excel to analyze the descriptive statistics of the variables. Answer the following questions. 1. Data Understanding: Open the mpg dataset in Excel and answer the following questions: What does each observation represent? How many variables are there? Which data attributes are categorical, and which are numeric? 2. Data Preprocessing: Check for duplicate and missing data. Are there any duplicate rows? Are there any missing values? Propose solutions for handling missing data. 3. Data Enrichment: Create a new variable called "mpg" that represents the average of city ("cty") and highway ("hwy") miles per gallon. 4. Understanding numerical variables: Calculate and present descriptive statistics (mean, median, range, variance, and standard deviation) for the numeric variable: "mpg" Compare and explain the mean and standard deviation. 5. Understanding categorical variables: What are the unique values of drive train type (drv)?  What is the mode for "drv" variable? Create bar plots to illustrate the distribution of "drv". Compare the distribution of "drv" in 1999 and 2008.  summarize the difference. 6. Box Plots for numeric variables: Use a box plot to show the summary distribution of numeric variables: "mpg" Report key statistics(Q1, median, Q3, max, min) displayed in the box plot. What is  "mpg" range of the middle 50% of cars in the dataset? Box plots by year: Use a box plot to show the distribution of "mpg" variable in 1999 and 2008. summarize the difference. Box Plots by Classes: Use a box plot to show the distribution of "mpg" variable in different classes. summarize the difference. 7. Histogram for numeric variables: Use Histogram to show the detailed distribution of numeric variables: "mpg" Explore different bin width and discuss what is a proper bin width. Use a bin width of 4, how many models fall into the common range

Np Ms Office 365/Excel 2016 I Ntermed
1st Edition
ISBN:9781337508841
Author:Carey
Publisher:Carey
Chapter4: Analyzing And Charting Financial Data
Section: Chapter Questions
Problem 2.10CP
icon
Related questions
Question
 

Tasks

Download the following dataset: 

mpg-new.xlsx Download mpg-new.xlsx 

This dataset is a subset of the fuel economy data provided by the EPA, accessible through fueleconomy.gov. It comprises 38 popular car models spanning from 1999 to 2008. Each entry includes detailed information about specific car models, including manufacturer, displacement, city MPG, highway MPG, and more.

Utilize Excel to analyze the descriptive statistics of the variables.

Answer the following questions.

1. Data Understanding:

Open the mpg dataset in Excel and answer the following questions:

  • What does each observation represent?
  • How many variables are there?
  • Which data attributes are categorical, and which are numeric?

2. Data Preprocessing:

  • Check for duplicate and missing data. Are there any duplicate rows? Are there any missing values?
  • Propose solutions for handling missing data.

3. Data Enrichment:

  • Create a new variable called "mpg" that represents the average of city ("cty") and highway ("hwy") miles per gallon.

4. Understanding numerical variables:

  • Calculate and present descriptive statistics (mean, median, range, variance, and standard deviation) for the numeric variable: "mpg"
  • Compare and explain the mean and standard deviation.

5. Understanding categorical variables:

  • What are the unique values of drive train type (drv)?  What is the mode for "drv" variable?
  • Create bar plots to illustrate the distribution of "drv".
  • Compare the distribution of "drv" in 1999 and 2008.  summarize the difference.

6. Box Plots for numeric variables:

  • Use a box plot to show the summary distribution of numeric variables: "mpg"
  • Report key statistics(Q1, median, Q3, max, min) displayed in the box plot.
  • What is  "mpg" range of the middle 50% of cars in the dataset?
  • Box plots by year: Use a box plot to show the distribution of "mpg" variable in 1999 and 2008. summarize the difference.
  • Box Plots by Classes: Use a box plot to show the distribution of "mpg" variable in different classes. summarize the difference.

7. Histogram for numeric variables:

  • Use Histogram to show the detailed distribution of numeric variables: "mpg"
  • Explore different bin width and discuss what is a proper bin width.
  • Use a bin width of 4, how many models fall into the common range
AI-Generated Solution
AI-generated content may present inaccurate or offensive content that does not represent bartleby’s views.
steps

Unlock instant AI solutions

Tap the button
to generate a solution

Knowledge Booster
Fundamentals of Distributed DBMS
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Recommended textbooks for you
Np Ms Office 365/Excel 2016 I Ntermed
Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:
9781337508841
Author:
Carey
Publisher:
Cengage
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781305627482
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781285196145
Author:
Steven, Steven Morris, Carlos Coronel, Carlos, Coronel, Carlos; Morris, Carlos Coronel and Steven Morris, Carlos Coronel; Steven Morris, Steven Morris; Carlos Coronel
Publisher:
Cengage Learning
COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
Computer Science
ISBN:
9780357392676
Author:
FREUND, Steven
Publisher:
CENGAGE L
CMPTR
CMPTR
Computer Science
ISBN:
9781337681872
Author:
PINARD
Publisher:
Cengage
Oracle 12c: SQL
Oracle 12c: SQL
Computer Science
ISBN:
9781305251038
Author:
Joan Casteel
Publisher:
Cengage Learning