Assignment_No1

.pdf

School

University of Alberta *

*We aren’t endorsed by this school

Course

342

Subject

Industrial Engineering

Date

Feb 20, 2024

Type

pdf

Pages

2

Uploaded by DoctorStraw26954 on coursehero.com

Page of 2 1 ECE 447: Data Analysis and Machine Learning for Engineers Assignment A1 due Thursday, February 8 th , 2024, 11:55 PM possible two submission forms: document (answers “on paper”), Jupyter notebook file (The assignment is worth 100 pts, which is 5% of the final mark) A1-1. How would you define terms restriction bias and preference bias ? What is a difference between them 10 pts A1-2. The table below shows socioeconomic data for a selection of countries for the year 2009, using the following features: COUNTRY: The name of the country LIFE EXPECTANCY: The average life expectancy (in years) INFANT MORTALITY: The infant mortality rate (per 1,000 live births) EDUCATION: Spending per primary student as a percentage of GDP HEALTH: Health spending as a percentage of GDP HEALTH USD: Health spending per person converted into US dollars Calculate the correlation between the LIFE EXPECTANCY and all other features. Discuss the relationships and comment on the obtained results. 20 pts
Page of 2 2 ECE 447 Assignment No 1 A1-3. A marketing company working for a charity has developed two different models that predict the likelihood that donors will respond to a mailshot asking them to make a special extra donation. The prediction scores generated for a test set for these two models are shown in the table below. a) Using a classification threshold of 0.6 , and assuming that true is the positive target level (value), construct a confusion matrix for each of the models. Use this threshold for all questions below. 15 pts b) Calculate the simple accuracy and average class accuracy (using an arithmetic mean ) for each model. 10 pts c) Based on the average class accuracy measures, which model appears to perform best at this task? 10 pts d) Generate accumulative gain chart for each model. 10 pts e) The charity for which the model is being built typically has only enough money to send a mailshot to the top 20% of its contact list. Based on the cumulative gain chart generated in the previous part, would you recommend that Model 1 or Model 2 would perform best for the charity? 10 pts f) Generate ROC curves for both models (use a few threshold values). 15 pts
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help