CS 1173 Data Analysis and Visualization in MATLAB Laboratory 2
Cats vs Dogs

Objectives:

  • Analyze a new data set applying tools of previous lessons and labs.
  • Further develop critical thinking skills.
  • Interpret data and draw conclusions.
Cat and Dog together

Data.world, with the help of the American Veterinary Medical Association, did a study in 2018 of Cats vs Dogs popularity in the U.S. Their goal was to find out which pet is more popular by state. The goal of this lab is to use the same data but instead focus on the quantity of pets within certain regions of the US.

Hand-in Requirements:

The lab should be submitted submitted electronically through Blackboard under the Labs menu. Zip up your entire lab2 directory to submit. (Right click on the lab2 folder and to zip all the files together.) Remember to put your Word document in the lab2 directory along with your script and the data.

File Description
Lab2Pets.zip The .zip file contains 3 files.

pets.mat
The dataset contains a 48x6 matrix. Each row represents a state, in alphabetical order, with the exclusion of Alaska and Hawaii. (Ex: Alabama is row 1 and Wyoming is row 48)

pets.mat has 6 columns, with the breakdown below:

  • Column 1 contains the number of households surveyed.
  • Column 2 contains the number of households with any cat or dog pets.
  • Column 3 contains the number of households that had at least 1 dog as a pet.
  • Column 4 contains the overall number of pet dogs.
  • Column 5 contains the number of households that had at least 1 cat as a pet.
  • Column 6 contains the overall number of pet cats.
us_states.csv States listed in alphabetical order and its index number.

lab2.m is a template that you should use for Lab 2.

Part I:Initial Setup

Part II: Identifying and extracting the regions

We are going to be dividing the US into 4 major regions. They are Northeast, Midwest, South and West. To find what region each state belongs to, use the table below.

Region
List of States
Northeast Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, Vermont, New Jersey, New York and Pennsylvania
Midwest Illinois, Indiana, Michigan, Ohio, Wisconsin, Iowa, Kansas, Minnesota, Missouri, Nebraska, North Dakota, and South Dakota
South Delaware, Florida, Georgia, Maryland, North Carolina, South Carolina, Virginia, West Virginia, Alabama, Kentucky, Mississippi, Tennessee, Arkansas, Louisiana, Oklahoma, and Texas.
West Arizona, Colorado, Idaho, Montana, Nevada, New Mexico, Utah, Wyoming, California, Oregon, and Washington.

Using the above regional definitions and the data in us_states.csv, determine what rows belong to what regions, and create the variables below with the data from each state. For example, West has 11 states in it, so your resulting array that you create should have 11 rows and 6 columns. You will edit the code in Lab2.m to create these variables.

Part III: Pie Chart of households surveyed by region

Create a new cell in which, using the West, Midwest, Northeast and South variables you created above, create a pie chart containing the percentage of households surveyed by region. Hint: There should be 4 pie pieces.

Part IV: Comparing cats and dogs by region.

Plot 2 separate bar charts on the same figure (subplot), the left containing the total number of cats by region and the right is total number of dogs by region. Label and scale your axes appropriately.

To label each bar, use set(gca,'xticklabel',{'West','Midwest','Northeast','South'}). You must use the region_cats and region_dogs variables.

Part V: Statistics table

Populate the statistics variables in the script with the correct definitions. This will printout a table of statistics similar to the one below, once you are done. You are free to create other variables to help you with these values.


All Midwest West South Northeast
Houses ? ? ? ? ?
Max Dogs ? ? ? ? ?
% of dogs ? ? ? ? ?
Avg dogs ? ? ? ? ?
Max Cats ? ? ? ? ?
% of cats ? ? ? ? ?
Avg cats ? ? ? ? ?

where the rows mean:

Other requirements:

Implement each part of the lab in a separate cell. Document what each cell does.

Part IV: Analysis

Create a MicroSoft Word document containing the following:

Grading rubric for Part I (point values)

Criterion
Performance indicator
Missing Needs improvement Needs a little improvement Meets expectations Excellent
Part II graph is correct and has appropriate labeling 0 2.5 4 4.5 5
Part III graph is correct and has appropriate labeling 0 2.5 4 4.5 5
Part IV graph is correct and has appropriate labeling 0 2.5 4 4.5 5
Part V statistics table has correct values 0 3.75 6 6.75 7.5
Script runs without error 0 3.75 6 6.75 7.5
3 bullet points discussing why South has highest number of pets 0 2.5 4 4.5 5
3 bullet points comparing average number of cats and average number of dogs 0 2.5 4 4.5 5
3 bullet points discussing problems with the survey 0 2.5 4 4.5 5
Paragraph on the implications 0 2.5 4 4.5 5

This project was created by Dave Patrick and input by Dawn Roberson of the University of Texas at San Antonio and last modified on 6 Oct 2019. Please contact dawnlee.roberson@utsa.edu with comments or suggestions.