LESSON 8: Vector logic for extracting data
FOCUS QUESTION: How can I extract the rows and columns of an array based on data characteristics?
This lesson demonstrates how to use relational and logical operators to extract data for analysis.
Contents
- DATA FOR THIS LESSON
- SETUP FOR LESSON 8
- EXAMPLE 1: Load the consolidated sleep diary data
- EXAMPLE 2: Calculate the number of students in section 3 (==)
- EXAMPLE 3: Calculate the average minutesToSleep of students in section 3 (indexing)
- EXAMPLE 4: Calculate the number of women in the cohort (strcmp to compare strings)
- EXAMPLE 5: Calculate the % of women in the cohort
- EXAMPLE 6: Calculate the number of women in section 3 ( & )
- EXAMPLE 7: Calculate the number of students in section 2 or in section 3 ( | )
- EXAMPLE 8: Calculate % of wakeups that used an alarm
- EXAMPLE 9: Calculate the number of wakeups that were 7:30 am or later ( >= )
- EXAMPLE 10: Calculate % of wakeups between 7:30 am and 9:45 am ( & )
- EXAMPLE 11: Calculate % of wakeups that are after 7:30 am or don't use an alarm ( | and ~)
- EXAMPLE 12: Find the subjects with the earliest average wakeup
- EXAMPLE 13: Find number of bedtimes between 10:30 pm and 2:30 am (relative date)
- SUMMARY OF SYNTAX
DATA FOR THIS LESSON
| File | Description |
diaries.mat |
|
SETUP FOR LESSON 8
- Set the Current Directory to Z:\working\MATLAB\Lesson8. (You will need to make a new directory for Lesson8.)
- Download diaries.mat from Blackboard.
- Create a new script called Lesson8Script.m. (Use File->New->Blank M-File from the main MATLAB menubar.) You will enter each of the examples in a new cell in this script.
EXAMPLE 1: Load the consolidated sleep diary data
Create a new cell in which you type and execute:
load diaries.mat; % Load the sleep diaries
You should see 8 variables in the Workspace Browser:
- bedTimes - an array with the bedtimes of individual students in the columns
- dayCaffeine - a logical array with columns indicating daytime caffeine use for individual students
- gender - a vector of strings containing 'male' or 'female' designations for each student
- nightCaffeine - a logical array with columns indicating caffeine use after 6 pm for individual students
- section - vector containing sections numbers of the individual studens
- toSleepMinutes - an array with the number of minutes to fall asleep each night for the individual students
- useAlarm - a logical array with indications of alarm use for individual students in the columns.
- wakeTimes - an array with the wake times of individual students in the columns.
NOTE: All of the times are represented as doubles, which are real numbers. The integer part of the time gives the number of days since a reference day (in our case Jan 1, 0 AD) and the fractional part gives time on the current day represented as a fraction of 24 hours. You can use the datestr function to find out what date and time this double corresponds to (e.g., datestr(x) gives a string with the readable form of the date and time corresponding to the value x).
EXAMPLE 2: Calculate the number of students in section 3 (==)
Create a new cell in which you type and execute:
sect3 = (section == 3); % sect3 has 1's corresponding to section 3 students totalSect3 = sum(sect3); % Add up the true's (1's) to find number of students fprintf('%g students in section 3\n', totalSect3);
You should see 2 variables in the Workspace Browser:
- sect3 - a vector with 1's corresponding to the students in section 3
- totalSect3 - the total number of students in section 3 (a single value)
You should also see the following output in the Command Window:
52 students in section 3
In the space below: Define a variable called sect4 that is a logical vector of the same length as section and has 1's (true) in the entries corresponding to students in section 4.
Enter your definition in this cell and execute the cell to create this variable.
EXAMPLE 3: Calculate the average minutesToSleep of students in section 3 (indexing)
Create a new cell in which you type and execute:
minutesSect3 = toSleepMinutes(:, sect3); % Pick out columns of section 3 students meanMinutes3 = mean(minutesSect3(:)); % Find overall mean fprintf('Average minutes to sleep for section 3 students = %g\n', ... meanMinutes3);
You should see 2 variables in your Workspace Browser:
- minutesSect3 - array whose columns are minutes to fall asleep for section 3 students
- meanMinutes3 - the average number of minutes to fall asleep for students in section 3.
You should also see the following output in the Command Window:
Average minutes to sleep for section 3 students = 17.5321
- Define a variable called alarmUseSect3 that contains an array with the alarm use for section 3.
- Find the total number of times that students in section 3 woke up to an alarm.
EXAMPLE 4: Calculate the number of women in the cohort (strcmp to compare strings)
Create a new cell in which you type and execute:
women = strcmp(gender, 'female'); % women has 1's in positions corresponding to females totalWomen = sum(women); % Add up the trues (1's) to find number of women fprintf('%g women in the cohort\n', totalWomen);
You should see 2 variables in your Workspace Browser:
- women - a logical vector with 1's corresponding to female students
- totalWomen - variable holding number of female students in cohort
You should also see the following output in the Command Window:
74 women in the cohort
In the space below: Define a variable called men that holds a logical vector with one's in the positions corresponding to male students.
Enter your definition in this cell and execute the cell to create the variables.
EXAMPLE 5: Calculate the % of women in the cohort
Create a new cell in which you type and execute:
totalStudents = length(gender); % gender has an entry for each student percentWomen = 100.*totalWomen./totalStudents; fprintf('%g%% of the students in the cohort are women\n', percentWomen);
You should see 2 variables in your Workspace Browser:
- totalStudents - the total number of students in the cohort
- percentWomen - the percentage of students in the cohort that are female
You should also see the following output in the Command Window:
51.3889% of the students in the cohort are women
In the space below: Define a variable called fractMen that the fraction of students in the cohert that are male.
Enter your definition in this cell and execute the cell to create the variables.
EXAMPLE 6: Calculate the number of women in section 3 ( & )
Create a new cell in which you type and execute:
womenSect3 = women & sect3; % 1's in positions of women in section 3 totalWomen3 = sum(womenSect3); % Add up the trues (1's) fprintf('%g women in section 3\n', totalWomen3);
You should see 2 variables in your Workspace Browser:
- womenSect3 - logical vector with 1's in positions of women in section 3
- totalwomen3 - total number of women in section 3
You should also see the following output in the Command Window:
30 women in section 3
EXAMPLE 7: Calculate the number of students in section 2 or in section 3 ( | )
Create a new cell in which you type and execute:
sect2or3 = (section == 2) | (section == 3); % sect2or3 has 1's for students in section 2 or 3 total2or3 = sum(sect2or3); % Add up the trues (1's) fprintf('%g students in sections 2 and 3\n', total2or3);
You should see 2 variables in your Workspace Browser:
- sect2or3 - logical vector with ones corresponding to students in either section 2 or section 3
- total2or3 - total number of students in sections 2 and 3 combined
You should also see the following output in the Command Window:
98 students in sections 2 and 3
EXAMPLE 8: Calculate % of wakeups that used an alarm
Create a new cell in which you type and execute:
totalAlarms = sum(useAlarm(:)); % Add up the trues (1's) [numDays, numDiaries] = size(bedTimes); % How many rows and columns? totalEntries = numDays*numDiaries; % Total number of entries percentAlarm = 100*totalAlarms/totalEntries; % Percentage of total entries fprintf('%g%% of the wake-ups used an alarm\n', percentAlarm);
You should see 5 varaibles in your Workspace Browser:
- totalAlarms - total number of days the cohort woke up to an alarm
- numDays - number of days in the study
- numDiaries- size of the cohort (number of participants)
- totalEntries- total number of alarmUse entries
- percentAlarm - percentage of times cohort members woke to an alarm
You should also see the following output in the Command Window:
66.2698% of the wake-ups used an alarm
EXAMPLE 9: Calculate the number of wakeups that were 7:30 am or later ( >= )
Create a new cell in which you type and execute:
wakeupHours = (wakeTimes - floor(wakeTimes))*24; % Get fractional part of wakeTimes wakeGE730 = (wakeupHours >= 7.5); % Which are >= 7:30 am? totalWakeGE730 = sum(wakeGE730(:)); % Number of wake-ups after 7:30 am. fprintf('%g wake-ups are 7:30 am or later\n', totalWakeGE730);
You should see 3 variables in your Workspace Browser:
- wakeupHours - an array with the wake-up time of day for the cohort
- wakeGE730- logical array with 1's corresponding to wake-ups 7:30 or later
- totalWakeGE730 - total times members of cohort got up at 7:30 am or later
You should also see the following output in the Command Window:
1971 wake-ups are 7:30 am or later
EXAMPLE 10: Calculate % of wakeups between 7:30 am and 9:45 am ( & )
Create a new cell in which you type and execute:
wakeBetween = (7.5 <= wakeupHours) & (wakeupHours <= 9.75); % & means both betweenPercent = 100*sum(wakeBetween(:))/totalEntries; % Percentage of total entries fprintf('%g%% of the wake-ups are between 7:30 am and 9:45 am\n', betweenPercent);
You should see 2 variables in your Workspace Browser"
- wakeBetween - logical array with ones corresponding to wakeups in [7:30, 9:30]
- betweenPercent - percentage of wakeups in [7:30, 9:30]
You should also see the following output in the Command Window:
34.7222% of the wake-ups are between 7:30 am and 9:45 am
EXAMPLE 11: Calculate % of wakeups that are after 7:30 am or don't use an alarm ( | and ~)
Create a new cell in which you type and execute:
orWakeups = (wakeupHours > 7.5) | ~useAlarm; % | either one or both orPercent = 100*sum(orWakeups(:))/totalEntries; % Percentage of total entries fprintf(['%g%% of the wake-ups are either after 7:30 am ', ... 'or without an alarm\n'], orPercent);
You should see 2 variables in your Workspace Browser:
- orWakeups - logical vector with ones corresponding to wakeups eith after 7:30 or without alarm
- orPercent - percentage of wakeups that were either after 7:30 or didn't use an alarm
You should also see the following output in the Command Window:
75.5622% of the wake-ups are either after 7:30 am or without an alarm
EXAMPLE 12: Find the subjects with the earliest average wakeup
Create a new cell in which you type and execute:
averWakeup = mean(wakeupHours); % Subject average wake up hour earliest = min(averWakeup); % Earliest average wake up hour earliestSub = find(averWakeup == earliest); % Pick earliest subjects fprintf('Earliest average wakeup time: %g\nEarliest subject(s):', earliest); fprintf(' %g', earliestSub); % Separate print in case more than 1 fprintf('\n'); % Start a new line
You should see 3 variables in your Workspace Browser:
- averWakeup - average wakeup time for each cohort member
- earliest - the earliest average wakeup time for members of cohort
- earliestSub - the subject number(s) who had earliest average wakeup
You should also see the following output in the Command Window:
Earliest average wakeup time: 5.50243 Earliest subject(s): 91
EXAMPLE 13: Find number of bedtimes between 10:30 pm and 2:30 am (relative date)
Create a new cell in which you type and execute:
bed = (bedTimes - floor(wakeTimes))*24; % Hours relative to 0:00 of wake-up day bedBetween = (-1.5 <= bed) & (bed <= 2.5); % & means both are true percentBetween = 100*sum(bedBetween(:))/totalEntries; % Percentage of total entries fprintf('%g%% of the bedtimes were between 10:30 pm and 2:30 am\n', ... percentBetween);
You should see 3 variables in your Workspace Browser:
- bed - time of day for each bedtime in the data set (relative to midnight)
- bedBetween - logical array with ones for bedtimes in [10:30 pm, 2:30 am]
- percentBetween - percentage of times cohort members went to be between 10:30 pm and 2:30 am inclusively
You should also see the following output in the Command Window:
57.3743% of the bedtimes were between 10:30 pm and 2:30 am
SUMMARY OF SYNTAX
| MATLAB syntax | Description |
ind = find(x) |
returns the positions of the non-zero elements of x. |
Y = floor(X) |
returns and array Y that is the same size
as the array X. Each element of Y
is the largest integer that is less than or equal to
the corresponding element of X. For example,
floor(3.5) is 3, while floor(-2) is -2.
|
n = length(A) |
returns the number of elements in the longest dimension of A. |
Logical element-wise operators:&, |, ~ |
are used to combine arrays based on the logical element-wise
operators AND, OR, and NOT. The logical value true
corresponds to the value 1, and the logical value
false corresponds to the value 0. If you
apply a logical element-wise operator to an array of numerical values
rather than logical values, MATLAB first creates a logical array
with true corresponding to the nonzero elements
and false corresponding to the zero elements.
|
Relational element-wise operators:>, <, >=,
<=, ==, ~= |
are used to combine arrays based on a comparison of their
values. The result of applying a relational operator is a logical
array of the same size as the input operands. The result has true
in the positions where the relationship is true and false elsewhere.
|
[rows, col] = size(A) |
returns the number of elements in the first two
dimensions of A. |
strcmp(s1, s2) |
returns true if the string s1 is equal to
the string s2 and false otherwise. |
strcmp(s1, C) |
returns a logical array of the same size as the cell array
of strings C. The result has true in the positions
corresponding to the entries of C that match s1
and false elsewhere. |
This lesson was written by Kay A. Robbins of the University of Texas at San Antonio and last modified on 31-Dec-2010. Please contact krobbins@cs.utsa.edu with comments or suggestions. The image is a photograph of a nocturnal instrument photographed by Michael Daly on 8/22/2009. The image is available on Wikipedia as http://en.wikipedia.org/wiki/Nocturnal_%28instrument%29.