LESSON 10: Vector logic questions
FOCUS QUESTION: How can I extract the rows and columns of an array based on data characteristics?
Contents
- EXAMPLE 1: Load the consolidated sleep diary data
- EXAMPLE 2: Calculate the number of students in section 2 ( == )
- EXAMPLE 3: Calculate the average minutesToSleep of students in section 2 (indexing)
- EXAMPLE 4: Calculate the number of men in the cohort (strcmp)
- EXAMPLE 5: Calculate the % of men in the cohort
- EXAMPLE 6: Calculate the number of men in section 2 (use &)
- EXAMPLE 7: Calculate the number of students in section 2 or in section 3 (|)
- EXAMPLE 8: Calculate the number of wake-ups that were 8:30 am or later (use >= )
- EXAMPLE 9: Calculate % of wake-ups between 7:30 am and 9:45 am ( &)
EXAMPLE 1: Load the consolidated sleep diary data
load diaries.mat;
Questions | Answers |
Where do the variables come from when this file is loaded? | The file was created by saving variables from a MATLAB workspace using the save command. When you load this type of file, MATLAB recreates the saved variables along with the values. |
What is the .MAT format? | The .MAT format is a binary format that allows you to save an entire workspace or multiple variables, including complex structures in a single file. |
What are the advantages of saving data in .MAT format? | The .MAT format efficiently stores variables and allows you to resume working in a workspace that you previously created. Thus, you don't have to reprocess data to put it in the form you need. |
What are the disadvantages of saving data in .MAT format? | The .MAT format is proprietary, meaning that it belongs to Mathworks. Files stored in .MAT format are not recognized by most other applications. You cannot examine the contents of such a file using a text editor. |
EXAMPLE 2: Calculate the number of students in section 2 ( == )
sect2 = (section == 2);
totalSect2 = sum(sect2);
fprintf('%g students in section 2\n', totalSect2);
46 students in section 2
Questions | Answers |
My Workspace Browser indicates that sect3 is a logical array. What does that mean? | Logical array element values are either true or false. |
Why are sect3's values displayed as 1 or 0 rather than true or false? | MATLAB represents the logical values true and false by 1 and 0, respectively. You can use either representation. |
Why not just make sect3 be integer or double? | Because sect3 is a logical array, you know that its values will only be 1 (true) or 0 (false) and not some other numerical value. Logical vectors are used to pick out rows and columns of other databases and to express the answers to counting questions. |
Can I do arithmetic on logical values? | Yes, you can use logical values in arithmetic expressions. MATLAB just converts logical values to numeric 1's and 0's before doing the calculation. |
EXAMPLE 3: Calculate the average minutesToSleep of students in section 2 (indexing)
minutesSect2 = toSleepMinutes(:, sect2); averMinutes2 = mean(minutesSect2(:)); fprintf('Average minutes to sleep for section 2 students = %g\n', ... averMinutes2);
Average minutes to sleep for section 2 students = 16.9482
Questions | Answers |
What is the purpose of using sect3 as the column specifier of minutesToSleep? | This type of specifier allows you to select rows and columns based on a logical condition. MATLAB picks out the columns of toSleepMinutes corresponding to the positions where the specifier has 1's (true's). |
What is the size of minutesSect3 and why? | The minutesToSleep array has 21 rows and 144 columns. The variable sect3 is a vector of length 144. (This variable could not be used as an index vector for minutesToSleep unless the sizes matched.) Since sect3 has 52 ones corresponding to the 52 students in section 3, minutesSect3 will have 21 rows and 52 columns. |
EXAMPLE 4: Calculate the number of men in the cohort (strcmp)
men = strcmp(gender, 'male'); totalMen = sum(men); fprintf('%g men in the cohort\n', totalMen);
70 men in the cohort
Questions | Answers |
What does strcmp(A, s) do? | The strcmp function creates a logical vector that is the same size as A. The result has 1's in the locations where A contains the string s. The variable s contains a single string, and the variable A is a cell array of strings. |
Why is gender a cell array rather than an array of char? | Cell array elements can be of different lengths. We will almost always use cell arrays to represent arrays of strings. |
How can I distinguish a cell array from an ordinary array? | Use braces ({ }) to designate cell arrays and square brackets ([ ]) to designate ordinary arrays. |
EXAMPLE 5: Calculate the % of men in the cohort
totalStudents = length(gender);
percentMen = 100.*totalMen./totalStudents;
fprintf('%g%% of the students in the cohort are men\n', percentMen);
48.6111% of the students in the cohort are men
EXAMPLE 6: Calculate the number of men in section 2 (use &)
menSect2 = men & sect2;
totalMen2 = sum(menSect2);
fprintf('%g men in section 2\n', totalMen2);
25 men in section 2
Questions | Answers |
What does A & B mean? | The & symbol represents the logical element-wise AND operator. The result of A & B is an array of 0's and 1's that is the same size as the arrays A and B. The result has 1 in entries corresponding to positions where both A and B are non-zero, and 0 otherwise. In the example, A represents the students who are female and B represents the students in section 3. The two conditions combine to give the females in section 3. In other words, womenSect3 designates the subjects corresponding to women in section 3. |
EXAMPLE 7: Calculate the number of students in section 2 or in section 3 (|)
sect2or3 = (section == 2) | (section == 3);
fprintf('%g students in sections 2 or 3\n', sum(sect2or3));
98 students in sections 2 or 3
Questions | Answers |
What does A | B mean? | The | symbol represents the logical element-wise OR operator. The result of A | B is an array of 0's and 1's that is the same size as the arrays A and B. The result has 1 in entries corresponding to positions where either A or B or both are non-zero, and 0 otherwise. In the example, A represents the section 2 students and B represents the section 3 students. The two conditions combine to give the students who are either in section 2 or in section 3. Note: Sometimes common usage would ask for the students in "sections 2 and 3" to mean students in either section. Be careful to understand what the true logical meaning is. |
EXAMPLE 8: Calculate the number of wake-ups that were 8:30 am or later (use >= )
wakeupHours = (wakeTimes - floor(wakeTimes))*24;
wakeGE830 = (wakeupHours >= 8.5);
totalWakeGE830 = sum(wakeGE830(:));
fprintf('%g wake-ups are 8:30 am or later\n', totalWakeGE830);
1484 wake-ups are 8:30 am or later
Questions | Answers |
What is the floor function? | The floor function throws away the fractional part of its operand. Since wakeTimes is an array, the floor creates an array of integers that is the same size as wakeTimes. |
Why multiply by 24 to compute wakeupHours? | The expression wakeTimes - floor(wakeTimes) is an array containing the wake-up times in units of fraction of a day. Multiply by 24 to convert this expression to wake-up hours. |
What does A >= B mean? | The result of A >= B is an array of 0's and 1's that is the same size as the arrays A and B. The result has 1 in entries corresponding to positions where A is greater than or equal to B, and 0 otherwise. Use A >= B to find the locations where the element of A at least as large as the corresponding element of B. |
EXAMPLE 9: Calculate % of wake-ups between 7:30 am and 9:45 am ( &)
wakeBetween = (7.5 <= wakeupHours) & (wakeupHours <= 9.75);
betweenPercent = 100*mean(wakeBetween(:));
fprintf('%g%% of the wake-ups are between 7:30 am and 9:45 am\n', betweenPercent);
34.7222% of the wake-ups are between 7:30 am and 9:45 am
Questions | Answers |
What does A & B mean? | The & symbol represents the logical element-wise AND operator. The result of A & B is an array of 0's and 1's that is the same size as the arrays A and B. The result has 1 in each entry where the corresponding elements of both A and B are non-zero, and 0 otherwise. In the example, A represents the wake up times after 7:30 am and B represents the wake up times before 9:45 am. The two conditions combine to give the wake-up times that are both after 7:30 am and before 9:45 am. In other words, wakeBetween designates the elements with wake-up times between 7:30 am and 9:30 am inclusive. |
Why not just write 7.5 <= wakeupHours <= 9.75 to designate the wake up hours between 7:30 and 9:45 am? | Although this expression evaluates without an error, it does not give the correct result. For example, 3 <= 4 <= 2 is true. The reason is as follows. The <= operator takes two numerical values for comparison and returns a logical value. In the example, 3 <= 4 is true. MATLAB converts the true to a 1 for the second comparison. The second comparison then becomes 1 <= 2 which is true. |
This lesson was written by Kay A. Robbins of the University of Texas at San Antonio and last modified by Dawn Roberson on 1 March 2014. Please contact kay.robbins@utsa.edu with comments or suggestions. The image is a photograph of a nocturnal instrument photographed by Michael Daly on 8/22/2009. The image is available on Wikipedia as http://en.wikipedia.org/wiki/Nocturnal_%28instrument%29.