CS 1713 Week 8:
Searching and Sorting

Objectives:

Activities:

Reading: Chapter 10.4 - 10.5


Searching unordered arrays - the linear search

Searching means to find the position of a specified value in an array.

Linear search strategy: examine entries of an array in order until a match is found or all of the entries have been examined.

Example 1: The linear search returns the position of val in list.
     public static int linearSearch(int [] list, int val){
         for (int i = 0; i < list.length; i++)
             if (list[i] == val)
                return i;
         return -1;
     }

Exercise 1: Referring to Example 1, how many entries do you have to examine (on average) to find the position of val in list? How many entries do you have to examine (on average) to determine that val is not in list? How would you use linear search on an array of double?
Ans: Suppose list has n elements. Since it is equally likely that val will appear in the first half as the second, we examine n/2 on average as long as val is in list. If val is not in list we have to examine all n elements.

The linear search algorithm is said to be order n (This is known as Big O of n and is written O(n).)

Exercise 2: Write an overloaded linearSearch method that searches an array of Comparable. Add both versions of linearSearch to a Search class in a package continaing the ArrayUtility class of Case Study 9.
Ans:
    public static int linearSearch(Comparable [] list, Comparable val){
        for (int i = 0; i < list.length; i++)
           if (list[i].compareTo(val) == 0)
              return i;
        return -1;
    }

Exercise 3: Describe a linear search strategy for looking up a name in the telephone book. Do you use a linear search when using the telephone book?
A linear search strategy would start at the very beginning of the phone book and scan every name in order until you found the name you were looking for. Of course no one uses a linear search strategy. Instead, we rely on the fact that the names are in alphabetical order and move approximately to the place where the name should be located.

Searching a sorted array with the binary search

The binary search is a divide and conquer strategy that narrows down the search much like a telephone book search. The binary search requires the array values to be sorted either in ascending or descending order.

Binary search strategy: Binary search implementation Write a binary search function for integer arrays. Assume that the arrays are sorted in ascending order.

Example 2: The binarySearch for an array of int.
    public static int binarySearch(int [] list, int value) {
       int low = 0;
       int high = list.length - 1;
       while (low <= high) {
          int mid = (low + high)/2;
          if (list[mid] == value)
             return mid;
          else if (list[mid] < value)
             low = mid + 1;
          else
             high = mid - 1;
       }
       return -1;
    }
Exercise 4: How would the code change if the array list were sorted in descending order?

Exercise 5: For an array of size n, how many times would you expect to repeat the dividing process?
Ans: Let k be the number of times you halve the array.

Exercise 6: Adapt the binarySearch for arrays of Comparable and add it to the Search class.

Exercise: 7: Fill in the following table (compare using a linear and binary search for looking up a name in a list for different size lists):
n log2 (n)
25 = 32  
210 = 1024  
215 = 32,768  
220 = 1,048,576  

Sorting

Sorting is the process of rearranging a collection of items so that they appear in ascending or descending order. We will work on two sorts:


The selection sort

Example 3: The following is an implementation of the selection sort for an array of Comparable. Comparing this to the textbook implementation, we have made a function for the inner loop.
    
   public static void selectionSort(Comparable [] list) {
      for (int index = 0; index < list.length - 1; index++) {
          int min = findMinPosition(list, index);
          Comparable temp = list[min];
          list[min] = list[index];
          list[index] = temp;
      }
   }

   public static int findMinPosition(Comparable [] list, int start) {
      int min = start;
      for (int scan = start + 1; scan < list.length; scan++)
         if (list[min].compareTo(list[scan]) > 0 )
            min = scan;
      return min;
   }
Exercise 7: Fill in the following table with the steps of the selection sort. Show the minimum position at each stage.

You can find a solution here.
list[0] list[1] list[2] list[3] list[4] list[5] list[6] list[7] list[8] list[9]
98
90
28
40
82
24
14
94
53
69
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   

Exercise 8: Implement the selection sort for an array of int. Why can't we use compareTo?
Ans:
 
  public static void selectionSort(int [] list) {
      for (int index = 0; index < list.length - 1; index++) {
          int min = findMinPosition(list, index);
          int temp = list[min];
          list[min] = list[index];
          list[index] = temp;
      }
   }

   public static int findMinPosition(int [] list, int start) {
      int min = start;
      for (int scan = start + 1; scan < list.length; scan++)
         if (list[min] > list[scan])
            min = scan;
      return min;
   }

Exercise 9: Compute the order of the selectionSort algorithm for n elements.
Ans: The first time through the loop, you examine n elements, the second time n - 1, etc. The sum of the integers from 1 to n is n*(n+1)/2. Therefore the selectionSort is O(n2).

Exercise 10: The following static printArray method is useful for instrumenting a sort. Add this method to a Print class in the package with the Search class. Use it to instrument the selection sort. Run and check your hand results against the instrumented output.
    public static void printArray(Comparable[] list, String msg) {
      System.out.print("\n[");
      for (int i = 0; i < list.length; i++)
         System.out.print(" " + list[i]);
      System.out.println(" ]: " + msg);
   }

The insertion sort

The insertion sort strategy (the card player's sort):
  • Example 4: The following uses the insertion sort to sort an array of Comparable. Again we make a helper function for the inner loop.
         
       public static void insertionSort (Comparable [] list) {
          for (int index = 1; index < list.length; index++) {
             insertItem(list, index);
          }
       }
    
       public static void insertItem(Comparable [] list, int index) {
           Comparable key = list[index];
           int position = index;
     
             // shift larger values to the right
           while (position > 0 && key.compareTo(list[position-1]) < 0)   {
              list[position] = list[position-1];
              position--;
           }
           list[position] = key;
        }
    
    Exercise 10: Fill in the following table with the steps of the insertion sort. You can find a solution here.
    list[0] list[1] list[2] list[3] list[4] list[5] list[6] list[7] list[8] list[9]
    98
    90
    28
    40
    82
    24
    14
    94
    53
    69
                       
                       
                       
                       
                       
                       
                       
                       
                       
                       


    You can find a sample to print out here and a solution here.

    You can also find additional examples here.

    Estimating running time

    It is useful to have some idea about how long it will take to run a program.
    Is it feasible to sort an array of size 1000 or 10,000, or 100,000?

    For many programs, the running time mainly depends on how many times the body of the (inner) loop is executed.

    Example 5: Suppose it takes at most 10 ms. to do a linear search on an array if size 100,000. How long will it take to do a linear search on a similar array of size 1,000,000.
    Answer: Since the number of times through he loop is the size of the array (in the worst case) and the second array is 10 times an large, it should take 10 times as long, or 100ms.

    In general:

        loop iterations of A     running time of A
       ---------------------- = -------------------
        loop iterations of B     running time of B
    
    In the above example, if the unknown running time is t:
         100,000      10 ms.
       ----------- = --------
        1,000,000       t
    
    Solving gives t = 100 ms.

    Exercise 11: It takes at most 2 seconds to sort an array of size 10,000 using a selection sort. How long would it take to sort a similar array of size 30,000.

    Answer: Since it takes about n2/2 iterations of the inner loop to do an insertion sort on an arry of size n:

        (10,000)2/2      2 seconds
        ------------ =  ----------- 
        (30,000)2/2      t seconds
    
    t = 18 seconds.

    Notice that the factor of 1/2 on the left side cancels out, so we get the same answer if we use n2 for the number of iterations instead of n2/2. This is another way of saying that the selection sort algorithm is O(n2).

    Here is a summary for the algorithms we have done so far.
    In each case the array has size n.
     algorithm   loop iterations 
     linear search  n 
     binary search  log(n) 
     selection sort  n
     insertion sort  n2 


    Case Study 14: Searching and sorting arrays

    Create a new project called cs14. All classes in this project will be in a package called sort. Add classes called Search and SearchSortTester. The Search class contains static methods for the searching techniques discussed when the array contains Comparable objects and when the array contains int values.

    Add a class called QuadraticSort to sort. The QuadraticSort class contains static methods for the sorting techniques discussed when the array contains Comparable objects and when the array contains int values.

    Timing in Java
    The nanoTime static method of the System class is useful for determining how long it takes to execute code in Java.
    The method returns the number of nanoseconds since some arbitrary time.
    A nanosecond is a billionth of a second (10-9 seconds).
    Here is a simple method for printing out the number of seconds it takes to sort an array:
       private static final double BILLION = 1000000000.0;
    
       long startTime, endTime;
    
       startTime = System.nanoTime();
       QuadraticSort.insertionSort(list);
       endTime = System.nanoTime();
       System.out.println("Number of seconds to do the insertion sort is "+
                          (endTime - startTime)/BILLION); 
    
    

    Add some timing tests to the Case Study 14. How large can the arrays get before sorting takes a long time?

    Below are some useful methods for creating large arrays to sort:

        public static void printArray(int[] array) {
          for (int i = 0; i < array.length; i++)
            System.out.print(array[i] + " ");
          System.out.println();
        }
    
        public static int[] makeArray(int n) {
          Random rand = new Random();
          int[] array = new int[n];
          for (int i = 0; i < n; i++)
            array[i] = rand.nextInt(10*n);
          return array;
        }
    
        public static int[] cloneArray(int[] array) {
          int[] newArray = new int[array.length];
          for (int i = 0;i < array.length; i++)
            newArray[i] = array[i];
          return newArray;
        }
    

    The quadratic sorts we discussed here are too slow for large arrays because n2 grows quickly for large values of n. Next semester we will look at sorting methods that take about n log2 steps.


    n log2(n) n log2(n)    n2   
    25 = 32      
    210 = 1024      
    215 = 32,768      
    220 = 1,048,576      

    The Java Arrays class has efficient static methods for sorting arrays. For example, to sort an array, list, of ints you can use:
    Arrays.sort(list).