Sorting and Searching A sort is an algorithm for ordering the elements of an array. Sorting is a fundamental component of many applications. We can base a sort on any property of a data type that imposes an ordering its the values. We usually sort numbers on their value (into either increasing or decreasing order). Strings are often sorted based on the ANSI values of the characters or alphabetically ignoring case, but other measures are possible. For example they might be sorted based on string length. We will look at two simple sorting methods, the bubble sort and the shell sort. Both of these methods compare pairs of elements in an array and, if they are not in the correct order, swap the element positions. They differ in how they arrange the pair-wise comparisons to ensure that all elements are in order. The swap is a basic operation required in sorting. It is generally implemented using a temporary variable to hold one of the values. To swap two array elements: Dim a(1 To 10) As Single Dim temp As Single … Let temp = a(5) Let a(5) = a(8) Let a(8) = temp The Bubble Sort The bubble sort compares adjacent elements through the array and swaps them if they are out of order. Several passes through the array are necessary. The steps for the first pass are: 1. Compare the first and second item. If they are out of order, swap them. 2. Compare the second and third item (the second item may have originally been the first). If they are out of order swap them. 3. Repeat this pattern to the end of the list. After one pass, we can be sure that the greatest element is at the end of the list (assuming we are sorting in increasing order). So when we do the second pass we repeat the steps, except that we do not have to compare the second to last element to the last element (since the last element is known to be the greatest). After the second pass, we can be sure that the second greatest element has been moved to the second to last position, so it does not have to be compared in the third pass. We repeat this process, shortening the number of elements compared by one each pass, until in the final pass we only have two elements to compare. Bubble sort example: First Pass Pebbles Barney Barney Barney Barney Barney Pebbles Pebbles Pebbles Pebbles Wilma Wilma Wilma Fred Fred Fred Fred Fred Wilma Dino Dino Dino Dino Dino Wilma After the first pass, Wilma is correctly placed in the last position. Other elements have been swapped, but we cannot be sure that they are in the right position. Second Pass Barney Barney Barney Barney Pebbles Pebbles Fred Fred Fred Fred Pebbles Dino Dino Dino Dino Pebbles Wilma Wilma Wilma Wilma On the second pass, Pebbles is put into the correct place. We do not have to compare the last element to the second last because we know the last element is in the correct position. Third Pass Barney Barney Barney Fred Fred Dino Dino Dino Fred Pebbles Pebbles Pebbles Wilma Wilma Wilma This time Fred is placed correctly. Although it happens in this case that the top two elements are in the right order, this is not generally true (try replacing Dino with Alpha in the original ordering). Fourth Pass Barney Barney Dino Dino Fred Fred Pebbles Pebbles Wilma Wilma We check the ordering of only the top two elements on the last pass. If on any pass there are no swaps, then the list is in order. We can use this to terminate the sort early whenever it is detected. In practice, however, this does not speed things up much. It may help if the list is “almost” in order, that is if no entry is very far from its correct position. Notice that entries can move down the list quickly but can only move up one position per pass. A bubble sort is implemented with nested loops. The outer loop controls the pass and is performed one less time than the number of elements (4 times in this case). The inner loop controls the comparisons and is performed one less time for each iteration of the outer loop (4, 3, 2, and 1 times). Dim nom(1 To 5) As String Private Sub cmdSort_Click() Dim passNum As Integer, i As Integer, temp As String Rem Bubble sort names For passNum = 1 To 4 'Number of passes is 1 less than number of items For i = 1 To 5 - passNum 'Each pass needs 1 less comparison If nom(i) > nom(i + 1) Then Let temp = nom(i) Let nom(i) = nom(i + 1) Let nom(i + 1) = temp End If Next i Next passNum Rem Display alphabetized list picNames.Cls For i = 1 To 5 picNames.Print nom(i), Next i End Sub Private Sub Form_Load() Rem Fill array with names Let nom(1) = "Pebbles" Let nom(2) = "Barney" Let nom(3) = "Wilma" Let nom(4) = "Fred" Let nom(5) = "Dino" End Sub Parallel Arrays We often want to sort associated pieces of information based on the value of on of the pieces of information. For example, we might want to sort the names of cities based on their population. One way to do this is to use parallel arrays. Parallel arrays contain related information at each position. When we sort the arrays based on one piece of information we must maintain this relationship in all of the other arrays. When we swap in the array on which the sort is based, we must also swap in any parallel arrays. For example if populations are in the array pop() and the city names are in the array city(), the comparison and swap would look like: If pop(index) < pop(index+1) ‘ Swap the pop values Let popTemp = pop(index) Let pop(index) = pop(index+1) Let pop(index+1) = popTemp ‘ Swap the city strings Let cityTemp = city(index) Let city(index) = city(index+1) Let city(index+1) = cityTemp End If The Shell Sort The bubble sort turns out to be too slow for long lists. A more efficient sort for long lists is the Shell sort. The Shell sort starts by comparing elements at widely separated positions and works its way down to comparing nearby positions. In its simplest form, the gap begins at (roughly) half the size of the list and is successively reduced by half until adjacent items are compared. The algorithm for an array of size n is: 1. Begin with a gap of g = Int(n/2). 2. Compare items 1 and 1 + g, 2 and 2 + g, … , n - g and n. Swap any pairs that are out of order. 3. Repeat Step 2 (without changing g) until no more swaps are made for gap g. 4. Let g = Int(g/2). 5. Repeat Steps 2, 3, and 4 until g = 0. Shell sort applied to the example: gap = Int(5/2) = 2 First Pass Pebbles Pebbles Pebbles Pebbles Barney Barney Barney Barney Wilma Wilma Wilma Dino Fred Fred Fred Fred Dino Dino Dino Wilma Swap Wilma and Dino, so try again with the same gap. Second Pass (gap unchanged) Pebbles Dino Dino Dino Barney Barney Barney Barney Dino Pebbles Pebbles Pebbles Fred Fred Fred Fred Wilma Wilma Wilma Wilma Swap Pebbles and Dino, so try again with the same gap. Third Pass (gap unchanged) Dino Dino Dino Dino Barney Barney Barney Barney Pebbles Pebbles Pebbles Pebbles Fred Fred Fred Fred Wilma Wilma Wilma Wilma No swaps, so calculate new gap. gap = Int(gap/2) = 1 Fourth Pass Dino Barney Barney Barney Barney Barney Dino Dino Dino Dino Pebbles Pebbles Pebbles Fred Fred Fred Fred Fred Pebbles Pebbles Wilma Wilma Wilma Wilma Wilma Swap Dino and Barney, Pebbles and Fred, so try again with the same gap. Fifth Pass (gap unchanged) Barney Barney Barney Barney Barney Dino Dino Dino Dino Dino Fred Fred Fred Fred Fred Pebbles Pebbles Pebbles Pebbles Pebbles Wilma Wilma Wilma Wilma Wilma No swaps, so calculate new gap. gap = Int(gap/2) = 0 Sort is complete. Note that this took 14 comparisons and the bubble sort took only 10. As lists get longer (> about 30) the Shell sort outperforms the bubble sort. The shell sort also uses nested loops, but in this case their are three loops and only the inner- most loop can use For…Next structure. The middle level loop repeats until no swaps are made in the inner loop. A flag variable is used to control this loop. The outer loop repeats once for each gap size. The number of outer loop iterations required is known implicitly before the loop is executed (it is a function of the numParts), but, since we have to calculate the gap anyway, it is easier to just check for a gap value of zero. Shell sort code example: … Let gap = Int(numParts / 2) Do While gap >= 1 Do Let doneFlag = 1 For index = 1 To numParts - gap If part(index) > part(index + gap) Then Let temp = part(index) Let part(index) = part(index + gap) Let part(index + gap) = temp Let doneFlag = 0 End If Next index Loop Until doneFlag = 1 Let gap = Int(gap / 2) Loop … Searching In a search we want to find the element in a list that matches a key value. Sequential search starts at the beginning of the list and compares each element of the list to the key. In a random list of elements, we might find it on the first comparison or on the last, but on average we would have to compare the key to half of the elements. This can take a long time for long lists. If a list is sorted we can speed up the search considerably, using a binary search. Assume a list is sorted in ascending order. In a binary search, we look at the value of the middle element of the list. If its value is greater than the key value then we know that the element is in the lower half of the list; if its value is less than the key value the element then the element is in the upper half of the list; and if it is equal to the key value we are done. We can then ignore half of the list and search the remaining half. This process is repeated, with the list being halved each time, until the element is found or until there is only one element in the list. Search an array a of n items (1 to n) in ascending order for key: 1. Set first = 1, last = n, found = False 2. Set middle = Int((first + last)/2) 3. If a(middle) = key then set found true 4. Else if a(middle) > key then set last = middle - 1 5. Else set first = middle + 1 (a(middle) must be < key) 6. Repeat 2 - 5 until found is true or first > last. After completion the found flag indicates whether an element with the key value was found. Note that if there is more than one matching element, only one is found and there is no guarantee which one it is. If the found flag is true then middle gives the location of the element found. This is particularly useful when using parallel arrays. The following code modifies the example on page 360 of the text to make the binary search a general purpose function. ‘ Binary search Private Function binarySearch(target() As String, key As String) As Integer Dim first As Integer, middle As Integer, last As Integer Dim foundFlag As Integer Let first = 1 Let last = UBound(target) Do While (first <= last) And (foundFlag = 0) Let middle = Int((first + last) / 2) Select Case UCase(target(middle)) Case key Let foundFlag = 1 Case Is > key Let last = middle - 1 Case Is < key Let first = middle + 1 End Select Loop If foundFlag = 1 Then Let binarySearch = middle Else Let binarySearch = 0 End If End Sub To use the function to find the population a city, assuming the city array is sorted and pop is a parallel array holding the populations: … Dim city(1 To 10) As String, pop(1 To 10) As Single Dim index As Integer Let index = binarySearch(city(), “Dallas”) If index <> 0 Then picBox.Print “The pop of Dallas is”; pop(index) Else picBox.Print “Dallas is not in the list of cities” End If … One problem with this function is that it assumes the array starts at 1. We could compare LBound(target) to 1 and notify the user or the calling program in some way. …