CM10135 / Programming II:   Lecture 2


Example Algorithms:

Sorting

I.  Housekeeping:

  1. From last time:
    1. Every algorithm can be run on a Turing Machine
    2. But this is all less exciting than it seems when you realize an algorithm is now defined as anything that can run on a Turing machine.
      1. Is there any other kind of computation?
      2. Is the human brain a Turing machine?
      3. These are open questions!  (See this page for a description of the evidence for Church-Turing.)
      4. But not the topic of this course.
      5. An open question is one still being actively researched.  Intelligent, well educated experts in a field hold opposing views with each other.
    3. Reasons to do the tutorial
      1. Only says its optional because not in course description.
      2. Understand programming better if get a feel for what the components are.
      3. Understand memory better if you can reverse it -- should be able to draw what you are doing.
      4. Understanding process & memory is central to understanding algorithms & complexity.
      5. Understanding algorithms & complexity is central to the exam!
  2. Text books on web page (not mandatory).
  3. All of the stuff from lecture & much, much more are available from the course web page!
    1. http://www.cs.bath.ac.uk/~jjb/here/CM10135/course-sched.html
    2. Links to pages with movies of sort algorithms, graphs, other people's explanations of the same material, etc.
  4. Anyone not assigned to a tutorial?
  5. Rough structure for the next few lectures:
    1. Sorting algorithms today.
    2. Intro to algorithms & complexity tomorrow.
    3. Next week we'll look at the complexity of the algorithms we're learning today.
    4. Today's algorithms are in psuedo code, next week's labs you'll get to write them in Java.
      1. pseudo code is a fake generic programming language used to communicate in papers & talks.
      2. Some high-level languages are designed to be like psuedo code, e.g. python.

II.  Sorting in General

  1. Sorting is a useful skill, because it makes searching faster.
  2. 50 years of research in artificial intelligence indicate that searching is most of the problem of intelligence.
    1. You have to find the right thing to do.
    2. But looking for it takes time.
    3. And the right thing to do right now can be quite different from the right thing to do right now.
    4. If things are sorted, then individual things are easier to find (though still not trivial.)
      1. Phone book.
      2. Google.
    5. More about searching late next week or early the week after.
  3. You are mostly learning about how to search so you are familiar with some example algorithms.
    1. Gives us examples for learning how to analyze and compare algorithms.
    2. Helps you practice / exercise your brain / think like a programmer.
    3. These sometimes crop up on Computer Science tests
      1. e.g. the American Graduate Record Exam (GRE)
      2. standard training for CS majors since the days before there were easily accessible libraries of functions.
  4. Items are sorted on the basis of a key.
    1. Key is an attribute or set of attributes about a class of objects that can be used to create a useful ordering.
    2. Set of attributes may be necessary if no one attribute creates a sufficiently predictable ordering.
    3. Examples:
      1. Dictionary definitions
        1. Word they define is used as the key.
        2. Sorted alphabetically on the key.
      2. Students
        1. Student ID is most useful key, guaranteed unique.
        2. May be sorted by name, marks, age etc.
        3. Using non-unique keys can result in errors (e.g. misdelivery of post, pay).
      3. Google
        1. No obvious ordering for web pages.
        2. Creates indecies based on compression of content words.
        3. Also uses access information, links to pages, and other secret things.
    4. For the algorithms below, we'll just sort numbers
      1. But they'll work fine on more interesting objects...
      2. If you can figure out good keys for ordering them.

III.  Selection Sort

  1. Basic idea:
    1. You have an array of unsorted number.
    2. You have an array big enough to stick each thing in one slot.
    3. You put the smallest available number in the lowest available index.
  2. What do we need to do this?
    1. An integer to keep track of  our next available lowest index.  sortedIx
    2. Another integer to keep track of where we are while we're searching for the lowest number.  searchIx
    3. Another integer to keep track of what the current minimum value is.  min
    4. Another integer to keep track of where that value was. minLoc
  3. Initial psuedo code;
    array int messy, sorted;  // these have come from somewhere & must have equal length
    int sortedIx, searchIx, min, minLoc;
    // this also assumes the system defines a number BIGNUM that's bigger
    // than anything in messy.

    // outer loop over the sorted array -- when everything's sorted we're done
    for (sortedIx = 0; sortedIx < sorted.length; sortedIx++) {
    // initialize min to something bogus
    min = BIGNUM; minLoc = -1;
    // inner loop over messy array: find the smallest number in messy.
    for (searchIx = 0; searchIx < search.length; searchIx++) {
    // if it's the smallest yet seen.
    if (messy[searchIx] < min) {
    min = messy[searchIx];
    minLoc = searchIx;
    }
    } // for searchIx
    // now stick that small value in sorted
    sorted[sortedIx] = min;
    // and make sure it won't get found again
    messy[minLoc] = BIGNUM;
    } // for sortedIx
  4. This isn't actually the standard algorithm for insertion sort
    1. We can save space by doing it in just one array
    2. We can save time by not looking at the values we already sorted.
    3. Does anyone care if we save this much space or time?
      1. Probably not, but this is practice.
      2. Maybe, depending on how big the array is.
  5. Here's the standard one. The difference is that now we swap the smallest value with whatever was at the current point in the list:
    array int sorta;  // this comes from somewhere only one now!
    int sortedIx, searchIx, min, minLoc;
    int temp; // add a temporary location for swapping values.
    // don't need BIGNUM anymore...

    // outer loop over the sorted array -- when everything's sorted we're done
    for (sortedIx = 0; sortedIx < sorta.length; sortedIx++) {
    // now min is set to the thing in the slot we're currently trying
    // to fill.
    min = sorta[sortedIx]; minLoc = sortedIx;
    // inner loop over messy part of the array: find the smallest number
    for (searchIx = sortedIx; searchIx < search.length; searchIx++) {
    // if it's the smallest yet seen.
    if (sorta[searchIx] < min) {
    min = sorta[searchIx];
    minLoc = searchIx;
    }
    } // for searchIx
    // move the old value to temp
    temp = sorta[sortedIx];
    // now stick that small value in the right place
    sorta[sortedIx] = min;
    // finish the swap
    sorta[minLoc] = temp;
    } // for sortedIx

IV.  Insertion Sort

  1. Basic idea:
    1. You have an array of unsorted number.
    2. Split the array between sorted & unsorted parts (initially, take the first element as sorted).
    3. Stick the first number of the unsorted part of the array into the right place in the sorted part.
  2. What do we need to do this?
    1. An integer to keep track of sorted/unsorted boundary.  boundaryIx
    2. A place for the item we're currently trying to insert into the sorted bit.  currentItem
    3. Another index to keep track of where in the sorted bit we are. searchingIx
  3. Psuedo code;
    array int sorta;  // this comes from somewhere in a messy state
    int boundaryIx, searchingIx; // these are indecies
    int currentItem // this is the temporary storage for th item we're moving...
    // it's old home get's filled with stuff making space for it.

    // outer loop over the sorted array -- when everything's sorted we're done
    for (boundaryIx gets the values 1 through sorta.length) {
    // save new item
    currentItem = sorta[boundaryIx]
    // inner loop starts at boundary and works down
    for (searchingIx = boundaryIx - 1; (searchingIx >= 0) AND (sorta[searchingIx] < currentItem);
    searchIx--) {
    // haven't found the right place yet or wouldn't be in here, so
    // just move stuff.
    sorta[searchingIx+1] = sorta[searchingIx];
    } // for searchingIx
    // if we're out of the loop, so searchingIx is pointing just neg. of where we want to be
    sorta[searchingIx+1] = currentItem;
    } // for boundaryIx

V.  Quick Sort

  1. Basic idea:
    1. You have an array of unsorted number.
    2. Pick one number, the pivot point.
    3. Put all the numbers smaller than the pivot to its left, & all the larger numbers to its right.
    4. Call quick sort recursively on the numbers on each side of the pivot.
  2. What do we need to do this?
    1. A place for remembering the pivot value:  pivotValue
    2. And where it lives:  pivotIx
    3. This time we need a way to do that splitting thing --- the program is really in two parts.
    4. Algorithm & variables for splitting:
      1. Start one index, topIx from the top of the array, move it til it finds a number smaller than pivotValue.
      2. Start one index, botIx from the bottom of the array, move it til it finds a number larger than pivotValue.
      3. Swap these two values (requires a temp variable.)
      4. Terminate when the two Indecies cross, this is the new-found index location for the pivot value.
    5. Actually need some more variables to keep track of the ends when we do this recursively.
  3. Psuedo code main sort;

    QuickSort(int sorta[], first, last)

    int pivotIx, Split();

    // when first >= last, we're through -- we've sorted the smallest possible sized bin
    if (first < last) {
    // This is the real work, and it returns the location for the pivot point
    pivotIx = Split(sorta, first, last);

    // Now the recursive calls
    QuickSort(sorta, first, pivotIx-1);
    QuickSort(sorta, pivotIx+1, last);
    } // if first < last -- all done!

  4. Psuedo code split;

    // must have type integer since returns pivotIx
    int Split(int sorta[], first, last)

    int topIx, botIx; // indecies
    int
    pivotVal, temp; // items

    pivotVal = sorta[first]; // this is
    just chosen arbitrarily
    botIx = first+1, topIx = last; // start
    looking at the ends
    while (topIx > botIx) {
    // look for a missorted small number
    while (sorta[topIx] > pivotVal) {
    topIx--;
    }
    // now look for cross point OR missorted large number
    while ((topIx > botIx) AND (sorta[botIx] < pivotVal)) {
    botIx++
    }
    // if you now have two missorted numbers, switch them
    if (botIx < topIx) {
    temp = sorta[topIx];
    sorta[topIx] = sorta[botIx];
    sorta[botIx] = temp;
    }
    } // while topIx > botIx

    // now put the pivot in its correct location by swapping it with whatever is
    // under topIx (which will be <= the pivotVal, as top is now lower than bot
    temp = sorta[topIx];
    sorta[topIx] = pivotVal;
    sorta[first] = temp;
    return(topIx);


page author: Joanna Bryson
17 Feb 2005. Updated 10 April 2004, bug in Insertion Sort psuedo code, fixed by Chris