CM10135 / Programming II:   Assignment 1 (2005)

Using Hash Tables to Sort & Search

Assignment Date:  Tuesday,  1 March.  Due date:  Thursday, 17 March at 4:00pm


This assignment is the first of four you will have in this course.  Your mark in Programming II is composed of 40% coursework, 60% exam.  This assignment is worth 10% of your mark.

What to Sort & Search (and a note on plagiarism):

This assignment will build on some of the code you wrote for last week's tutorial.  In particular, you should reuse the objects that you created to be sorted, and also the code that generates arrays full of those objects.  If you have not completed last week's tutorial, you need to start off by completing instructions 3 & 4 of that tutorial.  If you have completed that tutorial but the code is in your partner's directory, either rewrite it or ask your partner to email it to you.  Note: we will not be marking the sorts you wrote last week, we will only be marking the basic class and it's hash methods.

An important note on plagiarism:  Plagiarism occurs any time that you use someone else's ideas without acknowledging them.  Plagiarism is much, much worse than asking for help.  If you get help on any part of your assignment you should acknowledge this in the comments.  This includes asking coursemates, tutors, or other staff, as well as reading things in books or on the Internet.  Say exactly where your help came from, and exactly how much of the code is your own.  In particular, in this assignment, you should acknowledge your partner from last week for helping you to build your initial object, or else explain why you didn't use the code you wrote in tutorial.  We will be looking for this comment!

What to do:

  1. Write an informal specification of your class.  It should include
    1. The key attribute you are using for your sort (e.g. age if you used last week's defaults).  Only this time, make sure the maximum key size is 30 (even if this means you have old children!). 
    2. The method for generating an array of different lengths of objects with random key values (this is also from last week).
    3. A class variable hashTable.
    4. A class method called hash which generates a hash-value for any  integer key.  
      1. Your hash function should be:  hash(key) = key mod 20.  Make sure hashTable is the appropriate size for this function!
      2. Be sure to implement some simple method for dealing with collisions!  Two possibilities were described in last week's lecture, I expect you will probably use the one described in the notes where that link points to.  
        1. Note: There's no reason to sort elements that are in a bucket together --- there should always be few enough that you can look through them in constant time.  Look at the analysis part of the lecture!
        2. Note: Because you are generating your key values randomly, you may have two objects with the same key value!  Feel free to ignore this if you want, you can get a good pass if you store all the instances but only ever retrieve the first one you stored.  If you want to move towards distinction, feel free to think of a more clever way to deal with this, such as employing a second key, or rejecting key values that already exist in the hash, etc.
    5. A class method called hashStore which puts an object into the hash table, using the hash function.
    6. A class method called hashFind which takes a key and returns an object or null if no such object is in the table.  
  2. Write the code for your new class.
  3. Test your class!
    1. You will be testing your function on arrays of size 12, 14, 16 & 18 of your class.
    2. For each array size, do the following
      1. Generate the array.
      2. Print out the key values & save them somewhere.
      3. Use hashStore on each object in the array (you probably want to write a method to do this for you!)
      4. For each number from 1-10, check to see:
        1. Does the function correctly say whether the number is in the array?
        2. How many objects did hashFind look at before it gave you the right answer?  (you will have to make hashFind report this number to you.)
NOTE:  As I said in lecture 7, the `correct' (neat) way to do this assignment would be with two classes -- one for the sorted objects, and one for generic hashes.  However, I am requiring in the coursework that you demonstrate an ability to use class variables and instance variables in the same class!  This  allows you to prove you understand the difference between class and instance variables.  If you would prefer to write it another way (and for good reason, see the notes above), please write about this in the analysis.  See particularly the description of distinction projects.  But still do it as described here.

New Note:  As will be discussed in class on 9 March, we have decided both two class & one class solutions to the hash part of this problem are acceptable provided that the hash table is still a class variable.

What to hand in:

  1. Your informal description of your classes & algorithms.
  2. A printout of your fully-commented code. 
  3. A short write-up of the results of your tests, including some analysis about how well the different-sized arrays worked with your hash function. 
  4. A disk containing an uncompiled version of your code.  Be certain the disk is not write protected!  The person marking your program will be compiling & running your program.
    1. Please make the program take a command line argument for the length of the list (# of things to be sorted) for the test cases (12...).   This will make it easier to test for the tutor/markers to run it.  Don't do elaborate exception testing for this, just check that you got a number.
    2. Since this is a late addition, you won't get marked down if you don't do it, but you will have a happier marker.
Remember, marks are not given just because a program compiles or runs.  It should run robustly, and it should be well commented & well documented as specified above.

How you will be marked:


The marks allocation for this assignment is as follows:
  1. Description / Specification 15% 
  2. Fully commented, working programs implementing your design 60% 
  3. Results and analysis 25 % 
Total 100%

Threshold for Pass (40%+)
  1. Brief discussion of requirements and program design.
  2. Simple program. Basic level of commenting. Code layout is mostly correct.  Code does the minimum possible functionality; much running and testing must be done by hand.
  3. Minimum of results, but results are largely correct.  Some conditions not tested.
  4. Little or no analysis.
Good Pass (~55%)
  1. Specification covers all aspects of problem. Good program design. Algorithms are given at an appropriate level of detail and are well explained. Clear description of how design will be implemented. 
  2. Program is a good implementation of design, or possibly better than the design. Layout of code is correct and clear. Code is generally concise but comprehensible, identifiers well named.
  3. Results are correct, clearly described, and include good analysis. 
  4. Some discussion of the program and the development process you undertook, ideas for improvement. 
Distinction (70%+)
  1. Specification covers all aspects of problem.  Detailed program design starting with a high level design and showing refinement into appropriate levels. Algorithms are given at an appropriate level of detail and are well explained. 
  2. A sophisticated program showing good use of Java, but not overly complex. Commenting is appropriate and gives sufficient information. Layout of code is correct and clear. 
  3. Results are clear and presented in an interesting way. Results are sufficient to give scope for excellent analysis. 
  4. Excellent analysis of results, possibly considering alternative implementations.  Shows a thoughtful discussion of the program and the development process you undertook. Detailed and thoughtful criticism of your work, and well-thought-out ideas for improvement.

page author: Joanna Bryson
checked by AMB for release, 28 February
Notes updated 8 March