CSci 8701: Final Examination (Take Home)

Guidelines

The objective of this problem is to give you a flavor of how to read, analyze, and report on a paper from a research view point. Hence, the manner in which you write your report is almost as important as what you write in it. Some hints follow:

Problem

Consider the paper from the textbook: Fast Algorithms for Mining Association Rules. R. Agrawal et al.

Answer the following questions to demosstrate that you have understood the material presented in this paper:

  1. Consider the following sets of items and transactions:
    Items Transactions (i.e. Subsets of Items)
    I0: Required CS Courses T0: Courses on CS major Student transcript
    I1: CS Courses T1: Courses on a Student transcript
    I2: CS Students T2: Students in a Course
    1. What is the typical size of transactions in T1. Provide an example of an association rules using I1 and T1. Explain the meaning of association rules using I1 and T1.
    2. What is the typical size of transactions in T2. Provide an example of an association rules using I1 and T2. Explain the meaning of association rules using I2 and T2.
  2. Compare and contrast the following pairs:
    1. association vs. statistical correlation
    2. confidence vs. support
    3. apriori algorithm vs. apriori Tid algorithm
  3. Consider a brute algorithm to find association rules. Given N items, it first generates all posible subsets of items. Set of transactions are next scanned to compute support for all subsets. Subsets with support below the given threshold are discarded to produce the final set of association rules.
    Compare the computational complexity of this algorithm with that of aprioi algorithm for finding association rules. When will you use brute force algorithm over apriori algorithm? Specifically comment on the three examples of pairs (I0, T0), (I1., T1) and (I2, T2) for pairs of items and transactions.

Answer the following questions to demostrate that you can analyze the paper towards carrying out research:

  1. What research contributions does this paper make?
  2. What assumptions have been made in this paper? Criticize each assumption with respect to its realism.
  3. Select an assumption you've identified and explain how it affects the proposed model and approach.
  4. Modify the selected assumption to make it more realistic (or you could do away with it entirely!).