Feedback on Project Draft (HW 4)

Feedback on Project Proposal P4, P5

  1. General Comments : The proposal (P5) is a union of the answers to questions P1, P2, P3 and P4. This you should include the list of readings, summary of readings from last homework in the proposal instead of creating new ones.
  2. Group 1 : You have chosen a timely topic of growing importance. However, the reading list is a bit short. Summary of reading is still missing even though you have created a fairly long list of examples of privacy violations related to location data. May be useful to look at arguments presented in the relevant court cases to identify the principles and guidelines to determine privacy classes of location data ? For example, home and office addresses are ususally public information, however hotel room number is usually not public. Pictures of homes listed in MLS are public information, but paparazzi's may not be allowed to pictures of a private backyard without permission. What principles differentiate the these privacy classes ?
    Consider adding a statement in your proposal to differentiate your proposed work from that of group 3. For example, one group may focus on policy (e.g. laws, ethics) and other on implementation mechanisms (e.g. technologies).
    Consider reviewing papers on privacy frameworks from following if your paper is proposing a framework for geo-data privacy: GIS and privacy article , UMN student data confidentiality policy , privacy links , privacy laws , LBS and privacy , health info. privacy , privacy frameworks: ( 1 , 2 , ), general privacy workshop .
  3. Group 2 : Well-focussed proposal. May consider adding a paper or two on detecting outliers based on non-numeric (e.g. category values) attributes. Add citations to the papers/books describing WEKA. References need to provide more information, e.g. publishers, year, etc. Look at the references at the end of the published papers in our reading list for guidance.
  4. Group 3 : You have chosen a timely topic of growing importance. However, the reading list in the proposal is primarily focussing on the location based services (LBS) offered by different vendors in various countries. Consider including in-vehicle navigation devices (e.g. GM Onstar, and similar 3rd party devices) as well. Are you planning to examine the privacy statements for these vendors and comparing those? If many vendors do not provide specific privacy statements about customers' location information, you may consider adding other papers to the reading list. Relevant papers may do some of the following: (i) classify different kinds of location information with respect to privacy (ii) legal precedents and guidelines for determining the privacy classes for different kinds of personal information. (iii) describe court ruling related to location privacy, e.g. look at the court cases involving use of aerial imagery in Baltimore and California
    Incidentally, I like the reading list in your answers to P2 and P3 better. Why are those not included in your proposal (ie.g. answer to P4 and P5) ?
    Consider adding a statement in your proposal to differentiate your proposed work from that of group 1.
    Consider reviewing papers on privacy frameworks from following if your paper is proposing a framework for geo-data privacy: GIS and privacy article , UMN student data confidentiality policy , privacy links , privacy laws , LBS and privacy , health info. privacy , privacy frameworks: ( 1 , 2 , ), general privacy workshop .
  5. Group 4 : Detailed feedback provided in meeting with the group. Proposal is well-focussed.
  6. Group 5 : Summary of readings is still not in desirable format. It is summarizing each paper individually. Try rewriting the summary of reading using a common classification scheme as discussed in the feedback for problems P2 and P3. In fact, the classification scheme of the papers in the related literature is the key contribution of survey papers and it is crucial for your group to develop a classification scheme. Proposal and report should be written in 3rd person. Current draft has many sentences in the first person. Do include the problem statement, summay of readings, etc. from previous homeworks into your proposal.
  7. Group 6 : Proposal is reasonable. Detailed feedback provided in face to face meetings.
  8. Group 7 : It will be useful to provide a crisp problem definition as discussed in the office hours on Tuesday March 30th. The objective function, e.g. time to produce top K tuples in the join result, may be used to compare the proposed work with the related work. Ideally, a dominance zone of the proposed approach in terms of selected parameters, e.g. K, may be provided using analytical and/or experimental means. Incidentally, the proposal has references to other spatial join algorithms including tree-matching, space partitioning, plane-sweep, etc. but does not include text to summarize those. Comparison of the proposed approach with the classical spatial-join algorithm is needed.
  9. Group 8 : Articulation of innovation relative to related work needs more work. Define the "decision type problem" and "optimizing some type of cost metric" to bring the novelty of the proposed approach better.
    Note that formal comparison of algorithms for a given problem often requires identification of an "objective function", which tend to be some kind of a "cost metric", solution quality, computational response time, throughput (transactions per second) etc.
  10. Group 9 : Well-written proposal. Consider reviewing the work of Carla Brodley (Purdue) on use of decision trees for classifying remotely sensed images.
  11. Group 10 : Well-focussed proposal. Consider examining a simulation based appraoch to analyzing probablistic failure scenarios for routes. For example, one may assume a time-based probability distribution (e.g. poission) of edge failure events for the edges in the roadmap. Given alternate routes, one may carry out monte carlo simulations to generate the failure events during the traversal of the routes, compute alternative paths and travel time after the failure events, and assess expected travel time via each route. One may limit the failure scenarios to single failure event during the traversal of each path, or to a fixed failure rate per unit time of path traversal. This simulation approach has been used to evaluate reliability of computer systems and performance of servers, e.g. operating systems, network protocols. This approach is also used in determining the behaviour of transaportation networks for heavy demand in software packages such as DYNASMART and DYNAMITE.
  12. Group 11 : Excellent categorization of related work. Well-written proposal. It may be useful to talk about the process of identifying the frequent queries on NHGIS. Will you be interviewing users? Are there different categories of users? Are there other ways of identifying frequent queries?

Feedback on Projects P2, P3

  1. General Comment 1 : Please write a single narrative summarizing all the papers instead of writing one paragraph per paper. Try to design a common classification scheme or framework to explain the work from all the paper.
  2. General Comment 2 : Some project topics seem shared by multiple groups. It is desirable for the groups with common topics to coordinate their efforts.
  3. Group G1: Feedback provided in office hours. Please try to include the court cases involving the city of Baltimore using aerial imagery for revising property tax. Another interesting case related to a Hollywod personality and aerial imagery. Consider examining the following papers relating to health information and HIPPA: (a) M. P. Armstrong et al, Geographically Masking Health Data to Preserve Confidentiality, Statistics in Medicine, 18, pp. 497-525.
  4. Group G2: Summary and list of reading on spatial outlier is reasonable.
  5. Group G3: Summary of reading is not in the desirable form. See the general comment. The list of readings is quite large and it may be possible to shorten it after defining the focus of the project.
  6. Group G4: Please post pdf documents inplace of postscript. Nice categorization of indexing methods for spatio-temporal data. The list of reference is reasonable. It may be useful to look at the chapter on spatio-temporal indexing in the chorochronous workshop proceedings in our reading list.
  7. Group G5: See general comment. In addition, the summary of reading list does not summarize all the papers listed in the reading list. Have searched DBLP and cite-seer?
  8. Group G6: The list of reading a bit broad. Consider focussing on outlier detection techniques and their application in identifying anamolies in maps.
  9. Group G7: Can not locate either a list of readings or a summary of the readings. Good summary of related work. However, the list of readings is a bit short. It may help to examine other spatial join algorithms including tree-matching, space partitioning, plane-sweep, etc. and evaluate those for for the problem at hand.
  10. Group G8: List of readings and the summary look reasonable. Text summarizing the related work is in the right form and has addressed the general comment.
  11. Group G9: Can not locate either a list of readings or a summary of the readings. Feedback provided in the office hours. Please expand the reading list to include papers (e.g. those by C. Brodley) on use of decision trees for classifying remote sensing imagery.
  12. Group G10: See general comment. Current literature survey is a list of summary of individual papers. Please rewrite it to create a classification structure to describe entire literature in a paragraph or two.
  13. Group G11: See general comment. Nice categorization of logical data models for spatio-temporal data.

Feedback on Web-pages

General Comments :

Feedback on Paper Analysis Slides

General Comments : A few comments apply to multiple papers. These comments are summarized below:

  1. A. A cover slide should include a citation (e.g. title, author, publication forum, year, etc.) to the paper you are presenting. Cover slide should also include information about the team, including the names of members, group's webpage url, etc. Motivation slides should either list application domains or long-standing open problems in the field of spatial databases.
  2. B. Use phrases instead of complete sentences. Each phrase should be 6 to 8 words. You may omit articles and reword phrases to reduce the number of words. Remember the slides are providing highlights and you would be able to fill in the details in the oral presentation.
  3. C. Limit the amount of text on each slide to 6 to 8 phrases. Too much text forces the audience to read the text and prevents them from listening to you. Consider covering parts of the slides with too much text during presentation.
  4. D. Put a slide on "outline" listing the major groups of slides. Know the number of slides in each group. Time your presentation to know the amount of time you spend in presenting the slides in each group. This will allow you to complete the presentation on time.
  5. E. Problem statement slide should identify four things: inputs (what information is given), outputs (what solution is to be found ), objectives (how does one judge the quality of solution), and constraints (which assumptions can be made on inputs and environment). Problom definition should be stated in a way that allows solutions other than what authors have proposed. This is fundamental to understand and critique any paper without being biased by the assumptions/biases of the authors. Many research papers will have a section on problem statement identifying these components. Others will list assumptions in scope, experiment design, proofs of theorems and lemmas, proposed algortihms/methods etc. If this information is missing, you should try to come up with best guesses for these components.
  6. F. Key concepts should be defined and explained. Consider using examples and figures in explaining the key concepts. Use examples from paper. In addition you may ad new examples. You may refer to figures in the papers by proving figure numbers and page numbers. Figures should be simple with few (e.g. half a dozen) components. Consider using 3 to 5 slides on key concepts since audience should understand these well to appreciate rest of your analysis.
  7. G. A validation methodology allows researchers to back up their claims in an objective manner. It allows readers to reproduce the results and conclusions of the authors. This is a fundamental requirement for science and scientific methods. Common methodologies used in computer science research include the following:
  8. Assumptions, Rewrite Today: It is good to add the these two slide to distinguish between the conclusions of authors and your conclusions.

Comments on individual groups