Seoul National University
M2177.003000 Advanced Data Mining
Fall 2019 - U Kang


1. Preliminaries

There are three graded parts to the project:

  1. Phase 1: the project proposal (10%),
  2. Phase 2: the progress report (10%) and
  3. Phase 3: the final report and presentation (80%).
The proposal will be a short writeup describing what you plan to do and how you plan to do it. The progress report will be a more extensive writeup, describing the work performed up to then, and the revised plans for the whole project. It mainly serves as a `checkpoint', to detect and prevent dead-ends and other problems early on. The report will be a more detailed description of what you did, what results you obtained, and what you have learned and/or can conclude from your work. The work will be carried out in teams of 2-4 persons. Smaller or larger groups will only be allowed under special circumstances.

2. Choosing a Topic

Once you have selected a topic, you should do some background reading so that you are capable of describing, in some detail, what you expect to accomplish. For example, if you decide that you want to implement some new proposal for an algorithm for social network analysis, you will have to carefully read the paper that proposes similar algorithms, pinpoint their weaknesses, and explain how your approach will address these weaknesses. Once you have read up on your topic, you will be ready to write your proposal.

3. First Deliverable: Phase 1 - The Proposal

The proposal should describe what you plan to do for your project. It should describe the problem that you will be addressing, how you plan to address it, what tools (e.g., "yacc", Postgres, hadoop, etc.) you will need for your work, what you expect to produce as a result of your work, and anything else that you think the instructor should know to evaluate your plans. You should also describe what portion of the project each partner will be doing.

Your proposal should be approximately 6-8 pages long, typed (eg., latex/pdf/msword), double-spaced, neat, and with pictures if they seem useful. Also, the proposal should be self-contained. For example, don't just say: "We plan to implement Smith's Foo-Tree data structure [Smith86], and we will study its performance." Instead, you should briefly review the key ideas in the references, and describe clearly the alternatives that you will be examining.

Important points - check-list:

4. Second Deliverable: Phase 2 - The Progress Report

This should be a 10-15 page long report, and it serves as a check-point. It should consist of the same sections as your final report (introduction, survey, etc), with a few sections `under construction'. Specifically, the introduction and survey sections should be in their final form; the section on the proposed method should be almost finished; the sections on the experiments and conclusions will have whatever results you have obtained, as well as `place-holders' for the results you plan/hope to obtain.

Grading scheme for the project report:

Again: Keep the graded Phase 2 report, and attach it to your phase 3 submission.

5. Third deliverable: Phase 3 - The Final Report and Presenstation

The grade of the final phase of the project will have the following components:

  1. writeup: there, you would describe the novelties of your approach and your discoveries/insights/experiments. Your final report is expected to be a 20-30 pages long report, treating in depth the agreed topic. 
  2. software: packaging, documentation, and portability. The goal is to provide enough material, so that other people can use it and continue your work.
  3. presentation

5.1. Grading Scheme for Final Report and Presentation

5.2. Specifications for packaging of software:

Please create a tar-file ( use  gunzip ; tar xvf). Check-list:
  1. after un-tar-ing, the command 'make' should compile your system, install it if necessary and run a small demo on a sample input file (included in your package)
  2. it should have a README file, corresponding to the `user's manual': This file should describe the package in a few paragraphs, as well as how to install it and how to use it.
  3. it should have a directory DOC, with your writeup, and your slides (in your favorite form: latex, pdf, powerpoint, ms-word)
  4. 'make paper.pdf' should create the corresponding version of your writeup (skip this step, if you use ms-word)
  5. `make clean' should eliminate all the derived files (*.o, *.class, *.aux, etc)
  6. `make all.tar' (or 'make') should create a  tar/zip-file,  ready for distribution.
  7. please make sure that your package includes only the absolutely necessary set of files!

6. Due Dates

Last modified Aug. 4, 2019, by U Kang.