Seoul National University
M2177.003000 Advanced Data Mining
Fall 2019 - U Kang
There are three graded parts to the project:
The proposal will be a short writeup describing what you plan to do
and how you plan to do it. The progress report will be a more
extensive writeup, describing the work performed up to then, and
the revised plans for the whole project. It mainly serves as a
`checkpoint', to detect and prevent dead-ends and other problems
early on. The report will be a more detailed description of what
you did, what results you obtained, and what you have learned
and/or can conclude from your work. The work will be carried out in
teams of 2-4 persons. Smaller or larger groups will only be allowed
under special circumstances.
- Phase 1: the project proposal (10%),
- Phase 2: the progress report (10%) and
- Phase 3: the final report and presentation (80%).
2. Choosing a Topic
- Option 1: Start from the list of
suggested projects in eTL or some recommendations.
- Option 2: You and your project partners are free to pick a
topic to suit your interests. You will need to justify that the topic is interesting,
relevant to the course, of suitable difficulty.
- Other way:
- Projects related to your dissertation/master-project are also
possible, as long as there is no 'double-dipping', i.e., you clearly
specify what the project will do, in addition to what you were planning to do for your thesis anyway.
- If you cannot find your own topic, discuss with the instructor.
Once you have selected a topic, you should do some background
reading so that you are capable of describing, in some detail, what
you expect to accomplish. For example, if you decide that you want
to implement some new proposal for an algorithm for social network analysis,
you will have to carefully read the paper that proposes
similar algorithms, pinpoint their weaknesses, and explain how your
approach will address these weaknesses. Once you have read up on
your topic, you will be ready to write your proposal.
3. First Deliverable: Phase 1 -
The proposal should describe what you plan to do for your
project. It should describe the problem that you will be
addressing, how you plan to address it, what tools (e.g., "yacc",
Postgres, hadoop, etc.) you will need for your work, what you
expect to produce as a result of your work, and anything else that
you think the instructor should know to evaluate your plans. You
should also describe what portion of the project each partner will
Your proposal should be approximately 6-8 pages long,
typed (eg., latex/pdf/msword), double-spaced, neat,
and with pictures if they seem useful. Also, the proposal should be self-contained. For example,
don't just say: "We plan to implement Smith's Foo-Tree data
structure [Smith86], and we will study its performance." Instead,
you should briefly review the key ideas in the references, and
describe clearly the alternatives that you will be examining.
Important points - check-list:
- Grading scheme: 60% for the survey; 30% for innovation; 10% for
plan of activities
- Please provide a plan of activities and time estimates, per
- Attribution: list which group member did (or will do) what
- Your survey should have at least 3 papers or book
chapters per group member (outside of the reading list).
- Short papers, like PNAS, Nature, Science papers, count as
- Copying the abstract of the papers is obviously prohibited,
- For each paper, describe
- (a) the main idea,
- (b) why (or why not) it
will be useful for your project, and
- (c) its potential shortcomings,
that you will try to improve upon.
- Clear problem definition: give a precise problem definition.
- Keep the graded Phase 1 report, and attach it to your phase 2 and phase 3
- Optional latex template is provided in eTL (you are strongly recommended to follow its outline).
4. Second Deliverable: Phase 2 -
The Progress Report
This should be a 10-15 page long report, and it serves as a
check-point. It should consist of the same sections as your final
report (introduction, survey, etc), with a few sections `under
construction'. Specifically, the introduction and survey sections
should be in their final form; the section on the proposed method
should be almost finished; the sections on the experiments and
conclusions will have whatever results you have obtained, as well
as `place-holders' for the results you plan/hope to obtain.
Grading scheme for the project report:
Again: Keep the graded
Phase 2 report, and attach
it to your phase 3 submission.
- 70% for proposed method (should be almost finished)
- 25% for the design of upcoming experiments
- 5% for plan of activities (in an appendix, please show the old
one and the revised one, along with the activities of each group
- Attach your graded
phase 1 report
- Clear list of innovations: give a list of the best 2-4 ideas that your approach exhibits.
5. Third deliverable: Phase 3 -
The Final Report and Presenstation
The grade of the final phase of the project will have the
- writeup: there, you would describe the novelties of your
approach and your discoveries/insights/experiments. Your final
report is expected to be a 20-30 pages long report, treating in
depth the agreed topic.
- software: packaging, documentation, and portability. The
goal is to provide enough material, so that other people can use it
and continue your work.
5.1. Grading Scheme for Final Report and Presentation
- Writeup [80%]
- [2%] Introduction - Motivation
- [3%] Problem definition
- [5%] Survey
- Proposed method
- [10%] Intuition - why should it be better than the state of the
- [30%] Description of its algorithms
- [5%] Description of your testbed; list of questions your
experiments are designed to answer
- [20%] Details of the experiments; observations (as many
as you can!)
- [5%] Conclusions
- Software (testing, packaging and documentation) [10%]
- Poster presentation [10%]
5.2. Specifications for packaging of software:
Please create a tar-file ( use gunzip ; tar xvf).
- after un-tar-ing, the command 'make' should
compile your system, install it if necessary and run a small demo
on a sample input file (included in your package)
- it should have a README file, corresponding to the
`user's manual': This file should describe the package in a
few paragraphs, as well as how to install it and how to use
- it should have a directory DOC, with your writeup, and
your slides (in your favorite form: latex, pdf, powerpoint,
- 'make paper.pdf' should create the
corresponding version of your writeup (skip this step, if you use
- `make clean' should eliminate all the
derived files (*.o, *.class, *.aux, etc)
- `make all.tar' (or 'make all.zip') should
create a tar/zip-file, ready for distribution.
- please make sure that your package includes only the
absolutely necessary set of files!
6. Due Dates
- Proposal: Oct. 2
- Progress Report: Nov. 4
- Final Report: Dec. 2
Last modified Aug. 4, 2019, by U Kang.