Final Project | COMP 150DL

These project guidelines originally accompany the Stanford CS class CS231n, and are now provided here with minor changes reflecting our course contents. Many thanks to Fei-Fei Li and Andrej Karpathy for graciously letting us use their course materials!

Final Project Demo Schedule

12:00 - 12:20 pm: Amit Patel - Deep Reinforcement Learning for Dota 2
12:20 - 12:40 pm: Dylan, Nate, and Sam B. - Network Architectures for Multi-Label Video Tagging
12:40 - 1:00 pm: Team Mango (Beibei, Cole, Hogyan) - RNN for 2.5D CNN
1:00 - 1:20 pm: Jason, Lisa, and Sam - Sketch to Photo Translation
1:20 - 1:40 pm: Jason, Jay, and Alex - Lung Cancer Detection
1:40 - 2:00 pm: Justin, Jon, and Ben - Deep 3D+
2:00 - 2:20 pm: Jie, Jorge, and Chris
2:20 - 2:40 pm: Hossien and Xinmeng
2:40 - 3:00 pm: Takuto

Important Dates

Course project proposal: March 2 (due 11:59pm, email to comp150dl@gmail.com).
Course project milestone: April 4 (due 11:59pm, email to comp150dl@gmail.com).
Final project presentations will be held from 12pm-3pm on Friday May 12 at Microsoft New England Research and Development (NERD), 1 Memorial Drive 1st Floor, Cambridge, MA. Adams/Attucks Conference Room.
Final project write-up: Friday May 12 (due 11:59pm, email to comp150dl@gmail.com).

Groups

These are the suggested teams for the semester project. If scheduling or other significant concerns arise, please email comp150dl@gmail.com. Teams will receive an email from the intructor with recommended project topics and research references.

Team 1: Jason Krone, Lisa Fan, Sam Woolf
Team 2: Alex Tong, Jay DeStories, Jason Fan
Team 3: Beibei Du, Hogyan Wang, Cole Springate-Combs
Team 4: Nathan Watts, Dylan Cashman, Sam Bruck
Team 5: Ben Papp, Justin Lee, Jonathan Hohrath
Team 6: Mohammad Hossein Chaghazardi, Xinmeng Li, Sambit Pradhan
Team 7: Jie Li, Jorge Sendino Lopez de la Reina, Chris Mattoli
Team 8: Amit Patel
Team 9: Takato Sato

Overview

The Final Project is an opportunity for you to apply what you have learned in class to a problem of your interest. Your are encouraged to select a topic with your group and work on your own project. Potential projects usually fall into these two tracks:

Applications. If you're coming to the class with a specific background and interests (e.g. biology, engineering, physics), we'd love to see you apply deep neural networks to problems related to your particular domain of interest. Pick a real-world problem and apply deep neural networks to solve it.
Models. You can build a new model (algorithm) with deep neural networks, or a new variant of existing models, and apply it to tackle vision tasks. This track might be more challenging, and sometimes leads to a piece of publishable work.

Here you can find some sample project ideas:

Any well known vision challenge such as COCO Detection or Keypoint Estimation, Visual Question Answering, or a Kaggle competition
Describe images with Natural Language (COCO Attributes or Visual Genome Dataset)
Yelp Dataset Challenge: describe restaurants from their Yelp images
Dating artifacts with deep learning (use dataset collected from Spring 2016 version of this class)
Deep Humor: generate funny captions for images (dataset collection required, contact instructor)
Breast Cancer Prediction from Mammograms (contact instructor for dataset)
Breast Cancer Prediction from Mammograms: Multiple Instance Learning (contact instructor for dataset)
Rules for Social Media: characterize social network behavior of trans teens on tumblr (contact instructor for dataset)
Geolocation: predict a photo's GPS corrdinates (several datasets available)
Segment reflective surfaces with Light Field Data
Recyclables recognition -- build your own dataset for this!
Butterfly species recognition (ask instructor who to contact for this data)
High fashion understanding using repository of New York Times fashion week images from last 20 years (instructor has this dataset)
Alternate update scheme for faster training of Adversarial Nets (talk to instructor for more details)
New loss function that optimizes for better generalization of network solution over lowest possible loss (talk to instructor for more details)
Sample project ideas from UMass Amherst (Google Docs)

To inspire ideas, you might look at the papers read or recommended in this class, recent deep learning publications from top-tier vision conferences, as well as other resources below.

Awesome Deep Vision
CVPR: IEEE Conference on Computer Vision and Pattern Recognition
ICCV: International Conference on Computer Vision
ECCV: European Conference on Computer Vision
NIPS: Neural Information Processing Systems
ICLR: International Conference on Learning Representations
Kaggle challenges: An online machine learning competition website. For example, a Yelp classification challenge.

For applications, this type of projects would involve careful data preparation, an appropriate loss function, details of training and cross-validation and good test set evaluations and model comparisons. Don't be afraid to be creative. Some successful examples can be found below:

Deep neural networks also run in real time on mobile phones and Raspberry Pi's - feel free to go the embedded way. You may find this TensorFlow demo on Android helpful.

For models, deep neural networks have been successfully used in a variety of computer vision and NLP tasks. This type of projects would involve understanding the state-of-the-art vision or NLP models, and building new models or improving existing models. The list below presents some papers on recent advances of deep neural networks in the computer vision community.

Object recognition: [Krizhevsky et al.], [Russakovsky et al.], [Szegedy et al.], [Simonyan et al.], [He et al.]
Object detection: [Girshick et al.], [Sermanet et al.], [Erhan et al.]
Image segmentation: [Long et al.]
Video classification: [Karpathy et al.], [Simonyan and Zisserman]
Scene classification: [Zhou et al.]
Face recognition: [Taigman et al.]
Depth estimation: [Eigen et al.]
Image-to-sentence generation: [Karpathy and Fei-Fei], [Donahue et al.], [Vinyals et al.]
Visualization and optimization: [Szegedy et al.], [Nguyen et al.], [Zeiler and Fergus], [Goodfellow et al.], [Schaul et al.]

We also provide a list of popular computer vision datasets:

Meta Pointer: A large collection organized by CV Datasets.
Yet another Meta pointer
ImageNet: a large-scale image dataset for visual recognition organized by WordNet hierarchy
SUN Database: a benchmark for scene recognition and object detection with annotated scene categories and segmented objects
Places Database: a scene-centric database with 205 scene categories and 2.5 millions of labelled images
NYU Depth Dataset v2: a RGB-D dataset of segmented indoor scenes
Microsoft COCO: a new benchmark for image recognition, segmentation and captioning
Flickr100M: 100 million creative commons Flickr images
Labeled Faces in the Wild: a dataset of 13,000 labeled face photographs
Human Pose Dataset: a benchmark for articulated human pose estimation
YouTube Faces DB: a face video dataset for unconstrained face recognition in videos
UCF101: an action recognition data set of realistic action videos with 101 action categories
HMDB-51: a large human motion dataset of 51 action classes

Grading Policy

  Final Project: 40%
  milestone: 5%
  write-up: 10%
   •  clarity, structure, language, references: 3%
   •  background literature survey, good understanding of the problem: 3%
   •  good insights and discussions of methodology, analysis, results, etc.: 4%
  technical: 12%
   •  correctness: 4%
   •  depth: 4%
   •  innovation: 4%
  evaluation and results: 10%
   •  sound evaluation metric: 3%
   •  thoroughness in analysis and experimentation: 3%
   •  results and performance: 4%
  presentation: 3% (+2% bonus for best few presentations)

Project Proposal

The project proposal should be one paragraph (200-400 words). Your proposal should contain:

Who are the (1~3) group members? What will each person do? (This need to be a separate detailed paragraph)
What is the problem that you will be investigating? Why is it interesting?
What data will you use? If you are collecting new datasets, how do you plan to collect them?
What method or algorithm are you proposing? If there are existing implementations, will you use them and how? How do you plan to improve or modify such implementations?
What reading will you examine to provide context and background?
How will you evaluate your results? Qualitatively, what kind of results do you expect (e.g. plots or figures)? Quantitatively, what kind of analysis will you use to evaluate and/or compare your results (e.g. what performance metrics or statistical tests)?

Each group should submit a plain-text proposal to comp150dl@gmail.com. If your proposed project is joint with another class' project (with the consent of the other class' instructor), make this clear in the proposal.

Project Milestone

Your project milestone report should be between 2 - 3 pages using the provided template. The following is a suggested structure for your report:

Title, Author(s)
Introduction: this section introduces your problem, and the overall plan for approaching your problem
Problem statement: Describe your problem precisely specifying the dataset to be used, expected results and evaluation
Technical Approach: Describe the methods you intend to apply to solve the given problem
Intermediate/Preliminary Results: State and evaluate your results upto the milestone

Submission: Please email a PDF file named <your ID>_milestone.pdf to comp150dl@gmail.com. One submission for each group is sufficient.

Final Submission

Your final write-up should be between 4 - 8 pages using the provided template. After the class, we will post all the final reports online so that you can read about each others' work. If you do not want your writeup to be posted online, then please let us know at least a week in advance of the final writeup submission deadline.

You will submit one or two files:

A pdf file of your final report
(OPTIONAL) zip file (or pdf file) with Supplementary Materials

Report. The following is a suggested structure for the report:

Title, Author(s)
Abstract: It should not be more than 300 words
Introduction: this section introduces your problem, and the overall plan for approaching your problem
Background/Related Work: This section discusses relevant literature for your project
Approach: This section details the framework of your project. Be specific, which means you might want to include equations, figures, plots, etc
Experiment: This section begins with what kind of experiments you're doing, what kind of dataset(s) you're using, and what is the way you measure or evaluate your results. It then shows in details the results of your experiments. By details, we mean both quantitative evaluations (show numbers, figures, tables, etc) as well as qualitative results (show images, example results, etc).
Conclusion: What have you learned? Suggest future ideas.
References: This is absolutely necessary.

Supplementary Material is not counted toward your 4-8 page limit.
Examples of things to put in your supplementary material:

Source code (if your project proposed an algorithm, or code that is relevant and important for your project.).
Cool videos, interactive visualizations, demos, etc.

Examples of things to not put in your supplementary material:

All of Caffe source code.
Various ordinary data preprocessing scripts.
Any code that is larger than 1MB.
Model checkpoints.
A computer virus.

Final Presentation

The culmination of this course will be Demo Day on May 12 from 12pm-3pm at Microsoft New England Research and Development (NERD), 1 Memorial Drive 1st Floor, Cambridge, MA. Adams/Attucks Conference Room. Each team will present their results of their project in a talk not exceeding 20 mins. Please leave 2-3 mins for questions. Teams that go over 20 mins will be given no mercy and will be hauled off stage. For examples of good talks, look for videos of oral presentations at NIPS, CVPR, ICCV, etc.

Honor Code

You may consult any papers, books, online references, or publicly available implementations for ideas and code that you may want to incorporate into your strategy or algorithm, so long as you clearly cite your sources in your code and your writeup. However, under no circumstances may you look at another group’s code or incorporate their code into your project.

If you are doing a similar project for another class, you must make this clear and write down the exact portion of the project that is being counted for COMP 150DL.