Project Proposal
Project Title: Image Captioning
Project Idea
We want to train the machine to detect details from an image (such as dog and toy) and then output a sentence to describe the image.
For example, for the following image, the model should detect a brown dog and a blue and yellow toy. And it can output a sentence like “A dog is trying to catch a blue and yellow toy.”
Dataset Details
We will use Flickr8K, Flickr30K, and COCO.
Software
Papers
Deep Visual-Semantic Alignments for Generating Image Descriptions
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Team Members
Progress Milestones
March 8th: Project Proposal Due
March 11th - March 17th: Read papers and prepare the data.
March 18th - March 24th: Design the model.
March 25th - March 31st: Train the model to detect details from the images.
April 5th: Progress Report Due
April 1st - April 7th: Improve the model. Prepare the progress report.
April 8th - April 14th: Design and train the model to combine details into sentence.
April 15th - April 21st: Improve the model.
April 22nd - April 28th: Improve the model.
April 29th - May 3rd: Finalize the project and prepare the final report
May 3rd: Final Report Due
Progress Report
Click here to view our progress report.