Time: Tuesdays and Thursdays, 2:00pm-3:20pm
Location:VHE 217
Instructor:
Zornitsa Kozareva
Teaching Assistant:
Victor Barres
Guest Lecturers:
Kenji Sagae, ICT
Sujith Ravi, Google
Anton Leuski, ICT
CSCI544-2013 Final Project Award Winners
(from left to right Victor Barres, Ruoyang Wang, Wenqi Zhang, Greg Harris, Zornitsa Kozareva, Changhai Zheng, Yunqing Cao, Soonil Nagarkar, Sam Shuster and Bobi Pu)
Best Award for Most Creative Idea: Changhai Zheng and Yunqing Cao
Best Award for Presentation: Soonil Nagarkar
Best Award for System (results and algorithm): Greg Harris
Best Award for Most Likely to be Converted into Successful Business: Bobi Pu and Sam Shuster
Best Award for all of the above categories: Wenqi Zhang and Ruoyang Wang
The awards are based on the rating of each project by the 40 (fourty) graduate students taking the CSCI544 2013 class, the TA Victor Barres and professor Dr. Zornitsa Kozareva.
Class Questions:
Use Piazza to post class related questions and/or to start a discussion
https://piazza.com/class#spring2013/csci544
Goals:
This course covers both fundamental and cutting-edge research topics in Natural Language Processing (NLP) and delves into modern NLP applications including: information extraction, information retrieval, question answering systems like IBM's Watson, sentiment analysis.
Audience:
This graduate course is intended for:
- students who want to understand state-of-the-art and current NLP research
- students interested in tools for building NLP applications
- students interested in applications of NLP like sentiment analysis, information extractors, search engines among others
Prerequisities:
Proficiency in programming, algorithms and data structures, basic knowledge of linear algebra and statistics.
Related Courses There is a sister course, Advanced Natural Language Processing, offered in the fall semester. You can take these two courses in either order.
Textbooks (optional reading)
- Daniel Jurafsky and James Martin. Speech and Language Processing, 2nd edi., Prentice Hall, 2008.
- Christopher Manning and Hinrich Schütze. Foundations of Statistical Natural Language Processing, MIT Press, 1999.
- James Allen. Natural Language Understanding (2nd ed), Addison Wesley, 1994.
- Ian H. Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques (3rd ed), Morgan Kaufmann, 2005.
- Steven Bird, Ewan Klein, and Edward Loper. Natural Language Processing with Python - Analyzing Text with the Natural Language Toolkit, O'Reilly Media, 2009.
Classes from Previous Years
Syllabi and materials from previous years. Since those pages are no longer maintained, there is no guarantee of completeness.
Coursework:
Students will experiment with existing NLP software toolkits and write their own programs. Students will work with real datasets and will build their own NLP Information Extraction, Text Classification and Sentiment Analysis systems. Grades will be based on:
- Programming assignments (2 x 25%): the grade will depend on the performance of a system relative to the rest of the class and the technical report.
- Research project (50%): the grade will depend on the project's substantiality, correctness, relevance to the course, as well as the clarity and depth of the project report, which should follow standard ACL guidelines. Building a demo system will be optional, but will count as bonus points.
Homeworks and Project Proposal Guidelines
Homework I: Named Entity Recognition
Homework II: Web Page Clustering of Ambiguous Names
Project Proposal
Syllabus
| Date |
Instructor |
Lecture |
| January 15 |
Kozareva |
Introduction to NLP |
| January 17 |
Kozareva |
Morphology and Basic Text Processing |
| January 22 |
Kozareva |
Named Entity Recognition, Decision Trees |
| January 24 |
Kozareva |
Named Entity Recognition, k-NN, Features
|
| January 29 |
Kozareva |
Introduction to Weka
Homework 1 is out |
| January 31 |
Kozareva |
Name Discrimination |
| February 5 |
Sagae |
POS Tagging
|
| February 7 |
Sagae |
Parsing
|
| February 12 |
Kozareva |
Question Classification
|
| February 14 |
Kozareva |
Sentiment Analysis |
| February 19 |
Kozareva |
Regression |
| February 21 |
Kozareva |
Bullying Detection |
| February 26 |
Kozareva |
Latent Semantic Analysis
Homework 2 is out |
| February 28 |
Kozareva |
Applications of Latent Semantic Analysis |
| March 5 |
Kozareva |
Singular Value Decomposition |
| March 7 |
Ravi |
Unsupervised Learning for Structured Prediction
|
| March 12 |
Barres |
Principal Component Analysis |
| March 14 |
Kozareva |
Latent Dirichlet Allocation |
| March 19 |
|
Spring Break |
| March 21 |
|
Spring Break |
| March 26 |
Kozareva |
Semantic Class Induction |
| March 28 |
Kozareva |
Graph Algorithms |
| April 2 |
Kozareva |
Taxonomies |
| April 4 |
Kozareva |
Semantic Relations |
| April 9 |
Leuski |
Information Retrieval |
| April 11 |
Kozareva |
Events |
| April 16 |
Kozareva |
Textual Entailment |
| April 18 |
Satheeshkumar Karuppusam |
#8: Sarcasm Identification in Social Network data |
|
Vaishnavi Dalvi and Nirmisha Bollampalli |
#3: Affective Text for News Headlines |
|
Saravanan Ganesh and Harshvardhan |
#4: News Summarization |
|
Soonil Nagarkar |
#19: Author Attribution Through Stylometric Analysis |
| April 23 |
Nabir Bora
| #1: Sentiment Analysis on Conversational Speech Transcriptions: Do vocal cues play a role? |
|
Kiana Baradaran and Stefan Zeltner |
#15: Semantic Analysis for TV Episodes Ratings |
|
|
Akanksha Gopinath, Laksh Gupta and Sanket Sabnis |
#5: Question-Answer System with Search Engine Integration |
|
Greg Harris |
#6: Finding Informative Comments in Forums Beset by Amateurs, Jokesters and Trolls |
|
Xing Shi and Ai He |
#20: Building a Causality Database from Yelp Reviews |
| April 25 |
Shashank Mandil and Linwei Zhu |
#9: Stock Sentiment Analysis |
|
Pavan Gadam Manohar and Palvinder Singh |
#10: Prediction of movie ratings |
|
Cristina Cano and Sayat Satybaldiyev |
#2: Twitter Topic Modeling and Sentiment |
|
Ashish Jain |
#7: Multi-Document Summarization |
|
Changhai Zheng and Yunqing Cao |
#16: Opinion Mining based on Comparison |
| April 30 |
Bobi Pu and Sam Shuster |
#11: Using Natural Language Processing of Social Media in conjunction with Online Charts to Predict Billboard Top10 |
|
Shivasankari Kannan |
#12: Emotion Tracking in Novels |
|
Jia Li and Chen Zhang |
#18: Aspect-Sentiment Analysis of Amazon Product Review |
|
Yubing Dong, Yunru Huang and Shitian Shen |
#17: A Case Study of Sentiment Analysis and Topic Detection in Chinese Tweets |
|
Teng Song |
#22: Unsupervised part-of-speech tagging |
| May 2 |
Wenqi Zhang and Ruoyang Wang |
#13: Graph of Fame: Interpersonal Relationship Extraction and Social Network Construction of Celebrities |
|
Vladimir Zaytsev |
#14: N7 - Hate Speech Classifier for Short Messages |
|
Swaroop Manjunath and Zinan Xing |
#21: Predicting Stock Returns from Blog Sentiment |
|
Kozareva |
Closing and Award Ceremony |
Statement for Students with Disabilities:
Any student requesting academic accommodations based on a disability is required to register with Disability Services and Programs (DSP) each semester. A letter of verification for approved accommodations can be obtained from DSP. Please be sure the letter is delivered to me (or to TA) as early in the semester as possible. DSP is located in STU 301 and is open 8:30 a.m.-5:00 p.m., Monday through Friday. The phone number for DSP is (213) 740-0776.
Statement on Academic Integrity:
USC seeks to maintain an optimal learning environment. General principles of academic honesty include the concept of respect for the intellectual property of others, the expectation that individual work will be submitted unless otherwise allowed by an instructor, and the obligations both to protect one's own academic work from misuse by others as well as to avoid using another's work as one's own. All students are expected to understand and abide by these principles. Scampus, the Student Guidebook, contains the Student Conduct Code in Section 11.00, while the recommended sanctions are located in Appendix A: http://www.usc.edu/dept/publications/SCAMPUS/gov/. Students will be referred to the Office of Student Judicial Affairs and Community Standards for further review, should there be any suspicion of academic dishonesty. The Review process can be found at: http://www.usc.edu/student-affairs/SJACS/.
Emergency Preparedness/Course Continuity in a Crisis:
In case of a declared emergency if travel to campus is not feasible, USC executive leadership will announce an electronic way for instructors to teach students in their residence halls or homes using a combination of Blackboard, teleconferencing, and other technologies.
|