Course taught by Matt Crocker and Vera Demberg
Room and slots: -1.05 in C7.2 (basement) on Monday 8:30 - 10:00 and Wednesday 8:30-10:00 during the first half of the semester (till June 7th), plus one additional meeting on June 26th for poster presentations. See calendar below for details
Contact: crocker / vera at coli ...
Please subscribe to our course mailing list.
This course will cover the mathematical basis of information theory, and then proceed to information theoretic approaches to the study of language, with respect to language comprehension, language production and language evolution. We will also discuss methodologies for testing hypotheses related to these information-theoretic concepts.
The course will include tutorials where we look at ways to estimate surprisal from text corpora, in order to test for effects of surprisal or uniform information density.
Each student needs to attend the
meetings and participate in discussions. Grades will be
determined based on a poster project. Posters will be presented
at the last meeting on June 26th.
Poster templates: latex poster, ppt poster
Students will form groups of two or three to prepare a poster. Each group of students should prepare a research proposal of
what linguistic phenomenon could be investigated using surprisal or
the UID hypothesis. In particular, students should propose what kind
of research method to use for tackling the question, what results they
would expect, and what it would mean to find different results than
the expected ones. (The research does not actually have to be carried
out; the posters are about application ideas / proposals.)
Date |
Topic |
Speaker |
24.4. |
Matt Crocker |
|
26.4. |
Matt Crocker |
|
3.5. |
Tutorial on language models |
Clayton Greenberg, in Room 2.11! |
8.5. |
Vera Demberg |
|
10.5. |
Is human language a good code? (cntd) |
Vera Demberg |
15.5. |
Matt Crocker |
|
17.5. |
Tutorial |
Clayton Greenberg |
22.5. |
Vera Demberg |
|
24.5. |
Jesus Calvillo |
|
29.5. |
Matt Crocker |
|
31.5. |
Vera Demberg |
|
7.6. |
Clayton Greenberg |
|
26.6. |
Poster presentations |
students |
Piantadosi, S., Tily, H., and Gibson, E. (2011). Word lengths are optimized for efficient communication. Proceedings of the National Academy of Sciences , 108(9):3526.
K. Mahowald, E. Fedorenko, S.T. Piantadosi, and E. Gibson. (2013). Info/information theory: speakers choose shorter words in predictive contexts. Cognition, 126, 313-318.
Hale, J. (2001). A probabilistic earley parser as a psycholinguistic model. In Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies , NAACL ’01, pages 1–8, Stroudsburg, PA, USA. Association for Computational Linguistics.
Levy, R. (2008). Expectation-based syntactic comprehension. Cognition, 106(3):1126 – 1177. doi:10.1016/j.cognition.2007.05.006.
Demberg, V. and Keller, F. (2008). Data from eye-tracking corpora as evidence for theories of syntactic processing complexity. Cognition , 109:193–210.
Frank, S., Otten, L., Galli, G., and Vigliocco, G. (2013). Word surprisal predicts n400 amplitude during reading. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics , pages 878–883. Association for Computational Linguistics.
Hale, John. "The information conveyed by words in sentences." Journal of psycholinguistic research 32.2 (2003): 101-123.
Genzel, D. and Charniak, E. (2002). Entropy rate constancy in text. In Proceedings of the 40th meeting of the Association for Computational Linguistics ACL ’02, pages 199–206. Association for Computational Linguistics.
Roark, B., Bachrach, A., Cardenas, C., & Pallier, C. (2009, August). Deriving lexical and syntactic expectation-based measures for psycholinguistic modeling via incremental top-down parsing. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1 (pp. 324-333). Association for Computational Linguistics.
Linzen, Tal and Florian T. Jaeger "Investigating the role of entropy in sentence processing." to appear in the Proceedings of the 2014 workshop on Cognitive Modelling and Computational Linguistics (CMCL). Association for Computational Linguistics, 2014.
Jaeger, T. F. (2010). Redundancy and reduction: Speakers manage syntactic information density. Cognitive Psychology , 61:23–62.