Computational Linguistics for Low-Resource Languages

Winter Semester 2011/12
Wednesdays, 10am-12pm
Building C72, U15

instructor: Alexis Palmer
office: C74, 3.01
phone: +49-681-302-70027
email: apalmer@coli.uni-sb.de

News and important links
  1. This site is NOT being updated regarding scheduling.
    For current information, use the CL4LRL wiki: all users welcome!
    Contact me for access credentials.
  2. Students, remember to register via HISPOS.

Introduction

Many of the most well-known works in computational linguistics have been for English or one of a small number of other languages. What about the remaining 6000+ languages spoken in the world?

There is in fact a significant and growing body of work on computational linguistics for other languages. In this course we are interested not so much in individual systems for single languages, but rather in learning what kinds of approaches are most relevant for languages without abundant labeled data.

We will consider the following questions:
  1. what is a low-resource language?
  2. what are the special challenges posed for CL/NLP by low-resource languages?
  3. what are the current problems and major approaches for low-resource languages?
  4. how can linguistic knowledge inform systems for low-resource languages?
  5. how can semi- and unsupervised approaches be adapted for low-resource languages?


Schedule    
For links to papers (and other details), see wiki page.

Date Topic(s) Reading(s)Presenter(s)Resources
26-Oct Introduction   Palmerslides
details about case study
2-Nov Language resource status assessments case studies
Palmer, all slides
9-Nov ***I'll be at IJCNLP
no meeting ***
     
16-Nov Grammar engineering
Grammar Matrix
Bender/etal 2010 Antske Fokkens slides
23-Nov More on data Abney/Bird 2010
Bird/Simons 2003
Palmer slides
30-Nov Data collection and use, Resource bootstrapping a. Lewis/Xia 2010
b. Poornima/Good 2010
Tan
Koleva
 
7-Dec Unsupervised morphological analysis a. Creutz/Lagus 2005
b. Moon/etal 2009
Celik
Simova
 
14-Dec POS tagging a. Petrov/etal 2011
b. Biemann 2010
Farner
Stahl
 
21-Dec Using IGT a. Xia/Lewis 2007
b. Lewis/Xia 2008
Fell
discussion
 
11-Jan Typological implications, universals a. Daume/Campbell 2007
b. Naseem/etal 2010
Littauer
Scheidel
 
18-Jan Language phylogeny a. Daume 2009
b. Berg-Kirkpatrick/etal 2010
Schulder
Ardanuy
 
25-Jan Parse projection, Crisis MT a. TBD
b. Lewis/etal 2011
Bloem
Gorinski
 
1-Feb Cross-lingual IR, Cross-lingual WSD a. TBD
b. TBD
Schwarz
Khoddammohammadi
 
8-Feb       



last modified: November 17, 2011