Introduction to Scheme/Python in Computational Linguistics

Damir Ćavar, University of Zadar

This course offers in introduction into computational linguistics on the basis of concrete and practical examples on the basis of the programming languages Python and Scheme.

The concepts discussed in this course include:

- Processing text files (and various code pages)
- Extracting word lists and creating frequency profiles
- Generating N-gram models from text on the basis of characters, morphemes, words, and applying them
- Statistical methods for language identification, text similarity metric, text classification and clustering
- Analysis of corpora and treebanks
- Syntactic parsing with Context Free Grammars and Probabilistic Context Free Grammars
- Syntactic parsing with Categorial Grammars
- Top-down, bottom-up, chunk-, and chart-parsing

All concepts are accompanied with code examples (that can be found here soon), and discussed in detail with respect to implementation and formal properties.


Scheme - MzScheme and DrScheme
Python - and ActiveState

course web page