One of the main goals of this homework is to practice using regular expressions, from within your terminal environment and AntConc. It is all too easy to fall back on a few regular expression syntax rules you are familiar with; please make an effort to include as many varieties of regular expression syntax as possible.
- Explore Gutenberg corpus within your terminal environment, using unix commands and regular expression search patterns. For each item, provide: your search syntax, top 20-30 lines of your result, and a short analysis.
- Search the corpus (original text files) for a linguistic expression of your choice. Use regular expression syntax.
- Try another one of the above.
- Search any N-gram (unigram, bigram, trigram, 4-gram) file for an expression/pattern of your choice.
- Try another one of the above; this time, process the result further to produce a frequency list.
- Again try exploring Gutenberg corpus, this time using AntConc. For each item, present a screenshot of your result, along with your own analysis of the result.
- Look up concordances of an expression of your choice. This one does not have to involve regular expressions.
- Look up concordances of an expression of your choice. Make sure to utilize regular expression syntax in your search.
- Look up frequent clusters involving a search item of your choice.
- Find collocates of a word of your own choice.
NOTE: How to take a screenshot
* PCs: Alt+PrtSc to capture current window, and then Ctrl+V to paste into your document.
* Macs: see here http://guides.macrumors.com/Taking_Screenshots_in_Mac_OS_X.
|