## Python 2.7 Tutorial

With Videos by mybringback.com

## List Comprehension

On this page: list comprehension [f(x) for x in li if ...].

### Filtering Items In a List

Suppose we have a list. Often, we want to gather only the items that meet certain criteria. Below, we have a list of words, and we want to extract from it only the ones that contain 'wo'. For this, we will need to first make a new empty list, and then iterate through the original list to find items put in:
 ```>>> wood = 'How much wood would a woodchuck chuck if a woodchuck could chuck wood?'.split() >>> wood ['How', 'much', 'wood', 'would', 'a', 'woodchuck', 'chuck', 'if', 'a', 'woodchuck', 'could', 'chuck', 'wood?'] >>> wolist = [] >>> for x in wood: if 'wo' in x: wolist.append(x) >>> wolist ['wood', 'would', 'woodchuck', 'woodchuck', 'wood?'] >>> ```
OK, that works, but that's a lot of lines of code. What if I told you you can accomplish it all with one line of Python code? Well you can! Behold the superpower of list comprehension:
 ```>>> [x for x in wood if 'wo' in x] ['wood', 'would', 'woodchuck', 'woodchuck', 'wood?'] >>> ```
You want a list of words that are 5+ characters? That too can be done with list comprehension:
 ```>>> [x for x in wood if len(x) >= 5] ['would', 'woodchuck', 'chuck', 'woodchuck', 'could', 'chuck', 'wood?'] >>> ```
Words that are 5+ characters AND end with 'ck':
 ```>>> [x for x in wood if len(x) >= 5 and x.endswith('ck')] ['woodchuck', 'chuck', 'woodchuck', 'chuck'] >>> ```
You get the idea. Basically, list comprehension for filtering starts with [x for x in li], which in fact creates a new list that's identical to li, and then tacks on an if ... clause at the end, which works as filtering criteria.
 ```>>> [x for x in wood if len(x) <= 4] # if ... clause for filtering ['How', 'much', 'wood', 'a', 'if', 'a'] >>> ```

### Transforming Items in a List

Another popular type of task with a list is to transform each item. For example, suppose I want to create a new list where each 'o' is replaced by 'oo' in every word. As before, the usual for-loop process gets the job done but is tedious:
 ```>>> wood ['How', 'much', 'wood', 'would', 'a', 'woodchuck', 'chuck', 'if', 'a', 'woodchuck', 'could', 'chuck', 'wood?'] >>> doubleo = [] >>> for x in wood: doubleo.append(x.replace('o', 'oo')) >>> doubleo ['Hoow', 'much', 'wooood', 'woould', 'a', 'woooodchuck', 'chuck', 'if', 'a', 'woooodchuck', 'coould', 'chuck', 'wooood?'] >>> ```
Again, with list comprehension, all you need is one line of code:
 ```>>> [x.replace('o', 'oo') for x in wood] ['Hoow', 'much', 'wooood', 'woould', 'a', 'woooodchuck', 'chuck', 'if', 'a', 'woooodchuck', 'coould', 'chuck', 'wooood?'] >>> ```
Another example -- capitalizing every word:
 ```>>> [x.capitalize() for x in wood] ['How', 'Much', 'Wood', 'Would', 'A', 'Woodchuck', 'Chuck', 'If', 'A', 'Woodchuck', 'Could', 'Chuck', 'Wood?'] >>> ```
A list of word length, for every word in wood:
 ```>>> [len(x) for x in wood] # f(x) for transformation [3, 4, 4, 5, 1, 9, 5, 2, 1, 9, 5, 5, 5] >>> ```
So you can see how handy this is. The syntax works like this: starting with [x for x in li], which creates a new list that's identical to li, the initial x is substituted with f(x), a certain function with x as the input. The result is a new list where each x is transformed to f(x).

### Filtering and Transformation, Applied Together

You might ask: can we filter AND transform at the same time? Sure we can. Below, we are filtering in only those words with 'wo' and then uppercasing them:
 ```>>> [x.upper() for x in wood if 'wo' in x] ['WOOD', 'WOULD', 'WOODCHUCK', 'WOODCHUCK', 'WOOD?'] >>> ```
What we have here is this syntax: [f(x) for x in li if ...]. Here's another example:
 ```>>> [x+'-away' for x in wood if len(x) <= 4] # f(x) and if ... ['How-away', 'much-away', 'wood-away', 'a-away', 'if-away', 'a-away'] >>> ```
In the NLTK book, you will see a lot of examples of list comprehension in action, performing exciting operations on gigantic lists of words and other linguistic data. You should get comfortable with list comprehension: it will super-charge your text processing.