Go to: Na-Rae Han's home page  

Python 3 Notes

        [ HOME | LING 1330/2330 ]

Importing Modules

<< Previous Note           Next Note >>
On this page: import, the math module, math.sqrt(), the random module, random.choice(), the nltk package, nltk.word_tokenize(), importing functions with from m import x, aliasing with from m import x as y.

How to Import Python Modules

In this video tutorial, we learned how to import a Python script you yourself created as a module and re-use the functions you built earlier. You achieve this through the import statement. The utility of the import statement is more general, however: it is used to import any external modules with pre-written functions you can easily access and utilize. (Hence the import antigravity line from the the xkcd comic on the tutorial home! You should try the command, btw.)

We have seen some familiar math functions: operators such as +, / and functions such as sum() are used so frequently that they are included as built-ins. They are simply accessible, no additional step needed. (Can you imagine a programmig language without + or -? Neither can I.) But additional math functions such as square root, logarithm, factorial, etc. are packaged as part of the math module, which should be imported first before you can use them. Let's first import it, and then see what functions are available using dir():

>>> import math
>>> dir(math)
['__doc__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 
'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf', 
'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 
'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'ldexp', 'lgamma', 'log', 
'log10', 'log1p', 'log2', 'modf', 'nan', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 
'tan', 'tanh', 'trunc']
Some of them look familiar. Let's try .sqrt() 'square root'. Because this function is part of the math module, you have to invoke it with the module name prefix: math.sqrt() is how you do it. Without the math prefix, you run into an error.
>>> sqrt(256)
Traceback (most recent call last):
  File "<pyshell#16>", line 1, in 
NameError: name 'sqrt' is not defined
>>> math.sqrt(256)   
>>> math.sqrt(3849458582)        # module name math. must be prefixed
Let's try another module, called random. This is a module that implements random number generators. What's in it? Again, you can find out using dir():
>>> import random
>>> dir(random)
['BPF', 'LOG4', 'NV_MAGICCONST', 'RECIP_BPF', 'Random', 'SG_MAGICCONST', 'SystemRandom', 
'TWOPI', '_BuiltinMethodType', '_MethodType', '_Sequence', '_Set', '__all__', '__builtins__', 
'__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 
'_acos', '_ceil', '_cos', '_e', '_exp', '_inst', '_log', '_pi', '_random', '_sha512', '_sin', 
'_sqrt', '_test', '_test_generator', '_urandom', '_warn', 'betavariate', 'choice', 'expovariate', 
'gammavariate', 'gauss', 'getrandbits', 'getstate', 'lognormvariate', 'normalvariate', 
'paretovariate', 'randint', 'random', 'randrange', 'sample', 'seed', 'setstate', 'shuffle', 
'triangular', 'uniform', 'vonmisesvariate', 'weibullvariate']
OK, that's a lot of ... stuff. What's often useful are functions .choice() and .shuffle(). Let's see what .choice() does, using the built-in help() function. Again, don't forget to prefix random when referencing choice(). Seems pretty straightforward: given a list as the argument, random.choice() chooses an element at random and returns it.
>>> help(random.choice)
Help on method choice in module random:

choice(seq) method of random.Random instance
    Choose a random element from a non-empty sequence.
>>> random.choice([1,2,3,4,5])
>>> random.choice([1,2,3,4,5])
Let's make it more interesting. We can use the random.choice() function to generate random adjective + noun pairings:
>>> adj = ['happy', 'sad', 'curious', 'green', 'colorless', 'evil']
>>> n = ['ideas', 'penguins', 'pandas', 'love', 'professors']
>>> random.choice(adj) + " " + random.choice(n)
'sad pandas'
>>> random.choice(adj) + " " + random.choice(n)
'green professors'
>>> random.choice(adj) + " " + random.choice(n)    # Don't re-type the whole thing! Use Alt+p or Ctrl+p.
'sad love'
You are probably compelled to keep on trying until you get 'evil professors'. I understand.

Standard vs. 3rd-Party Modules

math and random are part of the Python Standard Library: even though you have to import them first before using them, they are nevertheless pre-installed as part of the standard Python installation package. Are there more? You bet. This is the exhaustive list. (Don't worry -- we will be using only a handful of them in this class.)

But beyond this standard library, what makes programming languages such as Python so powerful is the vast sea of libraries developed and generously shared by 3rd parties. Since they are not part of standard Python distribution, you have to separately download and install them. After that, though, using these 3rd-party packages is done exactly the same way: through the import statement.

NLTK (Natural Language Toolkit), which we will be using extensively in the second half of the class, is such suite of libraries. Once you download and install it (see this page), you can use its handy text-processing functions, such as word tokenization, part-of-speech tagging, and more. The example below shows how to import NLTK and use its .word_tokenize() function.

>>> import nltk
>>> nltk.word_tokenize("It's 5 o'clock somewhere.")
['It', "'s", '5', "o'clock", 'somewhere', '.']

Importing Functions From a Module, Aliasing

Sometimes, when your module/package gets sufficiently complex (as it does with NLTK), referencing a function from a module with its full module path can become tedious, especially if you have to do it repeatedly. If you are going to be using only certain functions from a module, you can (1) import those functions individually, and (2) apply aliasing too while at it.

Importing only particular functions or submodules from a module is achieved through the from m import x statement, as shown below. This lets you reference the function x without having to prefix it with the module name every single time. Note that this only imports the particular function only: the choice() function in our example. Unless you also import the random module as a whole, the random module itself and all other functions underneath it stay unimported.

>>> from random import choice
>>> choice([1,2,3,4,5])              # No need to write random.choice()!
>>> help(random.shuffle)             # Oops! random module hasn't been imported
Traceback (most recent call last):
  File "<pyshell#4>", line 1, in 
NameError: name 'random' is not defined

You can also import multiple functions in one import statement:

>>> from math import sqrt, log
>>> sqrt(1600)       # No need to prefix math!
>>> log(27, 3)       # log of 27, base 3

If your function name is still too long, you can also apply aliasing by appending as y, where y is the shorter name you yourself give to the function. The full syntax therefore is from m import xxxx as y. In the NLTK example above, word_tokenize is still a handful to type, so let's give it a nice short nickname of "wtk":

>>> from nltk import word_tokenize as wtk
>>> wtk("I ain't nobody's fool.")       # much shorter than nltk.word_tokenize !!
['I', 'ai', "n't", 'nobody', "'s", 'fool', '.']
There it is... it surely beats writing nltk.word_tokenize() every single time!