How Python modules and packages work

One of the things that I wish I had found a clear explanation for when I was learning Python was packages, modules and the purpose of the __init__.py  file.

When you start out, you usually just dump everything into one script. Although this is great for prototyping stuff, and works fine for programs up to a thousand lines or so, beyond that point your files are just too big to work with easily. You also have to resort to copying and pasting to reuse functions from one program to the next.

Take this script, for instance. This code is saved in a file called add_nums.py .

This file on its own is a Python module. If you want to reuse the function add_nums  in another program, you can import the script into another program as a module:

But there’s a problem. When a module is imported, all the code at the top level is executed, so print add_nums(2, 5)  will run and your program will print “7”. There’s a little trick we can use to prevent such unwanted behaviour. Just wrap the top-level code in a main function and only run it if add_nums.py  is being run as a script and not imported as a module.

When your script is run directly, using ./add_nums.py  or python add_nums.py , the __name__  global variable is set to "__main__". Otherwise it is set to the name of the module. So by wrapping the invocation of the main function in an IF statement, you can make your script behave differently depending on whether it is being run as a script or imported.

Ok, so what about packages?

A package is just a directory with a __init__.py  file in it. This file contains code that is run when the package is imported. A package can also contain other Python modules and even subpackages.

Let’s imagine we have a package called foo . It is composed of a directory called foo , an __init__.py  file, and another file called bar.py  that contains function definitions.

Ok. Now let’s imagine that __init__.py is empty and there is a function called baz defined in bar.py. People who are just getting started making packages and don’t really understand how they work tend to make an empty __init__.py and then they magically find that they can import their package. Often they are copying what they have seen the Django startapp command do.

To be able to call baz, you have to import it like this:

That’s quite ugly. What we really want to do is this:

What do we have to do to make that work? We have two options. Either we move the definition of baz  into __init__.py  or we just import it in __init__.py. We’ll go for the second option. Change __init__.py to this:

Now, you can import baz from the package directly without referencing the bar module.

So, what kind of stuff should you put in __init__.py ? Firstly, anything to do with package initialization, for instance, reading a data file into memory. Remember that when you import a package, everything in __init__.py is executed, so it’s the perfect place for setting things up.

I also like to use it to import or define anything that makes up the package’s public interface. Although Python doesn’t have the concept of private and public methods like Java has, you should still strive to make your package API as clean as possible. Part of that is making the functions and classes you want people to use easy to access.

That’s it for now. Go make a package!

Download Mastering Decorators

Mastering_decorators_cover

Enjoyed this article? Join the newsletter and get Mastering Decorators - a gentle 22-page introduction to one of the trickiest parts of Python.

Weekly-ish. No spam. Unsubscribe any time. Powered by ConvertKit
  • Gale Yang

    Thank You, well explained!