List, Dict and Set Comprehensions By Example

One type of syntactic sugar that sets Python apart from more verbose languages is comprehensions. Comprehensions are a special notation for building up lists, dictionaries and sets from other lists, dictionaries and sets, modifying and filtering them in the process.

They allow you to express complicated looping logic in a tiny amount of space.

List Comprehensions

List comprehensions are the best known and most widely used. Let’s start with an example.

A common programming task is to iterate over a list and transform each element in some way, e.g:

That’s the kind of thing you might do if you were a Java programmer. Luckily for us, though, list comprehensions allow the same idea to be expressed in much fewer lines.

The basic syntax for list comprehensions is this: [EXPRESSION FOR ELEMENT IN SEQUENCE].

Another common task is to filter a list and create a new list composed of only the elements that pass a certain condition. The next snippet constructs a list of every number from 0 to 9 that has a modulus with 2 of zero, i.e. every even number.

Using an IF-ELSE construct works slightly differently to what you might expect. Instead of putting the ELSE at the end, you need to use the ternary operator – x if y else z.

The following list comprehension generates the squares of even numbers and the cubes of odd numbers in the range 0 to 9.

List comprehensions can also be nested inside each other. Here is how we can generate a two-dimensional list, populated with zeros. (I have wrapped the comprehension in pprint to make the output more legible.)

(As you have probably noticed, it is possible to create list comprehensions that are utterly illegible, so please think about who has to touch your code after you and exercise some restraint.)

On the other hand, the syntax of basic comprehensions might seem complicated to you now, but I promise that with time it will become second nature.

Generator Expressions

A list comprehension creates an entire list in memory. In many cases, that’s what you want because you want to iterate over the list again or otherwise manipulate after it has been created. In other cases, however, you don’t want the list at all. Generator expression – described in PEP 289 – were added for this purpose.

Let’s say you want to calculate the sum of the squares of a range of numbers. Without generator expressions, you would do this:

That creates a list in memory just to throw it away once the reference to it is no longer needed, which is wasteful. Generator expressions are essentially a way to define an anonymous generator function and calling it, allowing you to ditch the square brackets and write this:

They are also useful for other aggregate functions like min, max.

The set and dict constructors can take generator expressions too:

Dict Comprehensions

On top of list comprehensions, Python now supports dict comprehensions, which allow you to express the creation of dictionaries at runtime using a similarly concise syntax.

A dictionary comprehension takes the form {key: value for (key, value) in iterable}. This syntax was introduced in Python 3 and backported as far as Python 2.7, so you should be able to use it regardless of which version of Python you have installed.

A canonical example is taking two lists and creating a dictionary where the item at each position in the first list becomes a key and the item at the corresponding position in the second list becomes the value.

(Look how jumbled up it is. A reminder that dicts have no natural ordering.)

The zip function used inside this comprehension returns an iterator of tuples, where each element in the tuple is taken from the same position in each of the input iterables. In the example above, the returned iterator contains the tuples (“a”, 1), (“b”, 2), etc.

Any iterable can be used in a dict comprehension, including strings. The following code might be useful if you wanted to generate a dictionary that stores letter frequencies, for instance.

(The code above is just an example of using a string as an iterable inside a comprehension. If you really want to count letter frequencies, you should check out collections.Counter.)

Dict comprehensions can use complex expressions and IF-ELSE constructs too. This one maps the numbers in a specific range to their cubes:

And this one omits cubes that are not divisible by 4:

Set Comprehensions

A set is an unordered collection of elements in which each element can only appear once. Although sets have existed in Python since 2.4, Python 3 introduced the set literal syntax.

Python 3 also introduced set comprehensions.

Prior to this, you could use the set built-in function.

The syntax for set comprehensions is almost identical to that of list comprehensions, but it uses curly brackets instead of square brackets. The pattern is {EXPRESSION FOR ELEMENT IN SEQUENCE}.

The result of a set comprehension is the same as passing the output of the equivalent list comprehension to the set function.

That’s it for the theory. Now let’s dissect some examples of comprehensions.

Examples

List of files with the .png extension

The os module contains a function called listdir that returns a list of filenames in a given directory. We can use the endswith method on the strings to filter the list of files.

Here it is in usage:

Merge two dictionaries

Merging two dictionaries together can be achieved easily in a dict comprehension:

Here is merge_dicts in action:

Sieve of Eratosthenes

The Sieve of Eratosthenes is an ancient algorithm for finding prime numbers. You might remember it from school. It works like this:

  • Starting at 2, which is the first prime number, exclude all multiples of 2 up to n.
  • Move on to 3. Exclude all multiples of 3 up to n.
  • Keep going like that until you reach n.

And here’s the code:

The first thing to note about the function is the use of a double loop in the first set comprehension. Contrary to what you might expect, the leftmost loop is the outer loop and the rightmost loop is the inner loop. The pattern for double loops in list comprehensions is [x for b in a for x in b].

In case you hadn’t seen it before, the third argument in the rightmost call to range represents the step size.

It would be possible to use a list comprehension for this algorithm, but the not_primes list would be filled with duplicates. It is better to use the automatical deduplication behaviour of the set to avoid that.

Exercises

I’ve included some exercises to help you solidify your new knowledge of comprehensions.

1. Write a function called generate_matrix that takes two positional arguments – m and n – and a keyword argument default that specifies the value for each position. It should use a nested list comprehension to generate a list of lists with the given dimensions. If default is provided, each position should have the given value, otherwise the matrix should be populated with zeroes.

2. Write a function called initcap that replicates the functionality of the string.title method, except better. Given a string, it should split the string on whitespace, capitalize each element of the resulting list and join them back into a string. Your implementation should use a list comprehension.

3. Write a function called make_mapping that takes two lists of equal length and returns a dictionary that maps the values in the first list to the values in the second. The function should also take an optional keyword argument called exclude, which expects a list. Values in the list passed as exclude should be omitted as keys in the resulting dictionary.

4. Write a function called compress_dict_keys that takes a dictionary with string keys and returns a new dictionary with the vowels removed from the keys. For instance, the dictionary {"foo": 1, "bar": 2} should be transformed into {"f": 1, "br": 2}. The function should use a list comprehension nested inside a dict comprehension.

5. Write a function called dedup_surnames that takes a list of surnames names and returns a set of surnames with the case normalized to uppercase. For instance, the list ["smith", "Jones", "Smith", "BROWN"] should be transformed into the set {"SMITH", "JONES", "BROWN"}.

Solutions

1. Nest two list comprehensions to generate a 2D list with m rows and n columns. Use default for the value in each position in the inner comprehension.

2. Disassemble the sentence passed into the function using split, then call capitalize on each word, then use join to reassemble the sentence.

3. Join the two lists a and b using zip, then use the zipped lists in the dictionary comprehension.

4. Iterate over the key-value pairs from the passed-in dictionary and, for each key, remove the vowels using a comprehension with an IF construct.

5. Use the set comprehension syntax (with curly brackets) to iterate over the given list and call upper on each name in it. The deduplication will happen automatically due to the nature of the set data structure.

I’ll leave it there for now. If you’ve worked your way through this post and given the exercises a good try, you should be ready to use comprehensions in your own code.

If you’ve got any questions or other remarks, let me know in the comments.

How exactly do context managers work?

Context managers (PEP 343) are pretty important in Python. You probably use one every time you open a file:

But how well do you understand what’s going on behind the scenes?

Context manager classes

It’s actually quite simple. A context manager is a class that implements an __enter__ and an __exit__ method.

Let’s imagine you want to you print a line of text to the console surrounded with asterisks. Here’s a context manager to do it:

The __exit__ method takes three arguments apart from self. Those arguments contain information about any errors that occurred inside the with block.

You can use asterisks in the same way as any of the built-in context managers:

Accessing the context inside the with block

If you need to get something back and use it inside the with block – such as a file descriptor – you simply return it from __enter__:

myopen works identically to the built-in open:

The contextmanager decorator

Thankfully, you don’t have to implement a class every time. The contextlib package has a contextmanager decorator that you can apply to generators to automatically transform them into context managers:

The code before yield corresponds to __enter__ and the code after yield corresponds to __exit__. A context manager generator should have exactly one yield in it.

It works the same as the class version:

Roll your own contextmanager decorator

The implementation in contextlib is complicated, but it’s not hard to write something that works similarly with the exception of a few edge cases:

It’s not as robust as the real implementation, but it should be understandable. Here are the key points:

  • The inner function instantiates a copy of the nested CMWrapper class with a handle on the generator passed into the decorator.
  • __enter__ calls next() on the generator and returns the yielded value so it can be used in the with block.
  • __exit__ calls next() again and catches the StopIteration exception that the generator throws when it finishes.

That’s it for now. If you want to learn more about context managers, I recommend you take a look at the code for contextlib.

How variable scope works in Python

Someone asked me to take a look at a piece of code recently and tell him why it wasn’t working. The problem was that he didn’t really understand Python variable scoping. That’s what I’m going to talk about today. It is quite basic, but you really need to have it down cold, and there are a few surprises in there too.

What you need to know

A variable in Python is defined when you assign something to it. You don’t declare it beforehand, like you can in C. You just start using it.

Any variable you declare at the top level of a file or module is in global scope. You can access it inside functions.

Before I go on I need to add a disclaimer: global variable are almost always a bad idea. Yes, sometimes you need them, but you almost always don’t. A good rule of thumb is that a variable should have the narrowest scope it needs to do its job. There’s a good discussion of global variables and the associated issues here.

Modifying the value of a global variable is less simple. Take a look at this example.

What happened? Why is the value of x 123 for the second print statement? It turns out that when we assigned the value 321 to x inside foo we actually declared a new variable called x in the local scope of that function. That x has absolutely no relation to the x in global scope. When the function ends, that variable with the value of 321 does not exist anymore.

To get the desired effect, we have to use the global keyword.

That’s more like it.

There is one more scope we have to worry about: the enclosing scope created by declaring one function inside another one. Watch.

What if you want to modify the value of x declared in the outer function? You’ll run into the same problem that made us use global. But we don’t want to use global here. x is not a global variable. It is in the local scope of a function.

Python 3 introduced the nonlocal keyword for this exact situation. I wrote a post about it on this page, but I’ll show you a quick example now.

A simple way to remember Python scoping rules

In the book Learning Python by Mark Lutz, he suggests the following mnenomic for remember how Python scoping works: LEGB

Going from the narrowest scope to the widest scope:

  • L stands for “Local”. It refers to variables that are defined in the local scope of functions.
  • E stands for “Enclosing”. It refers to variables defined in the local scope of functions wrapping other functions.
  • G stands for “Global”. These are the variables defined at the top level of files and modules.
  • B stands for “Built in”. These are the names that are loaded into scope when the interpreter starts up. You can look at them here: https://docs.python.org/3.5/library/functions.html

And that is everything you need to learn about this topic for the vast majority of Python programming tasks.

Comparing files in Python using difflib

Everybody knows about the diff command in Linux, but not everybody knows that the Python standard library contains a module that implements the same algorithm.

A basic diff utility

First, let’s see what a minimal diff implementation using difflib might look like:

The context_diff function takes two sequences of strings – here provided by readlines – and optional fromfile and tofile keyword arguments, and returns a generator that yields strings in the “context diff” format, which is a way of showing changes plus a few neighbouring lines for context.

The library also supports other diff formats, such as ndiff.

Let’s use the utility to compare two versions of F. Scott Fitzgerald’s famous conclusion to The Great Gatsby.

The exclamation marks (!) denote the lines with changes on them. file1.txt is of course the version we know and love.

Fuzzy matches

That’s not all difflib can do. It also lets you check for “close enough” matches between text sequences.

When I saw this first, I immediately thought “Levenshtein Distance”, but it actually uses a different algorithm. Here’s what the documentation says about it:

The basic algorithm predates, and is a little fancier than, an algorithm published in the late 1980’s by Ratcliff and Obershelp under the hyperbolic name “gestalt pattern matching”. The basic idea is to find the longest contiguous matching subsequence that contains no “junk” elements (R-O doesn’t address junk). The same idea is then applied recursively to the pieces of the sequences to the left and to the right of the matching subsequence. This does not yield minimal edit sequences, but does tend to yield matches that “look right” to people.

HTML diffs

The module includes a class called HtmlDiff that can be used to generate diff tables for files. This would be useful, for instance, for building a front end to a code review tool. This is the coolest thing in the module, in my opinion.

The class also has a method called make_file that outputs an entire HTML file, not just the table.

Here is what the rendered table looks like:

difflib_html

Go forth and diff!

There are a few other subtleties, but I have covered the main functionality in this post. Check out the official documentation for difflib here.

The bool function in Python

Python’s built-in bool  function comes in pretty handy for checking the truth and falsity of different types of values.

First, let’s take a look at how True and False are represented in the language.

True and False are numeric values

In Python internals, True is stored as a 1 and False is stored as a 0. They can be used in numeric expressions, like so:

They can even be compared to their internal representation successfully.

However, this is just a numeric comparison, not a check of truthiness, so the following comparison returns False:

bool to the rescue

The number 5 would normally be considered to be a truthy value. To get at its inherent truthiness, we can run it through the bool function.

The following are always considered false:

  • None
  • False
  • Any numeric zero: 0, 0.0, 0j
  • Empty sequences: "", (), []
  • Empty dictionaries: {}
  • Classes with __bool__() or __len__() functions that return False or 0.

Everything else is considered true.

Python Regular Expression Basics

Regular expressions is one of those topics that confuse even advanced programmers. Many people have programmed professionally for years without getting to grips with them. Too often, people just copy and paste Regexes from StackOverflow or other websites without really understanding what’s going on. In this article, I’m going to explain regular expressions from scratch and introduce you to Python’s implementation of them in the re  module.

Regular expressions describe sets of strings

A regular expression is a description of a set of strings. Regular expression matching is a method of finding out if a given string is in the set defined by a certain regular expression. Regular expression search is a method of finding occurrences of strings belonging to that set inside a larger string. Python’s re module provides facilities for search, matching and replacing matched substrings with something else.

The simplest regular expression is just a sequence of ordinary characters. Ordinary characters are those characters that do not have a special meaning in the regular expression syntax.

The re.match function returns a match object if the text matches the pattern. Otherwise it returns None.

Notice how I put the r prefix before the pattern string. Sometimes the regular expression syntax involves backslash-escaped characters that coincide with escape sequences. To prevent those portions of the string from being interpreted as escape sequences, we use the raw r prefix. We don’t actually need it for this pattern, but I always put it in for consistency.

To search in a larger string, we use re.search.

So far this is not very useful. We are just matching strings against other strings, which can be achieved more easily with == and in.

However, regular expressions really come into their own when when we start using sets of characters and repetitions.

Sets of characters and repetitions

Let’s say we don’t just want to match the string "cheese", but any string of lowercase alphabetic characters that is six characters long. In that case we can use the pattern "[a-z]{6}". The bit in the square brackets – [a-z] – means that we should match any lowercase alphabetic character from a to z. The bit in the curly brackets – {6} – means that the match should repeat six times.

The dot character . matches any character except a newline.

By the way, if you want to match the dot character itself, you will have to escape it. Special characters can be escaped and made to match their ordinary equivalents by putting a backslash \ before them.

Any other restricted set of characters can be defined, such as the set of all digits – [0-9] – and the set of all alphanumeric characters – [a-zA-Z0-9]. There are some shorthand ways of specifying common sets of characters too. For instance, \w is equivalent to the set [a-zA-Z0-9_], i.e. the set of every alphanumeric character and the underscore.

For a full list of the special character classes supported by re, use the help function.

* and +

So far we have learned how to match a set of characters a specific number of times, but what if we want to match it an indeterminate number of times? That’s where * and . come in.

* is known as the Kleene star, after Stephen Kleene, who invented this notation to describe finite state automata. It means “match the previous character or set of characters zero or more times”.

For instance, the regex a*  matches the following strings:

  • ""
  • "a"
  • "aa"
  • "aaa"
  • etc.

The + character, on the other hand, means “match the previous character or set of characters one or more times”.

So, the regex b+  matches the following strings.

  • "b"
  • "bb"
  • "bbb"
  • etc.

Unlike the previous pattern, this one does not match the empty string.

Repeating matches between x and y times

We’ve seen how to match characters either a definite number of times or an unlimited number of times, but we can also restrict the length of the match using the {x, y} syntax, where x is the lower limit and y is the upper limit.

The pattern a{3,5} will match strings composed of the character a repeated between three and five times.

The strings below match the pattern:

  • "aaa"
  • "aaaa"
  • "aaaaa"

However, the strings "aa"  and "aaaaaa"  do not match.

Excluding characters

Until now, we’ve been defining sets of character by the characters that are included in them, but we can also define sets of excluded characters. That is done using the caret ^ character _inside_ the square brackets.

In the above example, the pattern [^abc]+ matches any string of length one or more that does not contain the characters a, b or c.

Matching the start and end of strings

Regular expressions also support another feature called “anchors”. The caret ^ and the dollar sign $ are the two most common anchors, used to match the start and end of strings respectively. This feature is relevant for searching within strings rather than matching the whole string.

Consider the following example:

The first pattern – ^cheese  – matches the first occurrence of the substring "cheese"  within the search string. The second pattern – cheese$  – matches the second occurrence.

By using the two anchors together, we can match the whole string. Here is a pattern that will match any string starting and ending with "cheese".

Matching this or that

Sometimes we want to build a pattern that says “match this string, or match that string”. For that, we use the pipe | character.

We can confine it to a certain region using round brackets.

Optional items

What if we wanted to match, say, both the American and English spellings of the word “harbour”. The American version has no “u” in it. Here’s when the optional character ? comes in useful.

The ? in the pattern matches the preceding character u zero or more times.

Groups and named groups

Parts of a regex pattern bounded by round brackets are called “groups”.

These groups are numbered and can be accessed using indexes, but it is also possible to create named groups. These are accessible by name rather just by an index.

Greedy and non-greedy matching

The normal way for regex searches to work is greedily, i.e. matching as much of the search string as possible. Here’s an example.

The pattern <.*>  matched the whole string, right up to the second occurrence of > . However, if we only wanted to match the first <h1>  tag, then we can use the greedy qualifier  *?  that matches as little text as possible.

Now we’re only matching the first tag.

The end

We certainly haven’t covered everything there is to know about regular expressions in this post, but we’ve covered enough to decipher that vast majority of patterns found in the wild, and to invent our own without falling back on cargo-cult copying and pasting.

However, there’s no need to reinvent the wheel, so if you find a good regex that does what you need, you may as well swipe it. Before you do though, test it out with a tool like Regexr or similar.

Python descriptors made simple

Descriptors, introduced in Python 2.2, provide a way to add managed attributes to objects. They are not used much in everyday programming, but it’s important to learn them to understand a lot of the “magic” that happens in the standard library and third-party packages.

The problem

Imagine we are running a bookshop with an inventory management system written in Python. The system contains a class called Book  that captures the author, title and price of physical books.

Our simple Book class works fine for a while, but eventually bad data starts to creep into the system. The system is full of books with negative prices or prices that are too high because of data entry errors. We decide that we want to limit book prices to values between 0 and 100. In addition, the system contains a Magazine class that suffers from the same problem, so we want our solution to be easily reusable.

This tutorial is pretty long. Want a PDF?

Just type in your email address and I'll send a PDF version to your inbox.

Powered by ConvertKit

The descriptor protocol

The descriptor protocol is simply a set of methods a class must implement to qualify as a descriptor. There are three of them:

  • __get__(self, instance, owner)
  • __set__(self, instance, value)
  • __delete__(self, instance)

__get__ accesses a value stored in the object and returns it.

__set__ sets a value stored in the object and returns nothing.

__delete__ deletes a value stored in the object and returns nothing.

Using these methods, we can write a descriptor called Price that limits the value stored in it to between 0 and 100.

A few details in the implementation of Price deserve mentioning.

An instance of a descriptor must be added to a class as a class attribute, not as an instance attribute. Therefore, to store different data for each instance, the descriptor needs to maintain a dictionary that maps instances to instance-specific values. In the implementation of Price, that dictionary is self.values.

A normal Python dictionary stores references to objects it uses as keys. Those references by themselves are enough to prevent the object from being garbage collected. To prevent Book instances from hanging around after we are finished with them, we use the WeakKeyDictionary from the weakref standard module. Once the last strong reference to the instance passes away, the associated key-value pair will be discarded.

Using descriptors

As we saw in the last section, descriptors are linked to classes, not to instances, so to add a descriptor to the Book class, we must add it as a class variable.

The price constraint for books is now enforced.

How descriptors are accessed

So far we’ve managed to implement a working descriptor that manages the price attribute on our Book class, but how it works might not be clear. It all feels a bit too magical, but not to worry. It turns out that descriptor access is quite simple:

  • When we try to evaluate b.price and retrieve the value, Python recognizes that price is a descriptor and calls Book.price.__get__.
  • When we try to change the value of the price attribute, e.g. b.price = 23 , Python again recognizes that price is a descriptor and substitutes the assignment with a call to Book.price.__set__.
  • And when we try to delete the price attribute stored against an instance of Book, Python automatically interprets that as a call to Book.price.__delete__.

The number 1 descriptor gotcha

Unless we fully understand the fact that descriptors are linked to classes and not to instances, and therefore need to maintain their own mapping of instances to instance-specific values, we might be tempted to write the Price descriptor as follows:

But once we start instantiating multiple Book instances, we’re going to have a problem.

The key is to understand that there is only one instance of Price for Book, so every time the value in the descriptor is changed, it changes for all instances. That behaviour in itself is useful for creating managed class attributes, but it is not what we want in this case. To store separate instance-specific values, we need to use the WeakRefDictionary.

The property built-in function

Another way of building descriptors is to use the property built-in function. Here is the function signature:

fget, fset and fdel are methods to get, set and delete attributes, respectively. doc is a docstring.

Instead of defining a single class-level descriptor object that manages instance-specific values, property works by combining instance methods from the class. Here is a simple example of a Publisher class from our inventory system with a managed name property. Each method passed into property has a print statement to illustrate when it is called.

If we make an instance of Publisher and access the name attribute, we can see the appropriate methods being called.

That’s it for this basic introduction to descriptors. If you want a challenge, take what you have learned and try to reimplement the @property decorator. There is enough information in this post to allow you to figure it out.

A quick guide to nonlocal in Python 3

Python 3 introduced the nonlocal  keyword that allows you to assign to variables in an outer, but non-global, scope. An example will illustrate what I mean.

msg  is declared in the outside function and assigned the value "Outside!". Then, in the inside function, the value "Inside!" is assigned to it. When we run outside, msg has the value "Inside!" in the inside function, but retains the old value in the outside function.

We see this behaviour because Python hasn’t actually assigned to the existing msg variable, but has created a new variable called msg in the local scope of inside that shadows the name of the variable in the outer scope.

Preventing that behaviour is where the nonlocal keyword comes in.

Now, by adding nonlocal msg to the top of inside, Python knows that when it sees an assignment to msg, it should assign to the variable from the outer scope instead of declaring a new variable that shadows its name.

The usage of nonlocal is very similar to that of global, except that the former is used for variables in outer function scopes and the latter is used for variable in the global scope.

Some confusion might arise about when nonlocal should be used. Take the following function, for instance.

It would be reasonable to expect that without using nonlocal the insertion of the "inside": 2 key-value pair in the dictionary would not be reflected in outside. Reasonable, but incorrect, because the dictionary insertion is not an assignment, but a method call. In fact, inserting a key-value pair into a dictionary is equivalent to calling the __setitem__ method on the dictionary object.

I will leave it there for now. If you want to learn more about the nonlocal keyword, check out PEP 3104.

The two ways to sort a list in Python

Today I’m going to take a look at another element of the language that tends to trip up Python beginners – the difference between sorted(my_list)  and my_list.sort().

The built-in function sorted sorts the list that is passed into it, and returns a new list while preserving the old one.

On the other hand, the sort method on list objects sorts the list in place, destroying the original ordering.

Using a list’s sort method is the equivalent assigning the output of sorted back to the original list.

However, that particular way of doing things is frowned upon. Only use sorted

sorted and list.sort both accept the key and reverse parameters. The cmp parameter, which allowed you to pass in a custom comparator function, has been removed in Python 3. key should be used instead.

The difference between range and xrange in Python

Today I’m going to take a look at another difference between Python 2 and 3 that can trip up people making the switch. Python 2 used to have two functions that could be used to iterate a certain number of times in for  loops, range  and xrange . In Python 3, there is no xrange , but the range  function behaves like xrange  in Python 2.

The way things were

You probably remember that in Python 2 you could generate indexes in for  loops in two ways:

The difference between these two built in functions is not immediately obvious when used in this way. Let’s take a look at the output of each function in the interactive interpreter.

As you can see, range  returns a normal list , but xrange  returns an xrange  object. An xrange  object is similar to a generator: it produces the necessary index on demand instead of producing the entire list up front. Therefore it can be slightly faster and more memory efficient. According to the Python 2 documentation, the xrange  type offers the following guarantee:

The advantage of the xrange type is that an xrange object will always take the same amount of memory, no matter the size of the range it represents.

xrange deprecated in Python 3

In Python 3, xrange  has been removed and the only option for generating iterable sequences of consecutive numbers is range . Actually, it is more correct to say that the Python 2 range  function has been removed and xrange  has been renamed to range .

For the most part, this change is easy to handle: just use range  when you would have used either range  or xrange  in Python 2. The only place you might be tripped up is if you actually need the list  that range  used to return. Luckily, all you have to do in that case is pass the Python 3 range  object to the list  constructor function.

Pythonic iteration

Before I finish, I’ll just mention a way to make your code more Pythonic. Quite frequently, when people want to use any of the range functions, it is because they want a way to index another sequence type, e.g.

Sometimes people even declare a counter variable outside the loop, just so they have an index.

There is no need to do either of these things. In particular, that range(len(seq))  idiom is one of classic markers of amateur Python code. What you really need is the enumerate  function, which automatically generates an index for whatever sequence you are iterating over.

Ta-da! Once you start using enumerate , you’ll never go back.