Why Is My Django/MySQL Application Showing Unicode as Question Marks?

Back up your database before you try anything here. Sometimes character set conversions can change your data in ways you don’t want. Be sensible and use mysqldump or something to safeguard it before you start messing around. Needless to say, you should try everything in a test environment first.

When you run a Django application (or any other web application, for that matter) on top of a stock MySQL install, you might hit a problem with storing Unicode characters. I saw it in a Django project that had to deal with Arabic text. Instead of the Arabic characters, it just showed a bunch of question marks.

Here’s how to fix it.

Check your MySQL character set

Out of the box, your MySQL character set is probably latin1 . We’re going to change it to utf8 .

First, run this command to check that you are in fact dealing with an incorrect character set:

In the output, you will probably see the following line:

If you do, keep going. We’re going to sort it out.

Edit my.cnf

The main MySQL configuration file is called my.cnf . On Ubuntu it is located at /etc/mysql/my.conf  . You can check where it is on your own system by running locate my.cnf .

The file is divided into sections and the start of each section is indicated with a name in square brackets. We’re interested in the sections [client]  and [mysqld] .

After making a backup of the current state of the file

open it in your text editor of choice and find the [client]  section. Add the following line to it:

Next, find the [mysqld]  section and add the following three lines to it:

Be careful that you add the code to the right sections. If you make a mistake here then MySQL will not start and it won’t write any useful error message to the logs.

Save my.cnf  and restart MySQL. On many systems, you can do this with the service  command:

Alter each table to use the new character set

First, you want to generate the script you are going to use to convert each table one by one to the new character set. Change the database name, username and password to the correct values and run this in the terminal.

It will generate the SQL you need to change each of your tables. For example, if your database contained three tables called users , comments  and posts , the generated code would look this this:

Run that code against the database using your tool of choice. It might take a while, depending on the size of your tables. You’ll know when you try it on your test environment. When it’s done, those question marks should be history.

How variable scope works in Python

Someone asked me to take a look at a piece of code recently and tell him why it wasn’t working. The problem was that he didn’t really understand Python variable scoping. That’s what I’m going to talk about today. It is quite basic, but you really need to have it down cold, and there are a few surprises in there too.

What you need to know

A variable in Python is defined when you assign something to it. You don’t declare it beforehand, like you can in C. You just start using it.

Any variable you declare at the top level of a file or module is in global scope. You can access it inside functions.

Before I go on I need to add a disclaimer: global variable are almost always a bad idea. Yes, sometimes you need them, but you almost always don’t. A good rule of thumb is that a variable should have the narrowest scope it needs to do its job. There’s a good discussion of global variables and the associated issues here.

Modifying the value of a global variable is less simple. Take a look at this example.

What happened? Why is the value of x 123 for the second print statement? It turns out that when we assigned the value 321 to x inside foo we actually declared a new variable called x in the local scope of that function. That x has absolutely no relation to the x in global scope. When the function ends, that variable with the value of 321 does not exist anymore.

To get the desired effect, we have to use the global keyword.

That’s more like it.

There is one more scope we have to worry about: the enclosing scope created by declaring one function inside another one. Watch.

What if you want to modify the value of x declared in the outer function? You’ll run into the same problem that made us use global. But we don’t want to use global here. x is not a global variable. It is in the local scope of a function.

Python 3 introduced the nonlocal keyword for this exact situation. I wrote a post about it on this page, but I’ll show you a quick example now.

A simple way to remember Python scoping rules

In the book Learning Python by Mark Lutz, he suggests the following mnenomic for remember how Python scoping works: LEGB

Going from the narrowest scope to the widest scope:

  • L stands for “Local”. It refers to variables that are defined in the local scope of functions.
  • E stands for “Enclosing”. It refers to variables defined in the local scope of functions wrapping other functions.
  • G stands for “Global”. These are the variables defined at the top level of files and modules.
  • B stands for “Built in”. These are the names that are loaded into scope when the interpreter starts up. You can look at them here: https://docs.python.org/3.5/library/functions.html

And that is everything you need to learn about this topic for the vast majority of Python programming tasks.

How to fix database race conditions in Django views

Today I’m going to show you how to fix an extremely common error in Django applications. My guess is about 90% of Django applications deployed in the wild suffer from this error, and like 72% of statistics I just made that one up on the spot. Seriously though, it’s pretty common.

Imagine you’ve got an online bookstore application with a Book  model that has a quantity attribute. When somebody buys a copy of one of your books, you want to decrease the quantity attribute by 1. Here is the naive way to do it:

At the start when you’ve got a small load on your system, this will seem to work fine. Now imagine your bookstore grows, you open some new branches, and there are multiple updates being run on your application every second. That’s when strange things will start to happen. Here is how two concurrent updates might play out with our current code. book1 represents the first concurrent update and book2 represents the second:

At the start of both concurrent updates, an identical copy of the data in the database is loaded into memory. The inventory quantity is decreased on each copy, then the new quantity is written back to the database, with the second update clobbering the first. Result: it is as if one of the updates never happened.

In database terms, what we need is called a SELECT FOR UPDATE. Basically, this locks the row in the database until the new information is written back, preventing a second instance from reading and modifying data that might be in the process of changing.

Since Django 1.4, implementing SELECT FOR UPDATE through the ORM is really simple:

That will lock the row selected with get until the end of the transaction block, which since Django 1.5 corresponds to the end of the request by default.

select_for_update is compatible with the postgresql_psycopg2, oracle, and mysql database backends. It doesn’t work for the sqlite backend.

Text to speech with Python 3 on Linux and OSX

Recently I had a requirement to synthesise speech from text on two different operating systems. Here is what I came  up with.

OSX

Synthesising speech is a simple matter for OSX users because the operating system comes with the say  command. We can use subprocess  to call it.

Linux

On Linux, there are a few different options. I like to use the espeak  Python bindings when I can. You can install it on Ubuntu using apt-get .

Then use it like so:

espeak  supports multiple languages, so if you are not dealing with English text, you need to pass in the language code. Unfortunately, it looks like the Python bindings don’t support that yet, but we can still use subprocess  like we did on linux.

The list of available languages can be found on the espeak website here.