As teams and codebases grow, things that seem straightforward can become more complex. For example, a series of database schema changes that when deployed independently and in order, work exactly as expected, can cause other developers issues when trying to get their local environments caught up. Let’s look at one example of this in Django, along with its solution: deprecating a database field, but using it as part of a backfill prior to its removal from the database schema.
The Scenario Your workflow when deprecating a field in favor of a new one might look something like this:
In the first PR:
In the second PR:
In an example, say we have a
Student model, with an
The migration file in our first PR might look like this:
# -*- coding: utf-8 -*- from django.db import migrations, models from django.db.models import F from student.models import Student def backfill_display_name(apps, schema_editor): Student.objects.update(primary_email=F('email')) class Migration(migrations.Migration): dependencies = [ ... ] operations = [ migrations.AddField( model_name='student', name='primary_email', field=models.CharField(max_length=100, null=True), ), migrations.RunPython(backfill_display_name, migrations.RunPython.noop) ]
This is fairly straightforward - we’ve updated the migration file that Django automatically generates when a field is added to the model definition that includes a backfill, setting the value of the new
primary_email field with the value of the existing
This PR can safely be deployed and we can carry on our merry way. So we open another PR that removes the deprecated field. That one might look something like this:
class Migration(migrations.Migration): dependencies = [ ... ] operations = [ migrations.RemoveField( model_name='student', name='email', ), ]
We deploy this PR next, and everything goes swimmingly! Nice work.
This sequence of migrations might have gone through successfully when run independently and in order, but if another developer didn’t pull down the master branch of the repository and run migrations between these two PRs going out, when they pull down both and try to catch up by running migrations, they will see an error that looks like this:
django.core.exceptions.FieldError: Cannot resolve keyword 'name' into field. Choices are: <field choices listed here>
Well, the developer that just pulled down the code now has a version of the
Student model definition that doesn’t have the
Student directly from the relevant
models.py file, which represents the current state of the model.
But that’s where we normally import models from…what can we do instead?
apps.get_model() - documentation here.
You’ll notice that even in the original backfill we wrote, the backfill function takes two arguments:
schema_editor - we’re going to focus on the first one here. This is true of any function passed into
migrations.RunPython (documentation here). This gives us an alternative to importing our models directly from the models.py file and actually imports the model as it was defined at the time. This means that regardless of the current model definition of
Student at the time this migration is run, it will have access to the
def backfill_display_name(apps, schema_editor): Student = apps.get_model('student', 'Student') Student.objects.update(primary_email=F('email'))
If we re-write our migration like the above, a developer who pulls down the codebase and tries to run migrations days or weeks later, will have no trouble doing so 🎉Happy migrating!