Models are the way we tell Django what we want our database schema to look like, and are the things through which we interact with rows, or instances, in our database.
One of the benefits of using a framework like Django is that it provides functionality making it easier to create and interact with our data than if we had to do so directly with the database and generate and validate our own SQL statements. This means that there are things that will look different in the database than they do in our code — this is okay! But it’s also interesting to look at some of them to better understand where Postgres (in this example, you may be using a different database engine) and Django diverge, and which is driving what behavior.
Let’s build an example model as we go through this, so that at the end we can take a comprehensive look at the differences between our model definition, the migration file, and the database schema.
from django.db import models class BlogPost(models.Model): # We'll build out our model definition here!
By default, models are given an
id field as the primary key without needing to define it at all (so nothing to add to our example model definition we’ll be using throughout this post!) — documented here.
To our example
BlogPost class, we’ll add a foreign key to the author:
author = models.ForeignKey(User)
However, what’s actually stored in the database for foreign keys is the value of the related object’s primary key (typically the
id field, as mentioned above). For this reason, the name of the field in the database will be slightly different than our model definition: the field will be
author_id. Both author and
author_id are available as properties on a model instance.
Optimization note: there’s a cool implication of this! Because of Django magic related to defining something as a
ForeignKey (instead of just an integer, which is what’s actually stored as the database), when we call
.author on an instance of a blog post, it will retrieve that instance from the database as well. If all we need is the id field, calling
.author_id can improve performance over calling
The database field will be of type
integer and will by default have an index, even though we did not define
db_index=True the way we would for non-foreign key fields on which we wanted an index.
constraint bit of a `ForeignKey is actually handled at the database level, not Django - even though this is just an integer field, trying to save an integer in it for which there is no corresponding row in the related table, will fail.
ManyToManyField is Django’s way of creating the many-to-many relationship without you needing to define a through table (though you can provide a
through argument if you have a table defined yourself). If you inspect your database schema after creating one of these, you will see that a join table is created for which you have no corresponding model.
To demonstrate this, we’ll add a
Tag table and create a
ManyToMany field on our
BlogPost table so we can see what this looks like at the end:
tags = models.ManyToManyField(Tag, blank=True)
I wrote more about these fields and their use for tracking when model instances were created and updated here, but it’s worth noting that they’re used strictly by Django. Django ensures that the fields are added or updated on each save, but in the database they’re just
timestamp fields. Let’s add them to our model definition so we can see what things look like at the end.
created_at = models.DateTimeField(auto_now_add=True) updated_at = models.DateTimeField(auto_now=True)
Both of these are options you can pass into a field as you’re initializing them, and neither has any impact on the schema definition of the field. Django uses them both when validating form submissions, and when trying to save data to the database via the ORM, and will only allow you to save values that comply.
However, because these are not enforced at the database level, you could definitely write an SQL statement that saves invalid data. This is one of the (many!) reasons always using the ORM to access data is advisable.
ManyToManyField above is already defined with
blank=True, so we’ll just add a new field here to demonstrate
access = models.IntegerField(choices=ACCESS_CHOICES)
Examples of custom field types include
SlugField, though it’s definitely worth perusing all the different field types Django offers (documentation here or source code here — looking at the source code can help clarify which base field type each of these inherits from, a hint at how it will actually be defined in the database).
All of these fields inherit from one of the basic field types (usually a
CharField) but have a default validator on them, giving you one less thing to worry about when validating form submissions or API serializers, and which handle the work for you of deciding how to define the validator as you’re adding the field. Let’s add a
SlugField to our example model:
path = models.SlugField()
Uniqueness is interesting: when a single field on a model needs to be unique, we define that by just applying
unique=True to the field definition, but at the database level, it doesn’t touch the field — instead Postgres creates an index to enforce uniqueness. The Django docs mention this in reference to the fact that adding unique=True means it’s unnecessary to add
db_index=True, since we definitionally get an index with the uniqueness constraint.
This means that unlike with things like
choices, even if you’re accessing the database directly with SQL, this constraint will be enforced.
Let’s add a unique field to our model:
title = models.CharField(max_length=255, unique=True)
help_text to a field definition doesn’t have any impact on the database at all. Django uses it when when drawing model forms. Perhaps unexpectedly though, if you change it, running makemigrations will generate a migration file that will be a no-op when it’s run. Let’s add help text to one of the fields we added above.
Ok, pulling together each of the fields we discussed above, this is what our model definitions look like now:
from django.contrib.auth import get_user_model from django.db import models User = get_user_model() class BlogPost(models.Model): created_at = models.DateTimeField(auto_now_add=True) updated_at = models.DateTimeField(auto_now=True) author = models.ForeignKey(User) title = models.CharField(max_length=255, unique=True) path = models.SlugField() ACCESS_CHOICES = ( (0, 'Public'), (1, 'Private'), (2, 'Password') ) access = models.IntegerField(choices=ACCESS_CHOICES, help_text='controls who this post is visible to') tags = models.ManyToManyField(Tag, blank=True) class Tag(models.Model): name = models.CharField(max_length=100)
Given the model definitions above, when we run
makemigrations, here’s are the operations that are added to the migration file:
operations = [ migrations.CreateModel( name='BlogPost', fields=[ ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), ('created_at', models.DateTimeField(auto_now_add=True)), ('updated_at', models.DateTimeField(auto_now=True)), ('title', models.CharField(max_length=255, unique=True)), ('path', models.SlugField(max_length=255)), ('access', models.IntegerField(choices=[(0, b'Public'), (1, b'Private'), (2, b'Password')], help_text=b'controls who this post is visible to')), ('author', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to=settings.AUTH_USER_MODEL)), ], ), migrations.CreateModel( name='Tag', fields=[ ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), ('name', models.CharField(max_length=100)), ], ), migrations.AddField( model_name='blogpost', name='tags', field=models.ManyToManyField(blank=True, to='Tag'), ), ]
You can see here some of the things we talked about, such as how both of our new models have an
id field, which is shown here as an
AutoField. For the most part though, things look an awful lot like what we defined in our
models.py file. That’s because migration files are solidly in the Django realm, instead of in the database realm — Django translates them into SQL under the hood.
Note: if you want to see the SQL that will be run when you run your migration file, you can use
sqlmigrate for that.
Things get more interesting when we look at what happens in the database itself. Let’s do that now.
First, let’s look at the tables we have. When we go into our Postgres shell and run \d, here’s what we see:
Schema | Name | Type --------+--------------------------------------------------------- public | blogpost | table public | blogpost_tags | table public | tag | table
One of the primary things that you’ll notice here is there’s a table in the database that we didn’t define at all in our models! That’s the
blogpost_tags table, which is the join table between
tag. It has its own primary key, in addition to foreign keys to those two tables.
Ok, let’s look at the database schema for the
blogpost table, which is where most of the interesting things are. We can do that by running
\d blogpost. Here’s what we see when we do that:
Column | Type | Modifiers ------------+--------------------------+-------------------------- id | integer | not null default nextval('blogpost_id_seq'::regclass) created_at | timestamp with time zone | not null updated_at | timestamp with time zone | not null title | character varying(255) | not null access | integer | not null author_id | integer | not null Indexes: "blogpost_pkey" PRIMARY KEY, btree (id) "blogpost_title_key" UNIQUE CONSTRAINT, btree (title) "blogpost_author_id_f4a5a919" btree (author_id) "blogpost_title_14359379_like" btree (title varchar_pattern_ops) Foreign-key constraints: "blogpost_author_id_f4a5a919_fk_auth_user_id" FOREIGN KEY (author_id) REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED Referenced by: TABLE "blogpost_tags" CONSTRAINT "blogpos_blogpost_id_306b49da_fk" FOREIGN KEY (blogpost_id) REFERENCES blogpost(id) DEFERRABLE INITIALLY DEFERRED
Let’s do a quick rundown of the things we notice here:
idfield, even though we didn’t define one
ForeignKeyfields are just
integerfields, but they have
CONSTRAINT FOREIGN KEYon them, which is how cascade deletion is handled
slugfield type, though that field was given a
max_lengthof 50, even though we didn’t define that
accessfield (which had the choices) is just an
integerfield, since that’s the type of data all of the options were
titlefield has a
UNIQUE CONSTRAINTindex on it.
There are plenty more interesting bits to the way Django and databases communicate with each other, and how things are translated from one to the other, but hopefully this was a useful primer. ✨