Database modeling

One of the most important exercises you can do to improve your development skills is to model out the data behind an application. When you do this, you first describe the application as you might use it. Then you try to identify every table you would need to store the data and the relationships between those tables.

The most difficult part of this is handling the translation from how we think about problems as humans into two-dimensional tables. As you do this, you'll have to break down each collection of like data into its own table. You will likely encounter questions along the way that you'll have to answer to move forward.

Photo gallery app example

For example, let's imagine a photo gallery application. In its simplest form, registered users can upload photos. The design of this is very simple.

Users will want to put photos into galleries. When you run into this part, you will have to decide: can a photo only be in one gallery, or can a photo be in multiple galleries? This choice will affect the database significantly.

Database with photos in one and only one gallery

Since photos are only in one gallery, a simple foreign key from photo to gallery will suffice.

Database with photos in multiple galleries

To allow photos to be in multiple galleries, we needed to add a join table, gallery_photos.

This example remains quite simple. Imagine if multiple users had permission to add photos to galleries. Each gallery would have an owner that added and removed people from the gallery (or they might even have multiple owners!) Here's one example of what that might look like.

Database with photos in multiple galleries and multiple users per gallery

Note the gallery_member table. It looks like a join table, and it does serve as one, but it has extra data (is_owner). This is represented differently in Django (a ManyToManyField for gallery_photos and a GalleryMember model with two ForeignKeys for gallery_member).

Rules to follow in modeling

There's a few rules that you have to keep in mind in order to model well.

  • Don't use multiple columns to store the same sort of thing. For example, no fields called category1, category2, etc. Two foreign keys to the same table are fine if they represent different things.
  • Move repeating data to its own table. For example, a category field shouldn't be a string field with the same 10 categories used over and over. Make a category table and use a foreign key.
  • Don't store derived data in the database. For example, if you were storing people for a car rental application, you wouldn't have a birthday field and a over_25 field. You would calculate if someone was over 25 by using their birthday.