Overview

There are a lot of levels of organization to think about with large dataset and databases. We aren’t going to dive too deeply (since it’s a deep rabbit hole), but today we’re going to learn a bit about database normalization and how we can better organize our tables within a database.

Basic Learning Objectives

Before class, you should be able to:

  • Define database normalization
  • Give an example of first normal form and third normal form

Advanced Learning Objectives

After class, you should be able to:

  • Explain the trade-offs that should be considered when deciding how to structure database tables
  • Explain why database normalization is important

Readings

To achieve the basic learning objectives, you should read the following:

NOTE: You do not need to know the finer details of first normal form vs. fourth normal form, etc. Read these to get a sense of the ways you could, and should, be cutting down on redundancy and inefficiencies in database tables.

Checks

Submit an answer to the following on Moodle:

  • Identify a real or hypothetical duplication of data in your team dataset (i.e. if you don’t think there is any data duplication, come up with a way there could be) and how a normal form could reduce that duplication.