Goal

To get a good start on your project proposal by thinking about your team contract and potential datasets.

Team Contract

  1. Make a Google doc to record notes
  2. Discuss with your team your answers to the following:
    1. Who is interested in each of the required roles
    2. When can the 1-hour weekly meeting be?
    3. How will you communicate with each other?
    4. How will you make decisions?
    5. Note there are more things to decide, but you should make sure to look for datasets as well.

Keep track of that Google doc, you’ll move the notes to your team repository next week.

Datasets

  1. Using our course research guide as a starting point, work with your team to identify potential datasets.
    • Important: Verify the sources of publicly available datasets. There are some legitimate people & institutions on Kaggle & GitHub (all of you are already legitimate people on GitHub after the Getting Started Lab last week). However, there are many not-quite-so-legitimate sources. Some datasets may even be AI-generated!
  2. Once you’ve found a dataset, make sure to record the following since you’ll need them later:
    • URL
    • date downloaded
    • authorship
    • exact name and version
    • terms of use
    • suggested citation
  3. Make sure at least one member of your team downloads a copy of the data!
  4. Remember that you’ll need two datasets with an overlapping column, so keep looking and think about which column could overlap.

(If the dataset is small enough, you can and should add the dataset to your team repository on Wednesday.)