To get an introduction to using RStudio to make plots of your data and do a basic non-parametric test.
You won’t actually hand anything in for this lab.
On Mantis, go back into your
teaching-alife git repository and type
git pull to get the changes that I’ve added. If it says that you can’t because of changes you’ve made to files, go through for each file that it says has conflicts and type:
git checkout -f filename
This will overwrite changes that you made to these files, so if you want to keep them, you should rename them first:
mv fileToChange newName
git pull command again to make sure you have the things I’ve added. Don’t proceed until you get the message
Already up to date.
First you’ll need some data to analyze. I updated the
native.cpp so that it actually uses the configuration settings and so you’ll get somewhat more interesting results.
make since I’ve changed
b. Make sure you have a folder
Example and if not, type
python stats_scripts/simple_repeat.py to get some data
You can turn that data into something useful in R right away, but in the interest of using Python to do more, I have a Python script called
munge_data.py that takes each of the individual files and puts it into one file to load into R, reducing the amount of things you need to do in R.
a. Go into the
Example directory on the Console with
b. Then run the munge script:
python ../stats_scripts/munge_data.py. This creates a file
munged_basic.dat in this directory, which is good for keeping things organized.
Mantis doesn’t actually have R installed, so you’ll need to open another browser and navigate to
maize.mathcs.carleton.edu and log in with your Carleton username and password.
a. You should now have an RStudio window in front of you. You’ll see that the files sync between Mantis and Maize, which is quite useful. Open the
teaching-alife folder and then the
stats_scripts folder and then the file
b. R is a language that was written by statisticians, not computer programmers, so it works a bit differently. Instead of running the whole file at once, you can run each line by having your cursor on it and typing
Cmd + Return.
c. Run the two lines that start with
require, we’ll use these packages to make plots that are prettier than the basic ones.
d. In R, assignment is done with
<- as you can see for the list of colors. The default R colors are fine, but they aren’t color-blind or printer friendly, so I prefer my homebrew. Run the
e. Read through the rest of the lines and comments and run them each in order to see what they do. You are welcome to just adapt these commands to suit your needs and ask me if there are types of plots that you want to do that aren’t included here. You can also explore more ggplot options on your own if you’d like.
The main thing that you’ll need to change first for your project is probably the
munge_data.py file. Open it and note how it works and what you’d need to change to get different data from your data files (sorry that it’s a bit messy!). Feel free to copy it over to your project and use it.
As soon as your project is printing out some basic data, try out this work flow to get a plot, even if it isn’t an interesting plot, it’s still useful to see your data!