R/Rstudio - Why Use Projects?
Date Posted:
The use of projects (.Rproj files) is fundamental to organized coding and project management, and should be used essentially 100% of the time while using RStudio. Thus in this post, we will discuss why you should use projects, how to create them, and some organization standards. We will also list a few additional resources for you to learn more about the subject at the very end.
Why Use It?
In my opinion, there are four fundamental reasons why you should use projects:
- It can take <30 seconds to set up.
- Keeps all relevant files in the same place.
- Sets the working directory so you can use relative paths.
- Allows for version control.
In the same way that you might make a unique folder for each class and store all the relevant homework assignments, lecture notes, and slides in, you can create an R project for each unique assignment you take on. By storing all your relevant code, scripts, data, and output in the same project/directory/folder, you can keep everything organized and isolated from irrelevant documents.
Creating the Project
How long does it take to create a .Rproj? However long it takes for you to decide the folder name and where it should be stored on your computer.
Here’s a simple tutorial on how to create a project:
- Click the project icon on the top right corner of RStudio.
Note: If this cannot be found, go to the taskbar and click File
-> New Project
.
- Select
New Project
.
- Select
New Directory
.
- Select
New Project
.
Note: Certain types of projects (e.g. Shiny) may require a different selection here. New Project
will work for essentially everything, but do keep in mind that there are additional options.
- Type the desired name of the project under
Directory Name
, and select where you want this to be saved underCreate project as subdirectory of
.
- Press
Create Project
.
Congratulations, you are done! There are additional steps you can take to incorporate version control, but these are the fundamental basics. Now, you can create files normally (e.g. R scripts, R Markdown, etc.) in RStudio, and they’ll be stored within this project directory.
If you manually access the location where you stored the project, you will see it represented as folder with a .Rproj file located inside. The default working directory while working on the project will be this folder.
Accessing the Project After Creation
After a project has been created, one can reopen the project by doing the following:
Click the Project icon on the top right of RStudio.
Select
Open Project
and locate where the .Rproj file is located. If the project was opened recently, it will appear in the dropdown menu immediately.
Additionally, if you find the .Rproj file on your computer, you can open it from there.
Applying the Benefits
As mentioned previously, creating a project “sets the working directory so you can use relative paths.” Let’s clarify what this means.
Without the use of projects, you may have run into issues with the working directory.
Previously, while trying to load in a dataset, you may have run into errors like the following:
read.csv('Dataset.csv')
## Warning in file(file, "rt"): cannot open file 'Dataset.csv': No such file or
## directory
## Error in file(file, "rt"): cannot open the connection
Afterwards, you may have been forced to set the working directory with setwd()
, or specified the full path to the dataset (i.e. read.csv('C:/Users/Name/Desktop/RProject/Data/Dataset.csv')
. However, what happens when someone else tries to use your code? Then, they have to manually change the file path, and they may run into issues.
Projects resolve this issue.
When you use a project, the working directory is automatically set to the folder you created. So, whenever you create a file (data, scripts, figures, etc.) that is relevant to your assignment, store it in this directory. Then, if the data is stored in a Data
folder (see example later in the post), you can use the following code:
# Use of the relative path. Accesses data in the Data folder
read.csv('./Data/Dataset.csv')
Note that the ‘.’ represents the current directory location. If you created a specific scripts
folder, you may have to use ‘..’ to go up a folder first and then access the data folder. An example is shown below.
# Goes up a folder first, then into the Data folder
read.csv('../Data/Dataset.csv')
Since this uses relative paths (./Data/Dataset.csv
instead of absolute path of C:/Users/Name/Desktop/RProject/Data/Dataset.csv
), other users can immediately apply your scripts without change. This will also prevent any issues in the future if you start moving files around.
Project Organization
There are many different ways to organize the project directory, and thus it should be based on you and your organization’s needs.
Here is an extremely basic example:
All your data will go into the Data
folder, all the code (R scripts, RMD, etc.) goes into the Code
folder, and a description about the project will be written into the README.md file. The README.md file is a markdown file, and typically it discusses the purpose of the project, the versions of the programming language/packages used, and other necessary information.
More detailed organizations and descriptions can be found in the resources section.
Resources
To learn more about creating/using R Projects:
To learn about integrating version control:
Data management and organization (how should you structure folders, where do you place scripts vs plots, etc.):