Last updated on 2024-07-09 | Edit this page
Overview
Questions
- What is programming?
- What is object oriented programming?
- How to document code?
- What is a directory?
Objectives
- Learn basic concepts of programming
What is programming?
Programmers use programming languages to give instructions to theircomputers. In this course, we will learn how to use the open sourcelanguage R to complete common tasks required in the field of officialstatistics. This includes the basics of R, data manipulation, and bestpractices.
There are a few reasons why programming with R is useful for officialstatistics. Data manipulation and analysis with R is:
Time-saving: R can complete many computations on a large amountof data that would take a person a long time manually
Reproducible: This code can be re-run with other data with smallmodifications and shared with others to be applied to other newpurposes
Transparent: When you’ve completed a script using best practices,you should be left with a clear list of instructions to complete thedata analysis in the form of code. This avoids “black boxes” where ananalyst is unsure what they’ve done to the data to get it to it’s finalform
R is an object oriented programming language
Object oriented programming languages use objects as theirmain tools. These objects have classes, which describe theirgeneral properties. For example, in R you might work withnumeric objects, which would contain numbers. You could alsowork with characters, which would be composed of text. We’llexplore classes and data types thoroughly in Episode 3 (Data Types andStructures). We can assign “labels” to these objects, creating avariable and use them interchangeably. We assign objects withan assignment operator. In R, the most commonly used assignment operatoris <-
. Try reproducing the example below on your machineby entering the code into the console and hitting the “run” button.
R
# Assign a number to a variablenumber_flowers <- 8# Print the variable's contentsprint(number_flowers)
We can get the value stored within the variable by printing it.
[1] 8
Assigning a new value to a variable breaks the connection with theold value; R forgets that number and applies the variable name to thenew value.
When you assign a value to a variable, R only stores the value, notthe calculation you used to create it. This is an important point ifyou’re used to the way a spreadsheet program automatically updateslinked cells. Let’s look at an example.
# Reassign the variablenumber_flowers <- 7
{: .language-r)
OUTPUT
[1] 7
Variable Naming Conventions
Historically, R programmers have used a variety of conventions fornaming variables. The .
> character in R can be a validpart of a variable name; thus the above assignment could have easilybeen weight.kg <- 57.5
. This is often confusing to Rnewcomers who have programmed in languages where .
has amore significant meaning. Today, most R programmers 1) start variablenames with lower case letters, 2) separate words > in variable nameswith underscores, and 3) use only lowercase letters, underscores, andnumbers in variable names. The Tidyverse Style Guide includes asection on thisand other style considerations.
Documenting Code
Notice that in the above examples, hashtags (#
) are usedbefore giving instructions that are intended for you rather than R.Hashtags produce comments, which are handy for leavinginformation about the code that will follow. Commenting as much code aspossible is part of best practices. Always comment your code! You owe itto your colleagues who may see your code (not to mention your futurecoding self).
# Hashtags go before commented code, which is not run# print("This code will not be run")print("Always comment your code!")
OUTPUT
[1] "Always comment your code!"
Directories
A directory is a location on your machine. Say you’d like to open afile that’s located in a folder on your computer. We need to tell Rwhere to look for the file if we expect to find it. Directories areusually listed by referencing nested folders separated by slashes. Thereare small differences due to operating system (OS), so refer todocumentation specific to your OS when learning to work with folderstructures.
For example: /Users/Documents/Learning-R
points to afolder called “Learning-R” in a user’s documents folder. Depending onyour IDE (Integrated Development Environment) and setup, you can printyour current directory, known as the working directory. Rautomatically reads and writes files from and to your current workingdirectory.
R
# print current working directory getwd()
OUTPUT
[1] "/Users/Documents/
Before beginning our lessons, please set your working directory tothe folder that we created in the setup section withsetwd()
. For example, if your folder is namedLearning-R:
R
# change current working directory setwd("~/Documents/Learning-R")
Key Points
- Programming makes our work faster, more reproducible, and moretransparent.
- R is an object oriented programming language
- Document your code with comments
- A working directory is the active location on your computer where Rcan read and write files