15012021
Vectors Overview
Note: #0
R
| 4 Minutes
R
Vectors
A Vector in R is a one-dimensional array that can hold the following data types:
- Numeric
- Character (Strings)
- Logical (Booleans)
Simply put a vector is an easy way of storing a single dimension of data, a vector can be created with the combine function c()
.
numeric_vector <- c(1, 2, 3)
Naming Vectors
You can name a vector by using the names()
function in R like so:
example_vector <- c("Coner Murphy", "Web Developer")
names(example_vector) <- c("Name", "Profession")
When printed out this would yield:
Name Profession
"Coner Murphy" "Web Developer"
However, if you had to repeat this process over many vectors typing in each name, again and again, would get tiring quickly so instead we can store our names in a vector and use that to name our columns.
Take the below example where we have temperatures for multiple cities and we had to name them all:
london_weather <- c(-2, 2, 5, 10, 25)
dublin_weather <- c(-5, 0, 2, 5, 12)
paris_weather <- c(0, 5, 8, 14, 22)
If we were going to name these using the method shown above we would have to do:
names(london_weather) <- c("Jan", "Feb", "Mar", "Apr", "May")
names(dublin_weather) <- c("Jan", "Feb", "Mar", "Apr", "May")
names(paris_weather) <- c("Jan", "Feb", "Mar", "Apr", "May")
This gets tiring fast, imagine if we had to do this for all 12 months and 50 different cities... No one has time for that.
Instead by storing our names in their own vector, we can do something like this:
months_vector <- c("Jan", "Feb", "Mar", "Apr", "May")
names(london_weather) <- months_vector
names(dublin_weather) <- months_vector
names(paris_weather) <- months_vector
How much quicker and nicer is that!
Performing Calculations with Vectors
Performing calculations on vectors is syntactically the same as performing calculations with 2 numbers in normal math but where it slightly differs with R is in how the calculations are made. In the below example is how two vectors would get summed together to produce the final result broken down to show each step.
c(1,3,5) + c(2,4,6)
c(1 + 2, 3 + 4, 5 + 6)
c(3, 7, 11)
Essentially you take the same indexed value from both vectors and perform the chosen calculation on them to get a result for each pair. This then yields the same length of vector as was the input vectors.
You can also perform calculations using vectors stored in variables like so:
a <- c(1,3,5)
b <- c(2,4,6)
c <- a + b
# c = c(3, 7, 11)
Bringing it together
Imagine a scenario where you are running a business with 2 storefronts and you want to see your overall profit/loss across the business for the entire week. You can do this like so:
store_1_vector <- c(100, -50, 200, 10, 400)
store_2_vector <- c(60, -20, 150, 90, 350)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(store_1_vector) <- days_vector
names(store_2_vector) <- days_vector
total_profit_loss <- store_1_vector + store_2_vector
# total_profit_loss
# Monday Tuesday Wednesday Thursday Friday
# 160 -70 350 100 750
Okay but now what if you wanted to see the entire week in one figure to determine if you made a profit or loss for the overall week? This is where the sum()
function comes into play.
Let's take the same example from above but add in the sum()
function to see the overall figures:
store_1_vector <- c(100, -50, 200, 10, 400)
store_2_vector <- c(60, -20, 150, 90, 350)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(store_1_vector) <- days_vector
names(store_2_vector) <- days_vector
total_store_1 <- sum(store_1_vector)
total_store_2 <- sum(store_2_vector)
total_profit_loss <- total_store_1 + total_store_2
# total_profit_loss = 1290
Other than the + operator, you can also perform the other basic mathematical operators such as -, / and *.
But, you can also use the comparison operators as well, so following on from our example if we wanted to compare which shop performed better we could do:
store_1_better <- total_store_1 > total_store_2
# store_1_better = True
Selecting elements
When it comes to selecting elements from a vector we need to keep one piece of vital information to hand, vectors are not 0-indexed like many over programming languages.
For example, in JavaScript if you wanted to select the first element of an array you would do something like:
const arr = [1,2,3,4,5];
const firstElement = arr[0]
// firstElement = 1
Notice how we use 0 to grab element 1 of the array in JavaScript, this is the same for many other languages but not in R. In R we would do:
arr <- c(1,2,3,4,5)
firstElement <- arr[1]
# firstElement = 1
So, looking back at our example from earlier, if we wanted to say grab the total_profit_loss for Friday across both our stores we could do it like:
store_1_vector <- c(100, -50, 200, 10, 400)
store_2_vector <- c(60, -20, 150, 90, 350)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(store_1_vector) <- days_vector
names(store_2_vector) <- days_vector
total_profit_loss <- store_1_vector + store_2_vector
total_profit_loss_friday <- total_profit_loss[5]
print(total_profit_loss_friday)
# Friday
# 750
Okay, this is all well and good just getting the results from Friday but what if I want to get the results from Monday as well to see the opening and closing positions of the week.
No problem, here you go:
store_1_vector <- c(100, -50, 200, 10, 400)
store_2_vector <- c(60, -20, 150, 90, 350)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(store_1_vector) <- days_vector
names(store_2_vector) <- days_vector
total_profit_loss <- store_1_vector + store_2_vector
total_profit_loss_monday_friday <- total_profit_loss[c(1,5)]
print(total_profit_loss_monday_friday)
# Monday Friday
# 160 750
To select multiple entries from a vector we pass the entries we wish to access to the square brackets in another vector, so if you really wanted to, you could also do something like:
store_1_vector <- c(100, -50, 200, 10, 400)
store_2_vector <- c(60, -20, 150, 90, 350)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(store_1_vector) <- days_vector
names(store_2_vector) <- days_vector
total_profit_loss <- store_1_vector + store_2_vector
elements_to_fetch <- c(1,5)
total_profit_loss_monday_friday <- total_profit_loss[elements_to_fetch]
print(total_profit_loss_monday_friday)
# Monday Friday
# 160 750
This would yield the exact same result but the vector defining the elements you're fetching has been split out into its own variable.
Okay, hotshot what about getting every day in the week apart from Monday?
Once again, no sweat; we could do something like:
store_1_vector <- c(100, -50, 200, 10, 400)
store_2_vector <- c(60, -20, 150, 90, 350)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(store_1_vector) <- days_vector
names(store_2_vector) <- days_vector
total_profit_loss <- store_1_vector + store_2_vector
total_profit_loss_exc_monday <- total_profit_loss[c(2,3,4,5)]
print(total_profit_loss_exc_monday)
# Tuesday Wednesday Thursday Friday
# -70 350 100 750
But, let's face it. No one wants to spend their life defining each element you want to fetch. Imagine if you had to go up to 100 that would not be fun.
So, instead, R is nice and gives us a lazy way of doing it, like so:
store_1_vector <- c(100, -50, 200, 10, 400)
store_2_vector <- c(60, -20, 150, 90, 350)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(store_1_vector) <- days_vector
names(store_2_vector) <- days_vector
total_profit_loss <- store_1_vector + store_2_vector
total_profit_loss_exc_monday <- total_profit_loss[2:5]
print(total_profit_loss_exc_monday)
# Tuesday Wednesday Thursday Friday
# -70 350 100 750
This yields the exact same results but just means we only have to type in the starting element (2) and the finishing element (5), how sweet!
Finally, to round up selecting elements in vectors by indexes and names, we can select values by using the name we assigned to it. For example, if we wanted to grab the value for Monday we could do:
store_1_vector <- c(100, -50, 200, 10, 400)
store_2_vector <- c(60, -20, 150, 90, 350)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(store_1_vector) <- days_vector
names(store_2_vector) <- days_vector
total_profit_loss <- store_1_vector + store_2_vector
total_profit_loss_monday <- total_profit_loss[c("Monday")]
print(total_profit_loss_monday)
# Monday
# 160
Selecting elements by comparison
Thinking back to our scenario as the business owner, what if you set your stores a collective target of 300 profit/day, is there a way R can show us if they hit this target for every day of the week because no-one wants to be doing manual calculations, do they?
In fact, there is! Isn't this R stuff great?
store_1_vector <- c(100, -50, 200, 10, 400)
store_2_vector <- c(60, -20, 150, 90, 350)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(store_1_vector) <- days_vector
names(store_2_vector) <- days_vector
total_profit_loss <- store_1_vector + store_2_vector
hit_target <- total_profit_loss > 300
print(hit_target)
# Monday Tuesday Wednesday Thursday Friday
# FALSE FALSE TRUE FALSE TRUE
What we did is checked each value in the vector against our target using the greater than operator (>) from this we then returned the boolean value the comparison yielded allowing us to see that on Wednesday and Friday we hit our target of 300, how easy was that!
But, we can go one step further! By filtering our original total_profit_loss
vector down to just show the days that we passed our target on and the profit we got that day.
store_1_vector <- c(100, -50, 200, 10, 400)
store_2_vector <- c(60, -20, 150, 90, 350)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(store_1_vector) <- days_vector
names(store_2_vector) <- days_vector
total_profit_loss <- store_1_vector + store_2_vector
hit_target <- total_profit_loss > 300
hit_target_days <- total_profit_loss[hit_target]
print(hit_target_days)
# Wednesday Friday
# 350 750
No sweat at all, so not only have we selected what days we hit the target, we have filtered the original vector down to show just the days we hit the target and what we got on those days.
Conclusion
This was just an overview of vectors in R and how to select elements from within them. Overall, vectors are just the tip of the iceberg of R and I can't wait to get into more R.
If you have found this interesting please consider sharing it with others on social media and if you want to see more content like this please consider signing up to my newsletter below.