print(typeof('c'))
[1] "character"
print(typeof("social science is kewl"))
[1] "character"
This chapte introduces agent-based modeling in socmod
starting with basic functional and object-oriented programming in R. It therefore builds on an assumed knowledge of R basics which can be gained by studying, for example, Hands-On Programming with R or other basic R tutorial introductions.
To model social behavior it helps to have intuitive software tools grounded in an empirically-motivated, self-consistent scientific theory. socmod
provides those tools, but to put it to best use it’s helpful to know how it works, especially for people used to using R in a data science context. Many of the same tools are transferable, especially declarative/functional programming, as is used and encouraged by the tidy approach in R for Data Science.
To create simulations that generate social behavior we have to go beyond tabular representations of the world to more complex representations of people and their interactions that can be measured. These measurements are recorded in a standard tabular format, e.g., CSV, which can then be analyzed using tidy/R for Data Science strategies.
In agent-based modeling, we create software representations of simulated people, i.e. agents, or other interacting entities, including the agents’ environment. Object-oriented programming is the natural choice for software design in this case because it provides a structure for defining custom objects like socially interacting agents. An object is one bit of data that could be a number or character type, but could also be something more complicated. In object-oriented programming we define custom classes that specify various data fields and function methods for maintaining and modifying the state of objects. The fields and methods of a class therefore are sets of related data and functions to represent things in the world.
To summarize, objects are a way to track and modify the state of different software entities. We can create simulations of real-world systems by defining custom object types called classes whose state and behaviors are modeled on relevant real-world features and behaviors.
In socmod
, we define custom R objects for agent-based models of social behavior using the R6Class
method that creates a new object type, i.e., a new class. We use tidy-style functional programming where helpful to represent model features and dynamics.
In the remainder of this chapter, I will first review variable assignment and data structures in R including vectors, lists, data.frames, and tibbles. We then review relevant topics in functional programming, then demonstrate how to create custom classes in R with the library R6. This tutorial then closes with a demonstration of a simple four-agent agent-based model of social behavior, kept simple to highlight the functional and object-oriented design patterns in socmod
, representing Agent
, an AgentBasedModel
, and Trial
classes defined using the R6Class
function.
In R, there is a hierarchy of data collection types that are necessary to know about for functional programming, which often involves applying a function across every element of a collection of data. Data collections include vectors, lists, data.frames, and tibbles, which we’ll cover here.
In computer science, a collection is an abstract data type that organizes and stores instances of other data types. In R, the most basic data types are numbers (e.g., double
or integer
) and text, whose data type is character
in R whether it’s a single character or several:
print(typeof('c'))
[1] "character"
print(typeof("social science is kewl"))
[1] "character"
R vectors are defined using the c()
function, e.g., vec <- c(0, 0, 1)
. The data type of all vector elements must be the same. This is enforced by R coercing data to different types, for example:
print(c(0, 0, 1))
[1] 0 0 1
print(c(0, 0, "yo"))
[1] "0" "0" "yo"
We index vectors using single square brackets, with R indexing starting from 1:
<- c(1, 2, 3, 4, 5) # equiv to vec <- 1:5 or seq(1, 5)
vec print(vec[1])
[1] 1
print(vec[length(vec)])
[1] 5
If one wants to keep elements of different types in a single collection, use the R list
:
<- list(0, 542, "yo")
l print(l)
[[1]]
[1] 0
[[2]]
[1] 542
[[3]]
[1] "yo"
Note the visual representation has changed on printout. We now need to use double-square brackets to index list elements themselves:
print(l[[3]])
[1] "yo"
If we leave off one of the square brackets we effectively get a sub-list with just one element:
print(l[3])
[[1]]
[1] "yo"
This is useful if you want to create a new sub-list with more than one element:
print(l[c(2, 3)])
[[1]]
[1] 542
[[2]]
[1] "yo"
Named lists are the primary key-value store in R, just like dict
in Python. It allows us to label entries of the list and access them using double square brackets with the character name or using the $
access operator:
<- list(a = c(0, 5, 6), b = c(7, 8, 9))
named_l print(named_l$a == named_l[["a"]]) # compares element-by-element
[1] TRUE TRUE TRUE
print(all(named_l$b == named_l[["b"]]))
[1] TRUE
tibble
(and data.frame
)Both the tibble
and data.frame
classes represent tabular data, meaning data that can be represented in table format, e.g., in comma- or tab-separated value format. There are some subtle differences listed below, but we’ll use tibble
for representing our tables.
Tibbles are like fancy lists that have special properties that make data manipulation more efficient. The details of how this works aren’t super important at this point. The important thing to know is how one can initialize tibbles and access tibble columns similarly to lists, as shown in the following example:
<- tibble::tibble(a = c(0, 5, 6), b = c(7, 8, 9))
tbl print(tbl$a)
[1] 0 5 6
print(tbl[["b"]])
[1] 7 8 9
Using tibbles ensures that data manipulation and analysis using the tidyverse will work as expected. For example, the tidyverse provides functions for analyzing different groups of data within a larger dataset. This is a common data analysis pattern called split-apply-combine, which in the tidyverse translates to group-by and summarise when using the dplyr
library. For example, we can calculate the mean of measurements in “experimental” and “control” conditions in some fake data:
<- tibble::tibble(
observations condition = c("experimental", "experimental", "control", "control"),
measurement = c(13.5, 14.6, 3.4, 5.4)
)
<-
mean_measurement_tbl %>%
observations ::group_by(condition) %>%
dplyr::summarise(mean_measurement = mean(measurement))
dplyr
print(mean_measurement_tbl)
# A tibble: 2 × 2
condition mean_measurement
<chr> <dbl>
1 control 4.4
2 experimental 14.0
data.frame
and tibble
The tibble
library provides a table data representation (also called tibble
) that is a bit more flexible and intuitive than the R built-in data.frame
. One reason I prefer tibbles is because traditional data.frames automatically convert strings to factors unless you tell them not to. Tibbles don’t, so you’re never surprised.
Column naming is also more flexible with tibbles. A data.frame requires syntactically valid R names, while tibbles can handle column names that include spaces or even non-standard characters. I often use this feature to use column names like Mean adaptation success
that print nicely when used as labels in ggplot
.
Finally, tibble operations always return a tibble. For example, if df
were a data.frame, the operation df[, 1]
would return a vector by default. If it were a tibble it would return another tibble, making behavior more predictable in data analysis pipelines.
Functional programming is especially useful for writing code that applies the same function to several inputs, but wants to leave it up to the user to specify exactly which function should be applied. In socmod
we pass functions as arguments to other functions to specify how agents pick their interaction partners (i.e., teachers in a learning context) and how social learning works. Here are some simple examples demonstrating the key concepts and techniques of functional programming that could help with socmod
programming.
In R, and other programming languages supporting the functional style, one can treat functions like any other data and pass it as an argument to other functions. A function that accepts a function as an argument is called a higher-order function. Here is a simple example of a higher order function that takes some data
as a first argument and applies the function f
to that data twice, putting the result of each calculation in a two-element vector.
<- function(data, f) {
repeat_2_higher_order_func return (c(f(data), f(data)))
}
Below we call this higher order function by providing data=2
as the first argument and an anonymous function as the second argument, which can be written using \(arg) { ...function body... }
syntax:
# \(x) {...} is equivalent to function(x) { ... }; these are
# anonymous functions.
# Expecting to return c(4, 4)
repeat_2_higher_order_func(data = 2, f = \(x) { return (x * 2) })
[1] 4 4
The above is the representation of a vector printed to screen, so we see that our expectations were matched.
c(4, 4)
[1] 4 4
map
: a common, useful higher-order function used often in socmod
One of the most useful and common higher-order functions is the map function. This function maps a function, denoted .f
below, onto every element in a collection, denoted .x
. This notation is from the map
family of functions in the purrr
library for tidy functional programming.
library(purrr)
Here is an example of mapping an anonymous function that multiplies its input by 3 onto a vector with entries 2 and 8:
# Now get we use map_vec that applies .f to every element of .x,
# expecting the following to return a vector with elements 3*2
# and 3*8, i.e., c(6, 24).
::map_vec(.x = c(2, 8), .f = \(x) { return(3 * x)}) purrr
[1] 6 24
If we want to apply a function that takes two variables, the value and index of an element in a collection, we can use the imap
family of functions as follows to return a vector that contains the original element multiplied by its place in the input vector:
<- c(2, 5050, 6)
input_vec ::imap_vec(input_vec, \(el, idx) { return (idx * el) }) purrr
[1] 2 10100 18
Classes are ways to encapsulate diverse distinct, but related, processes, behaviors, data, attributes, and other types of information in a single object, i.e., a software representation of an instance of that entity. R6 is a library for creating our own custom classes that serve as an abstract template that specifies what distinguishes different types of objects/entities. Below we show first how to create a new agent, i.e., a new instance of the Agent
class that is provided by socmod
. After that is a fun example of how we can design a different social behavioral model, one of football/soccer matches. We write classes for players and teams and develop a play_match(team1, team2)
function that pits two teams against each other.
Agent
class in socmod
Below we create a new instance of the Agent
class using the class constructor, the function written socmod::Agent$new()
below that creates a new instance of the class.
<- socmod::Agent$new(1, name = "Matt",
a1 behavior = "Adaptive",
fitness=1e6)
We can use the access operator, $
in R, to access the fields (i.e., attributes) of agent a1
like so:
print(a1$get_fitness())
[1] 1e+06
print(a1$get_name())
[1] "Matt"
print(a1$get_id())
[1] 1
print(a1$get_behavior())
[1] "Adaptive"
We can also use purrr::map
functions over neighbors like so:
# Assign neighbors to a1.
$set_neighbors(c(
a1::Agent$new(id = 2, name = "n2"),
socmod::Agent$new(id = 3, name = "n3")
socmod
))
# Get the list of neighbors back to check it worked.
<- a1$get_neighbors()
neighbors print(class(neighbors)) # should be [1] "Neighbors" "R6"
[1] "Neighbors" "R6"
# Neighbors$map() returns a list...
<- neighbors$map(\(n) n$get_name())
neighbor_names print(neighbor_names)
[[1]]
[1] "n2"
[[2]]
[1] "n3"
# ...use `unlist` to convert it to a vector:
print(unlist(neighbor_names))
[1] "n2" "n3"
In class we started creating our model of a soccer player agent called Footballer
, defined below. We ran out of time at the end of class to write methods for Footballer
, i.e., ways that a football player could interact with the world, or that the world could act upon a soccer player. However we only had time to create a stub for two methods. A stub is a minimal chunk of code that does very little to nothing, but doesn’t get in the way by causing errors or anything like that. It enables us to document our plans for future development in the exact place where it would happen in the code.
Below we have stubs for score_goal
and get_penalty
methods for in-game behaviors. Other possibilities could include get_traded
that would change its team and perhaps get_signed
for cases where a player is a free agent.
library(R6)
<- R6Class("Footballer",
Footballer
public = list(
# Listing attributes as fields and
# setting to zero for their definition.
speed = 0.0, # units of max km/h
accuracy = 0.0, # probability of scoring on a shot
market_value = 0.0, #
aggressiveness = 0.0, # units of penalties per match
team = "",
initialize = function(speed = 15,
accuracy = 0.2, market_value = 1e6,
aggressiveness = 0.5,
team = "Free agent") {
$speed = speed
self$accuracy = accuracy
self$market_value = market_value
self$aggressiveness = aggressiveness
self$team = team
self
},
# Stub two SoccerPlayer class methods...
# ...one for scoring a goal in a game...
scored_goal = function() {
return (ifelse(runif(1) < self$accuracy, 1, 0))
},# ...and one for getting a penalty in a game.
get_penalty_on_play = function() {
return (runif(1) < self$aggressiveness)
}
)
)
<- R6Class("Team",
Team
public = list(
name = "",
players = list(),
wins = 0,
ties = 0,
payroll = 0,
games_played = 0,
# Team name is required, with players optionally specified.
initialize = function(name, players = list()) {
$name = name
self
# Initialize players and payroll.
for (player in players) {
$payroll <- self$payroll + player$market_value
self$team <- name
player
}
$players <- players
self
},
# Add a player to the roster.
sign_player = function(player) {
# Add the player to the team.
$players <- c(players, player)
self# Update payroll.
$payroll <- self$payroll + player$market_value
self# Update the player's team to be this team's name.
$team <- self$name
player
}
) )
<- Footballer$new(accuracy = 0.5)
footballer $scored_goal() footballer
[1] 1
# sum(purrr::map_vec(1:10, \(.) p$scored_goal()))
We can define a function that simulates playing a football match where the team that scores more goals wins, or a tie if teams have the same score. We track penalties received but assume they don’t have an effect on the match outcome. We assume each player gets ten chances to score a goal. We leave it to the reader as an exercise to make use of the Footballer$get_penalty_on_play()
method within the game to add penalties and consequences in the game (e.g., penalty kicks or something), and to use the Team$sign_player()
add more players to each roster, which should increase the mean total scores in each game based on the simple way we’ve modeled gameplay in play_match
.
# Define a function that models a football match.
<- function(team1, team2) {
play_match
# Walk over each player to see how many scores they get,
# summing to get the total team score.
<- sum(
team1_score ::map_vec(
purrr$players,
team1
\(player) {# Calculate the total goals scored by the current player.
sum(purrr::map_vec(1:10, \(.) player$scored_goal()))
}
)
)
<- sum(
team2_score ::map_vec(
purrr$players,
team2
\(player) {# Calculate the total goals scored by the current player.
sum(purrr::map_vec(1:10, \(.) player$scored_goal()))
}
)
)
if (team1_score == team2_score) {
$ties <- team1$ties + 1
team1$ties <- team2$ties + 1
team2cat("Tie game!\n")
else if (team1_score > team2_score) {
} cat(team1$name, "wins!!!\n")
$wins <- team1$wins + 1
team1else {
} cat(team2$name, "wins!!!\n")
$wins <- team2$wins + 1
team2
}
cat(team1$name, ": ", team1_score, " ",
$name, ": ", team2_score, "\n")
team2
$games_played <- team1$games_played + 1
team1$games_played <- team2$games_played + 1
team2 }
First we need teams, which we create and to which we assign players using the constructor, Team$new()
in the code below.
# Initialize two teams, Whales and Squirrels, each with two
# players. All players have identical default attributes.
<- Team$new(
whales name = "Whales",
players = c(Footballer$new(), Footballer$new())
)<- Team$new(
squirrels name = "Squirrels",
players = c(Footballer$new(), Footballer$new())
)
# Play three matches.
play_match(whales, squirrels)
Whales wins!!!
Whales : 5 Squirrels : 4
play_match(squirrels, whales)
Whales wins!!!
Squirrels : 5 Whales : 6
play_match(squirrels, whales)
Whales wins!!!
Squirrels : 1 Whales : 2
# Print how many games each time won.
cat("\nAfter three games...", "\nThe Whales have won",
$wins, "games and the Squirrels have won", squirrels$wins) whales
After three games...
The Whales have won 3 games and the Squirrels have won 0