Let's take a hypothetical example where you have scraped data from a website displaying football/soccer results. You find yourself with this data:
df <- data.frame(
match_id = c(1,2,3,4),
home_team = c("Chelsea", "Arsenal", "Liverpool", "Fulham"),
away_team = c("Newcastle", "West Ham", "Tottenham", "Everton"),
score = c("2-2", "2-1", "1-1", "1-3")
)
df
match_id home_team away_team score
1 Chelsea Newcastle 2-2
2 Arsenal West Ham 2-1
3 Liverpool Tottenham 1-1
4 Fulham Everton 1-3
And you wish to separate the score column into separate columns, one for home goals scored, and another for away goals scored.
tidyr::separatedata | the dataframe or tibble | ||
col | the source column to split | ||
into | the target columns to receive the split data | ||
sep | the separator to split the column by |
# install the tidyr package if not installed
# install.packages('tidyr')
tidyr::separate(data = df,
col = 'score',
into = c('home_score', 'away_score'),
sep = '-')
match_id home_team away_team home_score away_score
1 Chelsea Newcastle 2 2
2 Arsenal West Ham 2 1
3 Liverpool Tottenham 1 1
4 Fulham Everton 1 3
By default, the target columns replace the source column in the resulting dataframe. In order to retain the source column, use the remove = F flag,
tidyr::separate(data = df,
col = 'score',
into = c('home_score', 'away_score'),
sep = '-',
remove = F)
match_id home_team away_team score home_score away_score
1 Chelsea Newcastle 2-2 2 2
2 Arsenal West Ham 2-1 2 1
3 Liverpool Tottenham 1-1 1 1
4 Fulham Everton 1-3 1 3
Article one