Part 1

I downloaded “death rates from illicit drug use disorders” data from Our World in Data. I selected this data because I am interested in how the United States has one of the highest death rates from illicit drug use in the world.
This is the link to the data.
I am going to load the package that will be needed:

library(tidyverse)

Read the data in:

death_rates_from_illicit_drug_use_disorders <- read_csv(here::here("_posts/2022-05-04-project-part-1/death-rates-from-drug-use-disorders.csv"))

Use glimpse to see the data:

glimpse(death_rates_from_illicit_drug_use_disorders)

Rows: 6,840
Columns: 4
$ Entity                                                                   <chr> …
$ Code                                                                     <chr> …
$ Year                                                                     <dbl> …
$ `Deaths - Drug use disorders - Sex: Both - Age: Age-standardized (Rate)` <dbl> …

Use output from glimpse (and View) to prepare the data for analysis:

Create the object regions
Change the name of the 1st column to Region and the name of the 4th column to Deaths from Drug Use
Use filter to extract the rows that I want to keep: Year >= 2010 and Region in regions
Select the columns to keep: Region, Year, DeathsfromDrugUse
Assign the output to regional_drugdeaths
Display the first 10 rows of regional_drugdeaths

regions <- c("United States",
             "Canada",
             "Australia",
             "Russia",
             "India",
             "China")

regional_drugdeaths <- death_rates_from_illicit_drug_use_disorders %>%
  rename(Region = 1, DeathsfromDrugUse = 4) %>%
  filter(Year >= 2010, Region %in%  regions) %>%
  select(Region, Year, DeathsfromDrugUse) 

regional_drugdeaths

# A tibble: 60 × 3
   Region     Year DeathsfromDrugUse
   <chr>     <dbl>             <dbl>
 1 Australia  2010              2.94
 2 Australia  2011              3.05
 3 Australia  2012              3.10
 4 Australia  2013              3.23
 5 Australia  2014              3.46
 6 Australia  2015              3.66
 7 Australia  2016              3.80
 8 Australia  2017              3.89
 9 Australia  2018              4.01
10 Australia  2019              4.11
# … with 50 more rows

Check that the total for 2019 equals the total in the graph:

regional_drugdeaths %>% filter(Year == 2019) %>%
  summarise(total_emm = sum(DeathsfromDrugUse))

# A tibble: 1 × 1
  total_emm
      <dbl>
1      34.1

They do indeed match!

Add a picture:

Write the data to file in the project directory

write_csv(regional_drugdeaths, file = "regional_drugdeaths.csv")

Project Part 1

Part 1