Breast cancer awareness!

My little contribution …

October is Breast Cancer Awareness Month, an annual campaign to increase awareness of the disease.

My contribution to the campaign with Swedish breast cancer data.

The data is from NORDCAN, a database of cancer statistics for the Nordic countries: Denmark, Finland, Iceland, Norway and Sweden. The data are delivered from each national Cancer Registry and Cause of Death registry.

The primary goal was to visualise the Swedish trends in age‐standardised breast cancer incidence and mortality. In the background, you can also see the data for the remaining four Nordic countries is also.

For cancer, the result is usually expressed as an annual rate per 100 000 persons at risk. I did choose the age-standardised rate (ASR) using the Nordic standard where the age-distribution is from the NORDCAN population in year 2000. It is useful when, as we are doing here, are trying to look at the differences between countries.

What can you do?

Ben Gorham is the designer of this year’s Swedish pink band.

There are many national and international organisation who are actively engaged in this campaign.

R code

All R code used to create this plot can be seen below. Enjoy!

## load libraries
library(tidyverse)
library(XML)
library(gganimate)
> # Function to read HTML files
> check<-function(x){
+   readHTMLTable <- readHTMLTable(x)[[2]] 
+   return(readHTMLTable)
+ }
> 
> # time frame 
> year_list <- 1990:2016

Creating my color theme. Here is a link to a color palette Cheatsheet.

> # Choosen colors
> my_dark_pink <- "#e13d44"
> my_ligher_pink <- "#f6ced7"
> my_light_pink <- "#f4b1bb" 
> my_dark_grey <- "#5f5f5f"
> my_light_grey <- "#b2b2b2"
> 
> # colors for each line
> swe_color_order <- c(rep(c(my_light_pink, my_light_grey), 4), my_dark_pink, my_dark_grey)
> 
> # Create a new theme
> theme_bluewhite <- function (base_size = 11, base_family = "") {
+   theme_bw() %+replace% 
+     theme(
+       # background
+       plot.background = element_rect(fill = my_ligher_pink),    # Background of the entire plot
+       panel.background = element_rect(fill = my_ligher_pink, color = NA),
+       
+       panel.grid.major = element_blank(),
+       panel.grid.minor = element_blank(),
+       panel.border = element_blank(),
+       
+       # axis
+       axis.line = element_line(color = my_ligher_pink),
+       axis.ticks = element_line(color = my_ligher_pink),
+       axis.text.y = element_text(color = my_dark_grey),
+       axis.text.x = element_text(color = my_dark_grey, angle = 45),
+       axis.title.y = element_text(color = my_dark_grey, vjust = 0, angle = 90),
+       axis.title.x = element_text(color = my_dark_grey, hjust = 0),
+       
+       strip.background = element_rect(color = NA,  fill=NA),
+       strip.text = element_text(colour = 'my_dark_pink'),
+       
+       plot.title = element_text(color=my_dark_grey, size=40, hjust=0, family = 'Open Sans ExtraBold'),
+       plot.subtitle = element_text(color = my_dark_pink, size=10, hjust=0, family="Tahoma"),
+       plot.caption = element_text(color = my_dark_grey, family = 'Tahoma', size=7, hjust = 0) 
+     )
+ }
> 
> theme_set(theme_bluewhite())

Importing data

> # URL for the incidence data
> url_list <- paste0("http://www-dep.iarc.fr/NORDCAN/SW/table2.asp?cancer=200&period=", 
+                    year_list,
+                    "&sex=2&type=0&age_from=1&age_to=18&sort=0&text=1&registry=3&submit=%A0%A0Utf%F6r%A0%A0")
> 
> # importing data - a list of df
> inc_list <- lapply(url_list, check)
> 
> # binding all the df files into one
> inc <- do.call(rbind, inc_list)
> 
> inc <- inc %>% mutate(year = rep(year_list, sapply(inc_list, nrow)), #adding year information
+                        indicator = "inc")
> # same for mortality data
> url_list <- paste0("http://www-dep.iarc.fr/NORDCAN/SW/table2.asp?cancer=200&period=", 
+                    year_list,
+                    "&sex=2&type=1&age_from=1&age_to=18&sort=0&text=1&registry=3&submit=%A0%A0Utf%F6r%A0%A0")
> 
> mort_list <- lapply(url_list, check)
> 
> mort <- do.call(rbind, mort_list)
> 
> mort <- mort %>% mutate(year = rep(year_list, sapply(mort_list, nrow)),
+                          indicator = "mort") 

Putting together the files and doing some adjustments.

> # Incidence data + Mortality data
> data <- rbind(inc, mort) 
> 
> # renaming some variables
> names(data)[4:6] <- c("asr_w", "asr_e", "asr_n" )
> 
> # transforming all factor variables to character
> i <- sapply(data, is.factor)
> data[i] <- lapply(data[i], as.character)
> 
> # and some to numeric
> data[2:6] <- lapply(data[2:6], as.numeric)
> 
> # keeping just country data 
> data <- data %>% 
+   mutate(cats = paste0(Population, "_", indicator)) %>%
+   filter(Population %in% c("Sverige", "Finland", "Danmark", "Norge", "Island")) 
> 
> str(data)
'data.frame':   270 obs. of  10 variables:
 $ Population    : chr  "Danmark" "Finland" "Island" "Norge" ...
 $ Antal         : num  2809 2411 111 1828 5284 ...
 $ Ojusterad rat : num  107.8 94.4 87.1 85.2 121.2 ...
 $ asr_w         : num  68.6 62.1 73.2 52.3 72.7 72.9 65.7 54.3 57.4 71.1 ...
 $ asr_e         : num  93.3 85.6 100.8 72 99.7 ...
 $ asr_n         : num  107.5 98.6 117.5 85.9 114.1 ...
 $ Kumulativ risk: chr  "-" "-" "-" "-" ...
 $ year          : int  1990 1990 1990 1990 1990 1991 1991 1991 1991 1991 ...
 $ indicator     : chr  "inc" "inc" "inc" "inc" ...
 $ cats          : chr  "Danmark_inc" "Finland_inc" "Island_inc" "Norge_inc" ...

asr_w, asr_e and asr_n are the three age-standardised rates using world (w), european (e) and nordic (n) standard populations.

> # creating the plot
> p <- 
+   ggplot(
+   data,
+   aes(as.factor(year), asr_n, group = cats, color = factor(cats))) +
+   geom_line(size=.8) +
+   scale_color_viridis_d() +
+   labs(x = NULL, y = NULL) +
+   coord_cartesian(ylim=c(0, 200)) + 
+   scale_y_continuous(breaks=seq(0, 200, 25)) +  
+   theme(legend.position = "none") +
+   scale_color_manual(values=swe_color_order) +
+   labs(title = "Sweden", 
+        subtitle = "Age standardized rates per 100 000" 
+        ) +
+   annotate(geom="text", x=3, y=145, label="Incidence", color=my_dark_pink,  family="Verdana", size=5.5) +
+   annotate(geom="text", x=3, y=65, label="Mortality", color=my_dark_grey,  family="Verdana", size=5.5) +
+   annotate(geom="text", x=3, y=0, label="Source: NORDCAN", color=my_dark_grey,  family="Open Sans", size=3.5) 
> 
> # Creating the animation
> p + transition_reveal(year) + 
+   geom_point() +
+   transition_reveal(year) + 
+   geom_point(aes(group = seq_along(year)))
> 
> #animate(p, renderer = gifski_renderer("gganim.gif"))

Reproducibility

─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 3.5.3 (2019-03-11)
 os       macOS Mojave 10.14.6        
 system   x86_64, darwin15.6.0        
 ui       X11                         
 language (EN)                        
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       Europe/Stockholm            
 date     2019-11-10                  

─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
 package     * version    date       lib source                         
 assertthat    0.2.1      2019-03-21 [1] CRAN (R 3.5.2)                 
 backports     1.1.5      2019-10-02 [1] CRAN (R 3.5.2)                 
 blogdown      0.16       2019-10-01 [1] CRAN (R 3.5.2)                 
 bookdown      0.14       2019-10-01 [1] CRAN (R 3.5.2)                 
 broom         0.5.2      2019-04-07 [1] CRAN (R 3.5.2)                 
 callr         3.3.2      2019-09-22 [1] CRAN (R 3.5.2)                 
 cellranger    1.1.0      2016-07-27 [1] CRAN (R 3.5.0)                 
 cli           1.1.0      2019-03-19 [1] CRAN (R 3.5.2)                 
 colorspace    1.4-1      2019-03-18 [1] CRAN (R 3.5.2)                 
 crayon        1.3.4      2017-09-16 [1] CRAN (R 3.5.0)                 
 desc          1.2.0      2018-05-01 [1] CRAN (R 3.5.0)                 
 devtools    * 2.2.1      2019-09-24 [1] CRAN (R 3.5.2)                 
 digest        0.6.21     2019-09-20 [1] CRAN (R 3.5.2)                 
 dplyr       * 0.8.3      2019-07-04 [1] CRAN (R 3.5.2)                 
 ellipsis      0.3.0      2019-09-20 [1] CRAN (R 3.5.2)                 
 evaluate      0.14       2019-05-28 [1] CRAN (R 3.5.2)                 
 farver        1.1.0      2018-11-20 [1] CRAN (R 3.5.0)                 
 forcats     * 0.4.0      2019-02-17 [1] CRAN (R 3.5.2)                 
 fs            1.3.1      2019-05-06 [1] CRAN (R 3.5.2)                 
 generics      0.0.2      2018-11-29 [1] CRAN (R 3.5.0)                 
 gganimate   * 1.0.3      2019-04-02 [1] CRAN (R 3.5.2)                 
 ggplot2     * 3.2.1      2019-08-10 [1] CRAN (R 3.5.2)                 
 glue          1.3.1.9000 2019-10-12 [1] Github (tidyverse/glue@71eeddf)
 gtable        0.3.0      2019-03-25 [1] CRAN (R 3.5.2)                 
 haven         2.1.1      2019-07-04 [1] CRAN (R 3.5.2)                 
 highr         0.8        2019-03-20 [1] CRAN (R 3.5.2)                 
 hms           0.5.1      2019-08-23 [1] CRAN (R 3.5.2)                 
 htmltools     0.4.0      2019-10-04 [1] CRAN (R 3.5.2)                 
 httr          1.4.1      2019-08-05 [1] CRAN (R 3.5.2)                 
 jsonlite      1.6        2018-12-07 [1] CRAN (R 3.5.0)                 
 knitr       * 1.25       2019-09-18 [1] CRAN (R 3.5.2)                 
 lattice       0.20-38    2018-11-04 [1] CRAN (R 3.5.3)                 
 lazyeval      0.2.2      2019-03-15 [1] CRAN (R 3.5.2)                 
 lifecycle     0.1.0      2019-08-01 [1] CRAN (R 3.5.2)                 
 lubridate     1.7.4      2018-04-11 [1] CRAN (R 3.5.0)                 
 magrittr      1.5        2014-11-22 [1] CRAN (R 3.5.0)                 
 memoise       1.1.0      2017-04-21 [1] CRAN (R 3.5.0)                 
 modelr        0.1.5      2019-08-08 [1] CRAN (R 3.5.2)                 
 munsell       0.5.0      2018-06-12 [1] CRAN (R 3.5.0)                 
 nlme          3.1-141    2019-08-01 [1] CRAN (R 3.5.2)                 
 pillar        1.4.2      2019-06-29 [1] CRAN (R 3.5.2)                 
 pkgbuild      1.0.6      2019-10-09 [1] CRAN (R 3.5.2)                 
 pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 3.5.2)                 
 pkgload       1.0.2      2018-10-29 [1] CRAN (R 3.5.0)                 
 prettyunits   1.0.2      2015-07-13 [1] CRAN (R 3.5.0)                 
 processx      3.4.1      2019-07-18 [1] CRAN (R 3.5.2)                 
 progress      1.2.2      2019-05-16 [1] CRAN (R 3.5.2)                 
 ps            1.3.0      2018-12-21 [1] CRAN (R 3.5.0)                 
 purrr       * 0.3.2      2019-03-15 [1] CRAN (R 3.5.2)                 
 R6            2.4.0      2019-02-14 [1] CRAN (R 3.5.2)                 
 Rcpp          1.0.2      2019-07-25 [1] CRAN (R 3.5.2)                 
 readr       * 1.3.1      2018-12-21 [1] CRAN (R 3.5.0)                 
 readxl        1.3.1      2019-03-13 [1] CRAN (R 3.5.2)                 
 remotes       2.1.0      2019-06-24 [1] CRAN (R 3.5.2)                 
 rlang         0.4.0      2019-06-25 [1] CRAN (R 3.5.2)                 
 rmarkdown     1.16       2019-10-01 [1] CRAN (R 3.5.2)                 
 rprojroot     1.3-2      2018-01-03 [1] CRAN (R 3.5.0)                 
 rstudioapi    0.10       2019-03-19 [1] CRAN (R 3.5.2)                 
 rvest         0.3.4      2019-05-15 [1] CRAN (R 3.5.2)                 
 scales        1.0.0      2018-08-09 [1] CRAN (R 3.5.0)                 
 sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.5.0)                 
 stringi       1.4.3      2019-03-12 [1] CRAN (R 3.5.2)                 
 stringr     * 1.4.0      2019-02-10 [1] CRAN (R 3.5.2)                 
 testthat      2.2.1      2019-07-25 [1] CRAN (R 3.5.2)                 
 tibble      * 2.1.3      2019-06-06 [1] CRAN (R 3.5.2)                 
 tidyr       * 1.0.0      2019-09-11 [1] CRAN (R 3.5.2)                 
 tidyselect    0.2.5      2018-10-11 [1] CRAN (R 3.5.0)                 
 tidyverse   * 1.2.1      2017-11-14 [1] CRAN (R 3.5.0)                 
 tweenr        1.0.1      2018-12-14 [1] CRAN (R 3.5.0)                 
 usethis     * 1.5.1      2019-07-04 [1] CRAN (R 3.5.2)                 
 vctrs         0.2.0      2019-07-05 [1] CRAN (R 3.5.2)                 
 withr         2.1.2      2018-03-15 [1] CRAN (R 3.5.0)                 
 xfun          0.10       2019-10-01 [1] CRAN (R 3.5.2)                 
 XML         * 3.98-1.20  2019-06-06 [1] CRAN (R 3.5.2)                 
 xml2          1.2.2      2019-08-09 [1] CRAN (R 3.5.2)                 
 yaml          2.2.0      2018-07-25 [1] CRAN (R 3.5.0)                 
 zeallot       0.1.0      2018-01-28 [1] CRAN (R 3.5.0)                 

[1] /Library/Frameworks/R.framework/Versions/3.5/Resources/library

Did you find this page helpful? Consider sharing it 🙌

Avatar
Leyla Nunez
Statistician

Related

comments powered by Disqus