Sharing code & tools

Research compendium, R Package & Shiny App

October 2024

Nicolas Casajus

Senior data scientist
@FRB-CESAB

Introduction

Research compendium

R package

Shiny App

Introduction

You released a new database (with metadata)
You published a data paper

And you wrote a lot of code…

Introduction

You released a new database (with metadata)
You published a data paper

And you wrote a lot of code…

Why not sharing your code?

Introduction

You released a new database (with metadata)
You published a data paper

And you wrote a lot of code…

Why not sharing your code?

Share your code to reproduce your pipeline

Introduction

You released a new database (with metadata)
You published a data paper

And you wrote a lot of code…

Why not sharing your code?

Share your code to access data

Introduction

You released a new database (with metadata)
You published a data paper

And you wrote a lot of code…

Why not sharing your code?

Share your code to add new data

Code hosting platforms

GitHub and co are cloud-based git repository hosting services

Perfect solutions to host projects (code) tracked by git

Services

Full integration of version control (commits, history, differences)
Easy collaboration w/ branches, forks, pull requests
Issues tracking system
Enhanced documentation rendering (README, Wiki)
Static website hosting
Automation & monitoring (CI/CD)

Code hosting platforms

Main platforms

GitHub
Microsoft

GitLab
Open source

GitHub account: https://github.com/ahasverus/

GitHub organization: https://github.com/frbcesab/

Research compendium

The goal of a research compendium is to provide a standard and easily recognisable way for organizing the digital materials of a project to enable others to inspect, reproduce, and extend the research.

Marwick B, Boettiger C & Mullen L (2018)¹

Three generic principles

Files organized according to the conventions of the community

Clear separation of data, method, and output

Specify the computational environment that was used

A research compendium should be self-contained

Research compendium

Strong flexibility in the structure of a compendium

Small compendium

.
├─ .git/
├─ .gitignore
│
├─ project.Rproj
│ 
├─ data/ 🔒
│ 
├─ code/
│  └─ script.R
│ 
├─ outputs/
│ 
├─ LICENSE
└─ README.md

Research compendium

Strong flexibility in the structure of a compendium

Small compendium

.
├─ .git/
├─ .gitignore
│
├─ project.Rproj
│ 
├─ data/ 🔒
│ 
├─ code/
│  └─ script.R
│ 
├─ outputs/
│ 
├─ LICENSE
└─ README.md

Medium compendium

.
├─ .git/
├─ .gitignore
│
├─ project.Rproj
│
├─ data/
│  ├─ raw-data/ 🔒
│  └─ derived-data/
│
├─ R/
│  ├─ function-x.R
│  └─ function-y.R
│
├─ analyses/
│  ├─ script-1.R
│  └─ script-n.R
│
├─ outputs/
│
├─ make.R
│
├─ DESCRIPTION
├─ LICENSE
└─ README.md

Research compendium

Strong flexibility in the structure of a compendium

Small compendium

.
├─ .git/
├─ .gitignore
│
├─ project.Rproj
│ 
├─ data/ 🔒
│ 
├─ code/
│  └─ script.R
│ 
├─ outputs/
│ 
├─ LICENSE
└─ README.md

Medium compendium

.
├─ .git/
├─ .gitignore
│
├─ project.Rproj
│
├─ data/
│  ├─ raw-data/ 🔒
│  └─ derived-data/
│
├─ R/
│  ├─ function-x.R
│  └─ function-y.R
│
├─ analyses/
│  ├─ script-1.R
│  └─ script-n.R
│
├─ outputs/
│
├─ make.R
│
├─ DESCRIPTION
├─ LICENSE
└─ README.md

Large compendium

.
├─ .git/
├─ .gitignore
├─ .github/
│  └─ workflows/
│     ├─ workflow-1.yaml
│     └─ workflow-n.yaml
│
├─ project.Rproj
│
├─ .renv/
├─ renv.lock
│
├─ dockerfile
├─ .dockerignore
│
├─ data/
│  ├─ raw-data/ 🔒
│  └─ derived-data/
│
├─ R/
│  ├─ function-x.R
│  └─ function-y.R
│
├─ analyses/
│  ├─ script-x.R
│  └─ script-n.R
│
├─ outputs/
│
├─ figures/
│
├─ paper/
│  ├─ references.bib
│  ├─ style.csl
│  └─ paper.Rmd
│
├─ make.R
│
├─ DESCRIPTION
├─ CITATION.cff
├─ CODE_OF_CONDUCT.md
├─ CONTRIBUTING.md
├─ LICENSE
└─ README.md

`README` please

A README is a text file that introduces and explains your project

each research compendium should contain a README
you can write different README (project, data, etc.)

Source: https://github.com/frbcesab/chessboard/blob/main/README.md

`README` please

A README is a text file that introduces and explains your project

each research compendium should contain a README
you can write different README (project, data, etc.)

GitHub and other code hosting platforms recognize and interpret README written in Markdown (README.md)

`README` please

A README is a text file that introduces and explains your project

each research compendium should contain a README
you can write different README (project, data, etc.)

GitHub and other code hosting platforms recognize and interpret README written in Markdown (README.md)

Source: https://github.com/frbcesab/chessboard

`README` please

A good README should answer the following questions¹:

Why should I use it?
How do I get it?
How do I use it?

`README` please

A good README should answer the following questions¹:

Why should I use it?
How do I get it?
How do I use it?

Main sections (for a research compendium)

Title
Description
Content (file organization)
Prerequisites
Installation
Usage
License
Citation
Acknowledgements
References

R package

What’s an R Package?

In the fundamental unit of shareable code is the package. A package bundles together code, data, documentation, and tests, and is easy to share with others.

Hadley Wickham - R packages (1^st ed.)

An package:

is a collection of well-documented functions
makes your work more reproducible
makes your code useful for you and for others

As of today (2024-10-31):

21585 packages are available on the CRAN
2289 packages on Bioconductor

Recommended environment

Development workflow

Package structure

A package contains two main components:

a DESCRIPTION file with package metadata
a folder R/ with documented functions

.
├─ R/
│  └─ fun.R
│ 
└─ DESCRIPTION

Package structure

A package contains two main components:

a DESCRIPTION file with package metadata
a folder R/ with documented functions

.
├─ R/
│  └─ fun.R
│ 
└─ DESCRIPTION

devtools::document()

.
├─ R/
│  └─ fun.R
│ 
├─ man/
│  └─ fun.Rd
│ 
├─ NAMESPACE
│ 
└─ DESCRIPTION

The function devtools::document() automatically generates a folder man/ (function documentation) and the NAMESPACE file.

What’s a function?

A function is a block of code organized together to perform a specific task and only runs when it is called. It can have parameters and can return a result.

Automate common and repetitive tasks

Advantages¹

You can give a function an evocative name that makes your code easier to understand.
As requirements change, you only need to update code in one place, instead of many.
You eliminate the chance of making incidental mistakes when you copy and paste.
It makes it easier to reuse work from project-to-project, increasing your productivity over time.

Creating a function

## Function definition ----

function_name <- function(input) {
  
  # Code block
  # Code block
  # Code block
  
  return(output)
}

A function is defined by calling function()
A function should have an explicit name
A function can have 0, 1 or many parameters (inputs)
A function can return a value (output)

Defining a function

## Arithmetic mean ----

arithmetic_mean <- function(x) {
  
  y <- sum(x) / length(x)
  
  return(y)
}

Creating a function

## Function definition ----

function_name <- function(input) {
  
  # Code block
  # Code block
  # Code block
  
  return(output)
}

A function is defined by calling function()
A function should have an explicit name
A function can have 0, 1 or many parameters (inputs)
A function can return a value (output)

Defining a function

## Arithmetic mean ----

arithmetic_mean <- function(x) {
  
  y <- sum(x) / length(x)
  
  return(y)
}

## Simplification ----

arithmetic_mean <- function(x) {
  
  sum(x) / length(x)
}

Calling the function

## Arithmetic mean ----

arithmetic_mean(x = c(4, 6, 5, 10))

[1] 6.25

## Comparison ----

mean(x = c(4, 6, 5, 10))

[1] 6.25

Documenting function

Specially-structured comments preceding each function definition
Lightweight syntax easy to write and to read
Syntax: #' @field value
Keep function definition and documentation in the same file
Automatically write .Rd files (in man/) and NAMESPACE

Get started w/ roxygen2: here

#' Compute the arithmetic mean
#'
#' This function computes the arithmetic mean of a numeric variable.
#'
#' @param x a `numeric` vector
#'
#' @return A `numeric` value representing the arithmetic mean of `x`.
#'
#' @export
#'
#' @examples
#' x <- 1:10
#' arithmetic_mean(x)

arithmetic_mean <- function(x) {
  
  sum(x) / length(x)
}

Documenting function

Specially-structured comments preceding each function definition
Lightweight syntax easy to write and to read
Syntax: #' @field value
Keep function definition and documentation in the same file
Automatically write .Rd files (in man/) and NAMESPACE

Get started w/ roxygen2: here

#' Compute the arithmetic mean
#'
#' This function computes the arithmetic mean of a numeric variable.
#'
#' @param x a `numeric` vector
#'
#' @return A `numeric` value representing the arithmetic mean of `x`.
#'
#' @export
#'
#' @examples
#' x <- 1:10
#' arithmetic_mean(x)

arithmetic_mean <- function(x) {
  
  sum(x) / length(x)
}

Then, run devtools::document() to automatically generate .Rd files in man/ and the NAMESPACE file

The `DESCRIPTION` file

Main component of an package, the DESCRIPTION file describes package metadata.

Package: nameofthepackage
Type: Package
Title: The Title of the Package
Authors@R: c(
    person(given   = "John",
           family  = "Doe",
           role    = c("aut", "cre", "cph"),
           email   = "john.doe@domain.com",
           comment = c(ORCID = "9999-9999-9999-9999")))
Description: A paragraph providing a full description of 
    the package.
License: GPL (>= 2)

External packages required by the package will be listed in this file.

Example

Database hosted on
Zenodo

Data paper published in
Scientific Data

Package hosted on GitHub
(coming soon on the CRAN)

Software paper submitted at
Methods in Ecology and Evolution

Must-read resources

https://r-pkgs.org/

https://cran.r-project.org/doc/manuals/r-release/R-exts.html

Shiny App

`shiny` package

Shiny is an package that makes it easy to build interactive web applications (apps) straight from .

Source: Mastering Shiny

Features

Provides a curated set of user interface (UI) functions that generate the HTML, CSS, and JavaScript needed for common tasks.

No knowledge of HTML, CSS, or JavaScript required

Introduces a new style of programming called reactive programming which automatically tracks the dependencies of pieces of code.

Automatically update output if input changes

`shiny` package

Available at: https://github.com/rstudio/shiny/

Structure of a Shiny App

A Shiny app is contained in a single script called app.R and has three components:

a ui (user interface) object
a server() function
a call to the shinyApp() function

## Required package ----

library(shiny)


## User interface ----

ui <- *(
  ...
)


## Server component ----

server <- function(input, output) {
  ...
}


## Create Shiny app object ----

shinyApp(ui = ui, server = server)

Structure of a Shiny App

A Shiny app is contained in a single script called app.R and has three components:

a ui (user interface) object
a server() function
a call to the shinyApp() function

## Required package ----

library(shiny)


## User interface ----

ui <- *(
  ...
)


## Server component ----

server <- function(input, output) {
  ...
}


## Create Shiny app object ----

shinyApp(ui = ui, server = server)

## Launch the Shiny app ----

runApp()

UI Components

More information here

UI Layouts

More information here

Reactive programming

User interacts with UI inputs (click a button, enter text, select an option, etc.)
The server handles input changes and modifies the output value
The server updates the UI output

Minimal Shiny app

## User interface ----

ui <- fluidPage(

    # Application title
    titlePanel("Old Faithful Geyser Data"),

    # Sidebar with a slider input for number of bins 
    sidebarLayout(
        sidebarPanel(
            sliderInput("bins",
                        "Number of bins:",
                        min   = 1,
                        max   = 50,
                        value = 30
            )
        ),

        # Show a plot of the generated distribution
        mainPanel(
           plotOutput("distPlot")
        )
    )
)

## Server logic ----

server <- function(input, output) {

    output$distPlot <- renderPlot({
      
        # Generate bins based on input$bins
        x    <- faithful[, 2]
        bins <- seq(min(x), max(x), length.out = input$bins + 1)

        # Draw the histogram with the specified number of bins
        hist(x, breaks = bins, col = 'darkgray', border = 'white',
             xlab = 'Waiting time to next eruption (in mins)',
             main = 'Histogram of waiting times')
    })
}

## Create the application ----

shinyApp(ui = ui, server = server)

RStudio IDE: New Project > New Directory > Shiny Application

Examples

https://github.com/ahasverus/bioclimatic-atlas/
https://ahasverus.shinyapps.io/bioclimaticatlas/

https://github.com/ahasverus/zeplacetobe/
https://ahasverus.shinyapps.io/zeplacetobe/

Resources

Shiny website

https://mastering-shiny.org/

https://engineering-shiny.org/

Table of contents

Table of contents

Introduction

Introduction

Introduction

Introduction

Introduction

Introduction

Code hosting platforms

Code hosting platforms

Research compendium

Research compendium

Research compendium

Research compendium

Research compendium

README please

README please

README please

README please

README please

R package

What’s an R Package?

Recommended environment

Development workflow

Package structure

Package structure

What’s a function?

Creating a function

Creating a function

Documenting function

Documenting function

The DESCRIPTION file

Example

Must-read resources

Shiny App

shiny package

shiny package

Structure of a Shiny App

Structure of a Shiny App

UI Components

UI Layouts

Reactive programming

Minimal Shiny app

Examples

Resources

Thanks

`README` please

`README` please

`README` please

`README` please

`README` please

The `DESCRIPTION` file

`shiny` package

`shiny` package