Writing Vignettes with APIs

Package vignettes (like this!) are a valuable way to show how to use your code. But if you’re demonstrating a package that communicates with a remote API, it has been difficult to write useful vignettes. R CMD check tests that package vignettes, when they are dynamically generated by Sweave or R Markdown, can successfully be rebuilt. If your API requires authentication to use, you’d need to distribute your login credentials for the vignette to build, and that’s generally not a good idea. Plus, building would require a stable network connection, without which you might get spurious build failures and CRAN submission rejections. Workarounds for these challenges, such as writing static vignettes that only appear to do work, present other problems: they’re lots of work to maintain and easily become out of sync with the package.

httptest solves these problems. By adding as little as one line of code to your vignette, you can safely record API responses from a live session. These API responses are scrubbed of sensitive personal information and stored in a subfolder in your vignettes directory. Subsequent vignette builds, including on continuous-integration services, CRAN, and your package users’ computers, use these recorded responses, allowing the document to regenerate without a network connection or API credentials. To record fresh API responses, delete the subfolder of cached responses and re-run.

This vignette shows you how. To see an example in the wild, see the introduction vignette to pivotaltrackR (source). While this discussion is focused on package vignettes, the same behavior should work in any R Markdown document.

The basics: start_vignette()

Getting started is easy. At the beginning of your R Markdown document, add this code chunk:

`​``{r, include=FALSE}
library(httptest)
start_vignette("vignette-name")
```

changing vignette-name to something meaningful, such as the name of your .Rmd file. start_vignette() works by checking for the existence of a directory with the name you provided. If no directory exists, the vignette proceeds making real API requests and records the responses as fixtures inside the vignette-name directory (that is, it calls start_capturing()). If the directory does exists, great—you’ve previously recorded API responses, so it uses them, loading them with the same use_mock_api() mode you can use in your test suite.

Curious about how these recording and mocking contexts work? See vignettes("httptest") for an overview; it’s focused on testing rather than vignettes, but the mechanics are the same.

That’s about it! It is a good idea to add an end_vignette() at the end of the document, like

`​``{r, include=FALSE}
end_vignette()
```

This turns off the request recording or mocking and cleans up the R session state. It’s not necessary if you build each vignette in a clean R process and quit on completion (everything is cleaned up when R exits), but having the end_vignette() call is good in case you build your documents in an interactive session.

Note that these code chunks have include=FALSE. This prevents them from being printed in the resulting Markdown, HTML, PDF, or whatever format document you produce. They’re doing work behind the scenes, so you don’t need them to be shown to your readers.

Handling server state changes

If all your vignette does is query an API to get data from it, start_vignette() is all you need. Your actions don’t change the state of anything on the server, so every time you make the same request (at least within your current session), you get the same response.

Sometimes, though, the purpose of your code is to alter server state: you are creating a database entry, sending a tweet, or other similar action. Suppose you are querying the Twitter API, and you first search for the #rstats hashtag, then you send a tweet with that hashtag, and finally you repeat your search. You’d expect the second search to contain the tweet you just sent.

To make this work, before any code chunk that will alter server state, call change_state():

`​``{r, include=FALSE}
change_state()
```

When recording, this adds a new “layer” of recorded responses, and when reading previously recorded responses, it changes to the next layer.

For a working example, see the pivotaltrackR vignette. It does a query, then creates a record on the server, modifies that record, and then deletes it. All of this is captured in the vignette data and is fully replayable.

Advanced topics

Because you’re recording API responses for replay offline, there are a few additional considerations. First, you’ll want to make sure not to expose your personal credentials or other private details in the cached API responses. httptest provides the ability to “redact” responses you record, and by default, all standard authentication methods are redacted from recorded responses. It’s probable that you don’t need to do anything further to have clean responses, but it’s worth verifying. Beyond credentials, there may be other attributes of API responses that you want to modify, such as finding-and-replacing record ids with a shorter or obfuscated value.,

To modify these responses, you can provide a custom redacting function. A good way to do this that works for both your test suite and your vignettes is to put your custom function in inst/httptest/redact.R in your package, and it will be automatically used whenever your package is loaded. See more about redacting in vignette("redacting").

Second, depending on how long the URLs are in the API requests you make, you may need to programmatically shorten them if you’re planning on submitting your package to CRAN because it requires file names to be 100 characters or less. Just as you can provide a custom function to modify responses that are recorded, you can provide a function to tweak the request being made in order to map the request to the right file in the mocked context. Importantly, this lets you truncate the URLs, which then map to files.

To do this, similarly place a function in inst/httptest/request.R. Any time your package is loaded and you’re reading mock (previously recorded) responses, this function will be called on the request object before mapping it to a file.

If you don’t want to set these request/response processors globally for your tests and vignettes, there are a couple of options. Both of these functions (response redactor and request preprocessor) can be defined and set in your document’s initial code chunk, using set_redactor and set_requester, like this:

library(httptest)
library(magrittr)

api_root <- "https://www.pivotaltracker.com/services/v5/"
set_redactor(function(response) {
  response %>%
    redact_headers("X-TrackerToken") %>%
    gsub_response(api_root, "", fixed = TRUE) %>%
    gsub_response(getOption("pivotal.project"), "123")
})
set_requester(function(request) {
  request %>%
    gsub_request(api_root, "", fixed = TRUE) %>%
    gsub_request(getOption("pivotal.project"), "123")
})
start_vignette("stories")

This is useful if you’re writing an R Markdown document outside of the context of a package.

Alternatively, you can put vignette-specific setup and teardown code for a package in inst/httptest/start-vignette.R and inst/httptest/end-vignette.R, respectively, and like the other inst/httptest files, these will be found and used whenever your package is loaded. This is a good option when you have more than one vignette and you want to share setup code across them without copy-and-paste.