
Bibliography Management with boilerplate
Source:vignettes/boilerplate-bibliography-workflow.Rmd
boilerplate-bibliography-workflow.Rmd
Introduction
The boilerplate package provides powerful bibliography management features that integrate seamlessly with your scientific writing workflow. This vignette demonstrates how to set up centralised bibliography management, validate citations, and ensure consistency across your manuscripts.
Overview: Why Centralised Bibliography Management?
Managing references across multiple manuscripts can be challenging. Common problems include:
- Inconsistent citation formatting
- Missing references in bibliography files
- Duplicated effort updating references across projects
- Version control conflicts with large .bib files
The boilerplate package solves these problems by:
- Centralising your bibliography in one location (e.g., GitHub)
- Caching bibliography files locally for performance
- Validating that all citations exist in your bibliography
- Automating bibliography distribution to project directories
Setting Up Bibliography Management
Step 1: Configure Your Bibliography Source
Add bibliography information to your boilerplate database:
library(boilerplate)
# Load your database
db <- boilerplate_import()
# Add bibliography configuration
db <- boilerplate_add_bibliography(
db,
url = "https://raw.githubusercontent.com/go-bayes/templates/refs/heads/main/bib/references.bib",
local_path = "references.bib",
validate = TRUE
)
# Save the updated database
boilerplate_save(db)
Step 2: Download and Cache the Bibliography
The bibliography is automatically cached for performance:
# Download/update bibliography (cached for 7 days by default)
bib_file <- boilerplate_update_bibliography(db)
# Force update if needed
bib_file <- boilerplate_update_bibliography(db, force = TRUE)
# Check cache age
boilerplate_update_bibliography(db, force = FALSE)
#> ℹ Using cached bibliography from ~/.boilerplate/cache/references.bib
#> ⚠ Bibliography cache is 5.2 days old. Consider using force=TRUE to update.
Step 3: Copy Bibliography to Your Project
When working on a manuscript, copy the bibliography to your project directory:
# Copy to current project
boilerplate_copy_bibliography(db, target_dir = ".")
# Copy and update from source first
boilerplate_copy_bibliography(db, target_dir = ".", update_first = TRUE)
# The bibliography is now available as ./references.bib
Validating References
Check All Citations Exist
Ensure all citations in your boilerplate text exist in the bibliography:
# Validate references across all text categories
validation <- boilerplate_validate_references(db)
# Check specific categories only
validation <- boilerplate_validate_references(
db,
categories = c("methods", "results")
)
# Review validation results
if (!validation$valid) {
cat("Missing references:\n")
print(validation$missing)
}
# See all available references
length(validation$available)
#> [1] 1847 # Example: large bibliography
# See which references are actually used
validation$used
#> [1] "@smith2023" "@jones2024" "@doe2022meta"
Handle Missing References
When validation finds missing references:
# Example validation with missing references
validation <- boilerplate_validate_references(db, quiet = TRUE)
if (length(validation$missing) > 0) {
cat("Please add these references to your bibliography:\n")
cat(paste0("- ", validation$missing, "\n"))
# Generate BibTeX entries for missing references
# (This is a manual process - add to your central .bib file)
for (ref in validation$missing) {
cat("\n@article{", gsub("@", "", ref), ",\n", sep = "")
cat(" title = {},\n")
cat(" author = {},\n")
cat(" journal = {},\n")
cat(" year = {},\n")
cat("}\n")
}
}
Integration with Document Generation
Automatic Bibliography Distribution
When generating text, automatically copy the bibliography:
# Generate methods text with automatic bibliography copying
methods_text <- boilerplate_generate_text(
category = "methods",
sections = c("sample.default", "analysis.primary"),
global_vars = list(n = 1000),
db = db,
copy_bibliography = TRUE,
bibliography_path = "." # Copy to project root
)
# The bibliography is now available for your Quarto/R Markdown document
Advanced Workflows
Project-Specific Bibliography Subsets
For large bibliographies, create project-specific subsets:
# Get citations used in current project
validation <- boilerplate_validate_references(db)
used_refs <- validation$used
# Read full bibliography
bib_lines <- readLines("references.bib")
# Extract entries for used citations
# (This is a simplified example - real implementation would need proper BibTeX parsing)
project_bib <- extract_bibtex_entries(bib_lines, used_refs)
# Write project-specific bibliography
writeLines(project_bib, "project_references.bib")
Multi-Author Collaboration
For collaborative projects with shared boilerplate:
# 1. Team lead sets up central bibliography
team_db <- boilerplate_import()
team_db <- boilerplate_add_bibliography(
team_db,
url = "https://github.com/our-lab/shared-refs/raw/main/lab_references.bib",
local_path = "lab_references.bib"
)
# 2. Each team member updates their local cache
bib_file <- boilerplate_update_bibliography(team_db, force = TRUE)
# 3. Validate before submission
validation <- boilerplate_validate_references(team_db)
stopifnot(validation$valid) # Ensure no missing references
Automated Reference Checking
Add to your CI/CD pipeline:
# .github/workflows/check-references.yml
# Run this check on every pull request
# In R script: check_references.R
library(boilerplate)
db <- boilerplate_import()
validation <- boilerplate_validate_references(db, quiet = TRUE)
if (!validation$valid) {
stop(
"Missing references found: ",
paste(validation$missing, collapse = ", ")
)
}
message("All references validated successfully!")
Best Practices
1. Maintain a Central Bibliography
- Keep your bibliography in version control (e.g., GitHub)
- Use a consistent naming scheme for citation keys
- Regular updates and maintenance
- Consider using tools like Zotero with Better BibTeX for key management
2. Cache Management
# Check cache location
cache_dir <- "~/.boilerplate/cache"
# View cached files
list.files(cache_dir, pattern = "\\.bib$")
# Clear old cache if needed
old_files <- list.files(
cache_dir,
pattern = "\\.bib$",
full.names = TRUE
)
file.remove(old_files[file.mtime(old_files) < Sys.Date() - 30])
Troubleshooting
Common Issues and Solutions
Cache Issues
# Force fresh download
bib_file <- boilerplate_update_bibliography(db, force = TRUE)
# Check cache directory permissions
file.access("~/.boilerplate/cache", mode = 2) # 0 = success
Validation Errors
# Debug validation issues
validation <- boilerplate_validate_references(db, quiet = FALSE)
# Check specific text for citations
text <- db$methods$sample$default
citations <- grep("@[a-zA-Z0-9_:-]+", text, value = TRUE)
print(citations)
Complete Example Workflow
Here’s a complete workflow from setup to document generation:
# 1. Initial setup (run once)
library(boilerplate)
# Initialise new project
boilerplate_init(create_dirs = TRUE)
# Import database
db <- boilerplate_import()
# Configure bibliography
db <- boilerplate_add_bibliography(
db,
url = "https://raw.githubusercontent.com/go-bayes/templates/refs/heads/main/bib/references.bib",
local_path = "references.bib"
)
# Save configuration
boilerplate_save(db)
# 2. Daily workflow
# Update bibliography if needed
boilerplate_update_bibliography(db)
# Copy to project
boilerplate_copy_bibliography(db, ".")
# 3. Before submission
# Validate all references
validation <- boilerplate_validate_references(db)
if (validation$valid) {
message("Ready for submission!")
} else {
warning("Missing references: ", paste(validation$missing, collapse = ", "))
}
# 4. Generate final document
final_text <- boilerplate_generate_text(
category = "methods",
sections = c("all"),
db = db,
copy_bibliography = TRUE
)
Summary
The boilerplate package’s bibliography management features provide:
- Centralised management - One bibliography, many projects
- Automatic distribution - Bibliography copied when needed
- Validation - Ensure all citations are defined
- Caching - Fast local access with periodic updates
- Integration - Works seamlessly with Quarto/R Markdown
By following this workflow, you can maintain consistent, accurate references across all your manuscripts while reducing duplicate effort and potential errors.