Reference Collection
Library
A curated reference collection for engineers who read beyond the documentation. The Library holds the research papers that changed how we build, the whitepapers that define best practice, canonical architecture diagrams for the patterns you encounter in production, and raw datasets for analysis and modelling.
Unlike the Archive — which is long-form editorial writing — the Library is a reference shelf. Come here to find, download, and study. Currently 11 resources.
Peer-reviewed work that moved the field.
Academic papers from institutions and conferences that have meaningfully advanced software engineering, systems design, and machine learning. Each entry is curated for practical relevance to working engineers, not just researchers.
The seminal 2017 Transformer paper that introduced the self-attention mechanism underpinning modern LLMs.
Microsoft Research paper demonstrating that 1-bit quantised LLMs match full-precision models at scale.
The 2007 paper describing the design of Amazon Dynamo, foundational reading for distributed systems engineers.
Authoritative guidance from the organisations building the infrastructure.
Industry whitepapers from cloud providers, standards bodies, and major engineering organizations. These documents define best practices, architectural frameworks, and security standards that underpin production systems worldwide.
Google's comprehensive whitepaper covering reliability, security, performance, and cost optimisation on GCP.
Amazon's six-pillar framework for building secure, performant, resilient, and efficient cloud infrastructure.
The NIST standard defining zero trust principles, components, and deployment models for modern security.
Diagrams worth a thousand words — and a thousand decisions.
Annotated system diagrams for common and uncommon architectural patterns: microservices topologies, edge deployments, AI pipelines, data infrastructure, and more. Each diagram is designed to be a practical starting point, not an idealised abstraction.
Event-Driven Microservices — Reference Architecture
A canonical reference diagram for event-driven microservices with a message broker, event store, and multiple consumers. Includes Kafka topology layers.
Edge-First API Architecture
Reference topology for deploying API logic at the network edge using Cloudflare Workers and Durable Objects, with origin fallback.
RAG Pipeline — Production Architecture
End-to-end reference architecture for a Retrieval-Augmented Generation system: ingest, embed, index, query, and rerank stages.
Raw material for analysis, modelling, and experimentation.
Structured datasets compiled for data science, ML research, and benchmarking. Sources are documented, schemas are included, and files are versioned. Covers LLM benchmarks, cloud pricing, performance baselines, and more.
LLM Benchmark Results — Feb 2026
Structured CSV of MMLU, HumanEval, and HellaSwag scores across 40 open-weight models, normalised for comparison.
Cloud Egress Cost Dataset — 2024
Per-region egress pricing data for AWS, GCP, and Azure, scraped quarterly. Useful for cost modelling.
Submissions welcome. Found a paper, diagram, or dataset that belongs here? Get in touch →