Library

Data

Raw material for analysis, modelling, and experimentation.

Structured datasets compiled for data science, ML research, and benchmarking. Sources are documented, schemas are included, and files are versioned. Covers LLM benchmarks, cloud pricing, performance baselines, and more.

2 resources

2 of 2

LLM Benchmark Results — Feb 2026

csv

Structured CSV of MMLU, HumanEval, and HellaSwag scores across 40 open-weight models, normalised for comparison.

Bitstream JournalInternal / AggregatedFeb 2026128 KB

LLMsMachine LearningAI Inference

Cloud Egress Cost Dataset — 2024

json

Per-region egress pricing data for AWS, GCP, and Azure, scraped quarterly. Useful for cost modelling.

Bitstream JournalInternal / AggregatedDec 202444 KB

AWSGCPAzureData Engineering

Submissions welcome. Found a dat that belongs here? Get in touch →