How Well Does Your
Agent Read the Web?
An open benchmark measuring token efficiency across 44 real websites. Fewer tokens means faster agents, lower costs, and more context for reasoning.
Top Compression Leaders
Teal = SOM tokens · Dark = raw HTML tokens · Lower SOM is better
Full Benchmark Results
All 44 sites · Click headers to sort · Search to filter
| SITE | CATEGORY | HTML TOKENS | SOM TOKENS | COMPRESSION ▼ |
|---|---|---|---|---|
| cloud.google.com | SaaS & Cloud | 862K | 6K | 133.7x |
| arstechnica.com | News & Media | 141K | 1K | 108.9x |
| kubernetes.io/docs | Dev Tools | 125K | 1K | 100.8x |
| techcrunch.com | News & Media | 139K | 1K | 97.5x |
| nytimes.com | News & Media | 438K | 5K | 97.3x |
| linear.app | SaaS & Cloud | 918K | 11K | 84.3x |
| stripe.com/docs | SaaS & Cloud | 356K | 7K | 53.3x |
| docker.com | SaaS & Cloud | 127K | 3K | 48.7x |
| tailwindcss.com | SaaS & Cloud | 396K | 9K | 46.2x |
| httpbin.org | General | 3K | 79 | 37.6x |
| nodejs.org | General | 185K | 5K | 36.9x |
| wired.com | News & Media | 460K | 15K | 30.5x |
| vercel.com | SaaS & Cloud | 348K | 12K | 30.0x |
| typescriptlang.org | Dev Tools | 103K | 4K | 23.4x |
| nextjs.org | Dev Tools | 123K | 6K | 21.3x |
| aws.amazon.com | SaaS & Cloud | 108K | 5K | 20.1x |
| theguardian.com | News & Media | 443K | 27K | 16.3x |
| azure.microsoft.com | News & Media | 158K | 15K | 10.5x |
| github.com/plasmate-labs/plasmate | Dev Tools | 180K | 19K | 9.3x |
| angular.dev | Dev Tools | 32K | 4K | 7.4x |
| en.wikipedia.org/wiki/Rust_(programming_language) | General | 189K | 28K | 6.9x |
| vuejs.org | Dev Tools | 34K | 9K | 3.9x |
| getbootstrap.com | Dev Tools | 29K | 10K | 3.0x |
| developer.mozilla.org/en-US/docs/Web/JavaScript | Dev Tools | 53K | 22K | 2.4x |
| svelte.dev | Dev Tools | 38K | 18K | 2.1x |
| lobste.rs | General | 18K | 9K | 1.9x |
| medium.com | News & Media | 3K | 1K | 1.8x |
| docs.rs | Dev Tools | 5K | 4K | 1.2x |
| rust-lang.org | Dev Tools | 5K | 5K | 1.0x |
| pypi.org | Dev Tools | 6K | 7K | 0.9x |
| news.ycombinator.com | General | 12K | 15K | 0.8x |
| jsonplaceholder.typicode.com | General | 2K | 3K | 0.8x |
| python.org | General | 9K | 15K | 0.7x |
| postgresql.org | Dev Tools | 6K | 9K | 0.7x |
| example.com | General | 152 | 331 | 0.5x |
| crates.io | General | 70 | 348 | 0.2x |
| producthunt.com | General | 3K | 26K | 0.1x |
Category Breakdown
Average compression ratio by site category
Browse by Category
Deep-dive into vertical-specific benchmark data
Cost at Scale
Estimate daily savings switching from raw HTML to SOM
Methodology
| Plasmate version | 0.5.1 |
| HTML baseline | curl -sL (raw HTTP, no rendering) |
| Token counter | tiktoken cl100k_base (GPT-4 tokenizer) |
| Date | May 18, 2026 |
| Platform | Linux x86_64 |
| Sites | 37 attempted, 37 successful, 0 failed (anti-bot) |
| Source | github.com/plasmate-labs/plasmate-benchmarks |
SOM is defined by the open SOMspec specification.
Contribute
Add your own sites to the benchmark:
git clone https://github.com/plasmate-labs/plasmate-benchmarks
# Add your URL to urls.txt
# Run: ./run-benchmark.sh
# Submit a PR with your resultsObservatory Vision
Re-run weekly against the latest Plasmate release. Watch the GitHub repo for update notifications. Track how the web is changing for AI agents. Which sites are improving their agent-friendliness. Which are getting worse. Results follow the WebTaskBench Protocol v1.0 — a reproducible methodology open to third-party submissions.
Badges & Certifications
For SOM compliance scoring, badges, and certifications, see somordom.com — the community's SOM compliance tool.