Content & Marketing
Blog Writer
Blog Writer creates polished, SEO-aware blog drafts that are ready for editorial review and fast publication. It can shape the angle, metadata, structure, and body copy around a primary keyword without turning the result into generic filler. Marketing teams, content operators, agencies, and founders benefit when they need a dependable drafting engine for thought leadership, case studies, and practical guides. The skill is designed to help produce content that ranks, reads well, and stays aligned to business intent. What makes it production-grade is the combination of strategy and structure: title and slug generation, meta description support, body copy, and internal-link thinking in one output. It behaves like a content system component, not a one-off writing prompt.
One-Time Purchase
$19.99
title: "API Rate Limiting — 5 Patterns That Actually Scale" slug: api-rate-limiting-patterns-that-scale meta_description: "Five production-proven API rate limiting patterns with code examples. From token buckets to sliding windows — pick the right approach for your traffic profile." target_keyword: "API rate limiting" secondary_keywords:
- "rate limiting patterns"
- "token bucket algorithm"
- "API throttling best practices" word_count: 2847 estimated_read_time: "12 min" category: "Engineering" tags:
- api-design
- rate-limiting
- scalability cta_type: newsletter
API Rate Limiting — 5 Patterns That Actually Scale
Editor's read
Five rate-limiting patterns, ranked by how well they survive contact with real traffic. The headline is unromantic: token bucket wins for most APIs, sliding-window log wins for fairness-sensitive APIs, and fixed-window counters should die in a fire. The full piece walks through code, failure modes, and how each pattern behaves during a retry storm.
At a glance
Your API handles 10,000 requests per second until it doesn't. One aggressive scraper, one retry storm from a misbehaving client, and your database connection pool is exhausted before your on-call engineer finishes reading the PagerDuty alert. Rate limiting isn't optional infrastructure — it's the difference between a graceful 429 response and a cascading outage.
This post walks through the five patterns we have actually run in production at Northbeam Analytics — when each one is the right choice, when it stops being one, and the small implementation details that turn a clean algorithm into a 2 AM page.
TL;DR — Pattern Selection
Default choice
Token bucket
Burst-tolerant, sustained ceiling, easy to reason about
When fairness matters
Sliding-window log
Exact, expensive, worth it for billing-grade limits
1. Token Bucket — The Default Choice for Most APIs
The token bucket algorithm is the most widely deployed rate-limiting pattern for a reason: it handles bursts gracefully while enforcing a sustained rate ceiling. Tokens accumulate at a fixed rate (say, 100/second), and each request consumes one token. When the bucket is empty, requests receive a 429 until tokens replenish.
The implementation is short, the failure modes are well understood, and Redis MULTI/EXEC makes it safe under concurrency. The one trap: setting the bucket too large to "be generous" turns a rate-limit into an availability problem, because a single bad actor can drain the bucket and starve well-behaved clients before the refill catches up.
Key takeaway
If you don't have a specific reason to pick something else, pick token bucket. The "specific reason" is usually one of: you need exact fairness (use sliding-window log), you need fixed accounting periods for billing (use sliding-window counter), or you need per-user concurrency limits rather than per-second limits (use leaky bucket).
2. Sliding-Window Log — Exact, Expensive, Worth It Sometimes
Same per-second ceiling as token bucket, but eliminates the edge-of-window double-spend that fixed-window counters allow. The cost is one Redis ZSET operation per request — measurable at 10K+ RPS but worth it for fairness-sensitive APIs.
The sliding-window log keeps a sorted set of recent request timestamps and counts how many fall inside the window for each new request. It is the most accurate of the five patterns and the most expensive. We run it on the public API where customers are billed per call; we do not run it on the internal RPC mesh where token bucket is fine.
3. Fixed-Window Counter — Don't
| Pattern | Accuracy | Memory | Production-ready? |
|---|---|---|---|
| Token bucket | Good | Low | Yes |
| Sliding-window log | Exact | High | Yes |
| Sliding-window counter | Good | Low | Yes |
| Leaky bucket | Good for concurrency | Low | Yes |
| Fixed-window counter | Burst-prone | Lowest | Tutorials only |
Fixed-window counters allow 2× the configured rate at the window boundary. We learned this the painful way during a retry storm: a misbehaving client got through 200 requests in two seconds against a "100 requests per minute" limit because it caught the second half of one window and the first half of the next. Use a sliding-window counter instead — same memory profile, no boundary attack.
4. Sliding-Window Counter — The Pragmatic Compromise
The sliding-window counter approximates the sliding-window log with two integer counters and a weighted average. Memory and per-request work look like fixed-window counter; the boundary behavior looks like sliding-window log. This is the pattern we recommend when token bucket isn't a fit and the log is too expensive.
5. Leaky Bucket — Concurrency, Not Throughput
Leaky bucket is often described as a token-bucket variant; it isn't. It enforces an outflow rate, not an arrival rate, and is the right primitive when the constraint is "this downstream can only handle N concurrent calls" rather than "this client gets N requests per second." We use it in front of a payment processor that throttles us at 50 in-flight transactions.
Pitfalls We've Hit
Pitfall 1 — counting before authenticating
Counting against a token-bucket key before the request is authenticated lets an attacker exhaust other users' quotas by spoofing identifiers. Rate-limit by the authenticated identity, not the client-provided one.
Pitfall 2 — Redis as a SPOF
Every pattern here assumes Redis. When Redis is degraded, "fail open" or "fail closed" is a product decision, not an implementation detail. Default to fail open with a circuit breaker and a noisy alert; defaulting to fail closed turns a Redis blip into an outage.
Pitfall 3 — bucket sizing by gut feel
The bucket size and refill rate should fall out of an actual measurement of legitimate burst behavior at the 99th percentile, not a round number that "feels generous." Generous is the same word every team uses right before a self-inflicted outage.
Internal Linking Suggestions
| Anchor text | Target page | Location in post |
|---|---|---|
| connection pool exhaustion | Database connection management guide | Intro, paragraph 1 |
| retry storms | Circuit breaker pattern article | Pitfalls, callout 1 |
| Redis MULTI/EXEC | Atomic Redis ops reference | Section 1, paragraph 2 |
Social Distribution Copy
Twitter/X (under 280 chars): Most rate-limiting tutorials stop at token buckets. Five patterns we've actually run in production at 10K+ RPS — including the one that turned a retry storm into a quiet alert instead of a page. Thread.
LinkedIn (under 700 chars): Rate limiting seems simple until you're debugging a cascading failure at 2 AM. After implementing all five major patterns across different production systems, here's the unromantic ranking — including the one pattern that should not exist outside of tutorials, and the one most teams reach for too early.
This sample illustrates the skill's output format. Northbeam Analytics is a fictional company used recurringly across these sample outputs. Real client data is never included in sample outputs.
View full sample →
All sales final. No refunds on digital products.
Includes support for Claude Code, Codex, OpenClaw, and Google Antigravity in the same license.
Also in Content Publishing
Bundle price: $55. Compare this skill with the full workflow bundle or Pro access.
Best for
Marketing and DevRel teams maintaining a content calendar of 4–12 posts per month, founders building organic SEO presence in a category their buyers actually search, and agency operators producing client deliverables where keyword research and on-page structure both need to land. Most useful when the topic is one a competent human could write — the skill amplifies expertise, it doesn’t fabricate it.
Not ideal for
Personal-essay or opinion content where the value is a specific human perspective and the structured SEO scaffolding gets in the way. Also a poor fit for technical deep-dives in domains where the writer has no firsthand experience — without expert review, the draft will read confident in places it should not.
Included in this purchase
- Claude Code, Codex, OpenClaw, and Google Antigravity skill files.
- Setup guidance for the right adapter in your workspace.
- One-time license for the purchased skill version.
Setup
Plan for a short copy-and-configure setup in your preferred agent workspace. No custom integration is required for the skill file itself.
Related Skills
$19.99
One-time license
$19.99
One-time license
$19.99
One-time license
Future Updates
This purchase includes the current version of the skill. If you want future adapter updates — meaning compatibility and packaging updates as supported platforms evolve — plus new catalog additions included automatically, upgrade to Pro.