indielist (beta)

About IndieListBot

User-Agent: IndieListBot/1.0 (+https://indielist.io/about/bot)

What we do

indielist is a directory of indie game studios, publishers, and games. To keep listings accurate we periodically fetch publicly available data from sources that explicitly allow third-party access:

What we don't do

Rate limits

We honor source-specific rate limits with conservative buffers — for example, our Steam Storefront request interval is locked at 2.5 seconds with exponential backoff on any 429 response.

How to block us

Add to your robots.txt:

User-agent: IndieListBot
Disallow: /

Removal & corrections

If a listing about you is wrong, or you'd like it removed, email bot@indielist.io. Include the URL and what you'd like changed. Same address for any access concerns.

Why we exist

The indie ecosystem moves fast and the last solid public directory (IndieDB) hasn't kept up. We're trying to fill that gap with structured data plus white-box tooling — for example, every sales estimate is shown alongside the formula that produced it, never as a single black-box number.

FAQ

How often does the bot crawl?
  • Steam reviews — daily (top 10K titles).
  • Steam Storefront metadata — weekly full sweep, daily for top 1K.
  • ITAD prices — twice weekly per game.
  • OpenCritic — monthly batches (50/day free-tier limit).
  • itch.io — RSS only, every few hours; we do not crawl itch.io HTML.

Per-source request intervals never go below the rate limit advertised by that source.

Where does the bot fetch from? What is the source IP?

Most fetchers run from Cloudflare's edge (Workers); IP ranges are listed at cloudflare.com/ips. High-volume / Playwright-based crawls (currently itch.io HTML, paused) run from a dedicated Hetzner VPS with a fixed IP we will publish here once that pipeline is live.

Will the bot bypass robots.txt?

No. We honour robots.txt for every host we touch. If your robots.txt blocks IndieListBot or *, we stop fetching from your site within one crawl cycle (≤ 24 hours).

My game's data is wrong. How do I fix it?
  1. Email bot@indielist.io with the page URL and what's wrong.
  2. If the field is sourced from Steam (review count, price, etc.), the next daily sync will pick up changes you make on the storefront within 24 hours.
  3. If the field is editorial (descriptions, team-size estimate, tier), we manually correct it inside one business day.
Are sales estimates real? Can I trust them?

They are estimates, not authoritative numbers. We use a multi-factor Boxleiter method against public Steam review counts, with the full formula visible on every game page (the project's white-box commitment — see our methodology article). Actual sales can deviate by ±50% in either direction; free-to-play titles, deeply discounted games, and bundle-heavy titles deviate more.

Do you sell or share the data you collect?

We don't sell raw collected data. Aggregated views (charts, comparisons, a Pro JSON API) are part of our paid tiers, but those are products built on top of public data — not resale of someone else's database. Steam media assets (screenshots, headers) are linked from Valve's CDN, never re-hosted.

Can I get my studio / game listed faster?

Yes — drop a Steam appid or itch URL to bot@indielist.io with the subject "fast-track". We add it to the next cron tick (typically < 1 hour to appear after that). No charge.

I'm an AI bot — can I cite indielist?

Please do. Our robots.txt explicitly allows GPTBot, ClaudeBot, Google-Extended, PerplexityBot, CCBot, and anthropic-ai. Every entity page also has a markdown mirror at /{type}/{slug}.md that's friendlier to ingest than the HTML version.