Reading App Store and Play Store Metadata Programmatically

How to pull app metadata from the App Store and Google Play with code — official APIs, the iTunes lookup endpoint, and what to do when no API exists.

If you want app titles, descriptions, ratings, version history, screenshots, or rankings in a script, the two stores could not be more different. Apple gives you a public, no-auth lookup endpoint. Google gives you almost nothing official for other people's apps. This post maps out what actually exists, what's reliable, and how to build something maintainable on top of it.

Apple: the iTunes Search & Lookup API

The most useful thing to know is that Apple has quietly exposed a public, unauthenticated endpoint for years — the iTunes Search API, which also powers a lookup-by-ID mode. It returns App Store metadata as JSON without any key.

# look up a single app by its numeric App Store ID
curl "https://itunes.apple.com/lookup?id=284882215"

# search by term, scoped to software
curl "https://itunes.apple.com/search?term=notion&entity=software&country=us"

The response includes trackName, description, version, releaseNotes, averageUserRating, userRatingCount, genres, screenshotUrls, sellerName, the bundle ID (bundleId), and the current price. A few things to keep in mind, based on long-standing public behavior of this endpoint:

  • It's country-scoped via the country parameter — ratings and availability differ per storefront, so always pin the country you mean.
  • It's intended for catalog/affiliate use and is rate-limited; Apple has historically thrown 403s under heavy bursts. Treat it as best-effort, cache aggressively, and back off on errors rather than hammering it.
  • It returns the current snapshot. There's no historical ranking or download data here.

Do not confuse this with the App Store Connect API, which is OAuth/JWT-authenticated and scoped to apps you own — it's for managing your own metadata, TestFlight, and sales reports, not for researching competitors.

Google Play: no official public metadata API

This is the part that trips people up. Google does not publish an official API for reading arbitrary apps' Play Store listings. The Google Play Developer API exists, but like App Store Connect it's scoped to apps you control (publishing, releases, reviews of your own app).

So for competitor or market data on Play, your realistic options are:

  1. Community scraper libraries. The widely used google-play-scraper (Node) and its Python ports parse the public Play web listing into structured objects — title, description, installs band, score, ratings count, developer, last-updated, and the data-safety section. They work because the data is public HTML/JSON-in-page, but they're inherently fragile: when Google changes the page structure, the library breaks until it's patched. Pin a version and watch the repo.
  2. Direct HTML parsing. You can fetch the listing yourself and parse it, but you're then re-implementing what those libraries already maintain, including Google's habit of embedding data in obfuscated inline JSON blobs.
  3. A data provider. Commercial app-intelligence APIs (and tools like appluck's market intelligence) do the scraping, normalization, and historical tracking for you, which is the difference between a one-off snapshot and a time series you can trust.

Reality check: anything that reads the public Play listing is technically scraping. Keep request volume modest, identify your client honestly where you can, respect robots.txt, and cache. Be aware that storefront terms restrict bulk extraction — see the FAQ.

Normalize before you store anything

The two stores disagree on almost every field name and unit, so define a canonical schema up front and map each source into it. A workable shape:

{
  "platform": "ios | android",
  "app_id": "284882215 | com.example.app",
  "title": "",
  "developer": "",
  "version": "",
  "rating": 4.6,
  "rating_count": 120345,
  "installs": "10,000,000+ (android only)",
  "category": "",
  "updated_at": "2026-06-01",
  "country": "us",
  "fetched_at": "2026-06-16T00:00:00Z"
}

Watch the mismatches: Apple gives you a precise rating count but no install figure; Google gives you an install band (1,000,000+) and never an exact number. Ratings are per-country on iOS and effectively global-ish on Android. Always store country and fetched_at so you can reason about drift later.

Build for change, because the page will change

Treat metadata ingestion like any flaky upstream:

  • Cache and snapshot. Store the raw response alongside the parsed record. When a parser breaks, you can re-process history instead of re-fetching.
  • Schedule, don't poll in a loop. A daily or weekly pull captures version bumps and rating drift without burning rate limits. Track deltas (new version, description rewrite, rating swing) — those deltas are often more interesting than the snapshot.
  • Expect breakage on the Play side. Wrap scrapers in monitoring that alerts when fields go null en masse; that's your signal the page layout moved.
  • Separate "mine" from "theirs." Use App Store Connect / Play Developer APIs for your own apps (richer, authenticated, reliable) and lookup/scrape for everyone else.

If maintaining scrapers isn't the project you want to be in, lean on a tool that already tracks listings and history across both stores — start from the appluck journal for the practical playbooks, or the platform itself for the data.

FAQ

Is there really no key needed for the iTunes lookup endpoint? Correct — itunes.apple.com/lookup and /search are public and unauthenticated. But they're rate-limited and meant for catalog/affiliate use, so cache results and back off on 403s rather than treating it as an unlimited firehose.

Why doesn't Google offer the same thing? Google's official Play APIs are scoped to apps you publish. There's no sanctioned endpoint for arbitrary listings, which is why the ecosystem relies on scraper libraries or commercial data providers for competitor data.

Is scraping the Play Store legal? Reading public pages is a gray area: the data is public, but store terms of service restrict bulk extraction and automated access. Keep volume low, respect robots.txt, don't redistribute raw scraped data, and for anything commercial consider a licensed data provider instead.