Why it exists

Comics can appear on more than one registry. Because registries are independent and don't share a namespace, the same comic may have different id values on different registries (e.g. mycomic on one, my-comic on another). Without a shared identity, client applications cannot tell whether two comic objects from different registries represent the same work.

canonical_id solves this by deriving a stable fingerprint from the first page's image file. Every registry that indexes the same comic will scrape the same first page, producing the same hash — regardless of what id they assigned or which registry they are.

Algorithm

The canonical_id is a SHA-256 hash of the normalized first-page image URL.

Step 1 — Obtain the first page's image URL

This is the direct URL of the image file served on page 1 of the comic (i.e. the image_url field from the first ComicPage object), not the URL of the webpage containing it.

Example: https://www.mycomic.com/images/chapter-1-page-001.jpg?cache=1

Step 2 — Normalize the URL

Apply the following transformations in order:

  1. Lowercase the entire URL.
  2. Remove query string and fragment — discard everything from the first ? or # character onwards. CDN signing tokens and cache-busters change over time; the path alone identifies the image.
  3. Remove the protocol prefix — strip https:// or http://.
  4. Remove the www. prefix if present.
  5. Remove trailing slashes.

Continuing the example:

Inputhttps://www.mycomic.com/images/chapter-1-page-001.jpg?cache=1
Lowercasehttps://www.mycomic.com/images/chapter-1-page-001.jpg?cache=1
Strip queryhttps://www.mycomic.com/images/chapter-1-page-0011.jpg
Strip protocolwww.mycomic.com/images/chapter-1-page-001.jpg
Strip www.mycomic.com/images/chapter-1-page-001.jpg
Strip trailing /mycomic.com/images/chapter-1-page-001.jpg (unchanged)

Step 3 — Hash

Compute the SHA-256 hash of the normalized string (UTF-8 encoded). Represent the result as a 64-character lowercase hexadecimal string.

canonical_id = SHA256("mycomic.com/images/chapter-1-page-001.jpg")
             → "e7c3f1a2…" (64 hex chars)

Setting canonical_id in your registry

If you are building a custom registry implementation, you must compute and store canonical_id yourself. Use the canonical_id generator to verify your implementation produces the correct hash for a given image URL.

Generator tool

Paste any first-page image URL into the canonical_id generator to compute the hash instantly in your browser. No data is sent to a server.