canonical_id — Cross-Registry Identity
A stable, registry-agnostic identifier derived from the first page of a comic.
Why it exists
Comics can appear on more than one registry. Because registries are independent and don't share a
namespace, the same comic may have different id values on different registries
(e.g. mycomic on one, my-comic on another). Without a shared identity,
client applications cannot tell whether two comic objects from different registries represent the same
work.
canonical_id solves this by deriving a stable fingerprint from the first page's
image file. Every registry that indexes the same comic will scrape the same first page, producing
the same hash — regardless of what id they assigned or which registry they are.
Algorithm
The canonical_id is a SHA-256 hash of the normalized first-page image URL.
Step 1 — Obtain the first page's image URL
This is the direct URL of the image file served on page 1 of the comic
(i.e. the image_url field from the first ComicPage object), not
the URL of the webpage containing it.
Example: https://www.mycomic.com/images/chapter-1-page-001.jpg?cache=1
Step 2 — Normalize the URL
Apply the following transformations in order:
- Lowercase the entire URL.
-
Remove query string and fragment — discard everything from the first
?or#character onwards. CDN signing tokens and cache-busters change over time; the path alone identifies the image. - Remove the protocol prefix — strip
https://orhttp://. - Remove the
www.prefix if present. - Remove trailing slashes.
Continuing the example:
| Input | https://www.mycomic.com/images/chapter-1-page-001.jpg?cache=1 |
| Lowercase | https://www.mycomic.com/images/chapter-1-page-001.jpg?cache=1 |
| Strip query | https://www.mycomic.com/images/chapter-1-page-0011.jpg |
| Strip protocol | www.mycomic.com/images/chapter-1-page-001.jpg |
| Strip www. | mycomic.com/images/chapter-1-page-001.jpg |
| Strip trailing / | mycomic.com/images/chapter-1-page-001.jpg (unchanged) |
Step 3 — Hash
Compute the SHA-256 hash of the normalized string (UTF-8 encoded). Represent the result as a 64-character lowercase hexadecimal string.
canonical_id = SHA256("mycomic.com/images/chapter-1-page-001.jpg")
→ "e7c3f1a2…" (64 hex chars)
Setting canonical_id in your registry
If you are building a custom registry implementation, you must compute and store
canonical_id yourself. Use the
canonical_id generator to verify your implementation
produces the correct hash for a given image URL.
Generator tool
Paste any first-page image URL into the canonical_id generator to compute the hash instantly in your browser. No data is sent to a server.