Fundamentals
How bookmaker odds data is structured
Odds data is a four-level hierarchy: event to market to selection to price, plus bookmaker, region and freshness metadata. Once the shape clicks, integrating a feed is straightforward.
· 5 min read
Different books name the same event and selection differently, so their prices don't line up on their own. This is how you normalise odds across bookmakers, and where it goes wrong.
Normalising odds means lining up the same event, market and selection across books that name them differently, so their prices are genuinely comparable. It is the hard part of any comparison or arbitrage product. Get it right and two prices for the same outcome sit side by side. Get it wrong and you produce false matches that look like opportunities and aren't. This guide covers the problem, the approach, and the pitfalls that cost teams the most time.
Because every book models the same reality with its own names, IDs and market structures. The prices are decimal numbers that compare cleanly. Everything around them does not. Before you can say "these two prices are for the same thing," you have to reconcile three separate layers, and each one drifts independently between books.
Man Utd v Man City, another Manchester United vs Manchester City, a third Man. United - Man. City. Kick-off times can be a minute apart, and abbreviations, punctuation and word order all vary.Over 2.5 Goals each appear under slightly different labels. Draw versus Tie, or a player's name spelled two ways, are enough to break a naive join.match_odds is another's three-way 1X2; totals lines, handicap notations and each-way terms are all modelled differently. Two books can even split the same market at different line values.None of this is exotic. It is the ordinary state of raw odds data from more than one source. For how a single book's data is laid out before any of this begins, see how odds data is structured.
It involves assigning a canonical identifier to each event, market and selection, then mapping every book's own version onto that canonical form. Once two books' prices both point at the same canonical event, market and selection, they are comparable. The canonical layer is the join key you don't get for free.
In practice that is three matching problems stacked on top of each other:
Man Utd v Man City and book B's Manchester United vs Manchester City are the same fixture, using team identity, competition and kick-off time together rather than any one alone.match_odds and book B's 1X2 are the same market, and align totals and handicap lines so the same line compares to the same line.Over 2.5 from one book pairs with Over 2.5 from the other, not with Under.The output is deceptively simple: every price now carries a stable event, market and selection you can group and compare on. That grouping is what a comparison table, an oddsmatcher or an arbitrage scanner is built on. It is also what comparison sites use data for at their core.
The dangerous failures are the ones that look like success. A missed match costs you a row you never see. A false match shows you an opportunity that does not exist, and that is far worse, because your product acts on it. Two pitfalls cause most of the damage.
The first pitfall is false positives: aligning two selections that aren't the same. A same-name player in two different fixtures, a totals line read at the wrong value, or a market matched three-way when it should be two-way, all yield a comparison that is arithmetically valid and factually wrong.
The second is near-duplicate events: the same fixture appearing twice with subtly different metadata, or two genuinely different fixtures that look almost identical (a first-team match and a reserve match, or two legs of a double-header). Collapse the wrong pair and you merge unrelated prices; split the right pair and you scatter one event across two canonical IDs. Both corrupt the comparison quietly.
These are not one-off bugs. Fixtures, team names and market layouts change continuously, so the matching logic needs constant maintenance to stay correct. That maintenance is the real, recurring cost of doing normalisation yourself.
Yes, and that is the point of buying one rather than building it. A normalised feed returns consistent identifiers across every book, so you receive prices that are already lined up on a shared event, market and selection. The reconciliation work described above happens before the data reaches you. You do not resolve names, you do not maintain a matcher, and you do not chase false positives.
Concretely, the same selection from two different books arrives sharing one canonical selection identifier, so grouping them is a plain key match rather than a fuzzy guess:
[
{
"event": "Man Utd vs Man City",
"market": "match_odds",
"selection": "man_utd",
"back": { "bookmaker": "bet365", "odds": 2.50 }
},
{
"event": "Man Utd vs Man City",
"market": "match_odds",
"selection": "man_utd",
"back": { "bookmaker": "william_hill", "odds": 2.45 }
}
// both rows carry the same canonical event/market/selection,
// so comparing bet365 to william_hill is a key match, not a guess
]OddsRelay goes one step further and pairs each back price with an exchange lay price for the same canonical selection, so the row is ready for matched betting or arbitrage on arrival:
{
"event": "Man Utd vs Man City",
"market": "match_odds",
"selection": "man_utd",
"back": { "bookmaker": "bet365", "odds": 2.50 },
"lay": { "exchange": "betfair", "odds": 2.54, "liquidity": 1620 },
"rating": 97.8,
"qualifying_loss": -0.14
// ... region, feed_type and freshness fields elided
}The back and lay blocks only pair correctly because both sides resolve to the same canonical selection first. The rating and qualifying_loss fields exist because that pairing is already done. Coverage is 60+ UK books with bet365 included, matched against three exchanges (Betfair, Smarkets and Matchbook), with coverage built to extend into the domestic South African and Nigerian books that the large aggregators skip. The full field list is in the response envelope and the API docs.
One thing worth stating plainly: this describes what the feed returns, not how the matching is performed. The matcher is the part that is genuinely hard to build and keep correct, and it is what you are licensing when you take a normalised feed instead of assembling one.
Normalising odds is the work of making prices from different books genuinely comparable: one canonical identity for each event, market and selection, with false positives and near-duplicate events treated as the primary risk. You can build and maintain that yourself, or take a feed that returns consistent identifiers so the prices arrive already lined up. OddsRelay does the second, and it powers a leading UK matched-betting platform today. A free trial gives you the full UK feed, matched against exchange lay prices, so you can see normalised rows before you write a line of matching code.
Written by
Founder, OddsRelay
James is the founder of OddsRelay — the odds-data feed behind matched betting, arbitrage and odds-comparison products: 60+ UK bookmakers with bet365 included, matched against exchange lay prices and delivered as one clean, documented API. He writes here about how that data layer actually behaves — coverage, matching, freshness and the trade-offs — from the side that builds and runs it. The same feed powers a leading UK matched-betting platform today.
Part of the Fundamentals cluster
What is an odds API? A 2026 guide for builders18+ · Data product for licensed operators. Please gamble responsibly.
Fundamentals
Odds data is a four-level hierarchy: event to market to selection to price, plus bookmaker, region and freshness metadata. Once the shape clicks, integrating a feed is straightforward.
· 5 min read
Fundamentals
An odds comparison site is a best-price table, and its data layer is the whole product. Here is what that layer has to do, and why normalisation, not raw prices, is the part that decides whether it works.
· 6 min read
Fundamentals
What every field in a good odds API response is for, walked one at a time: the event/market/selection keys, the back and lay blocks, rating and qualifying_loss, and the envelope around them.
· 6 min read