Arbitrage betting data, explained: how live arb works
Arbitrage is a relationship between prices across books and the exchange, not a single number. Here is the data behind it, how a real arb is detected, and why most of the signal is noise.
James··7 min read
Arbitrage means backing every outcome of an event at prices that lock in a margin whatever the result. You do that either across different bookmakers, or by backing at a book and laying the same selection on an exchange. The data behind it is many books' prices plus exchange lay prices, fresh and complete enough to act on. Miss any part of that and the opportunity you think you see is not really there. This is the overview of what that data is, how a real arb is detected, and why the signal is far rarer than the search results suggest.
What is arbitrage betting, in data terms?
An arb is a set of prices, not a single number. To cover a football match you need a back price for every outcome: home, draw and away. If those prices come from three different books and they combine favourably, you can stake each outcome so that one of them always returns more than your total outlay. The result of the match no longer matters to your position.
The two-sided version is cleaner and more common in practice. You back a selection at one book and lay the same selection on an exchange. If the back price is higher than the lay price, and the lay has enough liquidity to fill, the pair locks a margin. That is why the exchange lay side is not an optional extra: for most arbs it is one half of the trade. The distinction between this and betting on prices you merely think are wrong is covered in value betting vs arbitrage, with the precise terms in the value betting vs arbitrage definition.
How is an arb detected?
An arb exists when the implied probabilities of a complete set of outcomes sum to less than 100%. Every decimal price implies a probability: divide 1 by the price. A price of 2.00 implies 50%, a price of 4.00 implies 25%. Add the implied probabilities for every outcome of the event. If the total is below 100%, the market as a whole is priced generously enough that backing all outcomes returns a profit.
A worked example makes it concrete. Take the best back price available for each outcome of a three-way match:
Outcome
Best back price
Implied probability (1 / price)
Home
2.60
38.5%
Draw
3.70
27.0%
Away
3.20
31.3%
Total
96.8%
Illustrative only. A sum below 100% is the arithmetic signature of an arb; 96.8% implies roughly a 3.2% margin before stakes and rounding.
The total is 96.8%, below 100, so the three prices together imply an arb. In the back-and-lay version the same idea applies to two prices: the arb is real when the back price exceeds the exchange lay price by enough to cover the exchange commission. Either way, detection is arithmetic. The hard part is not the sum. It is being sure the prices you fed into the sum are true.
Why does the data decide whether the signal is real?
The arithmetic is trivial; the data quality is everything. A sum below 100% built on a stale or missing price is a ghost: it looks like an arb and disappears the moment you try to place it. Three properties of the data separate a real signal from a ghost.
Coverage breadth: an arb only appears if you can see both sides of it. Miss the one book pricing an outcome generously, or miss the exchange, and the sum never crosses the line. This is why breadth matters: OddsRelay carries 60+ UK books with bet365 included, and lay prices from three exchanges (Betfair, Smarkets, Matchbook), with coverage expanding into the domestic South African and Nigerian books the large aggregators skip.
Freshness: prices move in seconds, and an arb is the first thing to vanish when they do. A price seen two minutes ago may already be gone. Our honest posture is pre-match polling on roughly a few-second cycle, which suits pre-match arbitrage; we do not claim in-play streaming, because that is not what we ship.
Completeness: a market with a dropped selection cannot be summed correctly. If the draw price is missing from a three-way market, you cannot know whether the set arbs. A feed that occasionally loses selections produces both false positives and silent misses.
The exchange lay side deserves its own emphasis. A back price alone tells you nothing about whether you can close the position. The lay price, and crucially the liquidity behind it, decides whether the arb is fillable or merely theoretical. A generous lay price with no money behind it is not a trade. What a usable feed does with these three properties is the subject of what makes an arb feed usable.
What does a matched arb row look like?
A usable arb row carries both sides of the trade and the numbers that qualify it. Rather than a raw back price, each row pairs the back price at one book with the best opposing price, here an exchange lay, and attaches a rating and the qualifying loss. This is the shape of a single two-sided row (illustrative, not live data):
One arb row · back at a book vs best opposing lay · illustrative
The back price exceeds the lay price, so rating sits above 100 and qualifying_loss turns positive: the row locks a small margin rather than a small cost. The liquidity figure is what tells you the lay is real and fillable, not a phantom price. A raw prices API gives you the back block and nothing else; the paired lay, the rating and the sign of qualifying_loss are the product. If you want to see how these rows feed a detector, how to build an arbitrage scanner walks the loop.
How common are real arbs, honestly?
Genuine arbs are uncommon and short-lived, and no honest source will tell you otherwise. Bookmakers price to avoid them, so when one appears it is usually small, and it is gone in seconds as prices adjust. A feed that reports arbs constantly is far more likely to be showing you stale prices than a genuine edge. Rarity is the normal state, not a fault in the data.
Account limits are the constraint people underestimate. A book that spots consistent arbing will cut stakes or close the account, which shortens the useful life of any single account regardless of data quality. This is a structural reality of the activity, not something a feed removes. We describe what the data delivers, never a promise about outcomes.
What should an arbitrage data feed give you?
An arbitrage feed's job is to let you detect real opportunities instead of ghosts. That means delivering both sides of every candidate trade, fresh and complete, with the qualifying numbers already computed, so your scanner spends its time on genuine signals rather than on cleaning data. In concrete terms:
Both sides paired: each row carries the back price and the best opposing price, whether that is an exchange lay or another book, not a raw price you must match yourself.
Exchange lay with liquidity: lay prices from Betfair, Smarkets and Matchbook, with a liquidity figure so you can tell a fillable arb from a phantom one.
Breadth that surfaces the arb: 60+ UK books with bet365 included, plus domestic books the big aggregators skip, because an arb you cannot see is an arb you cannot take.
Freshness you can check: a published cycle and freshness stamps, so a rating above 100 reflects prices that are still live rather than a minute stale.
Qualifying numbers precomputed: rating and qualifying_loss on every row, so detection is a filter, not a build project.
A feed that does these things turns arbitrage from a data-cleaning problem into a filtering problem. You still face rarity and account limits, which no supplier can wish away. What you no longer face is chasing prices that were never really there. OddsRelay delivers bet365 and 60+ UK books matched against three exchanges, and it powers a leading UK matched-betting platform today.
See it before you build on it
The honest way to judge an arbitrage feed is to look at the coverage and freshness for yourself rather than take a claim on trust. You can see what is live right now on the coverage dashboard, and put the real matched rows through your own scanner with a free trial key: the full UK feed, bet365 included, back prices paired with exchange lay. Rarity and account limits are yours to manage. Clean, complete data so you act on real arbs instead of ghosts is what we supply.
James is the founder of OddsRelay — the odds-data feed behind matched betting, arbitrage and odds-comparison products: 60+ UK bookmakers with bet365 included, matched against exchange lay prices and delivered as one clean, documented API. He writes here about how that data layer actually behaves — coverage, matching, freshness and the trade-offs — from the side that builds and runs it. The same feed powers a leading UK matched-betting platform today.
18+ · Data product for licensed operators. Please gamble responsibly.