*This is the fifth blog post in a series that examines key theories and concepts about Real-Time Bidding (RTB), and how Datacratic leverages these items as part of our RTB strategies. In this post, we take a look at a bidding strategy that gets the same results as the ROI-based strategy we examined in Part 4 of this series, without needing a clearing-price model.*

In Part 2 of this series, we said that in real-time bidding, we should “bid truthfully”. In other words, you should bid whatever it is worth to you to win. To compute this truthful value, given a target cost per action (CPA) for a campaign, we said you could just multiply that target by the computed probability of seeing an action after the impression, and that would give you your bid value.

We added that by calculating an expected cost of winning an auction, you could compute the expected surplus for that auction and that to pace your spending efficiently, you would only bid truthfully when this expected surplus was above some threshold value. Otherwise, you would not bid at all. This threshold value would be the output of a closed-loop pace control system (described in Part 1), whose job it is to keep the spend rate close to some target.

In Part 4 of this series, we then showed that in fact, the second claim of Part 2 was not optimal, and that you should set an expected return-on-investment (ROI) threshold, instead of an expected surplus threshold.

In this post, Part 5 of the series, we show that the meaning of “bidding truthfully” can be more slippery than expected, and that you can get the same results as an ROI-based pacing strategy with a perfect expected-cost model, without even needing to use an expected-cost model.

## Same but Different

Let’s say you’ve implemented the ROI-based strategy described in Part 4. You have an opportunity to bid on an impression that you compute has a 0.1% chance of resulting in an action, such as a click or conversion. The target CPA is $1, so you’re willing to bid 1000 microdollars. Your pacing system tells you that right now, the minimum expected ROI you should accept is 50%. Do you bid or not?

The answer depends on how much you think the impression will cost: if you think it will cost less than 666 micros, then yes, because any more than that and the expected ROI will be less than 50%.

Now say that you estimate that the cost of winning this auction will be 500 micros (pre-auction expected ROI of 100%) so you bid the 1000 micros and… too bad, your cost estimate was off by 50% and you paid 750 micros. Your post-auction expected ROI is now 33%, which is less than you were willing to accept before the auction. Bummer.

But wait, given that you were willing to bid 1000 micros if the cost was less than 666 micros and nothing otherwise, then you don’t actually need to estimate the cost and run the risk of being wrong: you could just bid 666 micros. The cost is determined by the next-highest bid:

- If it’s lower than 666: You win, and you pay less than 666 for something which was worth 1000 to you (ROI is greater than 50%, as desired!)
- If it’s higher than 666: You lose, and you pay nothing (ROI not well defined).

Essentially, the auction mechanism is computing the cost for you and always coming up with the right answer.

Let’s switch to symbols: under the scheme described in Part 4 (let’s call it the “truth-or-nothing” bidding policy), you’re willing to bid your expected value (V) when the expected ROI is X or higher. Otherwise, you would bid nothing. If C is the expected cost (and assuming that C < V otherwise we wouldn’t be bidding, so the P(win) is 1), then the expected ROI is (V-C)/C, and the ROI is X or higher when the cost is at, or lower than, V/(X+1).

Given the mechanics of second-price auctions, if you happen to have a great cost estimator and are willing to bid some amount B when the cost C is less than or equal to D then, assuming B>D, you should be always be willing to bid D: you will win and lose the exact same auctions at exactly the same cost. We can call the policy of bidding at D when you were willing to bid at B the *“shaded” bidding policy*, because bid-shading is the technical term for bidding lower than you’re willing to pay. So the following bidding policies are equivalent:

Here’s a rough proof for all costs:

Putting it all together, given that X is always greater than 0 (we don’t accept a negative ROI), the following bidding policies are equivalent:

The big surprise here is that the shaded bid policy gives you the same results as the truth-or-nothing policy, and *you don’t actually need to compute an expected cost at all*. In fact, you get the same outcomes bidding with the shaded policy as you do with the truth-or-nothing policy *if you have a perfect cost model*.

If you have a less-than-perfect cost model, the truth-or-nothing policy could perform much worse. Given that no one has a perfect cost model, and that building even a mediocre one is a lot of work, the shaded policy is clearly a big improvement!

How should we reconcile this with the theory that bidding truthfully is the surplus-optimal thing to do in a single auction? The proof of this optimally boils down to the fact that for a given single auction:

- If you bid higher than your true value, you might overpay (i.e. get a negative surplus)
- If you bid lower, you might lose out on an opportunity to get what you want at a lower cost than its value (i.e. get a positive surplus).

In a context like Real-Time Bidding (RTB), though, where you spread out your spend over millions of auctions, it’s OK to miss an opportunity to get a bit of positive surplus if there’s a higher-ROI opportunity coming up, so shading your bids is not a bad idea. There’s another problem with bidding truthfully, though, which comes up if you don’t actually know the true value of winning.

## The Lower the Better

The problem we laid out in Part 2 of this series was one in which you had a set budget to spend over a given time period, and you were trying to get actions that were worth a certain known amount. In direct-response campaigns, sometimes there is a concrete, known target CPA. But in something like a branding campaign, it can be a little fuzzier: the target CPA is not actually the value that the advertiser ascribes to getting an action - it comes out of comparisons with other channels/campaigns (on an ROI basis). In an arbitrage or variable-margin context, the "target" CPA is a maximum allowable CPA: no one will complain if the CPA is lower than the target. So in fact, you often just want the lowest CPA you can get, and the hard constraint is to spend the media budget or achieve the delivery objective. In a case like this, it’s actually very difficult to bid “truthfully”.

Conveniently, though, the result above yields a very simple approach to this type of situation. The shaded policy described above calls for bidding V/(X+1) where V is the target CPA * P(action) and X is the minimum expected ROI you’re willing to accept, according to a closed-loop pace controller, as described in Part 1. The pace controller doesn’t really know or care about ROI, though, it just outputs a control value that correlates well with spend rate, so that it can keep the pace on track. That means that we can rearrange the equations a little bit to say that for every auction:

and take the output of the pace controller to be K instead of X.

Bidding in this way spends the budget while minimizing the CPA, given the constraints of liquidity and the availability of a good P(action) predictor. Given its lack of reference to ROI, value or cost, it even works when the “budget” is not dollars to be spent but impressions or actions to be bought (i.e. an arbitrage situation rather than a branding campaign). The only consideration if there is a known target CPA is to ensure that the system is able to push the CPA below the target; if it isn’t, then there is not enough liquidity to make the campaign work under the given constraints.

## Conclusion

This post is Part 5 of a so far five-part series in which we’ve described the evolution of Datacratic’s real-time bidding algorithm over time. We started with a closed-loop control system, which has been a constant feature of our system over the years, and we’ve used it to modulate our bidding behavior according to an ever-more sophisticated economic understanding of how to succeed in an RTB environment. The latest iteration of our algorithm is deceptively simple but by this point, theoretically quite well-grounded. Throughout this series, we haven’t put a lot of emphasis on how we actually compute the probability that an impressions being bought will result in some user action like a click or conversion. But clearly, this is where the biggest value-add of our system lies, and we will be describing the internals of that part of our system on this blog in the near future.

*Make sure to check out our other blog posts as part of our Peeking Into The Black Box Series: *

*Part 1 - RTB Pacing - Is Everyone Doing It Wrong?: How we apply control theory to pace our spending**Part 2 - Datacratic's RTB Algorithms: The statistical and economic theory that underlies Datacratic's real-time bidding strategies**Part 3 - Algorithm Meets World:**How the algorithm we mention in Part 2 works in real-world situations**Part 4 - Beyond Surplus: Why you should use ROI, instead of surplus, to choose when to bid*

*Banner image source: Unsplash*