Airbnb Builds a Second Neural Network to Diversify Listings

Homestay broker Airbnb found that the key to creating diversity with its machine learning algorithms is to have one neural network for standard learning and another to specifically diversify the results. Diversification means, in this sense, increasing the variety of options that a user sees when looking to book a stay.
This concept took years to develop because it went directly against the company engineer’s original core belief, that the probability of someone booking a listing could be determined independently of the listings themselves. Airbnb published a technical paper and a blog post detailing the quiet shortcomings of that approach: that a single neural network produces precise but homogeneous set of results.
To increase diversity, Airbnb engineers underwent the process of creating and iterating on additional neural networks. Once success was found by adding a new network per-listing placement in the results page, the engineers once again iterated for practicality and scalability. The result is a search results page made of two neural networks that work together to return a more diverse set of pages, a measurable gain for users.
This new approach produced good results: Airbnb observed an increase of 0.29% in uncanceled bookings and a 0.8% increase in booking value. The increase in booking value provides a reliable proxy to measure the increase in quality, although this isn’t the only target: A 0.4% increase in five-star ratings was an indicator of higher guest satisfaction for the entire trip.
Variety Is the Spice of Life
A neural network is a deep learning model that mimics the human brain and can recognize relationships between data. In the case of Airbnb, it’s great at determining which listing has a higher booking probability based on price, location or reviews, with each piece of criteria weighted differently.
The neural network learns which listings and criteria are more successful and which are not. The graph below shows how Airbnb bookings favor lower prices.
It’s one thing for a neural network to learn the relationships between successful listings and another for it to understand the following warring business principles:
- The majority principle: the majority preference should drive ranking.
- The Pareto principle: user preferences are distributed smoothly with a long tail, roughly binarized by an 80/20 split. (This speaks to me because I am reminded annually that I am the 20% when we book our friends vacation.)
- The pressure is turned up by the importance of making sure the first page speaks to someone because attention wanes after that.
The original neural network results were so skewed to the majority, to quote the technical paper, “it’s the tyranny of the majority.”
Tyranny !== diversity.
N Models Added For Us 20 Percenters
The lack of diversity meant one thing: the OG neural network was very good at identifying and exploiting listings that exactly matched the search criteria. Rather than start over, the original model was kept and N models for N additional listing positions were added for comparison.
Position One: The First New Model
Listing position one only comes centerstage if the original choice, or “position zero,” didn’t work out (and this is true for every subsequent listing). That means position one has an additional input — the listing in position zero that is now dubbed the “antecedent listing.”
All searches where the listing at position zero was booked are discarded for these new training examples for the conditional pairwise booking probability model. For all cases where position zero wasn’t booked, pairs are created based on ‘booked’ and ‘not booked’ listings, similarly to how the original neural network ranks booking probability.
The setup for position two is similar to position one, but now even more information is available. By the time this listing is checked out, neither the listings in position zero nor one were booked, making two antecedents available to compare against. The process repeats N times for N models.
As one may suspect, the theory is simple but not practical in its implementation and has an 𝑂(𝑁 3 ) time complexity.
The Refactor
With the refactor process, the researchers reported in the technical paper, “we start with the N distinct models constructed for each of the N positions and simplify them, one by one.” The result is two models, the original neural network and a new similarity neural network.
Airbnb expects results from both the original neural network and the “similarity network” to be fairly close since they share a large part of the training model. But the expectation is that the similarity model will outperform, since similar listings are downranked and diverse listings are upranked.
Important note: this only works if listing zero and one are different. If they’re identical, then only one antecedent needs to be accounted for, since one listing didn’t count and the goal is d-i-v-e-r-s-i-t-y. Here’s an illustration:
Give the People What They Want
The probe into diversity didn’t start here with this project. This was a multi-year long process at Airbnb that started in 2017 with category-based diversification. First, it was a singular dimension (price, location, amenities) then diversified along multiple dimensions. Failure and failure.
The breakthrough came in 2019 when the model was given the freedom to diversify accordingly. It’s also important to include that the training data itself is biased against diversity. Airbnb expects future training will include richer examples to learn from and enable a virtuous cycle of diversification.