Standardising data
Zumata uses AI to help hotel booking platforms like Expedia manage listings efficiently
To the layperson, a hotel booking platform like Expedia is simply a marketplace that aggregates a list of available hotels based on their personal preferences. But what goes on in the backend, specifically the distribution chain, is actually pretty messy.
The hotel business essentially operates like most conventional industries. On the first layer, the hotels are the manufacturers of the product, then the middlemen – wholesalers and distributors – stores, organises and sells the inventory to the hotel booking platforms, which are the end retailers like platforms such as Expedia.
“The middlemen market is actually defined by old technology and zero standardisation and really bad data quality. For example, supplier number one has a hotel called the Changi Crown Plaza, supplier number two has Hotel Changi Crown Plaza at Singapore. It’s the same hotel and you and I can figure that out. But programmatically that’s very difficult,” says Josh Ziegler, CEO and co-founder of Zumata, an AI-powered hotel distribution company.
Each hotel supplier has a different naming convention for the hotel — they have different addresses and different images. Thus, mapping the data accurately is challenging.
Having run discounts publisher Good Times for seven years in Hong Kong, Ziegler received feedback from numerous corporate clients asking for discounts on travel products. He then looked into how to build a travel programme and realised that, in order to achieve continual discounts, he had to work with multiple suppliers and map the data together to compare supply and retail prices.
The concept of Zumata was then born. In a nutshell, it aims to normalise and standardise disparate sets of hotel data. Having a unified set of data gives hotel booking platforms the ability to access a larger inventory and reap better margins, since they are able to compare prices from more suppliers.
“They can then dynamically adjust their prices and their margins and steer customers to the ones that make them the most money,” he says.
Standardising data sets
There are two layers to Zumata’s system of mapping data. First, it uses programmatic tech to find and match keywords. Then the expert systems kick in to exclude words such as brand name or the word ‘hotel’. Machine learning and deep learning are then applied so that patterns can be identified more effectively and quickly.
Ziegler says the AI is trained to think, act, and understand like a human. So the system of quantifying and arranging data sets becomes more concept-based than keyword-based.
This picture may look rather abstract, but to provide a real world example, imagine that the 15 suppliers offer rates at the Changi Crown Plaza. All of their data points could be totally different. So what Zumata will do is map all these data to one common name, then match every other of these 15 suppliers into that unique classifier before passing it to the end user.
Unifying the data
For the end user, it is imperative for them to be able to search for hotels using generic words, and this is where Zumata’s Natural Language Processing (NLP) tech comes in.
“So you can either speak or you can type: ‘I want a hotel that’s within walking distance to the Eiffel tower, it should have a gym and let’s say, good value for money’. So you can type in those types of things. And what it is going to do is parse that sentence, to understand what is the intent of what you’re asking for,” says Ziegler.
“So Zumata is going to say to recognise the Eiffel tower as a point of interest. And it says basically ‘here is a radius that is walking distance, so we have narrowed it down to those hotels’,” he says.
“Number 2, you have asked for a gym. So you probably want to see what the gym looks like. Maybe you’d like to run on the treadmill or lift weights so if you see an image of the gym, you will be able to decide if that’s the gym you’re looking for. And so we present the image of the gym straight away. Image recognition basically tags every image that we have – we have 300 million images – and then compares them to concepts that you were asking for.”
Finally, with regards to “value for money”, since there is no dollar amount specified, Zumata trawls through reviews, blogs and news articles where people talk about their experience at these hotels and the value they received. The NLP is then able to filter out all the irrelevant hotels and match the right ones according to the user’s intent, organising them from most relevant to least relevant.
NLP is also incorporated into the customer service layer. Instead of human customer officers handling all the queries, chatbots embedded with NLP capabilities can answer generalised queries, such as hotel cancellation deadlines, more efficiently.
“It’s a much better experience and they (customers) get their answers quickly; and the operator benefits because they have lower overheads. It’s easier to train, you won’t have sick days, it’s on demand and it’s in multiple languages so you don’t have to maintain different customer officers,” says Ziegler.
Solving big problems
Zumata works with Expedia, as well as large hotel distribution companies such as Dhisco, which handles 13 billion hotel transactions a year; and Amadeus, which is used by over 400,000 travel agents.
“These are the largest players in the industry, and we’re solving problems for them. So you can just imagine if it’s a challenge for them, tier 2 and tier 3 players have no idea how to even begin. It’s pervasive across the entire industry,” says Ziegler. Not all travel products suffer from this problem. Ziegler says airlines generally have much cleaner and more standardised data sets. But for car rentals, tours and transfers, organising data is equally messy.
“The most difficult part, on the distribution side, the hotel part is really the technology that exists, so within the travel business of Asia. I would say it is about 20 years behind. So where we are at is a huge leap from where they are today. In Asia it’s much more of an educational process because they are so far behind,” he concludes.