GumGum is a company whose platform delivers online advertisements related to the context in which potential customers are already shopping or searching. (For example: it will send advertisements for restaurants in Zurich to someone who has booked a trip to Switzerland.) To manage this granular targeting, it relies on its proprietary machine learning platform, Verity.
“For all of our publishers, we send a list of URLs to Verity,” according to Keith Sader, GumGum’s director of engineering. “Verity essentially enters and categorizes these URLs as different [internal bus] categories. So the IB has tons of taxonomies, automobile-based, apparel-based, entertainment-based. And then that’s how we do our targeting.
Verity’s targeting data is stored in DynamoDB, but the rest of GumGum’s data is stored in managed MySQL and its daily tracking data is stored in ScyllaDB, a database designed for data-intensive applications. Scylla, Sader said, helps his company avoid showing the same ads to the public over and over again, by keeping track of which ads customers have already seen.
“That’s where Scylla comes in for us,” he said. “Scylla is our rate limiter on ad serving.”
In this episode of The New Stack’s Makers podcast, Sader and Dor Laor, CEO and co-founder of ScyllaDB, discuss how GumGum used ScyllaDB to shift more computing resources to its core business and prevent it from repeating ads at audiences who have already seen them, no matter where they travel.
This case study episode of Makers, hosted by TNS Features Writer Heather Joslyn, was sponsored by ScyllaDB.
“Where do we spend our limited funds? »
Before adding ScyllaDB to his stack, Sader said, “We had a Cassandra-based system that some very smart people put together. But Cassandra is counting on you to have a team of engineers to support it.
“That’s great. But like many types of systems, managing Cassandra databases isn’t really what makes our business money.”
GumGum hosted its Cassandra database, installed on Amazon Web Services, on its own — and the drain on resources brought the company’s teams to a crossroads, Sader said. “Where do we spend our limited funds? Are we spending it on Cassandra’s upkeep? Or do we hire someone to do it for us? And that’s really what determined the move from a sort of self-installed, self-managed Cassandra to another vendor.
A central issue for GumGum, Sader said, was ensuring it didn’t overserve consumers, even as they moved around the world. “If you see an ad in one place, we need to make sure that if you travel across the country, you won’t see it again,” he said.
It’s a problem Cassandra solved for her business, he said. Since ScyllaDB replaces Apache Cassandra, it has also prevented over-exploitation in all regions of the world, thus preventing GumGum from losing money.
Along with managing his database for GumGum and other clients, Laor said one benefit ScyllaDB brings is an “always-on” guarantee.
“We have a great legacy of infrastructure that was meant to be resilient,” he said. “For example, each of our implementations has consistent configurable consistency, so you can have multiple replicas.”
Laor added, “Very often organizations have multiple data centers. Sometimes it’s for disaster recovery, sometimes it’s also to shorten latency and be closer to the customer. Database replicas located in geographically distributed data centers, he said, protect against failures in any data center.
See the results
Bringing ScyllaDB to GumGum has not been without its challenges, Sader and Laor said. When ScyllaDB is added to an organization’s stack, Laor said, he likes to start with as small a deployment as possible.
“But in the GumGum case, all of those customers were new processes,” Laor said. So hundreds or thousands of processes, all trying to connect to the database, it’s really a connection storm.
The Scylla team created a private version of their database to work on the problem and eventually solved it: “We had to fiddle with the algorithm and make sure that all [open source] upstream code committers summarize it.
He eventually devised an admission control mechanism that measures the number of parallel queries being processed by the distributed database and slows down queries that arrive for the first time from a new process. “We tried to have complexity on our side,” Laor said.
GumGum has seen the results of transferring this complexity and labor to a managed database. “We’ve basically reduced our entire operational effort with Scylla, to next to nothing,” Sader said.
He added: “We are coming to our busiest point of the year, the ads are really picking up in the fourth quarter. So we reach out and say, “Hey, we need more nodes in these regions, can you make that happen for us?” They go, ‘Yeah.’ Give us the things, we pay the money. And it happens.
In 2021, Sader said, “we increased our volume by probably 75% plus 50%, compared to our norm. The hardest thing to do in this industry is to make it easy. And Scylla has helped us simplify ad serving. »
Check out the podcast for more details on GumGum’s move to a managed database.