Big Data Market Segment LS
Big Data Market Segment RS
×

Message

Failed loading XML... Document is empty
Monday, 02 June 2025 17:23

"Why we had to build Snowflake" - an interview with co-founder Benoit Dageville Featured

By

It's June which means the data and AI cloud Snowflake is about to run its annual Snowflake Summit where it makes major product announcements. iTWire spoke with Snowflake co-founder Benoit Dageville who explained just why he and Thierry Cruanes simply had no choice but to create the cloud database which nows enjoys multi-billion dollar revenues with 8000 employees across 50 countries.

iTWire has been around the world and interviewed big tech company executives who are flanked by security, or who require questions to be submitted in advance, or who will only address a group. Yet, despite his obvious success Dageville sat down with iTWire for a chat as if we were equals, and as if we were old buddies. Any onlooker may have confused us for two engineers on their lunch break, such is the humility and simplicity with which Dageville carries himself. After our talk iTWire watched as Dageville happily stopped every couple of metres to greet enthusiastic conference attendees who called out to him in celebrity fashion, despite that he was on his way to lunch.

So, iTWire asked, what's the Snowflake story? What was the catalyst that led Dageville and Cruanes - or perhaps "Ben and Thierry without the ice cream" - to say there must be a better way than all this B-Tree stuff. And is Larry Ellison kicking himself he let you guys go?

"Thierry and I were very good friends working as architects in the core Oracle database team," Dageville explained to iTWire. However, "two revolutions were happening at that time, and Oracle was not part of them."

The first was the cloud. "For us, the cloud was a miracle for analytics, resources, and elasticity," he said. "I always said that as older software guys we were at the mercy of hardware and its resources. The cloud was a game changer for analytic workloads that has lots of peaks and troughs in usage. It really needed that elasticity the cloud offered."

Secondly, and "even more important, was service," Dageville said, referring to the emerging and evolving "as a service" mindset. "We wanted to create a solution that was super easy to use by customers and could be used as a service managing infrastructure by itself. This new era of fully-managed services was dramatically lowering complexity."

In fact, "the other revolution Oracle missed at the time was the big data revolution," Dageville added. "Hadoop was really good at analysing machine-generated data and combining structured and non-structured data in one system."

While relational models have always dominated the database world, there's been an increasingly growing understanding of the importance of non-structured and semi-structured data. "This is super critical because most of the customer interaction is like this. If you miss this, you miss a lot of information."

Big data was, at that time, slow and complicated. "It didn't offer a lot of the things you would expect, like transactions," Dageville said. "but they were critical."

The vision formed: "we wanted to unify these key ideas."

In fact, it became such a burning passion that such a platform needed to exist that they knew if nobody else was going to bring it to life then they would. They had to.

"We didn't talk to Larry Ellison because we knew there was no way we could build a product like this at Oracle which already has a mature database offering. Dageville knew his vision needed resources, but saw there was no reason for Oracle to put in those resources. Hence, something new had to be built in order to bring to life these burning thoughts about a modern data platform.

"For us, it was about building a product, not a company," he said. "We created a company as collateral damage," Dageville joked, before adding more seriously, "we had to build a company to build a product."

The pair ended up leaving Oracle in 2012, spending months in San Mateo together endlessly scrawling designs on a whiteboard. "How to you build a completely revolutionary system?" they asked themselves.

The first big plan was to build a full-fledged data platform completely in the cloud. The second was to make the data immutable. "You can only create new data," Dageville said. This refers to Snowflake's native data format where micro-partitions of data are written to, and tables contain pointers to these micro-partitions. If you want to delete or update a record the system quickly duplicates the micro-partition and makes your changes there, altering the pointer to this new version. The original micro-partition remains just as it was, where it was. If you think about it, the power in this method comes to light.

First, unlike traditional database engines, you don't need to lock a table when you are writing to it or reading from it. You can simultaneously write and read when there are versioned micro-partitions. Importantly, you can undo any delete, any insert, any update, in milliseconds. Drop a table? Undrop it - literally using Snowflake's own "undrop" command - and all the engine does is reset the table's micro-partition pointer to the previous version. Delete a whole slab of data in production because you forgot a where clause? No problem in Snowflake. Undelete in fractions of a second. Want to clone a production database no matter how big? Snowflake does it in less than the blink of an eye because all it has to do is copy pointers to micro-partitions, not the data itself. Micro-partitions, and specifically immutable micro-partitions, is just one of the many creative and brilliant things Snowflake's creators introduced to the world. The technology warms the hearts of database administrators everywhere, while the pointer arithmetic under the hood brings a smile to even the most grizzled old Computer Science professor.

Next, "it took us a long time to grok how to do elasticity," Dageville said. This is the concept where a platform expands to provide more compute when needed, and shrinks when not needed.

Of course, "talk to any database person at the time," Dageville said, "and they'd say you run a cluster and add nodes. This form of 'elasticity' only grows."

"For us that was not interesting. It's not super cool to add more nodes in a cluster."

The pair kept at it. They were determined their new product had to be elastic in the true sense. "After many days of discussion we determined elasticity was really about plugging new workloads," he said. "Plug it and run it on the system and it takes milliseconds to allocate new resources, completely isolated from other workloads."

"It would run. We'd stop it, and then the 'aha' moment came. If we can do that, why not dedicate 10x more compute resources for that workload? Or 100x? If we can very quickly allocate these resources we can accelerate software by 10x or 100x and parallelise workloads, while keeping the price the same" - that is, "if you price up 10x more resources but it's 10x faster, then it will be the same price as if you rented those resources but took ten times longer to get a result."

Here's where Snowflake really began to take shape - "our really innovative architecture was decoupling compute and data, and having dedicated resources."

The first iteration of Snowflake went out the door. "We understood very well at that time that the cloud was opening doors and analytical systems would be completely changed."

"The way we built our architecture was with a very different premise to others," Dageville said. "We realised a lot of startups building new data systems were taking open source and making some databases better; they would take Postgres as the source - and RedShift is based on that, for example - with a distributed but very traditional architecture that wasn't designed at all for the cloud. They're simply traditional database systems stored on a cluster."

"When you start with the wrong architecture, you can never get what you ask for," he said. "You can work super hard but never get it."

Instead, Snowflake began with a very different premise. "We know from the start we wanted to allocate compute resources at scale and start to run workloads in sub-seconds. We didn't want them to pre-started, but just start and the hardware would show up magically."

"That's really powerful. Imagine a world where you didn't need to install software? You just run it and it finds a place to run without impacting existing things."

"Snowflake does this because we completely decoupled data and compute." In fact, Snowflake has three independent layers covering transaction state as well as data and compute. "It's a completely different architecture that no one else has," Dageville said.

The first release of Snowflake was "laser-focused on workloads," Dageville said, but the founders had a bigger vision. "Our vision of data was to create a full-fledged cloud where you can run data apps in our cloud; create an iPhone experience where you put your apps on our App Store and have the consumer install them like as they would on their phone. The key aspect here was to make the platform really extensible and open, and give native apps that run completely on the customer side so the provider has no access to the consumer's data. The consumer is completely protected, and at the same time, so too the provider is protected from the consumer. They can have models, data, and really amazing data apps - not just code, but code and data packaged together that can then do something further on the consumer side."

This vision of data apps came to the public two years ago, and has continued to expand since. "It's a new type of app. The simplicity you get is you just click and it's completely managed by the platform."

"With Snowflake you get the best of both worlds; the world of the iPhone and the world of your cloud."

In addition, "from day one I wanted to make data self service as much as possible. We had to connect data on the one hand to business users who had to have a direct connection, so applications were the ideal bridge, and now apps powered by AI are so critical to further these interactions."

With the inclusion of Streamlit "you can do it in days, minutes, to build an app and put it in the hands of business people," he said. Streamlit is an open-source Python framework for rapidly developing graphical data-based applications, and was wholly acquired by Snowflake. While Streamlit is integrated into the platform, it remains a freely available toolkit for use outside of Snowflake too.

Dageville sees the next generation of data apps as AI apps. "AI allows us to go beyond a finite set of use cases. Builders don't need to know exactly what users want; they can provide a generative AI layer and magically translate questions into code."

"It's not written by me as the developer; it's written by you as the business user to execute what you want to do. It gives a lot more power and democratises data. Interactions don't have to wait for the developer. AI-powered apps can really achieve much more with data."

What's next? "We're going to make apps much better, we're going to make Snowflake much better," he said. "We need to provide all the core capabilities every app will need so it goes from hours, not days, and then minutes to create."

"Our goal is to make it really seamless to develop apps. We have the core building blocks, including Neeva" - an AI search platform that Snowflake acquired bringing not only its tech but providing a new Snowflake CEO in the form of Sridhar Ramaswamy who has been in the role for nearing two and a half years now. With Neeva, "you can run effectively a Google search inside Snowflake."

"Then, streaming data, more analytics ... our focus is to provide a complete and super-integrated platform such that amazing data and AI apps can be built on top of it," Dageville said.

"The iPhone didn't change dramatically in the last 10 years but the quality of apps have changed, and that's the experience and what's next for Snowflake."

"We are going to do amazing things, but the most amazing thing will be the apps that run on Snowflake."

"The iPhone is our North Star - to build an amazing platform for amazing apps to delight customers."

 

Stay tuned to iTWire to see the news and announcements made at Snowflake Summit as the week progresses.

 

 

Pictured: David M Williams, iTWire (left); Benoit Dageville, Snowflake (right)

Read 861 times

Please join our community here and become a VIP.

Subscribe to ITWIRE UPDATE Newsletter here
JOIN our iTWireTV our YouTube Community here
BACK TO LATEST NEWS here




Maximising Cloud Efficiency - LUMEN WEBINAR 23 April 2025

According to KPMG, companies typically spend 35% more on cloud than is required to deliver business objectives

The rush to the cloud has led to insufficient oversight, with many organisations struggling to balance the value of cloud agility and innovation against the need for guardrails to control costs.

Join us for an exclusive webinar on Cloud Optimisation.

In this event, the team from Lumen will explain how you can maximise cloud efficiency while reducing cost.

The session will reveal how to implement key steps for effective cloud optimisation.

Register for the event now!

REGISTER!

PROMOTE YOUR WEBINAR ON ITWIRE

It's all about Webinars.

Marketing budgets are now focused on Webinars combined with Lead Generation.

If you wish to promote a Webinar we recommend at least a 3 to 4 week campaign prior to your event.

The iTWire campaign will include extensive adverts on our News Site itwire.com and prominent Newsletter promotion https://itwire.com/itwire-update.html and Promotional News & Editorial. Plus a video interview of the key speaker on iTWire TV https://www.youtube.com/c/iTWireTV/videos which will be used in Promotional Posts on the iTWire Home Page.

Now we are coming out of Lockdown iTWire will be focussed to assisting with your webinars and campaigns and assistance via part payments and extended terms, a Webinar Business Booster Pack and other supportive programs. We can also create your adverts and written content plus coordinate your video interview.

We look forward to discussing your campaign goals with you. Please click the button below.

MORE INFO HERE!

BACK TO HOME PAGE
David M Williams

David has been computing since 1984 where he instantly gravitated to the family Commodore 64. He completed a Bachelor of Computer Science degree from 1990 to 1992, commencing full-time employment as a systems analyst at the end of that year. David subsequently worked as a UNIX Systems Manager, Asia-Pacific technical specialist for an international software company, Business Analyst, IT Manager, and other roles. David has been the Chief Information Officer for national public companies since 2007, delivering IT knowledge and business acumen, seeking to transform the industries within which he works. David is also involved in the user group community, the Australian Computer Society technical advisory boards, and education.

Share News tips for the iTWire Journalists? Your tip will be anonymous

Subscribe to Newsletter

*  Enter the security code shown:

WEBINARS & EVENTS

CYBERSECURITY

PEOPLE MOVES

GUEST ARTICLES

Guest Opinion

ITWIRETV & INTERVIEWS

Channel News

Comments