Smart Stuff

Value hands with lightbulb

Value Drivers in Data Warehouse Selection

Written by: | |

This is the second in the series of blogs that accompany the joint webinars we ran with Yellowbrick about data warehouse modernisation. You can see the second webinar here.

Previously we talked about some of the challenges that customers face when their data warehouse goes end of life, such as getting support, evaluating possible alternatives, and migrating to a replacement. In this blog, we're going to be focusing on the various ways that the Yellowbrick database specifically can add value to your organisation.

Saving – Spend Less, Make More

Value is not just a function of cost so let's begin with the most obvious way that Yellowbrick can help derive value, which is by saving money.


A great thing about Yellowbrick is that they are completely transparent about their pricing. With a lot of vendors, particularly in the Cloud, you almost need a PhD to calculate the prices. Yellowbrick price very simply in terms of VCPUs per hour, so the more CPUs you use for longer, the more you pay. There are a few pricing models to choose from. You can subscribe for single or multiple years to fix your cost to a particular level or Pay-As-You-Go. Or you can combine the two if you have periodic bursts, but you don't want to fix the cost of those.

For estimating the cost, we like the calculator Yellowbrick provides. Customers don't even have to speak to a salesperson, they can just dial in their parameters and out will pop a number. It gives a good indication of pricing, but we’d still recommend they talk to Yellowbrick as it doesn’t really help you understand the full range of options.

For Cloud, most Yellowbrick customers already have a good relationship with their Cloud vendor, having committed to spending a certain amount with them. Yellowbrick doesn’t try to resell you Cloud, you pay Yellowbrick a subscription fee and your Cloud costs are your own. Your usage counts against any commitment that you've already made with your Cloud provider, including any discounts that you've bought into and any other programs that the Cloud vendor is incentivising you to help you grow in the Cloud, including credits for R&D.

Yellowbrick also operates on-premises where their hardware is also priced Cloud-like. So even though it's running in your data centre you get that same Cloud mentality of subscription-based pricing and there are no upfront costs - you effectively rent the hardware from Yellowbrick at a well-defined cost. It's the same software subscription per VCPU. You pay additionally for storage and hardware from Yellowbrick whereas in the Cloud you pay the Cloud provider for storage and hardware.

Leverage the Proof of Concept

Customers hate doing a proof of concept and forming the impression that new platform is extremely cheap, but when they start using it in anger with real users and real volumes of data, they suddenly discover that it's eye-wateringly expensive. This is because they only really bought the demo and did not run real world workloads against it.

If you are evaluating a data warehouse technology either in the Cloud or on-premises, you should give it as close to a real-world workload as you can. If you're just taking a small subset and then trying to predict it forward, the cost can vary significantly as you start to throw more users onto a platform. In our 4th webinar we'll be discussing how Yellowbrick and Smart Associates will help you conduct a meaningful proof of concept that provides accurate costings.

If, however, you do need to scale up or down because got your capacity wrong at the outset it’s very straight forward and can be done in a few seconds through the Yellowbrick manager without impacting your existing workloads. If you experience variable workloads during the day, for example, you can script the increase or decrease in capacity using your preferred automation tool.

The Special Offer

We can’t talk about value without mentioning the promotion that Yellowbrick is running that guarantees that Yellowbrick will be 75% cheaper than your existing cloud data warehouse, otherwise they will discount your software the difference up to a year completely free. Such is their level of confidence from what they’ve seen on existing customer benchmarks and proofs of concept that they've run. Yellowbrick changes the game. They have a unique technology which makes Yellowbrick more efficient and so can deliver these savings.

Making money – Know More, Win More


The whole point of a data warehouse platform is to help an organisation to improve their profitability, to be able to retain their customers, and to gain a competitive advantage over the competition.

Lowering your cost base is one way to make more money. So, putting yourself on a solid low-cost foundation for growth, particularly as organisations start to give people more access to data, becomes increasingly important. You should ensure that you're able to grow your analytics capability, not just replace what you're doing today, but actually grow it and give more people access to that data, so they can really derive the business benefit, especially those who help an organisation make more money.

If you're still running on-premises there are costs to running a data centre - all day air conditioning, renting space, etc. Energy costs are high on people's agendas right now. So, the ability to shrink that environment can result in huge savings. Some Yellowbrick customers have shrunk their environment up to 97% - from 20 cabinets of equipment down to a half or so cabinet of equipment, which is an amazing saving from an energy perspective.

Simplification – Less Waste, Improved Productivity

Yellowbrick is simpler than other database technologies, including Netezza. A database veteran would be used spending their time optimizing their databases, creating indexes, deciding which discs to write to and how to spread files between different storage devices, etc. When Netezza arrived, it was revolutionary because it offered raw performance out of the box and users loved that simplicity.

With Yellowbrick, you get a greater step change in raw performance out of the box than Netezza because of Yellowbrick’s increased efficiency. They have taken away a lot of that operational overhead caused by grooming and statistics-generation, and, like Netezza, there are no indexes by default.


In fact, you can't even create an index on Yellowbrick even if you wanted to, they don't think they need them. The performance of the platform deals with that. Admittedly this is a bold statement, but other platforms have things like materialised views which are objects that use system resources and must be built and maintained by your team to optimise performance. Yellowbrick don't feel the need for them, the architecture renders such things unnecessary. Furthermore, Yellowbrick is not trying to do too much. A lot of database vendors are trying to accommodate a whole gamut of data categories like integration, governance, some of the new Big Data sparks type technologies, AI, and machine learning, etc., so being all things to all people. Yellowbrick is really a drop-in replacement for a technology like Netezza, or a data warehouse. They are solely focused on being the best data warehouse platform: the most efficient, the most cost-effective, the most price performant data warehouse out there.

Yellowbrick has the same solution on-premises and in the Cloud so moving backwards and forwards isn't a big shift. This is valuable where organisations aspire to Cloud but still have on-premises technology, or where they are reconsidering their mix of on-prem and Cloud and considering bringing things back on-premises. You don’t have to deploy a different solution on-premises and a different solution on Cloud. This symmetry really provides a simplification that gives you the agility to move with the times, whatever your business demands are.

Performance – Do More, Go Faster, Get Ahead

Why is being able to get answers more quickly so valuable to the Business? The answer is that as humans we're curious, we like to delve deep and diagnose things. That often means first identifying a problem, then searching for why and ultimately finding the unidentified root cause.


Machine learning and other technologies can do some of that for you but digging down means asking more questions and not just taking things at face value, but by analysing trends. When being able to answer those questions takes a long time there is the temptation, as we've seen over the last number of decades, to move the data outside of the data warehouse into some other platform, be it Excel or another database. Data living outside the warehouse is a nightmare for data governance and security. It takes people's time to extract the data and your data warehouse can become a glorified data export machine which is costly and self-defeating. We once had a customer where a user was trying to dump 120 terabytes of data from their Netezza system to their PC so they could do some local analysis on Tableau which just killed the system and the network. Being able to get answers quickly discourages that kind of behaviour because you don't need to do it.

People are now accustomed to getting instant gratification from their mobile devices for a whole number of things. Experienced IT professionals who have been around for years know about databases and understand why some things take longer than others, but newcomers don’t necessarily have that experience and they don’t understand why refreshing a report, for example, may take a relatively long time compared to the query they run in Google that searches billions of documents and returns an answer in a second.

So, the business value is in empowering users to delve deeper into the data and find those nuggets that really change the business rather than just focusing on reporting. The key is workload management that intelligently allocates the right resources to run queries and executing policies that control the types of queries people can run. If a query is going to export two million rows of data, it’s not a query that should be run, so send the user a message asking them to rewrite the query, or throw the query into a penalty queue, and/or kill it. So, it effectively acts as a traffic cop for data.


Flexibility – Fewer Constraints, More Choice

If you're a customer that buys into the entire ecosystem story, how long does it take before you realise that you're locked in and unable to ever get off that platform? With Yellowbrick, customers can choose whatever ETL tools, reporting tools or analytics capabilities they want to mix and match with the new database environment and integrate them simply by repointing to Yellowbrick.

Yellowbrick support AWS and Azure and will have support for Google Cloud at the end of 2023/early 2024. It supports on-premises as well, an appliance-less technology that lands in your data centre, on a subscription basis, so Cloud-like, and no up-front cost. Moving workloads between all of those can be straightforward. You can replicate data from on-premises to Cloud, or between Cloud providers. Deployments are managed through the web based Yellowbrick manager which provides a single plane of glass control across different instances.

Modernisation – Better, Faster, Smaller, Easier


Yellowbrick is decades newer technology than legacy platforms that customers may currently be using but what does that mean for the customer? To answer this question, you must look to the foundation of Yellowbrick. Their founders saw a gap between the advances in technologies such as CPUs, memory, storage and IO, and database technology, which was not keeping up with those advances. Hardware capability grew at a lightning rate, but data warehouse performance only grew at 15 or 20%, meaning that customers were missing out on potential performance and potential cost savings by running older technology. Yellowbrick looked to bridge that gap and so reinvented core database technologies to take advantage of those hardware capabilities and maximise the potential in your underlying infrastructure. This results in a more efficient lower cost database platform and a smaller footprint, which in turn reduces energy and space-related costs. This greener approach to technology was recognised by Intel, who have partnered with Yellowbrick on their Intel Disruptors program. As a gold member, Yellowbrick gets early access to Intel technology and so can build technology quickly to meet those advances with joint engineering initiatives. These technical advances are immediately passed on to Yellowbrick customers. Remember that Yellowbrick is acquired through a subscription to on-premises and Cloud instances, so customers automatically reap the benefits of those advances as Yellowbrick continuously updates its infrastructure with the latest technological advances. And because customers pay for VCPU’s per hour, not credit-based or unit-based pricing models that most Cloud providers use, if a query that took seven seconds now runs in three because of improvements to the underlying technology, the customer enjoys that efficiency saving. It is a subtle but important difference.

If you’d like to learn more about value drivers in data warehouse selection, why not see the recording of the full webinar (49 mins).

Or, if you’d like to have a chat with us about how we can help facilitate your free Yellowbrick proof of concept with your own data, contact us here.

Author Bio