Behind the Scenes of Our 2023 Cloud Data Platform Benchmark & Analysis
- janvier 25, 2023
Most comprehensive performance and cost benchmark analysis of top five cloud data platforms
Choosing a cloud data platform for your business is a big decision that requires careful consideration and a thorough understanding of your options. Knowing how each platform compares to its competitors based on performance and cost is essential. That's where our 2023 Cloud Data Platform Benchmark & Analysis comes in.
This is the third year we have conducted this analysis and comparison. Our team of data experts devoted over 400 hours of lab time to compare the leading cloud data platforms: Amazon Redshift, Azure Synapse Analytics, Databricks, Google BigQuery and Snowflake.
This report offers an unbiased, vendor neutral analysis and is ideal for organizations that are evaluating cloud data platforms, validation selection or reviewing data management strategy.
Our benchmarking methodology
- Fair tests were conducted with similar out of the box platform configurations.
- The performance was objectively assessed using TPC-DS v2.11.0 benchmark (industry-standard for measuring performance of database systems).
- A data set that simulates a typical enterprise workload was chosen for the assessment. The platforms were tested with TPC approved 10TB data set, containing 7 fact tables across 17 dimensions. The largest fact table had 28 billion rows and largest dimension of 65 million rows.
- Queries were run in SQL with only the necessary adjustments.
- The configurations were scaled up to 2x to evaluate performance vs scalability of the platform.
- Response times, query execution times, and overall performance were analyzed
- Performance benchmarking tests were conducted on both structured data and semi-structured data. These benchmarks replicated typical operations of any modern data warehouse setup. The time taken and cost incurred by the cloud data platform to process the analytical queries of various complexities were analyzed.
- For structured data set, the study focused on assessing:
- The data load and extraction capabilities of the cloud data platform
- The time taken by the cloud data platform to process the analytical queries of various complexities
- Each of the platform’s capabilities to extract tables from an existing big table
- For semi-structured data set, the study focused on assessing the time taken by the cloud data platform to process the analytical queries.
To provide our clients a deeper understanding of the different cloud data platform capabilities we went beyond usual performance benchmarking and cost analysis. The platforms were also scored against 33 dimensions across three dimensions:
- Usability: Simplicity/Self-Serve, True SaaS, Structured and Semi-Structured Data Support, Low/No Management
- Technical: Architecture, Security, Elasticity, SQL Compatibility, Data Sharing
- Business: Cost Controls, Roadmap, Enterprise Deployments, OEM Network