Clustered tables are BigQuery tables with at least one clustered column. Querying clustered tables eliminate scans of unnecessary data, and can calculate aggregates quicker because the blocks colocate records with similar values. Key features:

  • Cluster column order is very important and affects performance. If a table is clustered by columns A,B,C and we filter the table by B,C, then the query won’t be optimized for clustering.
  • Clustered tables can help you to reduce query costs by pruning data so it’s not processed by the query. This process is called block pruning. BigQuery sorts the data in a clustered table based on the values in the clustering columns and organizes them into blocks. Details:
  • Types of partitioning: BIGNUMERIC, BOOL, DATE, DATETIME, GEOGRAPHY, INT64, NUMERIC, RANGE, STRING, TIMESTAMP
  • Clustering DOES NOT allows to determine query costs before a query runs (not the case for Partitioned tables are tables divided into segments)
  • Important: You cannot optimize storage with clusters.

References

Introduction to clustered tables | BigQuery. [Online]. Available at: https://cloud.google.com/bigquery/docs/clustered-tables [Accessed 15 May 2025]. Create and use clustered tables | BigQuery. [Online]. Available at: https://cloud.google.com/bigquery/docs/creating-clustered-tables [Accessed 15 May 2025].