概念

[HRU96]观察到数据立方体的可能物化集合形成一个格，并提出了选择一组好的物化集合的算法。Calcite的推荐算法源自此算法。

Calcite不检查这些约束。如果违反了这些约束，Calcite将返回错误的结果。

[“sales”, “unit_sales”] [“ship_date”, “time_id”] [“order_date”, “time_id”]

“瓦片”是格中的一个物化表，具有特定的维度。

演示

{
"version": "1.0",
"defaultSchema": "foodmart",
"schemas": [
{
"type": "jdbc",
"name": "foodmart",
"jdbcUser": "FOODMART",
"jdbcUrl": "jdbc:hsqldb:res:foodmart",
"jdbcSchema": "foodmart"
},
{
"lattices": [
{
"name": "star",
"sql": [
"select 1 from \"foodmart\".\"sales_fact_1997\" as \"s\"",
"join \"foodmart\".\"product\" as \"p\" using (\"product_id\")",
"join \"foodmart\".\"time_by_day\" as \"t\" using (\"time_id\")",
"join \"foodmart\".\"product_class\" as \"pc\" on \"p\".\"product_class_id\" = \"pc\".\"product_class_id\""
],
"auto": true,
"algorithm": true,
"rowCountEstimate": 86837,
"defaultMeasures": [
{
"agg": "count"
}
]
}
]
}
]
}


\$ sqlline version 1.3.0
sqlline> !connect jdbc:calcite:model=core/src/test/resources/hsqldb-foodmart-lattice-model.json "sa" ""


sqlline> select "the_year","the_month", count(*) as c
. . . .> from "sales_fact_1997"
. . . .> join "time_by_day" using ("time_id")
. . . .> group by "the_year","the_month";

sqlline> explain plan for
. . . .> select "the_year","the_month", count(*) as c
. . . .> from "sales_fact_1997"
. . . .> join "time_by_day" using ("time_id")
. . . .> group by "the_year","the_month";


sqlline> !describe "adhoc"."m{16, 17, 27, 31, 32, 36, 37}"

sqlline> select count(*) as c
. . . .> from "adhoc"."m{16, 17, 27, 31, 32, 36, 37}";


sqlline> !tables


格建议者 Lattice suggester

{
"version": "1.0",
"defaultSchema": "foodmart",
"schemas": [
{
"type": "jdbc",
"name": "foodmart",
"jdbcUser": "FOODMART",
"jdbcUrl": "jdbc:hsqldb:res:foodmart",
"jdbcSchema": "foodmart"
},
{
"autoLattice": true
}
]
}


进一步的方向

1. 构建瓦片的算法考虑了过去查询的日志。
2. 物化视图管理器查看传入的查询并为它们构建瓦片。
3. 物化视图管理器删除不活跃使用的瓦片。
4. 格建议者根据传入的查询添加格，将现有格的瓦片转移到新格，并删除不再使用的格。
5. 覆盖表的水平切片的瓦片；以及一种重写算法，可以通过拼接多个瓦片并访问原始数据来填补空洞来回答查询。
6. 提供API来在基础数据发生变化时使瓦片或瓦片的水平切片无效化。

参考资料

https://calcite.apache.org/docs/lattice.html