neo4j apoc 系列
Neo4j GDS-01-graph-data-science 图数据科学插件库概览
Neo4j GDS-02-graph-data-science 插件库安装实战笔记
Neo4j GDS-03-graph-data-science 简单聊一聊图数据科学插件库
Neo4j GDS-06-neo4j GDS 库中社区检测算法介绍
Neo4j GDS-07-neo4j GDS 库中社区检测算法实现
Neo4j GDS-08-neo4j GDS 库中路径搜索算法介绍
Neo4j GDS-09-neo4j GDS 库中路径搜索算法实现
Neo4j GDS-10-neo4j GDS 库中相似度算法介绍
Neo4j GDS-11-neo4j GDS 库中相似度算法实现
Neo4j GDS-12-neo4j GDS 库中节点插入(Node Embedding)算法介绍
Neo4j GDS-13-neo4j GDS 库中节点插入算法实现
Neo4j GDS-14-neo4j GDS 库中链接预测算法介绍
Neo4j GDS-15-neo4j GDS 库中链接预测算法实现
Neo4j GDS-16-neo4j GDS 库创建 graph 图投影
Neo4j GDS-17-neo4j GDS 库创建 graph 图投影更复杂的场景
实际测试
数据初始化
节点清空
注意:这里会清空所有数据,主要是为了演示。实际不要轻易使用!
match(n) detach delete n
节点说明
i_alarm 告警信息。有 appName、ip、title、eventId 属性。i_alarm 根据 appName 指向 i_app 节点,ip 指向 i_vm 或者 i_phy。
i_app 应用节点。有 appName 属性。i_app 指向 i_vm
i_vm 虚拟机节点。有 ip 属性。i_vm 指向 i_phy
i_phy 物理机节点。有 ip 属性。
基础节点初始化
创建 4 个 i_app 节点,分别指向 4 个 i_vm。
2个 i_vm 一组,分别指向 i_phy 节点。
给出 cypher 初始化语句
CREATE (phy1:i_phy {ip: '192.168.1.1'}),
(phy2:i_phy {ip: '192.168.1.2'}),
(vm1:i_vm {ip: '10.0.0.1'})-[:BELONGS_TO]->(phy1),
(vm2:i_vm {ip: '10.0.0.2'})-[:BELONGS_TO]->(phy1),
(vm3:i_vm {ip: '10.0.0.3'})-[:BELONGS_TO]->(phy2),
(vm4:i_vm {ip: '10.0.0.4'})-[:BELONGS_TO]->(phy2),
(app1:i_app {name: 'app1'})-[:POINTS_TO]->(vm1),
(app2:i_app {name: 'app2'})-[:POINTS_TO]->(vm2),
(app3:i_app {name: 'app3'})-[:POINTS_TO]->(vm3),
(app4:i_app {name: 'app4'})-[:POINTS_TO]->(vm4);
告警1
要求:
帮我给出对应的 i_alarm 的 cypher 语句。
alarm1 指向 app1, vm1;
alarm2 指向 app2, vm2;
alarm3 指向 phy1
alarm4 指向 app3, vm3
语句
// Alarm1:指向 app1, vm1
CREATE (a1:i_alarm {
eventId: 'alarm1',
appName: 'app1',
ip: '10.0.0.1',
title: 'Alarm 1'
})
WITH a1
MATCH (app1:i_app {name: 'app1'}), (vm1:i_vm {ip: '10.0.0.1'})
MERGE (a1)-[:ALARM_OF_APP]->(app1)
MERGE (a1)-[:ALARM_OF_VM]->(vm1);
// Alarm2:指向 app2, vm2
CREATE (a2:i_alarm {
eventId: 'alarm2',
appName: 'app2',
ip: '10.0.0.2',
title: 'Alarm 2'
})
WITH a2
MATCH (app2:i_app {name: 'app2'}), (vm2:i_vm {ip: '10.0.0.2'})
MERGE (a2)-[:ALARM_OF_APP]->(app2)
MERGE (a2)-[:ALARM_OF_VM]->(vm2);
// Alarm3:仅指向 phy1
CREATE (a3:i_alarm {
eventId: 'alarm3',
appName: null,
ip: '192.168.1.1',
title: 'Alarm 3'
})
WITH a3
MATCH (phy1:i_phy {ip: '192.168.1.1'})
MERGE (a3)-[:ALARM_OF_PHY]->(phy1);
// Alarm4:指向 app3, vm3
CREATE (a4:i_alarm {
eventId: 'alarm4',
appName: 'app3',
ip: '10.0.0.3',
title: 'Alarm 4'
})
WITH a4
MATCH (app3:i_app {name: 'app3'}), (vm3:i_vm {ip: '10.0.0.3'})
MERGE (a4)-[:ALARM_OF_APP]->(app3)
MERGE (a4)-[:ALARM_OF_VM]->(vm3);
告警2
追加一个只只想 app 的告警
// 新增 app5 -> vm5 -> phy1
MATCH (phy1:i_phy {ip: '192.168.1.1'})
CREATE (vm5:i_vm {ip: '10.0.0.5'})-[:BELONGS_TO]->(phy1),
(app5:i_app {name: 'app5'})-[:POINTS_TO]->(vm5);
// Alarm5:仅指向 app5
CREATE (a5:i_alarm {
eventId: 'alarm5',
appName: 'app5',
title: 'Alarm 5'
})
WITH a5
MATCH (app5:i_app {name: 'app5'})
MERGE (a5)-[:ALARM_OF_APP]->(app5);
APOC 查询全部告警关联的
MATCH (start:i_alarm)
CALL apoc.path.expandConfig(start, {
relationshipFilter: '>', // 只沿单向关系
labelFilter: '/i_phy|/i_vm|/i_app', // 可选:只过滤到 i_phy 节点,如果需要可以改成 null
minLevel: 1, // 最小深度
maxLevel: 1, // 最大深度
uniqueness: 'NODE_GLOBAL' // 节点全局去重
}) YIELD path
UNWIND nodes(path) AS node
RETURN DISTINCT nodes(path) AS nodes, relationships(path) AS rels;
但是存在一个问题,i_alarm 中 alarm5 只和 app5 有关系,但是 app5 和其他的比如 i_phy 的关系却断开了。
如何把应该存在关系的节点连上呢?可以用 apoc + gds 分成多步来实现。
换一种思路
思路
我们来换一种思路,先查询到和 alarm 直接关联的全部节点:
MATCH (start:i_alarm)
CALL apoc.path.expandConfig(start, {
relationshipFilter: '>', // 只沿单向关系
labelFilter: '/i_phy|/i_vm|/i_app', // 可选:只过滤到 i_phy 节点,如果需要可以改成 null
minLevel: 1, // 最小深度
maxLevel: 1, // 最大深度
uniqueness: 'NODE_GLOBAL' // 节点全局去重
}) YIELD path
UNWIND nodes(path) AS node
RETURN DISTINCT nodes(path) AS nodes, relationships(path) AS rels;
然后根据已有的这些节点,构建节点之间的最短路径关系。比如 alarm5->app5->i_vm->i_phy。
那就直接返回 alarm5->app5->i_phy。
从而构建出一个完整的关系、剔除无关节点的子图。给出实现方式、分步走
明白了,你想换一种思路:
- 第一步:找到和
alarm直接关联的节点 - 第二步:在这些节点之间构建最短路径(跳过中间未关联节点),得到“压缩子图”
- 第三步:返回节点 + 逻辑关系,用于可视化
下面我给你整理一个分步实现方案。
Step 1:查询 alarm 直接关联的节点
MATCH (start:i_alarm)
CALL apoc.path.expandConfig(start, {
relationshipFilter: '>', // 只沿单向关系
labelFilter: '/i_phy|/i_vm|/i_app', // 只保留这几类节点
minLevel: 1,
maxLevel: 1, // 只找直接关联
uniqueness: 'NODE_GLOBAL'
}) YIELD path
UNWIND nodes(path) AS node
RETURN DISTINCT start AS alarm, collect(DISTINCT node) AS directNodes;
- 得到每个
alarm直接关联的节点集合 - 例如
alarm5 -> app5,alarm1 -> app1等
Step 2:生成告警关联节点集合
- 把所有告警关联节点放到一个集合
relatedNodes中 - 用于下一步构建子图
MATCH (start:i_alarm)
CALL apoc.path.expandConfig(start, {
relationshipFilter: '>',
labelFilter: '/i_phy|/i_vm|/i_app',
minLevel: 1,
maxLevel: 1,
uniqueness: 'NODE_GLOBAL'
}) YIELD path
UNWIND nodes(path) AS node
WITH collect(DISTINCT node) + collect(DISTINCT start) AS relatedNodes
RETURN relatedNodes;
- 注意要把
alarm本身也加入集合
Step 3:在这些节点之间构建最短路径
- 遍历
alarm -> directNodes - 用
apoc.algo.dijkstra或shortestPath找最短路径 - 跳过未关联节点(只允许在
relatedNodes中经过)
WITH $relatedNodes AS nodes
UNWIND [n IN nodes WHERE 'i_alarm' IN labels(n)] AS alarmNode
UNWIND [n IN nodes WHERE 'i_app' IN labels(n) OR 'i_phy' IN labels(n)] AS targetNode
CALL apoc.algo.dijkstra(alarmNode, targetNode, '>', 'weight') YIELD path, weight
RETURN DISTINCT nodes(path) AS nodes, relationships(path) AS rels;
- 这里
weight可以默认 1(或直接忽略) - 只会经过
relatedNodes中的节点 - 中间未关联节点(例如 vm5)被跳过
Step 4:整合成完整子图
- 所有最短路径
nodes(path)+relationships(path)合并 - 得到完整的告警压缩子图
MATCH (start:i_alarm)
CALL apoc.path.expandConfig(start, {
relationshipFilter: '>',
labelFilter: '/i_phy|/i_vm|/i_app',
minLevel: 1,
maxLevel: 1,
uniqueness: 'NODE_GLOBAL'
}) YIELD path
UNWIND nodes(path) AS node
WITH collect(DISTINCT node) + collect(DISTINCT start) AS relatedNodes
UNWIND [n IN relatedNodes WHERE 'i_alarm' IN labels(n)] AS alarmNode
UNWIND [n IN relatedNodes WHERE 'i_app' IN labels(n) OR 'i_phy' IN labels(n)] AS targetNode
CALL apoc.algo.dijkstra(alarmNode, targetNode, '>', 'weight') YIELD path, weight
RETURN DISTINCT nodes(path) AS nodes, relationships(path) AS rels;
微调
UNWIND [n IN nodes WHERE 'i_app' IN labels(n) OR 'i_phy' IN labels(n)] AS targetNode
很接近了。
这里微调一下,改为不是 i_alarm 的作为 targetNode。
适用性更强一些。
// Step 1:找到 alarm 直接关联的节点集合
MATCH (start:i_alarm)
CALL apoc.path.expandConfig(start, {
relationshipFilter: '>', // 只沿单向关系
labelFilter: '/i_phy|/i_vm|/i_app', // 只保留这些节点
minLevel: 1,
maxLevel: 1, // 只找直接关联
uniqueness: 'NODE_GLOBAL'
}) YIELD path
UNWIND nodes(path) AS node
WITH collect(DISTINCT node) + collect(DISTINCT start) AS relatedNodes
// Step 2:为每个 alarm 构建压缩路径到非 i_alarm 节点
UNWIND [n IN relatedNodes WHERE 'i_alarm' IN labels(n)] AS alarmNode
UNWIND [n IN relatedNodes WHERE NOT 'i_alarm' IN labels(n)] AS targetNode
CALL apoc.algo.dijkstra(alarmNode, targetNode, '>', 'weight') YIELD path, weight
// Step 3:返回压缩子图
RETURN DISTINCT nodes(path) AS nodes, relationships(path) AS rels;
实际上这个还是不够完美,因为 i_alarm->i_vm->i_phy 是最短路径上返回的。
这个在大图的时候,会直接超时。
v2-性能优化(这个不太对)
希望这里把 dijkstra 改为从 i_app 开始,因为 alarm 到 app 本身就是有关系的。结束节点,排除掉 alarm 和 app
Step 1:找到所有与 alarm 直接关联的节点
Step 2:从 alarm 出发,最多5步,只经过 Step1 找到的节点
MATCH (alarm:i_alarm)
CALL {
WITH alarm
// Step 1: 找到 alarm 直接关联节点
CALL apoc.path.expandConfig(alarm, {
relationshipFilter: '>', // 只沿单向关系
labelFilter: '/i_phy|/i_vm|/i_app', // 只保留相关节点
minLevel: 1,
maxLevel: 1,
uniqueness: 'NODE_GLOBAL'
}) YIELD path
UNWIND nodes(path) AS node
RETURN collect(DISTINCT node) AS directNodes
}
CALL apoc.path.expandConfig(alarm, {
relationshipFilter: '>',
minLevel: 1,
maxLevel: 5, // 最多5步
filterNodes: 'n -> n IN $directNodes OR n = $alarm', // 只经过与 alarm 相关节点
uniqueness: 'NODE_GLOBAL'
}) YIELD path
WHERE last(nodes(path)) IN directNodes // 只保留路径终点在 directNodes 中的路径
RETURN DISTINCT nodes(path) AS nodes,
relationships(path) AS rels;
方法列表
方法
不同版本可能不同
CALL gds.list()
YIELD name
WHERE name CONTAINS 'ortest'
RETURN name
ORDER BY name;
图效果
基础效果
报警节点效果
追加告警 alarm5 之后的图
和 alarm 直接关联的节点
参考资料
https://github.com/neo4j/graph-data-science
