首先我们假定读者对Gremlin语句有一定的了解,那么Traversal就是对gremlin语句用java语法中的方法进行拼接。看一下JanusGraph官方文档中对traversal的使用:
JanusGraph graph = JanusGraphFactory.open('conf/janusgraph-cql.properties')
GraphTraversalSource g = graph.traversal()
saturn = g.V().has('name', 'saturn').next()
traversal方法首先返回一个GraphTraversalSource类型,这个类型可以执行图相关的搜索,也就是根据条件从全图进行搜索,然后返回traversal,在后续若干个traversal后,最终结果在next方法后返回。这个方法被称为ternimal step,与他作用相同的一共有9个,他们的作用是终止traversal的添加过程,转而对前面设置的条件开始图查询。
g.V().out('created').hasNext()
g.V().out('created').next()
g.V().out('created').next(2)
g.V().out('nothing').tryNext()
g.V().out('created').toList()
g.V().out('created').toSet()
g.V().out('created').toBulkSet()
results = ['blah',3]
g.V().out('created').fill(results)
g.addV('person').iterate()
本节中,我们重点关注GraphTraversalSource类在全局搜索时的工作方式。
一、Terminal Step 代码解析
(一)以图为中心
我们从toList方法一路看到其与Hbase交互的部分,我将这个过程分为五部分
(1)获取最后一个step,调用他的next方法
// Traversal step g.V().toList() Traversal.toList ==> this.fill(); Traversal.fill ==> endStep.next();
(2)调用JanusGraphStep类中的executeGraphCentricQuery,可以从字面意思看出这个一个以图为中心的查询过程
// Step step AbstractStep.next ==> this.processNextStart(); GraphStep.processNextStart ==> this.iteratorSupplier.get(); JanusGraphStep.setIteratorSupplier ===> executeGraphCentricQuery() JanusGraphStep.executeGraphCentricQuery ===> builder.iterables().iterator();
(3)暂且理解为返回迭代器的过程
// Transection step QueryProcessor.iterator() ===> new ResultSetIterator() QueryProcessor.getNewIterator ===> return executor.execute(); StandardJanusGraphTx.execute ===> return (Iterator) getVertices().iterator(); Iterables.filter ===> return Iterators.filter(unfiltered.iterator(), predicate); VertexIterable.iterator ===> final RecordIteratoriterator = graph.getVertexIDs(tx.getTxHandle());
(4)通过StandardJanusGraph获取定点ID,以迭代器形式返回
// Graph step StandardJanusGraph.getVertexIDs ===> tx.edgeStoreKeys(vertexExistenceQuery);
(5)从Hbase中查询结果并返回迭代器形式的定点ID
// Backend step BackendTransaction.edgeStoreKeys ===> return executeRead(new Callable() { public KeyIterator call() throws Exception {return edgeStore.getKeys(new KeyRangeQuery()} }); BackendTransaction.edgeStoreKeys ===> return store.getKeys() HbaseKeyColumnValueStore.getKeys ===> return executeKeySliceQuery() HbaseKeyColumnValueStore.executeKeySliceQuery ===> return new RowIterator()
(二)以顶点为中心
(1)
// Traversal step g.V().toList() Traversal.toList ==> this.fill(); Traversal.fill ==> endStep.next();
(2)
// Step step AbstractStep.next ==> this.processNextStart(); JanusGraphVertexStep.processNextStart FlatMapStep.processNextStart ===> this.flatMap(this.head); JanusGraphVertexStep.flatMap ==> query.vertices()
(3)
// VertexCentricQueryBuilder.vertices ===> execute() VertexCentricQueryBuilder.execute ===> return resultConstructor.getResult(vertex,bq); VertexConstructor.getResult ===> return executeVertices(v,bq); BasicVertexCentricQueryBuilder.executeVertices ===> return executeIndividualVertices(vertex,baseQuery); SimpleVertexQueryProcessor.vertexIds ===> Iterables.transform( CacheVertex.loadRelations ===> lookup.get(query) SimpleVertexQueryProcessor.getBasicIterator ===> tx.getGraph().edgeQuery()
(4)
// Graph StandardJanusGraph.edgeQuery ===> tx.edgeStoreQuery()
(5)
// Backend BackendTransaction.edgeStoreQuery ===> call KCVSProxy.getSlice ===> store.getSlice(query, unwrapTx(txh)) HbaseKeyColumnValueStore.getSlice ===> getHelper() HbaseKeyColumnValueStore.getHelper ===> table.get(requests);(三) 对比
1.
两个过程都经历了fill ===> AbstractStep.next的过程,不同的是:
以图为中心查询生成的是 AbstractStep ===> GraphStep ===> JanusGraphStep
以顶点为中心生成的是 AbstractStep ===> FlatMapStep ===> VertexStep ===> JanusGraphVertexStep
2.
这两个实现类在重载next方法中的processNextStart时有所不同:
graph centric:
GraphStep 中定义我们返回的迭代器由 this.iteratorSupplier.get() 获得,get方法如下面所示,可以描述为 JanusGraphStep中的
Map, QueryInfo> hasLocalContainers
===> Multimap
===> final List
===> MultiDistinctOrderedIterator
hasLocalContainers.entrySet().forEach(c -> queries.put(c.getValue().getLowLimit(), buildGraphCentricQuery(tx, c))); queries.entries().forEach(q -> executeGraphCentricQuery(builder, responses, q)); return new MultiDistinctOrderedIterator(lowLimit, highLimit, responses, orders);
重点在 executeGraphCentricQuery(builder, responses, q),其中q是GraphCentricQuery类型。
vertex centric:



