-
Notifications
You must be signed in to change notification settings - Fork 238
Depth First vs. Breadth First
[ Note: Image linked to from original webpage ]
Depth-first traverse traverses down an entire path as specified by a path expression before turning to the next legal path. On the other hand breadth-first traverse traverses legal paths “in parallel,” where at each step, all legal steps are computed before moving onto the next step of the path. Gremlin provides support for both types of traversals.
Gremlin is naturally a depth-first traversal language/engine. Typical path expression (as used in this documentation) evaluate in a depth-first manner. For example, given the “set up” code:
gremlin> $_g := tg:open()
==>tinkergraph[vertices:0]
gremlin> g:load('data/graph-example-1.xml')
==>true
gremlin> $_ := g:id-v(1)
==>v[1]
The following “co-created” expression is a depth-first expression.
gremlin> ./outE[@label='created']/inV/inE[@label='created']/outV
==>v[1]
==>v[4]
==>v[6]
In the expression above, vertex 1
is passed to step outE
which yields three outgoing edges (see Defining a Property Graph for diagram of graph). The first of these outgoing edges is then passed into the label filter. It is a “created” edge, so it then goes to the next step inV
which produces one vertex (vertex 3
). That one vertex is then passed to inE
which yields three edges. The first edge is passed to the label filter. It is a “created” edge so it is passed to outV
which yields vertex 1
. At which point, the path expression is complete for the full depth of the expression. Thus, v[1]
is returned first by the evaluator. The evaluator then goes onto the remaining two edges created from the first outE
and the process continues until all legal branches of the path expression have been evaluated.
In a breadth-first traversal, all results for each step of the process are gathered before the next step is evaluated. The equivalent breadth-first traversal for the previous “co-created” expression is as follows.
./outE[@label='created']/gather[0]/inV/gather[0]/inE[@label='created']/gather[0]/outV/gather[0]
The gather
step is used to aggregate all the results of the previous step into a list. Given that gather
exhausts all the objects of the previous step and emits a list, [0]
is used to unroll that list and pass it to the next step.
With gather
, the generated list can be analyzed by a step to determine some desired property that is a function of all current objects in the breadth of the path. For example.
gremlin> step print
g:print('processing: ' + .)
.
end
==>true
gremlin> ./outE/gather/print/inV
processing: [e[7][1-knows->2], e[9][1-created->3], e[8][1-knows->4]]
==>v[2]
==>v[3]
==>v[4]
Note that with breadth-first traverse being controlled by the gather
step, its possible to intermix the use of depth- and breath-first traversing in a single path expression. In other words, its not necessary to gather all results from the previous step at every step. In fact, its only necessary when all objects of a particular part of the path expression are needed in a computation.