pgr_prim

pgr_prim — Returns the minimum spanning tree of graph using Prim algorithm. In particular, the Prim algorithm implemented by Boost.Graph.

_images/boost-inside.jpeg

Boost Graph Inside

Synopsis

The prim algorithm was developed in 1930 by Czech mathematician Vojtěch Jarník. It is a greedy algorithm that finds a minimum spanning tree for a weighted undirected graph. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized. The algorithm operates by building this tree one vertex at a time, from an arbitrary starting vertex, at each step adding the cheapest possible connection from the tree to another vertex.

This algorithms find the minimum spanning forest in a possibly disconnected graph; in contrast, the most basic form of Prim’s algorithm only finds minimum spanning trees in connected graphs. However, running Prim’s algorithm separately for each connected component of the graph, then it is called minimum spanning forest.

Characteristics

The main Characteristics are:

  • Process is done only on edges with positive costs.
  • It’s implementation is on only undirected graph.
  • Span start from chosen root_vertex resulting subgraph.
    • Default value of root_vertex is -1 then result is minimun spannig tree of disconnected graph.
  • Values are returned when there is a minimum spanning tree.
    • When there is no edge in graph then EMPTY SET is return.
  • Running time: \(O(E*log V)\)

Signatures

pgr_prim(edges_sql)
pgr_prim(edges_sql, root_vertex)

RETURNS SET OF (seq, prim_tree, start_node, end_node, edge, cost, agg_cost)
   or EMPTY SET

The signature is for a undirected graph.

Example:

Here root_vertex is -1. So, result is MST of disconnected graph.

SELECT * FROM pgr_prim(
    'SELECT id, source, target, cost, reverse_cost FROM edge_table'
);
 seq | prim_tree | start_node | end_node | edge | cost | agg_cost 
-----+-----------+------------+----------+------+------+----------
   1 |         1 |          1 |       -1 |   -1 |    0 |        0
   2 |         1 |          1 |        2 |    1 |    1 |        1
   3 |         1 |          2 |        3 |    2 |    1 |        2
   4 |         1 |          2 |        5 |    4 |    1 |        3
   5 |         1 |          3 |        6 |    5 |    1 |        4
   6 |         1 |          5 |       10 |   10 |    1 |        5
   7 |         1 |          6 |       11 |   11 |    1 |        6
   8 |         1 |         10 |       13 |   14 |    1 |        7
   9 |         1 |         11 |       12 |   13 |    1 |        8
  10 |         1 |          6 |        9 |    9 |    1 |        9
  11 |         1 |          5 |        8 |    7 |    1 |       10
  12 |         1 |          3 |        4 |    3 |    1 |       11
  13 |         1 |          8 |        7 |    6 |    1 |       12
  14 |         2 |         14 |       -1 |   -1 |    0 |        0
  15 |         2 |         14 |       15 |   17 |    1 |        1
  16 |         3 |         16 |       -1 |   -1 |    0 |        0
  17 |         3 |         16 |       17 |   18 |    1 |        1
(17 rows)

Example:Additional example

Results is subgraph which contain root_vertex.

SELECT * FROM pgr_prim(
    'SELECT id, source, target, cost, reverse_cost FROM edge_table', 4
);
 seq | prim_tree | start_node | end_node | edge | cost | agg_cost 
-----+-----------+------------+----------+------+------+----------
   1 |         1 |          4 |       -1 |   -1 |    0 |        0
   2 |         1 |          4 |        3 |    3 |    1 |        1
   3 |         1 |          4 |        9 |   16 |    1 |        2
   4 |         1 |          3 |        6 |    5 |    1 |        3
   5 |         1 |          9 |       12 |   15 |    1 |        4
   6 |         1 |          6 |       11 |   11 |    1 |        5
   7 |         1 |          6 |        5 |    8 |    1 |        6
   8 |         1 |         11 |       10 |   12 |    1 |        7
   9 |         1 |          5 |        8 |    7 |    1 |        8
  10 |         1 |         10 |       13 |   14 |    1 |        9
  11 |         1 |          8 |        7 |    6 |    1 |       10
  12 |         1 |          3 |        2 |    2 |    1 |       11
  13 |         1 |          2 |        1 |    1 |    1 |       12
(13 rows)

SELECT * FROM pgr_prim(
    'SELECT id, source, target, cost, reverse_cost FROM edge_table where id > 10', 9
);
 seq | prim_tree | start_node | end_node | edge | cost | agg_cost 
-----+-----------+------------+----------+------+------+----------
   1 |         1 |          9 |       -1 |   -1 |    0 |        0
   2 |         1 |          9 |       12 |   15 |    1 |        1
   3 |         1 |          9 |        4 |   16 |    1 |        2
   4 |         1 |         12 |       11 |   13 |    1 |        3
   5 |         1 |         11 |        6 |   11 |    1 |        4
   6 |         1 |         11 |       10 |   12 |    1 |        5
   7 |         1 |         10 |       13 |   14 |    1 |        6
(7 rows)

Description of the edges_sql query for prim functions

edges_sql:an SQL query, which should return a set of rows with the following columns:
Column Type Default Description
id ANY-INTEGER   Identifier of the edge.
source ANY-INTEGER   Identifier of the first end point vertex of the edge.
target ANY-INTEGER   Identifier of the second end point vertex of the edge.
cost ANY-NUMERICAL  

Weight of the edge (source, target)

  • When negative: edge (source, target) does not exist, therefore it’s not part of the graph.
reverse_cost ANY-NUMERICAL -1

Weight of the edge (target, source),

  • When negative: edge (target, source) does not exist, therefore it’s not part of the graph.

Where:

ANY-INTEGER:SMALLINT, INTEGER, BIGINT
ANY-NUMERICAL:SMALLINT, INTEGER, BIGINT, REAL, FLOAT

Description of the parameters of the signatures

Parameter Type Default Description
edges_sql TEXT   SQL query as described above.
root_vertex BIGINT -1

Root vertex from where spanning of tree start.

  • When root vertx is -1 then result is MST of disconnected graph.

Description of the return values for prim algorithms

Returns set of (seq, prim_tree, start_node, end_node, edge, cost, agg_cost)

Column Type Description
seq INT Sequential value starting from 1.
prim_tree BIGINT It represent subgraph in disconnected graph.
start_node BIGINT Identifier of the starting vertex of spanning tree.
end_node BIGINT Identifier of the ending vertex of spanning tree. -1 for the root vertex.
edge BIGINT Identifier of the edge used to go from start_node to end_node or vice-versa. -1 for the root vertex.
cost FLOAT Cost to traverse from start_node using edge to the end_node.
agg_cost FLOAT Aggregate cost of edges that is covered in spanning.

See Also

Indices and tables