P4: Network Analysis

Objectives

By the end of this practical you should be able to:

  1. create and retrieve information about objects based on their spatial relations in networks;
  2. perform specific network analysis tasks; and
  3. write simple PL/SQL code to execute SQL-based procedures and routines.
This practical should require two weeks to complete.

Overview

This practical will introduce you to the basics of spatial analysis of networks in SQL. We will use the spatial capabilities of Oracle 10g DBMS. Basic spatial operations in networks will be covered. 

Spatial networks are sets of nodes connected by edges, such as might be used to represent cities connected by roads or railways. In network models, the Euclidean distance between nodes does not have to equal the network distance. Edges (links) in the network may be directed or undirected. In addition to connectivity, networks in Oracle Spatial have geometric information associated with them (for instance, geographic coordinates of the nodes).

Some of the excercises will use the Procedural Language (PL) extension of the SQL language from Oracle. PL enables you to run more complex, structured programs with conditions, loops, etc. The basic structure of a PL/SQL block is:
DECLARE
/*optional */
/*this is a comment */
BEGIN
/*core section of the block, compulsory*/
EXCEPTION
/*Error handling,... Optional*/
END;
.
run;         /* this runs the program*/

Exercise P4.1: Network data entry

Spatial networks are sets of nodes connected by edges, such as cities connected by roads or railways. In network models, the Euclidean distance between nodes is not necessarily equal to the network distance. In this exercise, we will create a simple network, fill it with data, and perform some basic analysis.
  1. Network creation: There are several ways to create a network in Oracle. We will use the automatic way, which lets Oracle create all the necessary table structures with the procedure SDO_NET.CREATE_SDO_NETWORK. The parameters to this procedure provide the name of the network ("tutorial"), the number of hierarchical levels (1), whether the graph is directed (true), and whether costs are assigned to nodes (false). Use the following PL procedure to create the empty tutorial network:
    exec SDO_NET.CREATE_SDO_NETWORK ('tutorial',1,true,false)
    This automated procedure also creates all the necessary entries in the network metadata tables. It does not, however, create entries in the spatial metadata tables in case of a spatial network.
  2. Get an overview of the available data structures: Verify that the following tables are present and describe their structure: TUTORIAL_NODE$, TUTORIAL_LINK$, TUTORIAL_PATH$ and TUTORIAL_PLINK$. Detailed description of some of the columns can be found by referring to the lecture notes and to other documentation (e.g., Kothuri et al. 2004). The tables containing "paths" and "plinks" will not be used for the timebeing, so can be ignored. Validate the network created through the following command:
    SELECT SDO_NET.VALIDATE_NETWORK ('tutorial') FROM DUAL;
  3. The only errors should be those indication missing spatial ("geom") metadata. Note that "DUAL" is just a dummy table necessary to make the SELECT statement run.
  4. Input of network data: We will fill the network "tutorial" with the network provided in the figure on the right. For convenience, where two links connect a pair of nodes in opposite directions, one link is preceded with a negative sign (e.g., L6 connects N8 to N7; -L6 connects N7 to N8). The costs of the links are shown in the figure (all links in opposite directions connecting the same nodes have equal costs). No costs are assigned to nodes. The (non-geographical) coordinates of the nodes are indicated on the margins of the figure.
    1. Populate the node table: Use the following template to populate the table with node data (note that the SRID parameter is NULL):
      INSERT INTO tutorial_node$ (node_id, node_name, geometry) VALUES (1,'N1',SDO_GEOMETRY(2001,NULL, SDO_POINT_TYPE(0,2,NULL),NULL,NULL)
      );
      Remember to execute the COMMIT command at the end of all major operations to save your data. Because these commands are relatively repetitive and you are almost certain to make a mistake or two initiall, it is essential to construct your commands in a text editor (like notepad) using cut-and-paste. Then execute your commands using the "start" command (as in previous weeks). In case your get stuck and want to completely restart the creation of the network, drop the network with the command:
      exec SDO_NET.DROP_NETWORK('NETWORK_NAME');
      before trying to re-create it again. Sometimes it is necessary to also issue a "PURGE RECYCLEBIN" command before recreating the network.
    2. Populate the link table: Now populate the tutorial_link$ table with link data. You need to enter data for the columns link_id, link_name, start_node_id, end_node_id, cost, and geometry. Inputing links follows the template below. Note that the direction of edges is significant (e.g., an edge from N2 to N1 is different from an edge from N1 to N2): 
      INSERT INTO tutorial_link$ (link_id, link_name, start_node_id, end_node_id, cost, geometry) VALUES
      (1,'L1',2,1,1,
      SDO_GEOMETRY (2002,NULL,NULL,
      SDO_ELEM_INFO_ARRAY (1,2,1),            -- Straight line string element info
      SDO_ORDINATE_ARRAY (1,2,0,2)        -- Coodinates for link
      ));
    3. Verification of the network: Oracle offers several commands to verify the entry of the data in your network. You can retrieve the information about the nodes, links, non-connected nodes and invalid links. 
      SELECT SDO_NET.GET_NO_OF_NODES('network_name') FROM DUAL;
      Other commands include .GET_NO_OF_LINKS(network_name), .GET_ISOLATED_NODES(network_name), .GET_INVALID_LINKS(network_name), .GET_LINK_DIRECTION(network_name,link_id). Verify that your network has 9 nodes and 15 links, with no isolated nodes or invalid links.
    4. Finding information about individual elements of the network: Finally, it is possible to retrieve basic topological information about the elements of the network as such. This is important information for verifying that the network connectivity is correct. The basic measures are node degree (total number of links incident with a node). Node in/out-degree is the number of incoming and outgoing links from a specific node respectively (therefore for any node n, degree(n)=in_degree(n)+out_degree(n)). The basic commands all follow the same consistent pattern:
      SELECT SDO_NET.GET_NODE_DEGREE('network_name', node_id) FROM DUAL;
      The other commands are: .GET_NODE_IN_DEGREE, .GET_NODE_OUT_DEGREE, .GET_IN_LINKS, .GET_OUT_LINKS. Build an SQL statement to list all the nodes in numerical order along with their degree, in-degree, and out-degree. Make sure your output looks exactly like the following table. (Note: you can use the SQL*Plus command "COLUMN node_name FORMAT a9" to reformat the display of the node_name column so it fits neatly on your screen.)
      NODE_NAME
      DEGREE
      IN_DEGREE
      OUT_DEGREE
      --------------
      -------------- -------------- --------------
      N1
      2
      1
      1
      N2
      2
      1
      1
      N3
      3
      1
      2
      N4
      2
      1
      1
      N5
      3
      1
      2
      N6
      5
      3
      2
      N7
      4
      3
      1
      N8
      5
      2
      3
      N9
      4
      2
      2

Exercise P4.2: Network analysis using Network Editor

Once the network is created, we can start to perform network analyses. Oracle provides a Network Editor application which we shall use in this exercise to help familiarize yourself with the available network analyses. You won't be able to view the network you have just created, since we didn't give it the necessary spatial reference information and metadata. However, you will be able to build and analyse your own networks, and you should be aware that you can also use Network Editor to connect to Oracle and display existing spatial networks in the database.  
  1. Start Network Editor: On lab computers, the Network Editor can be started via the Windows start menu. Start the Network Editor. 
  2. Create a new network: Create a new SDO network from the "Edit", "Create new network..." menu item. Note that you can change a variety of parameters for your network. Call your network "mynet," and then press the "create" button. 
  3. Add nodes to the network: Add the nodes roughly as in the figure above. Each node can be created by selecting "Edit", "Add new node to network...", entering any additional information your node may require, and then pressing "Create".
  4. Add links to the network: Add the links to the network, again as in the figure above. Each link can be created by selecting "Edit", "Add new link to the network...". Note that you don't have to type in the IDs for every pair node which a link connects. Simply select the nodes where you want your new link to start and end in the editor window.
  5. Turn labels on: From the "view" menu turn the node and link labels on. Make sure you understand what you are looking at.
  6. Compute the shortest path: From the "Analysis" menu, you can access a variety of analyses. For example, compute the shortest path from node 4 to node 2. What is it's cost? Check with your neighbors? Do you all have the same routes/costs? If not, why not? What is the cost of the shortest path from node 2 to 4?
  7. Experiment with other network analyses: Use the Network Editor "Analysis" menu to experiment with some other network analyses. What do these different analyses do?

Exercise P4.3: Network analysis using PL/SQL

Network analysis is also possible using the Java API for Oracle (which the Network Editor uses) or in PL/SQL. Learning Java is beyond the focus of this course, so we will use simple PL/SQL scripts to analyse the network in a similar way to using the Network Editor. PL/SQL procedures are far too complex to type in directly to the command line. From this point onwards, if you are not already doing so, you must type your exercise answers into a text file, for example using notepad, and then run this file with the "start" command. Note that earlier statements which began with the "exec" keyword (like "exec SDO_NET.CREATE_SDO_NETWORK ('tutorial',1,true,false)") can still be used in a PL/SQL procedure by omitting the "exec" keyword (simply "SDO_NET.CREATE_SDO_NETWORK ('tutorial',1,true,false)"). You will need to use this fact to complete your assessment.
  1. Loading the network in a cache memory: To perform network analysis in PL/SQL, it is necessary to first load the network in a cache memory. This creates a network memory object. Loading the network can be performed using the following command (replace "network_name" with "tutorial"):
    BEGIN
       SDO_NET_MEM.NETWORK_MANAGER.READ_NETWORK('network_name','TRUE');
    END;
    .
    run;
    Note that every PL/SQL program is executed only after the two lines, the first containing only a single dot ".", the second containing the command "run;". It is also possible to use the character "/" in place of these two lines (although "/" behaves slightly differently in some circumstances). Note also, it is easy to forget the semicolon after "END;". If you do this your code will not execute properly. If the command executes successfully, you should see the lines of your code printed out, followed by the output: "PL/SQL procedure successfully completed." Any other messages will be errors. If you receive an error, double check you have correctly entered the procedure. Then ask a demonstrator for assistance. Also execute the following SQL command to ensure server system listings in the following excercises:
    SET SERVEROUTPUT ON FORMAT WRAP;
    Bear in mind that you can have only one network in memory at a time. To drop the memory cache at the end of network processing use the PL/SQL command:
    BEGIN
        SDO_NET_MEM.NETWORK_MANAGER.DROP_NETWORK('network_name');
    END;
    .
    run;
  2. Connectivity: It is possible that a network consists of several mutually disconnected sub-networks. To find out how many individual sub-networks the network contains, it is useful to get the number of connected components:
    DECLARE
        res_numeric NUMBER;
    BEGIN
        res_numeric := SDO_NET_MEM.NETWORK_MANAGER.FIND_CONNECTED_COMPONENTS('network_name');
        DBMS_OUTPUT.PUT_LINE('The number of connected components is: ' || res_numeric);
    END;
    .
    run;
    Note the declaration of variables and their types, as well as the special output line for outputting information to the system console in a formatted manner.
  3. Neighborhood analysis in networks: To retrieve the nearest neighbors of a node, follow the template below. Also find the nearest three nodes for node 4. 
    DECLARE
        res_array SDO_NUMBER_ARRAY;
        res_numeric NUMBER;
    BEGIN
        res_array := SDO_NET_MEM.NETWORK_MANAGER.NEAREST_NEIGHBORS('network_name',6,3);
        DBMS_OUTPUT.PUT_LINE('IDs for the nearest 3 neighbors of node 6 are: ');
        FOR indx IN res_array.FIRST..res_array.LAST
        LOOP
           res_numeric := res_array(indx);
           DBMS_OUTPUT.PUT(SDO_NET_MEM.PATH.GET_END_NODE_ID('network_name',res_numeric) || ',');
        END LOOP;
    END;
    .
    run;
  4. Shortest path: One of the most important network analysis operations is the shortest path. The shortest path between a source and destination node is the path with the lowest cost. More advanced shortest paths can take account of costs of turns and node traversal, but in our example only the costs of links are considered. The following code will retrieve the cost of the shortest path. Adapt it to find the cost of the path between nodes 4 and 2. Make sure that you the answer you get agrees with the answer you found in 4.2.6.
    DECLARE
        cost NUMBER;
        path_id NUMBER;
        res_numeric NUMBER;
    BEGIN
        path_id := sdo_net_mem.network_manager.shortest_path('network_name', source_node_id, dest_node_id);
        cost := SDO_NET_MEM.PATH.GET_COST('network_name', path_id);
        DBMS_OUTPUT.PUT_LINE('The ID of the shortest path from X to Y is: ' || path_id || ' and it costs ' || cost);
    END;
    .
    run;
  5. Further PL/SQL analysis: Given the following PL/SQL functions, and based on previous examples, you should be able to create a program that returns the names of all the links in the network, and their start and end nodes, in a "table" with columns LINK/START_NODE/END_NODE. The listing should be similar in structure to that in P4.3.4 above:
    SDO_NET_MEM.LINK.GET_NAME(
    net_mem IN VARCHAR2,
    link_id IN NUMBER
    ) RETURN VARCHAR2;

    SDO_NET_MEM.LINK.GET_START_NODE_ID(
    net_mem IN VARCHAR2,
    link_id IN NUMBER
    ) RETURN NUMBER;

    SDO_NET_MEM.LINK.GET_END_NODE_ID(
    net_mem IN VARCHAR2,
    link_id IN NUMBER
    ) RETURN NUMBER;
    HINT: Use the following template to determine all the link IDs in the network:
    var1_array := SDO_NET_MEM.PATH.GET_LINK_IDS('network_name', res_numeric);
    FOR indx1 IN var1_array.FIRST..var1_array.LAST
    LOOP
        var1_numeric := var1_array(indx1);
        DBMS_OUTPUT.PUT(var1_numeric || ' ');
    END LOOP;
  6. Reachable nodes: Write another program to retrieve all the nodes reachable from a specified node (5), using the template:
    res_array := SDO_NET_MEM.NETWORK_MANAGER.FIND_REACHABLE_NODES('tutorial',5);

Exercise P4.4: Printing output in PL/SQL

We've already seen in previous exercises the use of the DBMS_OUTPUT.PUT_LINE function to output the value of PL/SQL variables to the screen. However, in PL/SQL it's not possible to directly output the results of a SELECT statement to the screen. Thus, the following SQL code will produce an error:
BEGIN
  SELECT id, name FROM my_table;
END
.
run;
Instead, you can copy single values from a SELECT statement into a variable, and then print those out using the following template:
BEGIN
   SELECT count(id) INTO x FROM my_table;
   DBMS_OUTPUT.PUT_LINE(x);
END
.
run;
However, to print out all the values from a table of results returned from a SELECT statement, you need the following syntax which puts each line from the SELECT statement into the "rec" record variable, and then prints out elements from this record.
BEGIN
    FOR rec IN (SELECT id a, name b FROM my_table) 
    LOOP
        DBMS_OUTPUT.PUT_LINE(rec.a || ', ' || rec.b);
    END LOOP;
END
.
run;
Try printing out some more values inside your PL/SQL procedures. Repeat P4.1.3d, formatting the information about nodes and their degrees, using a PL/SQL procedure.

Assessment A4: Network Analysis

Assignment

Assessment figureThis assignment concerns the network shown in the figure on the right. Your assignment is to provide a script to answer the following 10 questions. It is essential that your entire script run correctly using the "start" command in SQL*Plus (e.g., "start <studentno>.txt"). If it does not run correctly, the markers will not debug your script and will give you zero marks for any questions that don't work. You should also check very carefully that your script creates network exactly as shown in the Figure. Note that the coordinates of each node are provided in parenthesis after the node name, e.g., "g (7,1)". The numbers associated with the links are the costs for those links. Note also that the links in this case do not have specified labels: you may call them any sensible names. Where two links go in both directions between the same pair of nodes, these links have the same costs (e.g., cost(ih)=cost(hi)=25).
  1. Create a new network called "assessment4" (see hint below).
  2. Create the nodes for this network, as shown in the figure.
  3. Create the links for this network, as shown in the figure. 
  4. Print out the number of nodes and links.
  5. Print out the names of every node, along with their degree, in degree, and out degree.
  6. Load the network into memory.
  7. Print out the number of connected components in the network.
  8. Print out the length of the shortest path from node b to g.
  9. Print out the names of the five nearest neighbor nodes from node g.
  10. Drop the network, deleting all tables associated with this network.

Hint

Note: for questions 1 and 10, a problem may arise because the SDO_CREATE_NETWORK and SDO_DROP_NETWORK functions can take some time to execute and are multi-threaded, meaning that subsequent statements may be executed before the network has been created, leading to errors. To avoid this problem, you should use the following general structure for your assessed assignment to guarantee that the network has been created before further statements are executed and deleted only after earlier statements have been executed.  
BEGIN
  /* Your answer to question 1 (create network) */
END;
.
run;
DECLARE
  /* Your variables */
BEGIN
  /* Your answers to questions 2-9 */
END;
.
run;
BEGIN
  /* Your answer to question 10 (drop network in memory and on the disk) */
END;
.
run;
Creative Commons License
This work is created by Matt Duckham (http://www.duckham.org) and licensed under a Creative Commons Attribution 3.0 United States License.