P2: Using Oracle Spatial

Objectives

By the end of this practical you should be able to:

  1. understand Oracle spatial data types;
  2. retrieve information about objects based on their spatial properties; and
  3. create new spatial data.
This practical should require two weeks to complete.

Overview

This practical will introduce you to the basics of spatial analysis in SQL. We will use the spatial capabilities of Oracle 10g DBMS to perform basic spatial operations on the sample data from the US Census Bureau, the Digital Chart of the World (covered in Kothuri et al., 2004), and some data created by yourself.

Exercise P2.1: Viewing spatial data

Oracle Spatial has no spatial data viewing capabilities. One of the reasons why Oracle Spatial is such a powerful system is that unlike GIS software packages, Oracle Spatial treats the storage, querying, and management of spatial data as completely separate functions from viewing that data. However, when learning about Oracle spatial it is occasionally nice to be to display the data you are using graphically. There are many different ways that you can view Oracle Spatial data, including web mapping using Oracle Spatial MapViewer, and linking it to GIS software packages like GIS. However, in this section we use a data viewer called Spatial Console. Spatial Console is only available in University lab spaces. However, there is an unsupported, free data viewer you can download and use at home if you wish. Details are available on the course web site.   

  1. Prepare data to be viewed: Viewing spatial data in Spatial Console requires that the data is in your schema, has metadata, and has an index. We shall look at creating metadata and indexes later in this practical. However, for now simply follow the steps below to prepare the world_continents data from the "spatial" schema for viewing in Spatial Console. To view other data, you may simply replace "world_continents" in the instructions below with the name of another data set in the spatial schema. However, you have a limited disk quota and so some data sets may be too large to store. When you have finished, you should always delete any data you have copied using the "DROP TABLE" command.     
    1. Log into your account using SQL*Plus in the usual way
    2. Make a copy of world_continents in your own schema using the command "CREATE TABLE my_world_continents AS SELECT * FROM spatial.world_continents"
    3. Copy across the metadata for world_continents from spatial's schema using the command "INSERT INTO user_sdo_geom_metadata (SELECT 'my_world_continents', column_name, diminfo, srid FROM all_sdo_geom_metadata WHERE  table_name='WORLD_CONTINENTS' AND owner='SPATIAL')"
    4. Create an index for the data using the command "CREATE INDEX world_continents_index ON my_world_continents(geom) INDEXTYPE IS mdsys.spatial_index"
  2. Start Spatial Console: The Oracle Spatial data viewer is called Spatial Console and can be started using the Spatial Console menu item in Windows. 
  3. Set connection: To view any spatial data, Spatial Console first requires information about the connecting to an Oracle server. Using the connection wizard, or by selecting "Add" from the "Action" menu, must enter the details of the connection to the Oracle server, as used in Practical 1.
  4. Set data to view: You must also enter the data to display. Each data set to display is referred to as a "layer" in Spatial Console. With the "Layers" tab selected (right hand side of screen) "Add" a new layer from the "Action" menu. 
  5. Layer caching: For small data sets you can answer "No" to the question "Would you like to cache the layer?"
  6. Move around the map: The map should now be displayed. The icons at the top of the screen allow you to zoom and pan around the data. Try this.  
  7. Select continents: Spatial Console also has a select function where you can select objects and view the data associated with those objects. Work out how to achieve this and explore any other capabilities of spatial viewer you can find. 
We shall not use the data viewer as the basis for future exercises. But it is always there to allow you to graphically view, explore, and perform simple queries on data if you wish. A free, limited, trial version of the software may also be downloaded via the course web site in case you want to use it at home.

Exercise P2.2: Overview of spatial objects

This exercise will shows you how to get overview information about a database's spatial data contents, using the SQL*Plus instant client. You will retrieve information about the basic spatial properties of objects described by the database. The focus is on the geometry processing functions of Oracle. We will use the US data from the Digital Chart of the World dataset.
  1. Login to the course database server (in Oracle SQL*Plus instant client). Log in to the server using technique covered in the last practical.
  2. Get an overview of the available data: List all the data attributed to the owner "Spatial" (this was covered in the last practical). Verify if the following tables are present: world_continents, world_countries, us_cities, us_interstates, us_parks, us_rivers, us_states, us_restaurants. 
  3. Attributes of the table containing spatial data: Find out the name of the column containing attributes of the spatial data type for each of the tables listed above. Provide the answer in the form of a SELECT statement. HINT: The geometry columns in Oracle are of the type SDO_GEOMETRY.
  4. Information about spatial coordinates: We will find out some basic information about the coordinate systems used to store spatial objects. We will use the state of New York (us_states.state_abrv='NY') and other US states.
    1. Find out the type of the spatial object: one of the most important information we need about a specific spatial object is what type of object it is (point, line, polygon, ...). Find out what is the spatial data type of the state of New York? Use the following query template:
      SELECT a.geom.SDO_GTYPE geom FROM spatial.us_states a WHERE ...;
      Note the output of the geometry type. The format is D00T. 2003 - '2' identifies the number of dimension of the spatial object (1,2,3 or 4 if temporal), '3' identifies the type of geometry (1 - point, 2 - line, 6 - multiline, 3 - polygon, 7 - multipolygon,...). More information about the codes used for geometry types is contained in the lecture notes and in Kothuri et al., 2004.
      HINT: note the use of the alias a for the table us_states. You can use this technique in and SELECT statement to avoid typing long identifiers. Also note the output, where the selected information has a new assigned column name, GEOM. You can use this approach for customized system outputs.
    2. Find out the spatial reference system in which the geometry is defined: The SRID is a code specifying in what coordinate system the spatial objects are stored. Find out what coordinate system Georgia's coordinates are stored in.
      SELECT a.geom.SDO_SRID FROM spatial.us_states a WHERE ...;
    3. Identify the SRID used: the cryptic identifiers of the Oracle SRID can be translated into standard notation and vice versa. Find out the SRID code for the standard Australian Geodetic 1966 system used for the official mapping.
      Hint: use the query as a template:
      SELECT wktext FROM MDSYS.CS_SRS WHERE srid='8307';
      You can use the columns wktext or cs_name to identify the coordinate system. HINT: Use the SQL 'LIKE' operator, if you do not know the exact input string (you can use '%pattern%' if you do want to match only a pattern and not the exact string): 
      SELECT column_name1 FROM table_name WHERE column_name2 LIKE '%pattern%';
  5. Information about spatial properties: In addition to coordinates, we can find out a variety of spatial information for our spatial data.
    1. Compute areas: Find the area of the US state of NY. HINT: to work with the geometries in Oracle, you have to provide a spatial function and its parameter as arguments of the SELECT statement. The usual template is: SDO_FUNCTION (parameter1,parameter2,...). The area function looks like this:
      SELECT SDO_GEOM.SDO_AREA (geometry_column, tolerance, units) FROM ...
      The units parameter is optional. Set the tolerance to 1 (see below for information about tolerance). Provide the answer in both square yards and square meters in the same table ('unit=sq_yard' and 'unit=sq_meter'). Make sure you give the output columns sensible headings. Once you've got the area of New York, write a query to list the names of all US states, in descending order of area (largest first). What is the fourth largest state?
    2. Compute perimeters: The function returning length for lines also returns the perimeter for polygons. The function is SDO_GEOM.SDO_LENGTH(geometry_column, tolerance, units). Find out the perimeter of the state of New York. Provide the answer in meters and miles (in one query). Now write a query to list the names of all US states, in descending order of perimeter (longest first). What is the state with the fourth longest perimeter? 
    3. Determine the bounding box of a spatial object: Bounding boxes are an important concept in spatial data handling. They provide the extent of the object along the ordinates of the coordinate system (northernmost point of a continent, and so on...). The function SDO_MBR(geometry) returns the minimum bounding rectangle (MBR) of a geometry. Find the MBR of New York. You can also retrieve the individual values along each dimension (x=1, y=2, z=3; or lon, lat, height).
      SELECT SDO_GEOM.SDO_MIN_MBR_ORDINATE (geometry_column, dimension_no) FROM ...
      Retrieve the north-south extent of NY. Write a (single) SQL statement to give the minimum and maximum latitude for each state. Label the columns in your statement NAME, MAXLAT, and MINLAT. HINT: Assign aliases to the outputs of the SELECT statement
      SELECT (Function1 (parameters...)) alias1, (Function2 (parameters...)) alias2 from ...
The tolerance value (in P2.2.5a) is needed to specify the degree of precision used to compute the answer to several different spatial functions, including area. For spatial data using geodetic coordinates, a tolerance value of 1 means that only points further apart than 1m will be considered to be distinct points. Setting the tolerance too large will result in strange answers (where distinct points are "snapped" together). Setting the tolerance too small will decrease the speed and efficiency of the spatial operation. Happily, the gap between "too large" and "too small" is usually very generous. In most cases, any tolerance from a wide variety of values will result in efficient retrieval of the correct answer. Repeat P2.2.5a with different tolerance values to reassure yourself you get the same answer (try tolerances of 10, 100, 1000, 10000, 0.1, 0.001, 0.0001, ...). Some spatial operations are more sensitive to tolerance values than others. However, in this course it's safe to always select the default tolerance of "1".

Exercise P2.3: Spatial data storage

In this exercise, you will define and create your own table with sample spatial data. You will have to create a metadata entry for those tables: all spatial data in oracle needs metadata to describe its coordinate system. At the end of the lab, don't forget to remove the metadata entries, drop the table and indexes.
  1. Inserting spatial data: create a table "petrolstation" with spatial data in the column location. You will create a table containing the coordinates of three "petrol stations", with the following coordinates:
    Name (varchar2(8)) ID (NUMBER) SDO_GEOMETRY
    Lon Lat
    Agip 1 -77.01669 33.90124
    Shell 2 -77.017 33.90120
    Texaco 3 -77.018 33.9013
    HINT: Use the following template:
    CREATE TABLE petrolstation(
    name     VARCHAR2(8),
    ID                NUMBER,
    location        SDO_GEOMETRY
    );

    INSERT INTO petrolstation VALUES (
    ...., SDO_GEOMETRY(
    2001,     --SDO_GTYPE attribute specifying points of dimension 2
    8307,    --other parameters of SDO_GEOMETRY, set to 8307 for now
    SDO_POINT_TYPE   -- point type specification, coordinates follow
    (value1,                -- value of ordinate 1 (longitude)
    value2,                 -- value of ordinate 2 (latitude) 
    NULL                    -- 3rd ordinate, NULL if 2D object
    ),
    NULL,
    NULL
    )
    );
  2. Some of the codes and NULL values in the insert statement may seem a bit strange. These values are all required by oracle to parameterize the new geometry. The lectures will provide more information on which codes to use and exactly what they mean. However, the Oracle Spatial documentation and Kothuri et al. (2004) are definitive texts to refer to if in doubt. Note the "--" notation. You can use it to insert comments into longer SQL statements for your own orientation. Try to store the insert segment in a text document with the extension *.sql. You can load it using: 
    START path_to_file.sql
  3. Create metadata entry for the spatial table: Oracle spatial treats all the objects with a spatial column in one table as a spatial layer. If you insert metadata information about this layer in a metadata table, Oracle will be able to create spatial indexes, reproject data, and perform more advanced analysis on this objects. Metadata information in Oracle spatial covers the type of the layer (its dimensions), its geographic bounds, tolerance, and spatial reference system information.
    INSERT INTO user_sdo_geom_metadata VALUES(
    'PETROLSTATION', --table name
    'location',      -- name of spatial column
    SDO_DIM_ARRAY    -- DIMINFO attribute, stores dimension bounds and tolerance
    (
    SDO_DIM_ELEMENT
    (
    'LONGITUDE', -- dimension name for first dimension
    -180,        -- SDO_LB - lower bound of the dimension values
    180,         -- upper bound for the dimension values
    0.5          -- tolerance for values - 0.5 meters  
    ),
    SDO_DIM_ELEMENT
    (
    'LATITUDE', -- dimension name for second dimension
    -90,        -- SDO_LB - lower bound of the dimension values
    90,         -- upper bound for the dimension values
    0.5          -- tolerance for values - 0.5 meters  
    )
    ),
    8307        -- SRID value for WGS84
    );
    Verify the creation of the table by selecting its records. If you like, you should also be able to view your new spatial data using the JBO viewer.

Exercise P2.4: Spatial indexes 

Spatial indexes speed up the retrieval of spatial object based on their geometric properties, and enable faster execution of some spatial analysis functions. Many ingenious spatial indexes have been proposed, some of which we will examine in more detail later in the course. However, it is not necessary to understand how an index works in order to use it in Oracle. In this section we will create an index for our spatial data. Although you won't see and change to your data, creating an index is required for many spatial analyses. (If you don't believe this, you might like to try to skip ahead to P2.5.2 and see what happens!)
  1. Creation of spatial index: Create a spatial index on the location column of the petrolstation table using the following template:
    CREATE INDEX indexname ON tablename(spatial_column) INDEXTYPE IS MDSYS.SPATIAL_INDEX;
    It is recommended to always perform a:
    DROP INDEX indexname;
    statement for a specific index before creating a new one. Note that depending on the nature of the spatial index (Quadtree, R-tree, ...) after deletion/insertion of data in the table the index may need to be re-created. Also, different geometric layouts of the data can justify the use of different indexes. The default index in Oracle Spatial is the R-Tree, but other indexes are also possible. R-trees, quadtrees, and many other indexes will all be covered later in the course.
  2. List all indexes: Sometimes it is useful to know which indexes have already been created. The "user_indexes" table contains this information, as shown in the following query:
    SELECT index_name, table_name FROM user_indexes;

Exercise P2.5: Simple geometric analysis

In this section we perform some simple geometric analyses covered in the lectures. In the next practical we shall perform some more complex analyses.
  1. Querying distances: The distance between any geometry (point, line, or polygon) and point can be determined using the function SDO_GEOM.SDO_DISTANCE(geom1, geom2, tolerance, units). Note that in the case of the distance between a line or polygon and a point the procedure retrieves the minimum distance (see lecture notes for more information). For example, try the following query which will provide the distance in miles between two of the petrol stations. What are the distances between all pairs of the petrol stations?
    SELECT SDO_GEOM.SDO_DISTANCE(a.location, b.location, 1, 'unit=mile') FROM petrolstation a, petrolstation b WHERE a.name='Agip' and b.name='Texaco';
  2. Advanced distance: Select the names of the 10 closest restaurants from the restaurant POI_ID=33625120 (us_restaurants table). The general template for the distance clause is as for that above:
    SELECT SDO_GEOM.SDO_DISTANCE(query_objects, origin_object,tolerance) FROM ...;
    However, you will need to use a nested statement and the operator ROWNUM from Practical 1 to answer this question. Assign an alias 'DISTANCE' to the function SDO_GEOM.SDO_DISTANCE and order by distance for the first subquery.
  3. Within distance: We can also find places that are within a certain distance of each other (assuming an index has been created) using the SDO_WITHIN_DISTANCE function. Try the following query, inserting a range of sensible values in place of the distance "x":
    SELECT * FROM petrolstation a, petrolstation b WHERE SDO_WITHIN_DISTANCE(a.location, b.location, 'distance=x unit=METER')='TRUE';
  4. Buffering: A buffer creates a new geometry containing all the locations within a specified distance of a input geometry. Create buffers of 1 mile around each of the petrol stations, using the following template:
    SELECT SDO_GEOM.SDO_BUFFER(geometry, distance, tolerance, units) FROM ...
  5. Centroid: We can create the centroid (center of gravity) for a polygonal shape using the following template:
    SELECT SDO_GEOM.SDO_CENTROID(geometry, tolerance) ...
    Use this template to find the centroid of the buffer of 1000m for the shell petrol station. What do you notice about this centroid and why?

Assessment A2: Spatial SQL

Marking

Assignment

The questions you must answer for this assignment all concern the sample data from the US Census Bureau the Digital Chart of the World used in the practicals. Your assignment is to provide SQL/Oracle statements to answer to the following 10 questions (see "Submission" section above for more details).
  1. List in alphabetical order the country names and spatial object geometry types (SDO_GTYPE) for all the countries beginning with "An".
  2. List the spatial reference system (SDO_SRID) in which the coordinates of Australia are stored.
  3. List the official name (not wktext!) of the coordinate system with SRID 2236.
  4. List the country name (in alphabetical order), area (in square kilometers), perimeter (in kilometers), and the maximum latitudes of all countries beginning with the letter "U". You must give your column headings meaningful and concise aliases (i.e., "Area", "Perimeter", "Maxlat").
  5. List the name, area (in square meters, computed from the geometry) and perimeter (in meters) of all US states with populations ("totpop" attribute) of more than 15 million people. Your list must be in decreasing order of population (most populous first) and must again use sensible aliases for your column headings.
  6. Find the minimum distance in miles from the city of "Fresno" ("us_cities" table, "city" attribute) to the "Potomac" river ("us_rivers" table, "name" attribute).
  7. List the names, state abbreviation, and distances (in miles) of all US cities that are within 100 miles of New York city. List your answer in order of distance (nearest first). Note: you must use the SDO_GEOM.SDO_DISTANCE function to answer this query instead of the SDO_WITHIN_DISTANCE function (because you do not have enough privileges to create the index required for the SDO_WITHIN_DISTANCE function for tables in the "spatial" schema).
  8. List the names of the cities in the US state of "Oregon" in order of distance to the center (centroid) of the state (closest first).
  9. Create a buffer of 100 miles around the "Hudson" river.
  10. List in alphabetical order the names of the countries whose s have areas that are more than 15% larger than the area reported in the "sqmi_cntry" attribute of the "world_countries" table.
Creative Commons License
This work is created by Matt Duckham (http://www.duckham.org) and licensed under a Creative Commons Attribution 3.0 United States License.