A spatial database management system (SDBMS) is an extension, some might say specialization, of a conventional database management system (DBMS). Every DBMS (hence SDBMS) uses a data model specification as a formalism for software design, and establishing rigor in data management. Three components compose a data model, 1) constructs developed using data types which form data structures that describe data, 2) operations that process data structures that manipulate data, and 3) rules that establish the veracity of the structures and/or operations for validating data. Basic data types such as integers and/or real numbers are extended into spatial data types such as points, polylines and polygons in spatial data structures. Operations constitute capabilities that manipulate the data structures, and as such when sequenced into operational workflows in specific ways generate information from data; one might say that new relationships constitute the information from data. Different data model designs result in different combinations of structures, operations, and rules, which combine into various SDBMS products. The products differ based upon the underlying data model, and these data models enable and constrain the ability to store and manipulate data. Different SDBMS implementations support configurations for different user environments, including single-user and multi-user environments.
Nyerges, T. (2021). Spatial Database Management Systems. The Geographic Information Science & Technology Body of Knowledge (1st Quarter 2021 Edition), John P. Wilson (ed.). DOI: 10.22224/gistbok/2021.1.11.
This Topic is also available in the following editions: DiBiase, D., DeMers, M., Johnson, A., Kemp, K., Luck, A. T., Plewe, B., and Wentz, E. (2006). Basic Data Structures. The Geographic Information Science & Technology Body of Knowledge. Washington, DC: Association of American Geographers. (2nd Quarter 2016, first digital)
1. Introduction to Spatial Database Management Systems
Spatial database management systems, both software and hardware sub-components, organize data for inventorying and querying databases, conducting spatial analysis, and creating map visualizations within an integrated manner for managing large data stores (Yeung and Hall 2007). Database management is a subset of a larger category of technology called data management technology. Data are managed using two types of computer-based files, physical files and logical files. A physical file is a collection of records managed by the operating system software as stored on disk; a data file being different than a database file. A logical file is a collection of records managed by application software, most fundamentally database management system software. Many logical files can be combined into a physical file. One advantage for using logical files is the increase in access speed to individual data elements, as opening a physical file takes considerable time in contrast to accessing individual elements within a logical file. When data are organized into physical files to be managed, we call this ‘data file management’ (or simply file management). When we use logical data files organized within physical files, we call this database management. When a logical file is the same as a physical file then the file is called a ‘data file’. When multiple logical files are included in a physical file then we refer to the file as a database file (Rigaux, Scholl and A. Voisard 2002).
Spatial database management adds the spatial aspect (dimensions of space) to database management (Shekhar and Chawla 2003). Database management software is designed specifically with a spatial aspect in mind, as three dimensions of physical space are core to existence. These three dimensions are managed (stored and retrieved) in a special manner in data management software, making spatial database management software an enhanced-type of data management software based on the data model design.
A data model is essentially a design framework for a data management system. One of the most comprehensive definitions of a data model was provided by Edgar Codd (1980) ten years after he developed the design of relational data models (Codd 1970). Codd’s interest stemmed from clarifying the logical character of a data model, as opposed to its physical implementation; as such, the general concept of data model is not restricted to any particular approach to data management. From a database design perspective, a more common and popular understanding of data model is that it defines the structure and intended meaning of data (West 2011, p. 5). However, Codd’s (1980, p. 112) more comprehensive view characterizes a data model as consisting of three components: 1) a collection of data structure types (the building blocks of any database that conforms to the model) for describing data; 2) a collection of operators or inferencing rules for manipulating data, which can be applied to any valid instances of the data types listed in (1), to retrieve or derive data from any parts of those structures in any combinations desired; 3) a collection of general integrity rules for validating data, which implicitly or explicitly define the set of consistent database states or changes of state or both -- these rules may sometimes be expressed as insert-update-delete rules. Herein, the Codd (1980) framework is used to describe SDBMS due to its completeness, whereas the West (2011) interpretation and all others like it provides just the first third of the framework.
Data logical structures such as tables, objects, attribute fields and relationships as descriptions of data are implemented as physical data storage structures with data access mechanisms for primary and foreign keys. Basic data types such as integers (e.g., 1, 2, 3) and real numbers (e.g., 1.1, 1.2, and 1.3), and/or character strings (e.g., ‘text string’) are extended into spatial data types (e.g., points, polylines and polygons), and are used to form spatial data structures for data storage.
Basic DBMS operations for manipulating data include data creation (C), retrieval (R), update (U), and delete (D), referred to as the CRUD suite of operations. Logical operations and their physical implementation are used to derive logical structures and store them in terms of storage structures.
Rules constitute the third component of a data model, hence the DBMS. DBMS rules protect against corruption of the data by validating data (hence data structures) during CRUD operations. Validity rules are critically important for establishing the veracity of databases, protecting against unintended changes by users. Atomicity, consistency, isolation, and durability is a set of properties in database transactions that are intended to guarantee data validity despite errors, e.g. power failures. A sequence of operations using these properties is called a "valid transaction."
In summary of the above characterization, three levels of data abstraction combined with the three components of a data model summarize the aspects of DBMS (see Table 1). Thus, all DBMS software implementations should contain explicit capabilities for three components 1) constructs, 2) operations, and 3) rules for all three levels of conceptual (meaning), logical (structure), physical (data formatting) levels of data abstraction, respectively.
Levels of Abstraction | Three Components | ||
Constructs for describing | Operations for manipulating | Rules for validating | |
Conceptual | ... worldly features | ... worldly processes | ... features and processes |
Logical | ... data primitives of the database | ... data primitives of the database | ... data and operations on the database |
Physical | ... disk storage formats | ... data stored as bytes and bits | ... reads and writes to disk |
In further clarification, many people understand data model as a collection of data categories and relationships (West 2011). As such, that interpretation is simply the first component offered by Codd (1980). However, it should be clear that operations on data structures plus rules for qualifying data structure and/or operations are essential in operational database management systems. Without the operations there is no ‘change’ in data being managed. Without the rules, the veracity of the data and operations can be easily called into question. With rules constituting the third component of a data model, DBMS rules protect against corruption of the data by validating data (hence data structures) during CRUD operations. Consequently, it is important to embrace the Codd (1980) interpretation for complete implementation and use of a SDBMS.
2. Example DBMS and Spatial DBMS Software
Logical data models underpin the designs of DBMS software. Consequently, these data models underpin implementations of Spatial DBMS package implementations.
2.1 Logical Data Models as DBMS Types
A variety of logical data model types for implementing DBMS exist, each type being a different implementation of a logical data language with a physical context. (See the GIS&T BoK entry for logical data model description.) To provide an idea of the most popular DBMS software systems across the world based on logical data models, DB-Engines (2020) maintains a website documenting general rank of popularity (using six criteria to form the ranks) among 300+ DBMS. The top-ten ranked DBMS and associated data models show that the relational model is the most popular (See Table 2).
DB-Engine Rank | DBMS Name | Data Models Supported* (but not all are SDBMS capable) |
---|---|---|
1 | Oracle | Relational, Document Store, Graph, and RDF Store |
2 | MySQL | Relational and Document Store |
3 | Microsoft SQL Server | Relational, Document Store, and Graph |
4 | PostgreSQL | Relational and Document Store |
5 | MongoDB | Document Store and Search Engine |
6 | IBM Db2 | Relational, Document Store, and RDF Store |
7 | Redis | Key-value Store, Document Store, Graph, Search Engine, Time Series |
8 | Elasticsearch | Search Engine and Document Store |
9 | SQLite | Relational |
10 | Cassandra | Wide column |
* Definitions of the data models are provided in the text below.
The majority of DBMS available have been implemented based on the relational data model, or a derivation thereof, due to its long history of success as one can observe from the above table. This success and thus popularity is due to its simplicity of data storage for maintaining validity of database elements. However, there are many other DBMS implementations based on other logical data models as well because they offer richer data storage structures. The simpler the data structure storage, the more manipulation is needed to achieve an end result. With computers being faster over the decades, the rich data structure (non-relational) approaches have been gaining in popularity. The data model types described below appear in alphabetical order. There is no implied recommendation in the listing.
Every spatial database management system makes use of spatial data types that are ‘built-on-top’ of general data types. The GIS&T Body of Knowledge physical data model entry offers the list of general data types.
2.2 Spatial DBMS Products
A Wikipedia (2020) page about spatial databases describes a wide variety of spatial DBMS products (https://en.wikipedia.org/wiki/Spatial_database). Below we categorize that list in terms of the principal logical data model used, as some software products support multiple data models. Again, we use alphabetical order to list the types as above, and within each data model category we alphabetize the software products, with no priority order intended. A December 2020 ranking of popularity as scored by DB-Engines website appears in parentheses, wherein NR is not ranked because DB-Engines ranks general DBMS only as opposed to more specific SDBMS. As such, the rank does not imply popularity of the SDBMS, only the DBMS used to host the SDBMS.
Graph
Document Store
Key-value Store
Object-oriented
Open Standards
Relational
Relational Column Store
3. Examples of Enterprise Spatial DBMS
Several situations exist for user environments, including single-user, workgroup, enterprise, and consortium activities. Single-user SDBMS involves a single person at a time making use of a database environment. Workgroup database management activity involves multiple people performing database management on the same project records within a single unit (division) of an organization, that is, an intra-organizational same unit context. Enterprise database management activities involve multiple people performing databases management on the same project records within multiple units across an organization, that is, an organization-wide, but different unit context. Consortium database management activities involve multiple people performing data management on the same project records across organizations, that is, an inter-organizational context. In all multiple user contexts, conflicts with record access can occur when multiple users try to update the same database record at the same time. Those circumstances require record-locking capabilities, wherein record-locking protects users ‘stepping on’ one another changes, potentially resulting in database corruption. Enterprise SDBMS are among the most common types of data management implementations across the GIS industry.
By combining a list of DBMS supported by Esri and the ranked list from the DB-Engines website we gain a sense of the popularity of a DBMS being used to host a GIS enterprise approach with the Esri geodatabase DBMS environment (See Table 3). Table 2 and 3 present world-wide lists. Only two of the DBMS solutions fell in rank from December 2019 to December 2020. This might indicate that SDBMS is on the rise world-wide.
Esri DBMS-Compatibility** | Rank on DB Engines Website (363 DB Engines ranked Dec 2020) | Data Model(s) Listed on DB Engines Website, for each of respective DBMS | |
December 2020 | December 2019 | ||
Oracle | 1 | 1 | Relational, Document Store, Graph, and RDF Store |
Microsoft SQL Server | 3 | 3 | Relational, Document Store, and Graph |
PostgreSQL | 4 | 4 | Relational and Document Store |
IBM Db2 | 6 | 6 | Relational, Document Store, and RDF Store |
SQLite | 9 | 11 | Relational |
Teradata Data Warehouse Appliance | 14 | 15 | Relational, Document Store, Graph, and Time Series |
Microsoft Azure SQL Database | 16 | 25 | Relational, Document Store, and Graph |
SAP HANA | 20 | 20 | Relational, Document Store, and Graph |
IBM Informix | 30 | 26 | Relational, Document Store, and Time Series |
Netezza Data Warehouse Appliance | 34 | 33 | Relational |
ALTIBASE | 142 | 147 | Relational |
Dameng | Nor ranked | Not ranked | Relational |
* Relational data model is supported by Esri DBMS software. Other data models might be supported through customized software. Dameng is not compatible with the geodatabase data model.
** Esri's ArcMap (10.8) and ArcGIS Pro (2.7)