Natural language processing

Topics

[CP-03-010] Social Media Analytics

Social media streams have emerged as new sources to support various geospatial applications. However, traditional geospatial tools and systems lack the capacities to process such data streams, which are generated dynamically in extremely large volumes and with versatile contents. Therefore, innovative approaches and frameworks should be developed to detect an emerging event discussed over the social media, understand the extent, consequences of the event, as well as it time-evolving nature, and eventually discover useful patterns. In order to harness social media for geospatial applications, this entry introduces social media analytics technologies for harvesting, managing, mining, analyzing and visualizing the spatial, temporal, text, and network information of social media data.
[PD-01-010] Natural Language Processing in GIScience Applications

Natural Language Processing (NLP) has experienced explosive growth in recent years. While the field has been around for decades, recent advances in NLP techniques as well as advanced computational resources have re-engaged academics, industry, and the general public. The field of Geographic Information Science has played a small but important role in the growth of this domain. Combining NLP techniques with existing geographic methodologies and knowledge has contributed substantially to many geospatial applications currently in use today. In this entry, we provide an overview of current application areas for natural language processing in GIScience. We provide some examples and discuss some of the challenges in this area.
[DC-02-037] Texts

The integration of Geographic Information Retrieval (GIR) with advancements in Natural Language Processing (NLP) and Large Language Models (LLM) has revolutionized the utilization of unstructured text as a data source for Geographic Information Systems (GIS). Historically, unstructured text, unlike structured text such as XML documents or SQL queries, was predominantly leveraged by search engines and within the broader field of Information Science. However, the ubiquity of user-generated content on social media, combined with accessible online news outlet APIs, has prompted the integration of textual data in GIS applications. The fundamental shift in NLP technologies, particularly the advent of LLMs like GPT models and the evolution of text recognition algorithms, has enhanced the reliability of place name recognition, a subset of Named Entity Recognition (NER). These technologies enable the effective extraction of geographic references from vast quantities of textual data, offering substantial potential for enriching GIS databases. The primary challenges in this field include resolving place name ambiguities and vagueness, and adapting to the dynamic nature of geographic names and boundaries. Despite these challenges, GIR promises to unlock powerful new dimensions of spatial analysis and decision-making by integrating textual and geographic data.
[DM-03-074] Modeling Semi-Structured and Unstructured Spatial Data

This chapter surveys semi-structured and unstructured geospatial data, emphasizing their formats, challenges, and analytical approaches. Semi-structured data formats, such as JSON, do not follow rigid schemas but retain internal organization that supports spatial processing. These formats underpin many widely used datasets, including OpenStreetMap, and can represent both object-based and network-based spatial models. Unstructured data, including text, imagery, sensor streams, and point clouds, lack standardized formatting and must be transformed or enriched before spatial analysis is possible. For instance, crowdsourced or drone-collected imagery can be processed using Structure from Motion (SfM) to reconstruct 3D surfaces and terrain models. Textual data, such as social media posts or institutional reports, can be mined for geographic content using natural language processing techniques like named entity recognition and geoparsing. The chapter also considers recent developments in AI, including deep learning methods for image classification, segmentation of point clouds, and modeling spatiotemporal patterns from sensor data. Finally, it discusses the emerging role of multimodal models that integrate visual and textual information in geospatial workflows. Together, these tools and methods enable the use of increasingly diverse data sources in spatial analysis, broadening both the scope and depth of geographic inquiry.

Natural language processing

Topics

[CP-03-010] Social Media Analytics

[PD-01-010] Natural Language Processing in GIScience Applications

[DC-02-037] Texts

[DM-03-074] Modeling Semi-Structured and Unstructured Spatial Data