Privacy Risks in the Collection, Brokerage, and Use of Geospatial Location Data

ByZayanna Serrano

With smartphones and sensors continuously recording human movements in real time, there are emergent risks for geospatial data collections that present systemic risks that extend well beyond the traditional understandings of privacy. This unprecedented rate of data collection has created one of the most all encompassing and least understood security challenges of the technological era. Smart phones, mobile phone apps, advertising software development kits (SDKs), and the Internet of Things (IoT) sensors generate extraordinarily detailed records of individuals’ movements. These data streams are funneled into the commercial ecosystem of aggregators and data brokers who complement a massive centralized database of mobility information. This sensitive information is capable of pinpointing the routes, relationships, and behaviors of millions of individuals. A tool that was originally a resource for navigation has now doubled as a tool for behavioral surveillance, whose risks go beyond traditional privacy concerns. 

Location data, in particular, is uniquely sensitive not only in terms of where individuals move to but also, what relations can be exposed from those movements. For example, an individual’s health status, political engagement, religious activities, and relationships can be drawn out from such data collection. In a study conducted with the MIT Media Lab, researchers found that location data are highly susceptible to re-identification: “just four spatiotemporal points are sufficient to identify 95% of individuals within a dataset” (de Montjoye et al., 2013). Anonymity in this way, even in anonymous data, appears to be a difficult task due to the uniqueness of individual mobility traces. These issues are further exacerbated by a vulnerability in the data supply chain, which can range from permissive app features to a lack of regulation in the data broker markets (Shilton & Greene, 2019). As a result of a vulnerable data supply chain,

geospatial data has become a valuable commodity to be exploited not only by advertisers and commercial analytics, but also for law enforcement and malicious actors (Fellow, 2022). These threats with a weak system pose a great danger to individuals. Security failures involving location data have left individuals vulnerable to stalking, discrimination, and violence. It has also disclosed visits to reproductive healthcare clinics and undermined national security by exposing military personnel’s movement patterns and base perimeters through fitness-tracking apps (Childs, 2023; FTC, 2022). These incidents underscore how seemingly routine mobility data can lead to dire circumstances in the development of this data, particularly when it reaches many hands without proper scrutiny. This paper investigates three major questions: (1.) How is location data particularly susceptible to identification? (2.) What systemic security failures and real-world incidents illuminate the vulnerabilities in the location data ecosystem? and (3.) What technical and policy interventions could be done to mitigate them? 

Ultimately, while it appears that geospatial data has legitimate use in navigation, logistics, and urban planning, it seems that in this current data supply chain where data is gathered, stored, and traded, there are still major security risks. This is where coordinated technical reforms through enhanced differential privacy protections, data-minimization requirements, and secure designs combined with policy actions, are urgently needed to reduce systemic risk and safeguard both individual rights and collective security. 

  1. The Geospatial Data Collection Ecosystem  

Modern geospatial data collection operates through an expansive and often unclear technological ecosystem. At the most fundamental layer of the ecosystem, mobile operating systems (OS) signals continuously provide location data through GPS, WiFi scanning, and

bluetooth. These signals provide precise tracking even when users are not actively using location-based services. 

Research in human-computer interaction reveals that OS-level data flows are often difficult for users to detect, as multiple apps use location data through operating system permissions. These permissions are frequently misunderstood or accepted without scrutiny (Devaraja & Patil, 2025). This lack of transparency in location based services allows mobility traces to be collected in the background without explicit user awareness. Numerous mobile apps use advertising software development kits (SDKs) that harvest precise GPS data under the guise of providing analytics, personalization, and targeted advertisement features. Studies of the mobile advertising ecosystem revealed that SDKs massively collect data beyond usage, while in turn has a separate data market for mobility traces with no relation to the mobile application (Shilton & Greene, 2019). These SDKs can also transmit detailed movement patterns to numerous third-party developers. This enables cross application aggregation and long-term profiling. Beyond smartphones, other Internet of Things (IoT) devices also form an additional layer for mobility trace data. A number of IoT devices collect location data in a passive manner using their companion apps or embedded sensors. This further expands the scope of mobility trace surveillance. IoT environments further exacerbate this transparency issue, as users are unaware of what data devices gather, how frequently they transmit it, and to who (Devaraja & Patil, 2025). 

The location data typically moves through a multilayered pipeline consisting of application developers, SDK providers, data aggregators, and finally, large-scale data brokers. Cell phones use GPS and network-based location information. This data proceeds from individual apps to embedded SDKs, which then transmit the data to aggregators that pool

information from thousands of other sources. These aggregators clean, label, and enrich the data that is eventually sold to location data brokers. These brokers, working with minimal transparency or regulatory oversight, help commercial firms develop behavioral models and long-term movement patterns for millions of individuals. Their means of gaining revenue from this data involve offering location-based advertising, developing foot-traffic analytics for retail industries, and offering products that are closely related to surveillance (Thompson & Warzel, 2019). Through this system, everyday mobility traces are transformed into lucrative commodities, without explicit user awareness or consent. 

This data is usually stored in large cloud services such as Amazon Web Services (AWS), which offer scaling but also present substantial security risks. This area of data management also lacks regulation, with most individuals being uninformed about how their everyday online activities result in long-term data storage, processing, and reselling. This data goes beyond being used for ads and analytics, as it also finds use in government institutions and law enforcement. According to the ACLU, government institutions and law enforcement have at times circumvented warrant requirements by buying commercially available location data instead of requesting it from telecom providers (Fellow, 2022). Such access raises significant privacy concerns, since such bulk of mobility data can reveal sensitive behaviors, associations, and patterns of daily life. 

III. Why Is Geospatial Data Uniquely Susceptible to Identification? 

Geospatial data contains a high susceptibility to re-identification due to its uniqueness. This uniqueness arises from the mobility traces that are extremely distinct. Even when datasets are anonymized, unique patterns act as quasi-identifiers, which render it feasible to make inferences about an individual with minimal information. According to MIT’s Media Lab, it was

feasible to identify 95% of individuals through using four spatiotemporal points from a user’s smartphone. In the MIT study, researchers were able to find the health record of the governor of Massachusetts simply by using a healthcare database and a voters list (de Montjoye et al., 2013). This means that mobility traces are a form of behavioral fingerprint, which has become continuous, unique, and easily traceable to the people who created them. 

The re-identifiability of location data matters because mobility patterns enable highly sensitive personal inferences. A series of visits to places of worship, political events, or community organizing spaces, for example, can expose an individual’s beliefs and associations. This raises concerns about the freedom of speech and assembly. Medical inferences are equally alarming. As precise location traces can reveal visits to reproductive health clinics, HIV treatment sites, mental health providers, or substance-use programs, can become a form of exploitation. Investigations have shown that commercial data retailers like Acxiom and CoreLogic have sold data sets that revealed visits to abortion clinics and addiction recovery centers, often without users’ knowledge or consent (Bhatia, 2024). Also, location data can reveal information about an individual’s sexual orientation or intimate relationships. This information can heighten risks of stalking, harassment, and domestic violence by revealing home routines and travel patterns. Ultimately, the danger of location data used to re-identification not only lies with identifying a person but also in exposing what they do, where they go, and what they believe in.  

Anonymization of geospatial data has been proven unsuccessful in all cases due to properties of location data. High spatial and temporal resolution allows detailed trajectories to be established even without direct identifiers. In other words, properties which make this data so valuable also make anonymization of this data ineffective.

  1. Security Failures in the Location Data Ecosystem 

Security failures in the location data ecosystem occur at all layers of the collection and distribution pipeline. The ecosystem starts with the mobile apps that serve as the primary interface for geospatial data collection. Many of these apps have been shown to exhibit poor permission management and excessive location permissions, often retaining background permissions (Wijesekera et al., 2015). These vulnerabilities combined with insecure transmission and storage practices as well as embedded third-party SDKs amplify these problems. All together, it creates a large amount of unregulated data collection pathways that are then used for advertising and analytics SDKs that often harvest precise coordinates, device identifiers, and wireless scans independently of the host app’s stated purpose (Shilton & Greene, 2019). 

At the top of the ecosystem, commercial data brokers introduce structural risks with minimal transparency, ineffective security measures, and large-scale aggregation of sensitive geospatial data. Data brokers rarely implement comprehensive security audits and maintain unclear data sharing agreements with advertisers, analytics agencies, and government contractors (Neally, 2021). Studies on mobility datasets makes it clear that there is a risk associated with centralized data ecosystems. A researcher team at Princeton University found that geolocation data created from IoT and maintained on cloud infrastructures is extremely susceptible to inference attacks (Apthorpe et al., 2017). Collectively, these findings show that the structural conditions of commercial data brokerage, such as centralization and lack of transparency, do not have robust security. 

Real-world incidents underscore the consequences of these systemic weaknesses. The Strava Global Heatmap incident revealed how even “anonymized” visualizations can expose sensitive operational information when movement patterns correlate with spatial signatures.

Subsequent analysis of the heatmap demonstrated that activity hotspots unveiled the location of military bases, patrol routes, and personnel habits, proving a point that aggregation does not eliminate identifiability (Childs, 2023). Regulatory investigations into data brokers further highlight the risks: for example, the U.S. Federal Trade Commission has documented instances in which data brokers collected and sold location data reflecting visits to medical facilities, religious institutions, and military sites, demonstrating that commercial datasets often reach a level of precision capable of uncovering deeply sensitive behaviors. 

Together, these technical failures and real-world incidents illustrate something fundamental: the location data ecosystem is structurally insecure. From apps and SDKs to data brokers and municipal systems, vulnerabilities enable the extraction, aggregation, and misuse of highly sensitive mobility data with profound implications for both privacy and safety. VI. Evaluation of Technical Approaches to Protect Location Data 

There are five technical approaches that are, rather widely, considered to help mitigate privacy and security risks from location data: the adoption of decentralized/personal data stores (PDS), on-device computation, differential privacy (DP) applied to mobility, encrypted geofencing and secure multiparty computation (SMPC). 

Decentralized personal data stores (PDS) propose a move for raw location data from central servers to user-controlled pods. This will allow services to query or calculate on data without having to extract it, enabling for user consent to be strengthened. Research on PDS architecture shows fundamental challenges related to complex workflows (Fallatah et al., 2023). As a result, PDS models remain promising but difficult to deploy. 

The second approach, on-device/local computation, retains raw mobility traces on the device but sends aggregated outputs and model updates. This technique is useful for population

analytics but restricted in its use. It also poses computational and communication challenges  (Melis et al., 2019). Further, without extra safety measures, model updates could disclose meaningful information regarding user data via reconstruction attacks. 

The third approach, Differential Privacy (DP), provides mathematically rigorous privacy guarantees by intentionally introducing noise into either aggregate or model outputs, while constraining the impact of a single user’s location data. But DP cannot be effectively implemented without significantly compromising utility for high-dimensional data as seen in mobility information, which remains an unresolved challenge in the research literature (Gursoy et al., 2017). 

The fourth and final possible approach, encrypted geofencing and secure multiparty computation (SMPC), enables the computation of spatial assertions without revealing sensitive coordinates. (Zhao et al., 2019). These tools have great capabilities but are more suited for highly sensitive or regulated contexts. 

Comparatively, there is not a single approach that meets the varying challenges sufficiently. Regarding aggregators, on-device computation with secure aggregation and differential privacy offers strong protection with reasonable utility. Ultimately, the most optimal solution would need to combine technical minimization, cryptographic safeguards, differential privacy, and governance measures. 

VII. Policy, Regulatory, and Governance Solutions 

Although the United States has begun to develop more robust data privacy protections at the state level, existing frameworks remain narrow in scope compared with the European Union’s General Data Protection Regulation (GDPR). California’s California Privacy Rights Act

(CPRA) expands on the California Consumer Privacy Act (CCPA) by classifying geolocation as “personal information” and giving residents the right to limit and correct use of that data in an opt-out consent; however, said protection extends only to California residents and businesses, leaving a large segment of U.S. consumers outside its protection. 

In contrast, the GDPR, applies to any personal data of individuals in the EU, regardless of the processor’s location, and it mandates a legal basis for processing. The GDPR also features an explicit opt-in consent. This restraint is beyond the CPRA’s primarily opt-out framework. GDPR’s expansive definition of personal data, including location and device identifiers, coupled with strict purpose limitation, creates stronger baseline protections for mobility data than what CPRA’s more limited regime allows (Mallet, 2025). These differences further show that though CPRA is an advance, it is still narrow in scope and enforcement compared to the GDPR and cannot be comprehensive in dealing with the systemic risks posed by pervasive geospatial data collection. 

VIII. Conclusion 

As geospatial technologies continue to improve, emerging surveillance markets will increase risks around mobility data. Commercial tracking of political activity, including visits to protests, campaign events, or ideological spaces, will further threaten freedoms. These threats will be further magnified by the continued advance of machine learning. It is now clear that fusing partial mobility traces from different sources can dramatically increase re-identification against apparently anonymized datasets. Trajectory signature learning in inference attacks has shown models to identify individuals and sensitive behaviors with uncanny precision (Liu, et al., 2022). As mobility data become more deeply integrated into IoT infrastructures, these

capabilities will only expand. Future research must therefore bridge GEOINT, cybersecurity, machine learning, and legal research. Key priorities include evaluating technical protections such as encrypted geofencing and differential privacy against realistic adversarial conditions. 

In closing, while geospatial data enable valuable public and commercial applications, the systems that collect and monetize mobility traces create structural vulnerabilities that current U.S. practices do not adequately address. Because mobility patterns are uniquely re-identifiable and semantically rich, even “anonymized” datasets expose sensitive information about individuals’ health, politics, religion, and relationships. These risks demand both technical minimization and regulatory reforms, from enforcing data minimization and governing data brokers to restricting uses entailing high-risk. The governance of geospatial data is essential not only for privacy, but for civil liberties and national security.

Works Cited 

Apthorpe, Noah, and Dillon Reisman. A Smart Home is No Castle: Privacy Vulnerabilities of Encrypted IoT Traffic. 2017. Princeton University Department of Computer Science, https://arxiv.org/pdf/1705.06805. 

Bhatia, Rhea. “A Loophole in the F A Loophole in the Fourth Amendment: The Go th Amendment: The Government’ ernment’s Unregulated Purchase of Intimate Health Data.” Washington Law Review, vol. 98, no. 1, 2024, pp. 1-35. Washington Law Review, https://digitalcommons.law.uw.edu/cgi/viewcontent.cgi?article=1076&context=wlro. Childs, Kevin, et al. Heat Marks the Spot: De-Anonymizing Users’ Geographical Data on the Strava Heatmap. 2023. University Raleigh, North Carolina, 

https://www.cise.ufl.edu/~k.childs/Papers/HeatMarksTheSpot.pdf. 

de Montjoye, Yves-Alexandre, and Cesar A. Hidalgo. Unique in the Crowd: The privacy bounds of human mobility. 2013. Scientific Reports, 

file:///Users/zayannaserrano/Downloads/Unique_in_the_Crowd_The_Privacy_Bounds_of _Human_Mo.pdf. 

Devaraja, Manila, and Sameer Patil. Understanding User Prioritization and Comprehension of Smartphone Permissions. 2025. ACM Digital Library, https://dl.acm.org/doi/10.1145/3743739. 

Fallatah, Khalid, and Mahmoud Barhamgi. Personal Data Stores (PDS): A Review. 2023. NIH National Center for Biotechnology Information, 

https://pmc.ncbi.nlm.nih.gov/articles/PMC9921726/. 

Fellow, Brennan. “New Records Detail DHS Purchase and Use of Vast Quantities of Cell Phone Location Data.” ACLU, 18 July 2022,

https://www.aclu.org/news/privacy-technology/new-records-detail-dhs-purchase-and-use of-vast-quantities-of-cell-phone-location-data. Accessed 11 December 2025. “FTC Sues Kochava for Selling Data that Tracks People at Reproductive Health Clinics, Places of Worship, and Other Sensitive Locations.” Federal Trade Commission, 29 August 2022, 

https://www.ftc.gov/news-events/news/press-releases/2022/08/ftc-sues-kochava-selling-d ata-tracks-people-reproductive-health-clinics-places-worship-other. Accessed 11 December 2025. 

Liu, Yiyong, et al. Membership Inference Attacks by Exploiting Loss Trajectory. 2022, ACM Digital Library. 

Mallet, Pierre. “Comparative Analysis of Data Privacy Legislation: Convergence and Divergence Between the GDPR and CCPA.” Tech Fusion in Business and Society, vol. 3, no. 1, 2025, pp. 465-475. ResearchGate, 

https://www.researchgate.net/publication/390994398_Comparative_Analysis_of_Data_Pr ivacy_Legislation_Convergence_and_Divergence_Between_the_GDPR_and_CCPA. Neally, Daniel. DATA BROKERS AND PRIVACY: AN ANALYSIS OF THE INDUSTRY AND HOW IT’S REGULATED. 2021. HeinOnline, 

file:///Users/zayannaserrano/Downloads/22AdelphiaLJ30.pdf. 

Shilton, Katie, and Daniel Greene. Linking Platforms, Practices, and Developer Ethics: Levers for Privacy Discourse in Mobile Application Development. 2019. Journal of Business Ethics, 

file:///Users/zayannaserrano/Downloads/Linking_Platforms_Practices_and_Developer_Et hics_L.pdf.

Warzel, Charlie, and Stuart A. Thompson. “Opinion | Twelve Million Phones, One Dataset, Zero Privacy (Published 2019).” The New York Times, 19 December 2019, https://www.nytimes.com/interactive/2019/12/19/opinion/location-tracking-cell-phone.ht ml. Accessed 11 December 2025. 

Wijesekera, Primal, and Arjun Baokar. Android Permissions Remystified: A Field Study on Contextual Integrity. 2015. USENIX, 

https://www.usenix.org/system/files/conference/usenixsecurity15/sec15-paper-wijesekera .pdf. 

Zhao, Chuan, et al. “Secure Multi-Party Computation: Theory, practice and applications.” Information Sciences, vol. 476, no. 476, 2019, pp. 357-372. Science Direct, https://www.sciencedirect.com/science/article/abs/pii/S0020025518308338.