Your Data Portfolio: What Companies Actually Have on You

The surveillance apparatus is vast, automated, and mostly indifferent to you personally. The algorithms that analyze your behavioral data are not reading your messages. They are classifying your patterns into market segments, predicting your next purchase, and selling those predictions to advertiser

The surveillance apparatus is vast, automated, and mostly indifferent to you personally. The algorithms that analyze your behavioral data are not reading your messages. They are classifying your patterns into market segments, predicting your next purchase, and selling those predictions to advertisers who will never know your name. But the scope of what has been collected — quietly, passively, over the decade or more you have been using these platforms — is worth understanding in concrete terms, not because it should frighten you, but because you cannot make proportional decisions about sovereignty without first knowing what you have already given away.

This is not a theoretical exercise. Every data category described here has been confirmed through GDPR data access requests, platform documentation, legal discovery in court proceedings, or investigative journalism. We are not speculating about what companies might collect. We are cataloging what they have already confirmed they do collect.

Why This Matters for Sovereignty

Shoshana Zuboff’s central insight in The Age of Surveillance Capitalism is that the data extracted beyond what is needed to improve a service — the behavioral surplus — becomes the raw material for prediction products sold on a behavioral futures market. The companies listed below are not storing your data as a favor. They are refining it into a product. Your location history, your search patterns, your purchase records, and your social graph are inventory in a market you never agreed to supply.

Understanding the scope of that inventory is the first step in a proportional response. You do not need to delete every account and move to a cabin in Vermont. You do need to know what exists before you can decide what to reclaim, what to limit, and what to accept as the cost of participation. Sovereignty begins with an honest audit, and an honest audit requires specifics.

How It Works: The Data Categories, Platform by Platform

Google

Google’s data collection is the most extensively documented, partly because of its scale and partly because of legal proceedings that forced disclosure. Through Google Takeout — the company’s own data export tool — a typical user can confirm the following categories: complete search history (every query, timestamped), location history including GPS coordinates recorded at regular intervals even when the user has disabled the setting labeled “Location History” (confirmed by the 2018 Associated Press investigation and the subsequent Arizona Attorney General lawsuit), YouTube watch history with duration data, email content scanned for ad-targeting signals, voice recordings captured through Google Assistant interactions, and detailed app usage logs from Android devices.

The location tracking deserves particular attention because it illustrates the gap between interface design and actual practice. In 2018, AP reporters demonstrated that Google continued logging precise location data through a separate setting called “Web & App Activity” even when users turned off the setting explicitly named “Location History.” Google’s own support pages at the time stated that disabling Location History meant “the places you go are no longer stored.” This was not accurate, and the Arizona AG’s office pursued legal action on deceptive-practice grounds. The case was settled in 2023 for $39.9 million .

For anyone who has used Google services for a decade or more, the cumulative dataset is substantial. It includes not just what you searched for, but where you were when you searched, how long you spent on each result, and what you did next. This is behavioral surplus in Zuboff’s precise sense — data generated far beyond what is necessary to return a search result.

Meta (Facebook and Instagram)

Meta’s data portfolio extends beyond what users directly post. Confirmed categories include: all posts, comments, and messages (including deleted messages retained for a period), every ad clicked or viewed with engagement duration, and the “Off-Facebook Activity” log — a record of websites you visited that had the Facebook tracking pixel installed. This last category was made visible to users only after sustained pressure, and when users first accessed it, many were startled by the breadth.

Facebook’s shadow profiles represent a particularly notable practice. These are data records created for people who do not have Facebook accounts, assembled from contact lists uploaded by existing users. When your friend uploads their phone contacts to Facebook, the platform creates or enriches a profile associated with your phone number and email address — whether or not you have ever consented to anything. Mark Zuckerberg confirmed the existence of this practice during his 2018 congressional testimony, though he avoided using the term “shadow profiles.”

Prior to 2021, Facebook also maintained facial recognition templates derived from tagged photos. The company announced it would delete these templates in November 2021 as part of a broader shift, but the years of biometric data already collected had been in active use for auto-tagging suggestions and, according to Zuboff’s analysis, for behavioral prediction modeling.

Amazon

Amazon’s data collection spans its retail, voice assistant, reading, and home security products. Through data access requests, users have confirmed the following: complete purchase and browsing history (including items viewed but not purchased), Alexa voice recordings stored in the cloud (Amazon initially retained these indefinitely; the company now offers deletion options after regulatory scrutiny), Kindle reading data including highlights, bookmarks, and reading speed, and Ring doorbell footage which can be stored on Amazon’s cloud servers.

The integration across Amazon’s product lines creates a composite picture that is broader than any single service suggests. Your purchase history reveals economic behavior. Your Alexa recordings reveal household routines. Your Kindle data reveals intellectual interests and reading habits. Your Ring footage reveals who visits your home and when. No single data point is alarming in isolation. In aggregate, the profile is remarkably detailed.

Your Phone Carrier

Carrier data collection is less visible than platform data because carriers do not offer consumer-facing data export tools comparable to Google Takeout. However, carrier privacy policies and the investigative work of Senator Ron Wyden’s office have confirmed the following categories: cell tower location data that tracks your general position continuously while your phone is active, call metadata (who you called, when, for how long), and in some cases, browsing data for traffic that is not encrypted. The specifics vary by carrier and jurisdiction.

The Wyden investigation specifically documented cases where carriers sold precise location data to third-party brokers without meaningful user consent. T-Mobile, AT&T, and Sprint all faced scrutiny for these practices between 2018 and 2020 . The FCC subsequently proposed fines, though the enforcement process moved slowly.

Data Brokers

The data broker industry operates largely outside public awareness. Companies like Babel Street, LexisNexis Risk Solutions, and Acxiom aggregate data from public records, purchase histories, location data, and dozens of other sources into comprehensive profiles that can be purchased by almost anyone willing to pay. The Electronic Frontier Foundation and Senator Wyden’s office have documented cases where broker data was precise enough to track specific individuals to specific buildings — including visits to medical facilities, places of worship, and protest sites.

A 2023 report from the data broker transparency project Databrokers.com cataloged over 4,000 data brokers operating in the United States . Most consumers have never heard of any of them. The profiles these brokers compile typically include: inferred demographics (age, income, household composition), purchase behavior aggregated across retailers, location patterns, vehicle registration data, property records, and in some cases, inferred political and religious affiliations.

Credit Bureaus

The three major credit bureaus — Equifax, Experian, and TransUnion — maintain financial behavior records that are regulated under the Fair Credit Reporting Act but broadly shared with lenders, landlords, employers (with consent), and insurers. These records include payment histories, credit utilization, address history, employer information, and account details. While regulated more tightly than data broker profiles, credit bureau data has been subject to massive breaches (Equifax, 2017, affecting approximately 147 million people) and is shared more widely than most consumers realize.

The Proportional Response

The purpose of this inventory is not alarm. It is calibration. Most people underestimate how much data they have generated through passive use over ten to fifteen years of digital life. The surprise is not that companies collect data — that much is obvious. The surprise is the cumulative volume, the cross-platform integration, and the degree to which passive behaviors (carrying a phone, browsing a website, having a friend who uses Facebook) generate data without any deliberate act of sharing.

The proportional response has three tiers. First, request your data. Google Takeout, Facebook’s “Download Your Information” tool, and Amazon’s data request process are all available under GDPR and CCPA rights. The exercise itself is clarifying — seeing the actual export is different from reading about it. Second, identify the categories that matter most to you. For most people, location data and email content represent the highest-sensitivity categories because they reveal patterns of movement and communication that are difficult to obscure retroactively. Third, begin making deliberate choices about which platforms continue to receive your behavioral surplus and which do not. This is not an all-or-nothing decision. It is a portfolio rebalancing.

What to Watch For

The landscape of data collection changes as platforms add products and as regulatory frameworks evolve. As of early 2026, the categories described here are confirmed. Several developments bear watching.

Google’s AI integration across its product suite is expanding the categories of behavioral data generated by ordinary use. Gemini interactions, for example, generate conversational data that is qualitatively different from search queries — longer, more contextual, and more revealing of intent. The privacy implications of AI assistant data are still being defined by regulators and litigated in courts .

Meta’s investment in mixed-reality hardware (Quest headsets) introduces biometric data streams — eye tracking, hand tracking, room mapping — that did not exist in the platform era. Whether these data categories will be subject to the same behavioral surplus extraction logic is an open question, but the business model incentives are clear.

Data broker regulation remains spotty. Several US states have passed data broker registration laws, but registration is not restriction. The gap between the number of brokers operating and the number subject to meaningful regulation remains wide.

The honest assessment is this: your data portfolio is larger than you think, more distributed than you can easily track, and more valuable to its holders than it is to you. The sovereign response is not to panic about what has already been collected. It is to understand the scope, make informed decisions going forward, and focus your efforts on the categories that matter most — which is what the rest of this series addresses.


This article is part of the Surveillance Capitalism & The Proportional Response series at SovereignCML.

Related reading: What’s Documented vs. What’s Assumed, The Enforcement Gap: Laws That Exist but Don’t Protect You, The Five Things That Actually Matter

Read more