What Are Deterministic and Probabilistic IDs?

By Brian LaRue February 14, 2018

DecoderProgrammaticcookiescross-platformdeterministicidentifiersprobabilisticbrianlarue

Cookies are a foundational part of digital advertising, but their application is limited in a cross-device environment. Cookies are browser-specific, they aren’t supported in OTT, and they aren’t easily ported between mobile apps. Cookies, then, are just one type of identifier among several that go into targeting users across multiple screens.

Those types of identifiers can be sorted into two groups, deterministic and probabilistic identifiers.

Deterministic identifiers are based on some kind of identifiable data—you know who this specific user is. These identifiers include log-ins and other registration data, and sometimes offline customer data or IDs, information the user and data collector have shared with each other. It’s possible to determine with certainty that this data relates to a particular user. There are some privacy concerns here, potentially—there may be personally identifiable information contained in deterministic data—but to ensure the user’s privacy, that ID will be coded into a long string of integers. Regardless, every time the user logs back into the site, on any device, the publisher or platform can recognize that individual user and tailor their experience accordingly. Deterministic IDs are, by and large, made up of first-party data the publisher or platform owns. Platform- or software-based deterministic IDs include Facebook, Google, Twitter, Apple IDFA and Android ID. Prominent publisher-based deterministic IDs include Amazon, The Weather Company and AOL.

Probabilistic identifiers use a wide range of signals—sometimes hundreds—across multiple channels to build user profiles by matching anonymous data points with data from known users who exhibit similar behaviors. It’s hard or impossible to say who these data points pertain to, specifically—hence the name—but you know they have a profile that exhibits similar behavior to a known (deterministic) user, and you can use what you know about your known users to make assumptions or predictions about the user behind the probabilistic ID.

Probabilistic data points could pertain to browser version, ad serving processes, device type, time zone, shared IP addresses, JavaScript commands, piggybacked iFrames—everyday elements of the web that don’t point to any particular user, but that allow the digital ecosystem to function as it does. They normally don’t use email or registration data that might identify a specific user. Probabilistic IDs are created by algorithmically analyzing all of these regularly-occurring signals to build out cross-platform user profiles. Every provider of probabilistic IDs has their own methodology. Well-reputed companies that develop probabilistic IDs may boast 70%-95% accuracy, compared to deterministic IDs.

Probabilistic IDs are not as precise as deterministic IDs, but most publishers (unless you’re a Facebook, Netflix or something similarly massive) only have known, deterministic data on so many users. Plus, proprietary deterministic ID systems can’t read each other’s IDs across platforms. That can lead to, say, the same user appearing to an advertiser to be two different people, depending on whether they’re on Facebook or Amazon. Probabilistic data, then, allows for scale, and it also provides a means to map out behaviors across devices with limited or no reliance on personally identifiable information.

AdMonsters Resources

Take It to the People: LiveIntent’s Jason Kelly on Identity-Based Marketing (2016)

Probabilistic Identifiers and the Problem With ID Matching (2015)

AdMonsters Playbook: Cross-Channel Data (2015)

Fight the Fragmentation: Grasping Cross-Device Audience Behavior (2015)