existence is a particularly special core attribute within Factual’s data. It is a machine-learned numerical value between
1.0 (with values rounded to the nearest tenth) applied to every POI record, and is an indication of how confident we are the POI is real, open, and not a duplicate. Records that are deemed to be bad are set to
0.0. We derive these scores by training ML models using a variety of signals, including social signals such as user checkins and tags.
By providing a range of confidence values, we enable our partners to filter data below a threshold to suit their particular needs. The higher the threshold, the more accurate and precise the data will be, but coverage will be lower (this tends to be the strategy for mapping and display use cases). The lower the threshold, the more comprehensive the data will be, but you will have a higher quantity of bad records (this tends to be the strategy for search and active-user use cases).
0.0). Therefore, it is strongly recommended that all partners use an
existencethreshold greater than
0.0to filter out known closed businesses, duplicates, and junky data.
existenceis not a probabilistic score; records with
0.9existence will not necessarily be real, open, and non-duplicates 90% of the time
existencemodel and distribution (e.g., a
USlikely is not the same level of confidence as a