existence is a particularly special core attribute within Factual's data. It is a machine-learned numerical value between
1.0 (with values rounded to the nearest tenth) applied to every POI record, and is an indication of how confident we are the POI is real, open, and not a duplicate. Records that are deemed to not meet those criteria are set to
existence equal to
0.0. We derive these scores by training ML models using a variety of inputs, including social signals such as user checkins and tags.
By providing a range of confidence values, we enable our partners to filter out data below a threshold to suit their particular needs. The higher the filter threshold, the more accurate and precise the data will be, but the lower the coverage will be (this tends to be the strategy for mapping and display use cases). The lower the threshold, the more comprehensive the data will be, but you will have a higher quantity of bad records (this tends to be the strategy for search and active-user use cases).
- Factual will deliver country data in its entirety to our partners (i.e., our data files will include records with
0.0). Therefore, it is strongly recommended that all partners use an
existencethreshold greater than zero to filter out known closed businesses, duplicates, and junky data.
existenceis not a probabilistic score; records with
0.9existence will not necessarily be real, open, and non-duplicates 90% of the time.
- Every country has its own
existencemodel and distribution (e.g., a
USlikely is not the same level of confidence as a
Updated about a year ago