Frequently Asked Questions


Download Data Clients


When will we receive the data files?

We will deliver the data upon notice of payment from our accounting department. We aim to deliver your files within 48 hours of receiving payment confirmation. Please note that we require payment prior to releasing the data – contract execution is not sufficient.


How do we access the data?

We provide the data via a Dropbox folder specifically created for you. You will need to provide us with the email address(es) that should have access to the folder and that will receive notifications from Dropbox. Factual data files are provided in compressed, tab-delimited format and the filename includes a UNIX timestamp in milliseconds, ex:

us_places.developer_preview.1345052880000.tab.gz

Compression is gzip format — it can be opened with most Windows, Linux, and Mac compression utilities.


I did not receive a Dropbox notification. Why not?

It’s possible that our accounting team has not yet processed your payment. If that’s not the case, please check your spam folder to make sure the notification from Dropbox did not skip your inbox and go straight to spam.


When do you update and replace the data?

We release updated data files every 4 to 6 weeks, at which points we will replace the old data files in your Dropbox folders with the latest ones.


Will you let us know each time a new data file is released?

We do not provide notification each time we release updated data files, however, every 4 to 6 weeks you can expect that new data files will be uploaded to your Dropbox folder.


Do you replace the files with bulk downloads of the data, or do you replace them with diffs files?

We replace the old file with a new file containing the entirety of the data. Our standard arrangement is to not provide diffs.


How do we get access to the new data?

When we release new data files we will upload them to your Dropbox folder.


What happens to the old files?

The previous data files will be placed into the will_remove_30_days folder until the next data file is released, at which point they will be removed from the folder entirely.


What type of files do you provide, and how would you recommend we open them?

We furnish our data in plain text format encoded with UTF-8. Fields are delimited by the Tab character.


File Size

Factual files contain millions of records. If your’re trying to use Excel, don’t. Do not load the entire file into memory, and don’t try to view with an editor – use head, tail, sed, awk, grep, and other utilities. These are all available for Windows at http://gnuwin32.sourceforge.net.


How do I open a text file that is tab delimited?

You will most likely not be able to open documents in tab delimited formats in programs like Excel, etc. There are a number of known text editors and command-line tools we can recommend, such as emacs, vim, etc. If you continue to experience issues, contact Factual and we can help.


Loading to SQL or Other System

See our code-specific examples in PHP.


Determining Field Width

Factual does not have formal specifications around field lengths — there is always the possibility that field lengths with change between versions (but rarely by much). However, we do provide a CSV analysis script that allows you to calculate value lengths for any CSV. See our code-specific examples in PHP

If you are on Windows and don’t want to run a PHP script, you can use a third-party Windows utility that does the same thing.


How should I use the Existence field?

One of the fields we make available in a download of Factual Global Places data is “Existence”. Existence is a machine-learned score that represents the likelihood that a place currently exists. Every place in the data has an Existence score ranging from 0.0 to 1.0, where 0.0 means the place definitely does not exist, and 1.0 means the place definitely does exist. Unless you have specified a threshold for Existence, you will have access to all places from existence 0.0 to 1.0. If you’re interested in setting a threshold please contact your account manager.

Learn more about existence score here.


Common API Questions

Alert: We are sun-setting access to the V3 APIs (Read, Schema, Facets, Submit). Learn more.


Why do I receive an HTTP 301 response?

A 301 response indicates a redirect. Essentially what occurred is that we discovered that we had two records for an identical record, or we discovered that we had an old and a new address for the same exact business and opted to merge the two. When we merge the two records, only one of the two Factual Ids will persist. The one that goes away is assigned to redirect itself to the one that remains. If you make a direct request for information using the Factual ID that went away, you’ll get the 301. Note that some app servers may automatically follow 301 redirects, and others may throw an exception. In the latter case, you can still read the body of the Response object, where you’ll actually see all of the current data for that business. The body will include the current Factual Id for that business so that you can update it.


Why do I receive an HTTP 410 response?

You’ve requested a record we deleted. This is typically because either the business was fake (spam), because the business owner requested that we remove it, or because the business was generated by some kind of technical mistake. If you get a 410, assume the record is gone for good.


Why do I receive an HTTP 503 response?

You’ve encountered an error condition that we didn’t think to handle. Since we never predicted this error, we don’t really have any more information about it. Your best bet is to report it to support.factual.com.


Factual Categories


How do restaurant categories relate to the cuisine field for Restaurants?

Factual assigns categories to all businesses and places of interest using a hierarchically arranged taxonomy of over 450 categories. These are broad categories, intended make it easy to search and filter businesses in our global place data. Among the categories, you’ll find a number of granular categories for types of restaurants. For example; Japanese, Indian, Sushi, and Steakhouses.

Factual’s restaurant data contains extended attributes that relate specifically to restaurants. Businesses contained in the restaurants table are also contained within our global place data, mapped to the exact same Factual Ids, and are assigned the same categories in both their global and extended attribute representations. However, the extended attribute table also contains a more granular field called cuisine that may list various types of cuisines available at each restaurant. Data in the cuisine field is not guaranteed to be consistent with the restaurant categorization. For example, it is both valid (and likely) to find a restaurant categorized as “American” that serves cuisine that includes either “Pizza” or “Sushi”. When using the Factual API, using the q parameter will automatically search both the category and cuisines. If you are using a download of the data, you may want to do the same.


How do I know when to use category_ids versus category_labels?

Factual uses an Id based system to define the taxometric relationship between categories. Businesses are assigned to any number of category_ids. Correspondingly, English-readable representations of the category, including hierarchical relationships, are provided for convenience in the category_labels field. Relying on Ids provides an abstraction layer with a number of significant advantages:

It is advised that you NEVER filter data based on category_labels. When using the Factual API to search by category_ids_, Factual ensures that data assigned to subcategories of any category you search are also returned in the results. For example, a search of category id 432 (“Travel > Lodging”) would also include 433-438 (“Bed and Breakfasts”, “Cottages and Cabins”, etc.). While searching for category labels beginning with “Travel > Lodging” may, on the surface, appear to yield the same result, this presents potentially brittle code. Factual has, and will continue to periodically refine its taxonomy in an effort to improve the usability of our product. Names of categories may change, as may the hierarchical arrangement. Similarly, do not assume that ordinal relationships are permanent. E.g., categoryids between 433 and 438 is not the same as category_ids includes 432. If you look at the “Social > Food and Dining > Restaurants” category, you’ll see that there are late additions to the taxonomy for 458 (“Social > Food and Dining > Restaurants > Food Trucks”) and 457 (“Social > Food and Dining > Restaurants > Asian”).


How are the categorical counts performed by Factual’s Facet API?

The facet API can returns counts of places, grouped by category. When grouping individual places together in a particular category, the facet API “rolls up” places that belong to subcategories of the chosen category, provided that your correctly count using the $includes_any operator on category_ids (not category_labels).

For example, given places assigned to the category_ids below:

name category ids1 category labels2
Juicy Joe’s 346 Social > Food and Dining > Juice Bars and Smoothies
McDermal’s 347
348
351
Social > Food and Dining > Restaurants
Social > Food and Dining > Restaurants > American
Social > Food and Dining > Restaurants > Burger
Texas Tom’s 348
349
Social > Food and Dining > Restaurants > American
Social > Food and Dining > Restaurants > Barbecue
The Country Club 347
369
Social > Food and Dining > Restaurants
Social > Country Clubs

The rolled up counts would be as follows:

category id category name count places counted
308 Social 4 All (no duplicate counts, despite multiple assignments per places to social categories).
338 Food & Dining 4 All (no duplicate counts)
347 Restaurants 3 All but Juicy Joe’s
348 American 2 McDermal’s & Texas Tom’s
346 Juice Bars and Smoothies 1 Juicy Joe’s
349 Barbecue 1 Texas Tom’s
351 Burgers 1 McDermal’s
369 Country Clubs 1 The Country Club

1 Factual places can be assigned to multiple category Ids.

2 Unlike category Ids, category labels show implicit categories. E.g., if a places is assigned to category id 346 (Juice Bars and Smoothies), the labels will include the ancestors of that category (Social, Food and Dining).


What should I eat for dinner?
Fried chicken (also referred to as “Southern fried chicken”) is a dish consisting of chicken pieces usually from broiler chickens which have been floured or battered and then pan-fried, deep fried, or pressure fried. The breading adds a crisp coating or crust to the exterior. What separates fried chicken from other fried forms of chicken is that generally the chicken is cut at the joints and the bones and skin are left intact. Crisp well-seasoned skin, rendered of excess fat, is a hallmark of well made fried chicken.3

Inexplicably, Factual’s restaurant data cuisine field does NOT contain fried chicken. Therefore, diners with finer palettes are advised that the best solution for finding Fried chicken is to lean on the q parameter in the Factual API. Full-text search in the Factual API has access to hidden tags that likely contain mention of this worldly dish, even though you won’t see any mention in our category or cuisine fields.

3 https://en.wikipedia.org/wiki/Fried_chicken


Further Reading:


Schemas

Global Places
Restaurants
Hotels
Doctors


Categories

Working with categories
Category taxonomy