What data is available?
We offer data from a wide variety of sources, which are detailed in the documentation. Much of the data is from the United States government: the Census Bureau, Office of Management and Budget, Federal Elections Commission, Environmental Protection Agency, and others. Other data points come from open data projects such as OpenStreetMap and Wikidata. Some of the data has been aggregated to specific geographic areas by the original data source. Other data (from OpenStreetMap, Wikidata, various surveys, and other so-called “microdata”) has been aggregated to various geographic areas by us.
How can I evaluate FastOpenData?
Check the quickstart guide to get going. But in short, there are three steps to getting started: First, you install the Python client (via pip), which includes a simple command-line utility if you prefer to use that. Second, you request a free API key; this is done through the command-line utility with a simple command. You will be asked only for your email address, and the command-line utility will immediately provide you with the API key. Third, you use your new API key to request data either through the command-line utility or by writing a few lines of very simple Python code. The whole process should take no more than a couple of minutes. There is nothing preventing you from using the free API key for production purposes, but we are unable to guarantee any specific SLA for the free tier.
How much does FastOpenData cost?
There is a free tier which is rate-limited, but has no other restrictions. To use the free tier, you are asked only for an email address when you request an API key. Paid tiers are available which eliminate the rate limits and come with SLAs that are suitable for critical production use-cases. The cost of the paid tiers depends on the number of requests; you are charged per request, but the cost of each request goes down if the number of requests is higher.
Are there restrictions on use of this data?
All the data can be used for any legitimate business purpose. Licenses are permissive.
So is this data about zip codes?
No. Although zip codes are commonly used as proxies for an address, they are very problematic for a few different reasons. First, zip codes have boundaries that have been defined by the US Postal Service for the purpose of delivering mail efficiently. Because that is their only intended purpose, zip code boundaries include highly heterogenous households with different demographics and other characteristics. This makes it difficult to learn very much from population averages of a zip code. In short, information gleaned from data about an average household in a zip code is unreliable because zip codes are so arbitrary. Second, there are actually two types of zip codes, and zip code data often conflates them. The zip codes on an address are determined by the US Postal Service, but they have the authority to redefine those boundaries at any moment. So in order to have more stable and reliable boundaries, agencies such as the Census Bureau use so-called “zip code tabulation areas” or ZCTAs. These often coincide with the US Postal Service’s zip codes, but they can diverge if the Postal Service has redrawn their boundaries. Thus, when zip codes are used, one is often using one type of zip code to attribute information about another, different type of zip code.
For these reasons, FastOpenData does not use either form of zip code to aggregate any data whatsoever. The service only uses zip codes to help locate the longitude and latitude for a specific address.
What happens when I send an address to FastOpenData?
When you send an address to FastOpenData through either the API, command-line utility, or Python client, a few things happen. First, the service takes the address, normalizes it, and finds its exact location as longitude and latitude. Second, we use the longitude and latitude to identify which geographic areas contain that specific address. These geographic areas include state, county, census tract, school district, congressional district, and others. Third, we query our database for all the data available for each of those geographic areas. And finally, we send back that data in an easy to parse JSON structure.
What if I have many addresses and I want data on all of them?
With FastOpenData’s API, Python client, or command-line utility, you can either request data about one specific address at a time, or you can send any number of addresses to our batch endpoint which is more efficient for retrieving data on a larger scale. If you use either the Python client or the command-line utility, you can specify a CSV file. The columns containing address information will be sent in batches to the batch endpoint, the responses will automatically be joined to the original table, and a new CSV file will be written. The same process is followed if you use our Python client to process a Pandas dataframe, but in this case, the additional columns containing FastOpenData’s responses will be appended automatically to the original dataframe in-place.
My tables contain proprietary data that I don’t want to send to FastOpenData. What should I do?
You don’t need to worry about that. The Python client and command-line utility only send the necessary columns that contain address information. Other data from your table(s) are never sent. The client and command-line utility do join all the data together into a single table, but this is done locally on your machine.
What about security? I don’t want addresses to be floating around on the FastOpenData servers.
Neither do we. FastOpenData does not store, log, or otherwise retain any of the addresses it processes. They are not even stored in memory after the request has been served. This policy has the drawback of making it a little more difficult to troubleshoot, but we’re trying to avoid any privacy or unnecessary security concerns.
What about personally identifiable information (PII)?
Data from the API does not include any PII whatsoever, so you won’t have to worry about privacy when it comes to using this information. Of course, you will have to send address information to the API in order to receive the data, which is considered PII. As part of the API response, this address information is normalized and sent back to you as well. But in short, you only ever receive PII that you already had.
I still have questions. How do I get in touch?
Feel free to send an email to zac@fastopendata.com.