Getting started
POST /v1/places/match
The Match API v1 uses fuzzy logic with various sets of rules to match and append to your data.
The difference between Search API and Match API is that Search API returns exact results from filters, whereas Match API matches your entities to the Data Axle database.
curl https://api.data-axle.com/v1/places/match -d '{
"identifiers": {
"name": "Best Business Ever",
"street": "123 Main St",
"city": "Seattle",
"state": "WA",
"postal_code": "98103"
}
}'
The response includes a persistent infogroup_id
along with appended attributes:
{
"document": {
"infogroup_id": "939010853",
"rules": ["address"],
"score": 0.90,
"attributes": {
"name": "Best Business Ever",
"street": "123 Main St",
"city": "Seattle",
"state": "WA",
"postal_code": "98103",
...
}
}
}
GET
orPOST
can be used.- By default, one match is returned. Set the
limit
option to retrieve multiple matches. - All available fields are included in the output. This can be changed with the
fields
option. - If a match is not found, a
null
document is returned.
Parameters
Parameter | Description | Default |
identifiers | The set of fields used to find a match for your record. | |
required_rule_groups | The set of rules the result must match. | |
match_rule_groups_exactly | How required_rule_groups matches rules. | false |
filter | Reduce potential results with the Filter DSL. | |
fields | The fields returned with the matched document. | Any Data Dictionary fields |
include_labels | Get the labels for encoded fields. | false |
limit | Return multiple results up to the given limit. Max of 400 results. | 1 |
minimum_match_score | Exclude results below a certain match score. | |
packages | Select packages of fields returned in the results. |
Identifiers
The fields you can submit for matching are:
Field | Description | Example |
reference_id | Optional field used to reference ID's from your system | 1cd345b729, Store #12567 |
infogroup_id | The unique identifier for a Place. | 123456789 |
name | The place's common, recognized name, or "doing business as" name. | Starbucks |
street | The location address of the place. | 301 W Main Street |
suite | The unit or apartment number. | 13B |
city | The city name for the location. | Seattle |
state | The state or province of the place. | WA, BC, ... |
postal_code | The postal code for the location address. | 98134, 98134-1436, V2Y1P3, ... |
country_code | The country code of the location. | US |
mailing_address | The PO Box or RR Box of the place. | PO Box 123 |
mailing_address_city | The town or municipality where the place receives mail. | Seattle |
mailing_address_state | The state or province of the mailing address. | WA, BC, ... |
mailing_address_postal_code | The postal code for the mailing address. | 98134, 98134-1436, V2Y1P3, ... |
landmark_address | The name of the complex, building, or mall where the place resides. | University Mall, JFK International Airport, ... |
phone | The telephone number for the place. | (206) 555-1212, 2065551212, ... |
website | The primary homepage URL of the business. | https://starbucks.com/seattle, starbucks.com, ... |
email | The email address of the contact. | someone@example.com, example.com, ... |
email_md5 | MD5 encrypted hash of the email address. | 16d113840f999444259f73bac9ab8b10 |
email_sha256 | SHA-256 encrypted hash of the email address. | 72497f475e4f76d0b28f57c73a084ece576... |
contact_first_name | The first name of the individual. | John, Mary Ann, ... |
contact_last_name | The last name of the individual. | Smith |
See the rule descriptions for which fields are required for each rule to match.
reference_id
is not used for matching, but is an optional field that can be used to reference requests using an ID from your own system. It will be returned with the results of batch requests.
Rules
Rules are automatically selected based on the data provided.
Rule | Description | Notes |
infogroup_id | Match by infogroup_id . Use this by itself if you already have an infogroup_id and would like data appended. | high confidence |
name | Match by name. Any one of name , alternative_name , or legal_name can be provided. | |
address | Match by street and region (city, state, or postal_code) or mailing address street and region. | |
phone | Match by phone number. | high confidence |
website | Match by website. | high confidence |
Match by email address. Any one of email , email_md5 , or email_sha256 can be provided. | high confidence | |
contact | Match by contact first name and last name. |
Records that only match a single rule will only be returned if the rule is marked as "high confidence". If only one rule was run, such as when only an address is provided, address matches will be included in the result.
Score
Match scores are provided to make it easier to compare the similarity of the input to the matched record. Records are scored 0-1, with a higher score indicating greater similarity.
Different use cases may benefit from different score threshold considerations. We recommend using matches with a score of 0.8 or above and matched using least two rules. Matching using multiple rules will increase match quality.
To exclude low-confidence matches from your results, use the minimum_match_score
parameter.
{
"minimum_match_score": 0.8
}
Required Rule Groups
Use the required_rule_groups
parameter to specify groups of rules. Only results that match one or more of the groups are returned.
{
"required_rule_groups": [
["address", "name"],
["address", "phone"]
]
}
In this example, a record matches in the following cases:
- The record matches the "address" and "name" rule.
- The record matches the "address" and "phone" rule.
- The record matches the "address", "name", and "phone" rule.
Match Rule Groups Exactly
When match_rule_groups_exactly
is true
, groups of rules specified in required_rule_groups
must match exactly in order to be returned.
{
"required_rule_groups": [
["address", "name"],
["address", "phone"]
],
"match_rule_groups_exactly": true
}
In this example, a record matches in the following cases:
- The record matches the "address" and "name" rule.
- The record matches the "address" and "phone" rule.
The rule set '["address", "name", "phone"]' will not match.
Filters
The filter
parameter reduces results to records matching a specified criteria, using the Filter DSL.
Fields
By default, all fields in the Data Dictionary are included in the output. Use the fields
parameter to reduce the number of elements returned:
{
"fields": ["name", "street", "city", "state", "postal_code"]
}
Some contact information, including email addresses, will be used for matching, but will not be included with the matched record. Some data may be suppressed for certain records and will not be returned.
Include Labels
The fields returned within records frequently contain encoded values that reference lookup data. To retrieve the labels for lookups, add the include_labels
option:
{
"include_labels": true
}
Read the Lookups API documentation for more information.
Limit
By default, one match is returned. Specify a limit
parameter to retrieve multiple matches. When limit is specified, an array of documents is returned.
{
"identifiers": {
"name": "Best Business Ever",
"street": "Main St",
"city": "Seattle",
"state": "WA"
},
"limit": 3
}
{
"documents": [
{
"infogroup_id": "243185311",
"rules": [
"name",
"address"
],
"score": 0.85,
"attributes": {
"city": "Seattle",
"name": "Best Business Ever",
"state": "WA",
"street": "1544 Main St"
}
},
{
"infogroup_id": "243185311",
"rules": [
"name",
"address"
],
"score": 0.82,
"attributes": {
"city": "Seattle",
"name": "Best Business Forever",
"state": "WA",
"street": "15 Main Street"
}
},
{
"infogroup_id": "243185311",
"rules": [
"name",
"address"
],
"score": 0.80,
"attributes": {
"city": "Seattle",
"name": "Best Business Forever-Forever",
"state": "WA",
"street": "544 Main St"
}
}
]
}
Packages
Select packages of fields by providing a package param_key
to the packages
parameter. By default, every field on a package is returned. The packages
parameter is combined with the fields
parameter to return only specified fields.
{
"packages": ["base_v1", "contacts_v1"]
}
Bulk Match
Use bulk requests to process large volumes of match requests in a batch:
- Create a batch
- Add match requests
- Retrieve results
Create a Batch
POST /v1/places/match/batch
Start by creating a batch. The initial request can include up to 1,000 match requests:
curl -XPOST https://api.data-axle.com/v1/places/match/batch -d '{
"identifiers": [
{
"reference_id": "123",
"name": "Best Business Ever",
"street": "123 Main St",
"city": "Seattle",
"state": "WA",
"postal_code": "98103"
},
{
"reference_id": "124",
"name": "Best Coffee Ever",
"street": "456 Central Way",
"city": "Seattle",
"state": "WA",
"postal_code": "98103"
}
]
}
The immediate response includes a batch_id
and an array of objects containing a match_id
. Each match_id
is returned in the order the match request identifiers were submitted:
{
"batch_id": "8a140451fe3f095f1c205cf185efffec",
"matches": [
{
"match_id": "5cbb3b15c8bee0f706239b45cb763fed"
},
{
"match_id": "2e849855d32e1e20eb33e3b74a7785b2"
},
{
"match_id": "5f0b1070c3e4761cabf116bbed2b49c4"
}
]
}
Adding Requests
PUT /v1/places/match/batch/:batch_id
Add match requests to an existing batch_id
:
curl -XPUT https://api.data-axle.com/v1/places/match/batch/:batch_id -d '{
"identifiers": [...]
}
Getting Bulk Match Results
GET /v1/places/match/batch/:batch_id
Use the Match Results API with the batch_id
to fetch completed results for the batch. Matches that are pending will not appear in the results.
Use the "status" object to determine batch progress. The batch has completed when
processed
is the same as requests
.
{
"next_token": "13835315192676945401741312",
"status": {
"requests": 500,
"processed": 223
},
"documents": [
{
"match_id": "5cbb3b15c8bee0f706239b45cb763fed",
"reference_id": "123",
"document": {
"infogroup_id": "939010853",
"rules": ["address"],
"score": 0.90,
"attributes": {
"name": "Best Business Ever",
"street": "123 Main St",
"city": "Seattle",
"state": "WA",
"postal_code": "98103"
}
}
},
{
"match_id": "5f0b1070c3e4761cabf116bbed2b49c4",
"reference_id": "125",
"document": null
}
]
}
Scrolling Through Match Results
Each request returns up to 1,000 results. To read the next set of results, use the next_token
from the previous request and append it to the request URL via the since
parameter:
curl https://api.data-axle.com/v1/places/match/batch/:batch_id?since=13835315192676945401741312
Repeat this process until an empty list of documents
is returned. Store the final next_token
for use in future requests from the same batch:
{
"next_token": "13835315192676945401741312",
"status": {
"requests": 500,
"processed": 500
},
"documents": []
}
Bulk Match Parameters
Parameter | Description | Default |
identifiers | The set of fields that are used to find a match for your record. | |
required_rule_groups | The set of rules the result must have to match. We recommend doing this either at batch creation time or result time, not both. | |
filter | Reduce potential results with a filter. | |
limit | Return multiple results up to the given limit count. Max 400 of results. | 1 |
Bulk Match Result Parameters
Parameter | Description | Default |
fields | The fields that are returned with the matched document. This overrides any fields that were created with the batch. | All fields in the Data Dictionary |
include_labels | Get the labels for encoded fields. | false |
since | The token of the earliest result you would like to receive. | |
required_rule_groups | The set of rules the result must match. We recommend doing this either at batch creation time or result time, not both. | |
match_rule_groups_exactly | How required_rule_groups matches rules. We recommend doing this either at batch creation time or result time, not both. | false |
packages | Select packages of fields returned in the results. |
Bulk Match Result Response
The match result includes the following fields:
Field | Description |
next_token | The next token to use when requesting more results. |
documents | The documents that matched your identifiers. If there were no matches, this will be an null or an empty array if a limit is provided. |
status.requests | The count of requests in the batch. |
status.processed | The count of requests that have been processed. The batch is complete when this number equals the requests count. |
match_id | The ID of the match request. |
rules | The rule that provided the best match to your identifiers. |
score | The score of the match result. Use this to further filter your results. |
infogroup_id | The ID of the matched record will be included in the match . |
attributes | The fields specified by fields in the request or those specified by your contract. |