Using Native Resolve
Welcome
This guide will walk you through how to use FullContact’s Resolve offering that is natively integrated inside of Snowflake.
If you run into any challenges working through this guide, please send an email to your FullContact customer success manager or to [email protected] for assistance.
Setup
If you haven't already, you must first install and configure the application. View the Application Setup Documentation
Step 1: Understanding how to Prepare your Data to Resolve
The Resolve stored procedure expects certain names for columns that contain a type of data. For example, a column containing the city part of an address needs to be named CITY. See below for accepted column names.
Accepted Resolve Input Column Names:
- Email Address (plaintext or hashed) - begins with
EMAIL
- Phone Number - contains
PHONE_NUMBER
- First/Given Name -
FIRST_NAME
- Last/Family Name -
LAST_NAME
- Primary Address Info (Street and number) -
ADDRESS_LINE_1
- Secondary Address Info -
ADDRESS_LINE_2
- City -
CITY
- State -
STATE
orREGION
- State Abbreviation (e.g. Colorado → CO) -
REGION_CODE
- Zip Code -
ZIP_CODE
orPOSTAL_CODE
- Country -
COUNTRY
- (Unique ID) -
RECORD_ID
Other Considerations - Prepping your data
To resolve with multiple emails or phone numbers make sure the columns are named or remapped with a number after them. Ex: EMAIL0
, EMAIL1
, EMAIL2
, PHONE_NUMBER0
, PHONE_NUMBER1
. The order or number after the EMAIL or PHONE column name does not indicate a preference or order.
RECORD_ID
Is required, and expects a column populated with unique ids for each customer record. Our application will perform a quick validation check prior to resolving your data to ensure the values are actually unique.
Multiple name values or addresses are not supported by Native Resolve.
Step 2: Prep your Data to Resolve
We have provided a stored procedure called GET_SQL_CREATE_INPUT_VIEW
that will help you create a semantic view of your table and provide the grants necessary to allow the application to operate within the database you created in Step 1.
GET_SQL_GRANT_INPUT_TABLE
will grant the application privileges to your input table. This function will print out a series of sql statements for you to copy, paste and execute on the worksheet.
Calling GET_SQL_GRANT_INPUT_TABLE
:
GET_SQL_GRANT_INPUT_TABLE
:CALL FC_NATIVE_APP.APP_SCHEMA.GET_SQL_GRANT_INPUT_TABLE('FULLY_QUALIFIED_INPUT_TABLE');
Where:
FULLY_QUALIFIED_INPUT_TABLE
is the full path to your input table
Calling GET_SQL_GRANT_INPUT_TABLE
with plausible inputs:
CALL FC_NATIVE_APP.APP_SCHEMA.GET_SQL_GRANT_INPUT_TABLE(
/* input table name */
'SAMPLE_DB.SAMPLE_SCHEMA.SAMPLE_TABLE'
);
Here is a sample of the SQL output you could expect from calling this stored procedure:
-- Allows app to select the specified input table only
-- NOTE: grants are only on CURRENT tables. Run again for replaced or future tables.
GRANT usage on database SAMPLE_DB to APPLICATION FC_NATIVE_APP;
GRANT usage on schema SAMPLE_DB.SAMPLE_SCHEMA to APPLICATION FC_NATIVE_APP;
GRANT select on table SAMPLE_DB.SAMPLE_SCHEMA.SAMPLE_TABLE to APPLICATION FC_NATIVE_APP;
Your input table will now be available for use in the following step.
Calling GET_SQL_CREATE_INPUT_VIEW
:
GET_SQL_CREATE_INPUT_VIEW
:GET_SQL_CREATE_INPUT_VIEW
reads your input table’s data, tries to determine what kind of data is in each column, and outputs a SQL view create statement with the columns it could predict. You will then copy, paste, make any necessary tweaks to the SQL and then run these statements to setup your data to resolve.
It is important to call this stored procedure to ensure that you are properly setup to Resolve your data.
CALL FC_NATIVE_APP.APP_SCHEMA.GET_SQL_CREATE_INPUT_VIEW(
'INPUT_TABLE_NAME', -- fully qualified input table name,
'OUTPUT_VIEW_NAME', -- [OPTIONAL] fully qualified semantic input view name. If null, the default will append '_VIEW' to the input table name in the same db and schema.
'RECORD_ID'); -- name of column to treat as RECORD_ID
Where:
INPUT_TABLE_NAME
is a string with the full path to your input tableOUTPUT_VIEW_NAME
is an optional string with the full name of the output view, NULL to create a view with a default name in the same db and schema as INPUT_TABLE_NAMERECORD_ID_COL_NAME
is a string with the name of the column in your input table to consider as your internal customer/record ID
Calling GET_SQL_CREATE_INPUT_VIEW
with plausible inputs:
CALL FC_NATIVE_APP.APP_SCHEMA.GET_SQL_CREATE_INPUT_VIEW(
/* input table name */
'TEST_INTERNAL.PUBLIC.BENCHMARK_127K',
/* output view name */
'TEST_INTERNAL.PUBLIC.BENCHMARK_127K_SEMANTIC',
/* name of column on input table that stores your customer/record id */
'UNIQUE_ID'
Here is a sample of the SQL output you would see from calling this stored procedure:
CREATE OR REPLACE VIEW TEST_INTERNAL.PUBLIC.BENCHMARK_127K_SEMANTIC AS
SELECT UNIQUE_ID AS RECORD_ID,
CITY,
STATE,
ADDRESS_LINE_1,
FIRST_NAME,
LAST_NAME,
EMAIL,
ZIP_CODE_BASE AS ZIP_CODE,
PRIMARY_PHONE_NUMBER AS PHONE_NUMBER0,
PREFERRED_CUSTOMER_CONTACT_INFO
IDX,
ADDRESS_LINE_2
FROM TEST_INTERNAL.PUBLIC.BENCHMARK_127K;
In this example most of our column names were already named using the acceptable column names described above. ROW_ID
, ZIP_CODE_BASE
and PRIMARY_PHONE_NUMBER
were mapped to new column names that our application knows how to deal with.
The GET_SQL_CREATE_INPUT_VIEW
procedure was not able to determine the type of the column: PREFERRED_CUSTOMER_CONTACT_INFO
. If you leave it as an unmapped column it will not be used to resolve your data but will be still carried forward with the data produced into the resolved output set.
If you know from experience that PREFERRED_CUSTOMER_CONTACT_INFO
contains PHONES of your customer you could map it by modifying that line of sql to be:
PREFERRED_CUSTOMER_CONTACT_INFO AS PHONE_NUMBER1
Step 3: Resolving
Calling Resolve
CALL FC_NATIVE_APP.APP_SCHEMA.RESOLVE_WITH_API_KEY(
'viewName', -- semantic input view
'fcApiKey', -- api key
'customResultsTableName' -- [OPTIONAL] output table name. If null, the output table name will be the name of your semantic input view table with `_RESOLVE_RESULTS` appended
Where:
viewName
is a string with the fully qualified path to the table or view to use as inputfcApiKey
is a string with your fullcontact api keycustomResultsTableName
is an optional paramater. If null, the output table name will be the name of your semantic input view table with_RESOLVE_RESULTS
appended.
The results will be stored in the followingFC_NATIVE_APP_IO.RESOLVE_OUT.customResultsTableName
When this procedure succeeds, it should return a message like Success! Resolved records were saved to: FC_NATIVE_APP_IO.RESOLVE_OUT.RESOLVE_RESULTS.
This message indicates that the results, which contain the original input rows and any Person Ids found, are stored in an example output table FC_NATIVE_APP_IO.RESOLVE_OUT.RESOLVE_RESULTS.
Metrics
For every successfully complete batch of records resolved through the app, we will provide a set of summary metrics about that batch in a metrics table in the application: FC_NATIVE_APP.METRICS.FC_RESOLVE_METRICS
SELECT * FROM FC_NATIVE_APP.METRICS.FC_RESOLVE_METRICS;
The following are metrics that we will provide for each run:
- Unique (run) ID
- Timestamp
- Input row count
- Opted out records - Consumers who have exercised their ‘right to delete’
- Number of newly generated Person IDs
- Output row count
- Resolve rate
The resolved dataset will also have the RUN_ID
column appended, so that you can key the run metrics to the resolved output if you so desire.
Closing
That is it, you are ready to start integrating FullContact’s Identity Resolution into other data processing workflows.
We’re always looking for feedback on the products we develop for customers. Please send feedback on ways to enhance this application for your usage to [email protected].
Updated over 1 year ago