Using Native Resolve

Welcome

This guide will walk you through how to use FullContact’s Resolve offering that is natively integrated inside of Snowflake.

If you run into any challenges working through this guide, please send an email to your FullContact customer success manager or to [email protected] for assistance.

Setup

If you haven't already, you must first install and configure the application. View the Application Setup Documentation

Step 1: Understanding how to Prepare your Data to Resolve

The Resolve stored procedure expects certain names for columns that contain a type of data. For example, a column containing the city part of an address needs to be named CITY. See below for accepted column names.

Accepted Resolve Input Column Names:

  • Email Address (plaintext or hashed) - begins with EMAIL
  • Phone Number - contains PHONE_NUMBER
  • First/Given Name - FIRST_NAME
  • Last/Family Name - LAST_NAME
  • Primary Address Info (Street and number) - ADDRESS_LINE_1
  • Secondary Address Info - ADDRESS_LINE_2
  • City - CITY
  • State - STATE or REGION
  • State Abbreviation (e.g. Colorado → CO) - REGION_CODE
  • Zip Code - ZIP_CODE or POSTAL_CODE
  • Country - COUNTRY
  • (Unique ID) - RECORD_ID

Other Considerations - Prepping your data

To resolve with multiple emails or phone numbers make sure the columns are named or remapped with a number after them. Ex: EMAIL0, EMAIL1, EMAIL2, PHONE_NUMBER0, PHONE_NUMBER1 . The order or number after the EMAIL or PHONE column name does not indicate a preference or order.

RECORD_ID Is required, and expects a column populated with unique ids for each customer record. Our application will perform a quick validation check prior to resolving your data to ensure the values are actually unique.

Multiple name values or addresses are not supported by Native Resolve.

Step 2: Prep your Data to Resolve

We have provided a stored procedure called GET_SQL_CREATE_INPUT_VIEW that will help you create a semantic view of your table and provide the grants necessary to allow the application to operate within the database you created in Step 1.

GET_SQL_GRANT_INPUT_TABLE will grant the application privileges to your input table. This function will print out a series of sql statements for you to copy, paste and execute on the worksheet.

Calling GET_SQL_GRANT_INPUT_TABLE:

CALL FC_NATIVE_APP.APP_SCHEMA.GET_SQL_GRANT_INPUT_TABLE('FULLY_QUALIFIED_INPUT_TABLE');

Where:

  • FULLY_QUALIFIED_INPUT_TABLE is the full path to your input table

Calling GET_SQL_GRANT_INPUT_TABLE with plausible inputs:

CALL FC_NATIVE_APP.APP_SCHEMA.GET_SQL_GRANT_INPUT_TABLE(
    /* input table name */
    'SAMPLE_DB.SAMPLE_SCHEMA.SAMPLE_TABLE'
 );

Here is a sample of the SQL output you could expect from calling this stored procedure:

-- Allows app to select the specified input table only
-- NOTE: grants are only on CURRENT tables. Run again for replaced or future tables.
GRANT usage on database SAMPLE_DB to APPLICATION FC_NATIVE_APP;
GRANT usage on schema SAMPLE_DB.SAMPLE_SCHEMA to APPLICATION FC_NATIVE_APP;
GRANT select on table SAMPLE_DB.SAMPLE_SCHEMA.SAMPLE_TABLE to APPLICATION FC_NATIVE_APP;

Your input table will now be available for use in the following step.

Calling GET_SQL_CREATE_INPUT_VIEW:

GET_SQL_CREATE_INPUT_VIEW reads your input table’s data, tries to determine what kind of data is in each column, and outputs a SQL view create statement with the columns it could predict. You will then copy, paste, make any necessary tweaks to the SQL and then run these statements to setup your data to resolve.

It is important to call this stored procedure to ensure that you are properly setup to Resolve your data.

CALL FC_NATIVE_APP.APP_SCHEMA.GET_SQL_CREATE_INPUT_VIEW(
'INPUT_TABLE_NAME',  	-- fully qualified input table name,  
'OUTPUT_VIEW_NAME',  	-- [OPTIONAL] fully qualified semantic input view name. If null, the default will append '_VIEW' to the input table name in the same db and schema.
'RECORD_ID');        -- name of column to treat as RECORD_ID

Where:

  • INPUT_TABLE_NAME is a string with the full path to your input table
  • OUTPUT_VIEW_NAME is an optional string with the full name of the output view, NULL to create a view with a default name in the same db and schema as INPUT_TABLE_NAME
  • RECORD_ID_COL_NAME is a string with the name of the column in your input table to consider as your internal customer/record ID

Calling GET_SQL_CREATE_INPUT_VIEW with plausible inputs:

CALL FC_NATIVE_APP.APP_SCHEMA.GET_SQL_CREATE_INPUT_VIEW(
    /* input table name */
    'TEST_INTERNAL.PUBLIC.BENCHMARK_127K', 
    /* output view name */
    'TEST_INTERNAL.PUBLIC.BENCHMARK_127K_SEMANTIC', 
    /* name of column on input table that stores your customer/record id */ 
    'UNIQUE_ID'

Here is a sample of the SQL output you would see from calling this stored procedure:

CREATE OR REPLACE VIEW TEST_INTERNAL.PUBLIC.BENCHMARK_127K_SEMANTIC AS
SELECT UNIQUE_ID AS RECORD_ID,
       CITY,
       STATE,
       ADDRESS_LINE_1,
       FIRST_NAME,
       LAST_NAME,
       EMAIL,
       ZIP_CODE_BASE        AS ZIP_CODE,
       PRIMARY_PHONE_NUMBER AS PHONE_NUMBER0,
       PREFERRED_CUSTOMER_CONTACT_INFO
       IDX,
       ADDRESS_LINE_2
FROM TEST_INTERNAL.PUBLIC.BENCHMARK_127K;

In this example most of our column names were already named using the acceptable column names described above. ROW_ID, ZIP_CODE_BASE and PRIMARY_PHONE_NUMBER were mapped to new column names that our application knows how to deal with.

The GET_SQL_CREATE_INPUT_VIEW procedure was not able to determine the type of the column: PREFERRED_CUSTOMER_CONTACT_INFO . If you leave it as an unmapped column it will not be used to resolve your data but will be still carried forward with the data produced into the resolved output set.

If you know from experience that PREFERRED_CUSTOMER_CONTACT_INFO contains PHONES of your customer you could map it by modifying that line of sql to be:

       PREFERRED_CUSTOMER_CONTACT_INFO AS PHONE_NUMBER1

Step 3: Resolving

Calling Resolve

CALL FC_NATIVE_APP.APP_SCHEMA.RESOLVE_WITH_API_KEY(
'viewName',     							-- semantic input view
'fcApiKey',    								-- api key
'customResultsTableName'      -- [OPTIONAL] output table name. If null, the output table name will be the name of your semantic input view table with `_RESOLVE_RESULTS` appended

Where:

  • viewName is a string with the fully qualified path to the table or view to use as input
  • fcApiKey is a string with your fullcontact api key
  • customResultsTableName is an optional paramater. If null, the output table name will be the name of your semantic input view table with _RESOLVE_RESULTS appended.

The results will be stored in the followingFC_NATIVE_APP_IO.RESOLVE_OUT.customResultsTableName

When this procedure succeeds, it should return a message like Success! Resolved records were saved to: FC_NATIVE_APP_IO.RESOLVE_OUT.RESOLVE_RESULTS. This message indicates that the results, which contain the original input rows and any Person Ids found, are stored in an example output table FC_NATIVE_APP_IO.RESOLVE_OUT.RESOLVE_RESULTS.

Metrics

For every successfully complete batch of records resolved through the app, we will provide a set of summary metrics about that batch in a metrics table in the application: FC_NATIVE_APP.METRICS.FC_RESOLVE_METRICS

SELECT * FROM FC_NATIVE_APP.METRICS.FC_RESOLVE_METRICS;

The following are metrics that we will provide for each run:

  • Unique (run) ID
  • Timestamp
  • Input row count
  • Opted out records - Consumers who have exercised their ‘right to delete’
  • Number of newly generated Person IDs
  • Output row count
  • Resolve rate

The resolved dataset will also have the RUN_ID column appended, so that you can key the run metrics to the resolved output if you so desire.

Closing

That is it, you are ready to start integrating FullContact’s Identity Resolution into other data processing workflows.

We’re always looking for feedback on the products we develop for customers. Please send feedback on ways to enhance this application for your usage to [email protected].