Understanding the Field Usage of Any Object in Salesforce

Sebastian Undurraga · December 14, 2021

Originally published in 2021. Full source on GitHub (linked at the end).

One of the biggest problems I've tackled while working with Salesforce is understanding and evaluating the field usage of a custom object. This application does the work for you, generating a CSV/Excel file with the date of the last record that used each field, and the percentage of use across all of them.

To make this app work, you'll need System Administrator credentials to log into Salesforce. This version runs in the Spyder IDE (part of Anaconda).

Let's understand how it works.

Dependencies

First, the dependencies. We'll use Pandas, datetime, and Simple Salesforce.

from simple_salesforce import Salesforce
import pandas as pd
import datetime

Credentials

Next, we connect to Salesforce with Simple Salesforce:

sf = Salesforce(password='password',
                username='username',
                organizationId='organizationId')

Your organizationId should look like 00JH0000000tYml. To find it (Lightning Experience):

Log into Salesforce with your System Administrator credentials
Click the gear icon → Setup
In the Quick Find box (on the left), type Company Information and open it
Look for Salesforce.com Organization ID, it will look like 00JH0000000tYml

The Object

Now plug in the object name, the API Name of the object. If it's a custom object, it usually ends in __c. To find the API name:

Log into Salesforce with your System Administrator credentials
Click the gear icon → Setup
Click Object Manager in the header
Find your object and copy the API Name shown next to it

This part of the code uses the object name to bring back all the fields:

object_to_evaluate = "object"
object_fields = getattr(sf, object_to_evaluate).describe()

The Date

This part is important and worth thinking about. By default, the code looks at the last year of data. That window matters: if you released a new field a week ago, the report will show it was used recently, but the usage rate will be low, around 2% (7/365). To change the window, just replace 365 with the number of days you want.

last_year = (datetime.datetime.now() + datetime.timedelta(days=-365)).strftime("%Y-%m-%d"+"T"+"%H:%M:%S"+"Z")

The Result

Now we iterate over all the fields and get the created date of the last record that used each field, along with the number of records that used it during the period (one year).

for field in object_fields['fields']:
    print(field['name'])
    try:
        field_detail = pd.DataFrame(
            sf.query("SELECT Id, createddate, SystemModStamp \
                      FROM {} \
                      WHERE createddate > {} \
                        AND {} != null \
                      ORDER BY Id DESC \
                      LIMIT 1".format(object_to_evaluate, last_year, field['name'])
                      )['records'])

        field_detail['Field Name'] = field['name']
        field_detail['Field Label'] = field['label']
        field_detail['Found?'] = 'Yes'

        field_quantity = pd.DataFrame(
            sf.query("SELECT count(Id) \
                    FROM {} \
                    WHERE createddate > {} \
                    AND {} != null".format(object_to_evaluate, last_year, field['name'])
                    ))['records'][0]['expr0']

        field_detail['Quantity'] = field_quantity
        data.append(field_detail)

        if field_detail.empty:
            error_data = {'Field Name': [field['name']],
                          'Field Label': [field['label']],
                          'Found?': ['Yes, no data']}
            data.append(pd.DataFrame(error_data))
    except:
        error_data = {'Field Name': [field['name']],
                      'Field Label': [field['label']],
                      'Found?': ['No']}
        data.append(pd.DataFrame(error_data))

# Concatenate the list of results into one dataframe
data_to_csv = pd.concat(data, ignore_index=True)

Some Formatting

Formatting is a nice-to-have for understanding the result, especially if you're going to share the insights. We rename some columns, format the date columns so CSV/Excel can read them, and add a % of use column.

data_to_csv.rename(columns={'CreatedDate': 'Created Date', 'SystemModstamp': 'Modified Date'}, inplace=True)
data_to_csv['Created Date'] = pd.to_datetime(data_to_csv['Created Date']).dt.date
data_to_csv['Modified Date'] = pd.to_datetime(data_to_csv['Modified Date']).dt.date
data_to_csv = data_to_csv.drop('attributes', axis=1)
max_value = data_to_csv['Quantity'].max()
data_to_csv['% of use'] = data_to_csv['Quantity'] / max_value

The Files

Finally, we export to CSV and Excel so you can choose whichever you prefer. The files are stored in the same folder as the app.

data_to_csv.to_csv('last Field Usage Date.csv')
data_to_csv.to_excel('last Field Usage Date.xlsx', float_format="%.3f")

The Complete Code

from simple_salesforce import Salesforce
import pandas as pd
import datetime

# Connection to Salesforce
sf = Salesforce(password='password',
                username='username',
                organizationId='organizationId')

# Change the name to the object you want to evaluate.
# If it's a custom object, remember to end it with __c
object_to_evaluate = "object"

# Get all the fields from the object
object_fields = getattr(sf, object_to_evaluate).describe()

# Empty list to append the information
data = []

# Date variable defining how far back we want to get data
last_year = (datetime.datetime.now() + datetime.timedelta(days=-365)).strftime("%Y-%m-%d"+"T"+"%H:%M:%S"+"Z")

# Iterate over the fields and bring the last record's created date where the field wasn't empty.
# If the record is not found, store it in the CSV/Excel file as not found.
for field in object_fields['fields']:
    print(field['name'])
    try:
        field_detail = pd.DataFrame(
            sf.query("SELECT Id, createddate, SystemModStamp \
                      FROM {} \
                      WHERE createddate > {} \
                        AND {} != null \
                      ORDER BY Id DESC \
                      LIMIT 1".format(object_to_evaluate, last_year, field['name'])
                      )['records'])

        field_detail['Field Name'] = field['name']
        field_detail['Field Label'] = field['label']
        field_detail['Found?'] = 'Yes'

        field_quantity = pd.DataFrame(
            sf.query("SELECT count(Id) \
                    FROM {} \
                    WHERE createddate > {} \
                    AND {} != null".format(object_to_evaluate, last_year, field['name'])
                    ))['records'][0]['expr0']

        field_detail['Quantity'] = field_quantity
        data.append(field_detail)

        if field_detail.empty:
            error_data = {'Field Name': [field['name']],
                          'Field Label': [field['label']],
                          'Found?': ['Yes, no data']}
            data.append(pd.DataFrame(error_data))
    except:
        error_data = {'Field Name': [field['name']],
                      'Field Label': [field['label']],
                      'Found?': ['No']}
        data.append(pd.DataFrame(error_data))

# Concatenate the list of results into one dataframe
data_to_csv = pd.concat(data, ignore_index=True)

# Format the CSV/Excel report
data_to_csv.rename(columns={'CreatedDate': 'Created Date', 'SystemModstamp': 'Modified Date'}, inplace=True)
data_to_csv['Created Date'] = pd.to_datetime(data_to_csv['Created Date']).dt.date
data_to_csv['Modified Date'] = pd.to_datetime(data_to_csv['Modified Date']).dt.date
data_to_csv = data_to_csv.drop('attributes', axis=1)
max_value = data_to_csv['Quantity'].max()
data_to_csv['% of use'] = data_to_csv['Quantity'] / max_value

# Export the data to CSV/Excel
data_to_csv.to_csv('last Field Usage Date.csv')
data_to_csv.to_excel('last Field Usage Date.xlsx', float_format="%.3f")

I hope it helps!

The full source code is on GitHub: github.com/undu82/Salesforce_Integrations.

Sebastian Undurraga is a Senior Technical Program Manager working on enterprise AI deployment. This tool and write-up were originally published in 2021, during his earlier work on process automation and analytics. He writes at unduslabs.com.

arrow_backAll writing