SQL to Split CSV in Marketing Cloud: Twopir Consulting
In Salesforce Marketing Cloud, dealing with data that doesn’t conform to the desired format is a common challenge. One recurrent use case discussed on Salesforce Stack Exchange involves splitting values from a comma-delimited list into separate rows. Unfortunately, conventional SQL methods like STRING_SPLIT are not supported in Marketing Cloud, leading to a reliance on alternative solutions.
Here, we present two SQL-based solutions for splitting comma-separated values in a Data Extension field. However, it’s important to note a limitation with both methods—they rely on the CROSS APPLY function and require predicting the maximum number of comma-separated values, as dynamic looping is not supported.
Data Schema for the Use Case
Consider a simple use case where a Data Extension contains unique SubscriberKeys, and each subscriber has a list of comma-separated OrderIDs. The goal is to have each OrderID in a separate row, associated with its corresponding SubscriberKey.
SubscriberKey | OrderIDs |
---|---|
Test_1 | apples, bananas, oranges |
Test_2 | potatoes, tomatoes |
Desired Result:
SubscriberKey | OrderID |
---|---|
Test_1 | apples |
Test_1 | bananas |
Test_1 | oranges |
Test_2 | potatoes |
Test_2 | tomatoes |
Solution 1: Cross Apply with Substring
Adapted from Adam Spriggs’ SFSE answer, this solution uses the SUBSTRING function to find and split each comma-separated value. The query includes multiple SELECT statements within a UNION, and you need to predict the maximum number of values (up to 6 in this example).
select orders.SubscriberKey, orders.OrderID from ( SELECT a.SubscriberKey , SUBSTRING(a.OrderIDs, 1, ISNULL(p1 - 1, LEN(a.OrderIDs))) OrderID from AllOrders a CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs), 0)) b(p1) union SELECT a.SubscriberKey , SUBSTRING(a.OrderIDs, p1 + 1, ISNULL(p2, LEN(a.OrderIDs) + 1) - p1 - 1) OrderID from AllOrders a CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs), 0)) b(p1) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p1 + 1), 0)) c(p2) union SELECT a.SubscriberKey , SUBSTRING(a.OrderIDs, p2 + 1, ISNULL(p3, LEN(a.OrderIDs) + 1) - p2 - 1) OrderID from AllOrders a CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs), 0)) b(p1) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p1 + 1), 0)) c(p2) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p2 + 1), 0)) d(p3) union SELECT a.SubscriberKey , SUBSTRING(a.OrderIDs, p3 + 1, ISNULL(p4, LEN(a.OrderIDs) + 1) - p3 - 1) OrderID from AllOrders a CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs), 0)) b(p1) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p1 + 1), 0)) c(p2) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p2 + 1), 0)) d(p3) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p3 + 1), 0)) e(p4) union SELECT a.SubscriberKey , SUBSTRING(a.OrderIDs, p4 + 1, ISNULL(p5, LEN(a.OrderIDs) + 1) - p4 - 1) OrderID from AllOrders a CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs), 0)) b(p1) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p1 + 1), 0)) c(p2) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p2 + 1), 0)) d(p3) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p3 + 1), 0)) e(p4) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p4 + 1), 0)) f(p5) union SELECT a.SubscriberKey , SUBSTRING(a.OrderIDs, p5 + 1, ISNULL(p6, LEN(a.OrderIDs) + 1) - p5 - 1) OrderID from AllOrders a CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs), 0)) b(p1) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p1 + 1), 0)) c(p2) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p2 + 1), 0)) d(p3) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p3 + 1), 0)) e(p4) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p4 + 1), 0)) f(p5) CROSS APPLY (select NULLIF(CHARINDEX(',', a.OrderIDs, p5 + 1), 0)) g(p6) ) orders where orders.OrderID is not null
Solution 2: Cross Apply with XML
This solution, sourced from Gregory Gifford’s blog, transforms the comma-separated values into XML and then splits them into separate rows. Similar to Solution 1, you need to predict the maximum number of values (up to 9 in this example).
SELECT p.SubscriberKey , c.OrderIDs FROM [AllOrders] p CROSS APPLY ( Select CAST('<x>' + REPLACE(p.OrderIDs,',','</x><x>') + '</x>' AS XML).value('/x[1]','nvarchar(max)') as OrderIDs1 UNION ALL Select CAST('<x>' + REPLACE(p.OrderIDs,',','</x><x>') + '</x>' AS XML).value('/x[2]','nvarchar(max)') as OrderIDs2 UNION ALL Select CAST('<x>' + REPLACE(p.OrderIDs,',','</x><x>') + '</x>' AS XML).value('/x[3]','nvarchar(max)') as OrderIDs3 UNION ALL Select CAST('<x>' + REPLACE(p.OrderIDs,',','</x><x>') + '</x>' AS XML).value('/x[4]','nvarchar(max)') as OrderIDs4 UNION ALL Select CAST('<x>' + REPLACE(p.OrderIDs,',','</x><x>') + '</x>' AS XML).value('/x[5]','nvarchar(max)') as OrderIDs5 UNION ALL Select CAST('<x>' + REPLACE(p.OrderIDs,',','</x><x>') + '</x>' AS XML).value('/x[6]','nvarchar(max)') as OrderIDs6 UNION ALL Select CAST('<x>' + REPLACE(p.OrderIDs,',','</x><x>') + '</x>' AS XML).value('/x[7]','nvarchar(max)') as OrderIDs7 UNION ALL Select CAST('<x>' + REPLACE(p.OrderIDs,',','</x><x>') + '</x>' AS XML).value('/x[8]','nvarchar(max)') as OrderIDs8 UNION ALL Select CAST('<x>' + REPLACE(p.OrderIDs,',','</x><x>') + '</x>' AS XML).value('/x[9]','nvarchar(max)') as OrderIDs9 ) c (OrderIDs) WHERE ISNULL(c.OrderIDs,'') <> ''
Performance and Ease of Use
In testing both queries with a dataset of 1 million records, each containing 4 comma-delimited values in the OrderIDs field:
Cross Apply with Substring took ~2 minutes
Cross Apply with XML took ~6 minutes
The Cross Apply with Substring solution appears more performant, but the choice between the two depends on the dataset size and complexity considerations.
These SQL-based solutions offer workarounds for splitting comma-separated values in Marketing Cloud Data Extensions, addressing a common challenge faced by users.
Conclusion :
In Salesforce Marketing Cloud, splitting comma-separated values in a Data Extension can be achieved using SQL functions to parse and manipulate the data. The SQL function commonly used for this purpose is STRING_SPLIT(). This function allows you to break down a comma-separated list into individual values, enabling you to work with each element separately within your queries.