Pre-filtering data on data source transformations
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎04-25-2013 01:50 AM
When processing data in a transform map you are pretty much forced to go through all the data which was imported and this can sometimes be a lot of records. Also, the test conditions performed on each record can be quite complex.
I wanted to find a performant way of pre-filtering data from a data source in the transform map while also limiting the complexity of the scripting.
My solution :
In the onStart script of your transform map, add filter criteria to the source object and re-run the query ! You will have a new filtered source for use in the other event scripts like "onBefore".
This can lead to substantial gains in performance because you are not forced to read through all your data record by record !
Example of an onStart "filter" script :
/* Only process the current pillar */
if(thisInstance == "dev"){
source.addQuery('u_pillar','D');
}
else if(thisInstance == "test"){
source.addQuery('u_pillar','T');
}
else if(thisInstance == "acc"){
source.addQuery('u_pillar','A');
}
else if(thisInstance == "prod"){
source.addQuery('u_pillar','P');
}
source.query(); // Filter it!
//Now the source for all actions in this transform map are anly working on a small number of the source records 😉
//If you use multiple transform maps, you can do this on each transform map with its own filter.
- Labels:
-
Integrations
- 2,543 Views

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎04-25-2013 03:43 AM
Really cool idea. Thanks for sharing this!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎02-03-2015 05:30 AM
I know it's an old discussion but it is really helpful.
I wanted to check the conditions for only 5k records out of 250k records, and wasn't getting a better way to do so. This simple solution would do the trick.
Thanks for sharing.
-Mandar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎02-05-2015 03:15 AM
I observed a strange thing in the behaviour for source object. I used the addQuery() as suggested above to filter out unnecessary records.
It did show the limited records when accessed from the onBefore scripts at the beginning, however once the filtered records were processed, again all the remaining records were being processed.
Suppose, i have loaded 50,000 records. And in the onStart script I put in condition to pull only 500 records.
When I put the source.getRowCount() in the onBefore() script to check the returned row count, initially it logged 500 records. However, after processing those 500 records, system again logged remaining 49500 records.
Has anyone tried above trick before and has anyone experienced such an odd behaviour?
Thanks,
Mandar