What is source native key & source recency timestamp in ETL when we map classes.

Hrishabh Kumar
Giga Guru

When we start to map classes in the ETL for any class we always find two fields in the mapping, namely "Source Native Key" and "Source Recency Timestamp". I don't understand there significance.

 

Refer the following screenshot.

In this case I am mapping in cmdb_ci_vcenter_cluster class.

I need following answer:

1) Are "Source Native Key" and "Source Recency Timestamp" attributes of "cmdb_ci_vcenter_cluster" class.

2) what is there significance.

3) Are these fields mandatory to map?

ETL source native key..PNG

5 REPLIES 5

Clara Lemos
Mega Sage
Mega Sage

Hi @Hrishabh Kumar ,

 

  • Source Native Key: IRE uses to uniquely identify a record and for building relationships and references. Also, improves performance of insert and update operations. When processing a payload, IRE generates an error if this field is empty.
  • Source Recency Timestamp: IRE uses to identify records that are older than the current record and therefore can be ignored, to help resolve conflicting attribute values. If a value is provided, it is used only if it is later than the value that is currently stored in the CMDB. If a value is not provided, IRE updates the attribute with the current timestamp.

You can find more details in this doc: Create an ETL transform map (servicenow.com)

 

If that helps please mark my answer as correct / helpful!
And if further help is needed please let me know

Cheers

 

 

Hi @Clara Lemos 

 

Unfortunately it's not that helpful to just quote extracts from the ServiceNow documentation.  While you might know exactly what the documentation means, other people often won't.  If you can explain in your own words and from your own experience what "Source Native Key" and "Source Recency Timestamp" mean and do, that would be really helpful.

 

On Source Native Key, my understanding is that this is normally the unique id or key for this record in the source system that's feeding into the CMDB.  When IRE creates or updates a CI for the very first time, it inserts a record in the sys_object_source table, where there is one record per CI per discovery source.  If a Source Native Key was supplied to IRE then it records that string in the ID field on the sys_object_source record.  Next time IRE is called, before checking any of the normal CI Identifiers to find an existing CI to match to, it first looks to see if there is a sys_object_source record with the Source Native Key value in the ID field, and for the correct Name (which shows the discovery source). If it finds it then that means IRE has seen that Source Native Key before and already knows the CI it's associated with.  So it can go straight to that CI, and doesn't need to bother checking any of the normal CI Identifiers.

 

But I don't understand what Source Recency Timestamp means.  The documentation says "IRE uses [it] to identify records that are older than the current record and therefore can be ignored".  How does it do that?  Is that from the last_discovered field on the CI?  Or maybe the "Last scan" field on the sys_object_source record?  Or perhaps it's to do with the "Last update" field on the cmdb_datasource_last_update record?  And why should it ignore records older than the current record?  When it says current record, does it mean the current record that has been passed into IRE for processing, or the current CI/record being matched to?

 

So if you have experience of Source Recency Timestamp and what it means, how it works and what it does, I'd be really interested to understand more about it.

 

Thanks

Michael

 

P.S. From looking at the description here of system property glide.identification_engine.skip_updating_last_scan_if_older, I've now seen that it is indeed about the last_scan field on the sys_object_source record.  It's the equivalent of the legacy last_discovered field on the overall CI, but being in sys_object_source it's recorded per-source.  With last_discovered, if a source comes in with a date that is older than the CI's last_discovered then by default IRE doesn't set discovery_source. That's because discovery_source is the source which most recently "discovered" this CI, so if a source comes in with what looks like older data, then it can't be the source that most recently discovered that CI.  Depending on the value of system property "glide.identification_engine.skip_updating_last_scan_if_older", IRE does something similar with last_scan on sys_object_source. Question then is whether last_scan affects anything else. I'd have thought it would do, but I don't know.

Thanks @mbourla 

That helped me a lot. 

FIn1
Tera Contributor

Source Native Key (SNK)

  • This is the unique ID for a record in the source system (the system sending data into the CMDB).

  • When IRE gets data, it saves this key in the sys_object_source table, linked to the CI and the source.

  • Next time data comes in from that source, IRE checks for this key first.

    • If it finds it, it knows exactly which CI to update—no need to search using other identifiers.

    • If not, it uses the usual CI identifiers to try and find a match.

Source Recency Timestamp

  • This is a date/time value that tells IRE how recent the data is from the source.

  • It’s usually stored as the last_scan field on the sys_object_source record (like “last_discovered” but per source).

  • Why does it matter?

    • If new data comes in with an older timestamp than what’s already recorded, IRE ignores it.

      • This prevents “old” or outdated data from overwriting newer, more accurate information.

    • The “current record” means the data being processed right now by IRE.

  • There’s a system property (glide.identification_engine.skip_updating_last_scan_if_older) that controls whether IRE should skip updating last_scan if the incoming data is older.

Summary:

  • Source Native Key helps IRE quickly find the right CI from a specific source.

  • Source Recency Timestamp makes sure only the most up-to-date info is used, avoiding accidental overwrites with old data.