Flow Designer version of this pseudocode (GlideRecord chooseWindow)

Zack Sargent · ‎02-22-2024

Problem:

Normally, it is recommended that you do not query more than 10,000 records at a time. Therefore, when scripting to handle a large table, I make something like this:

outer_count = 0

batch_size = 9900

Outer Loop: while (outer_count < total_records)

gr = new GlideRecord('some_table');

gr.chooseWindow(outer_count, outer_count + batch_size);

gr.query();

Inner Loop: while (gr.next())

single record stuff

}

outer_count = outer_count + gr.getRowCount();

}

What if I want to do this in Flow Designer? The OOB "Look up records" doesn't seem to support "chooseWindow." You can paginate external sources using a "data source," but that seems the long way around. This is all on-platform, and I want 9900 row "pages" of records to process at a time.

Question:

How can you build a Flow Designer flow (or subflow) to work like this?
Do I have to make a custom Action?

Edit:
Oh. It doesn't even look like you can make a custom action to do it because you have to choose a table type for the records. Even though I can input a table name and use it for my query, I can't match the output because I can't define the records type on output. Ideally, I could have Records.(table input pill) as the Type, here, but no love on this, either. This is why people script everything. I'm trying to build this flow to use parallel workers as it loops through the table and therefore processes it all rapidly, but it doesn't look like that's in the cards.

James Chun · ‎02-22-2024

Hi @Zack Sargent,

Interesting question, are you able to use 'Array.Object' Type as the output for the custom Action?

On the other hand, you should be able to create something similar without (or little) scripting in Flow.

Create a loop in Flow (e.g. use If statement)
In the loop, use the 'Look up Records' action.
- Order by a field (e.g number, updated_at)
- Return < 10,000 records

Once an iteration is complete, update the Flow variable that tracks the value of the last record's attribute (e.g. value of the number or updated_at date/time from the last record)
Go back to the start of the iteration, and use the updated Flow within the 'Conditions' of the 'Look up records' action. (e.g. updated_at >= xxxx/xx/xx)
Repeat the process/iteration until all records are iterated

There could be a maximum number of actions that can be executed within an instance of a flow. If so, this Flow may get time out if the data is very big.

Do you have a use case for this? Interested to know how you were planning to use this.

Thanks

View solution in original post

James Chun · ‎02-22-2024

Hi @Zack Sargent,

Interesting question, are you able to use 'Array.Object' Type as the output for the custom Action?

On the other hand, you should be able to create something similar without (or little) scripting in Flow.

Create a loop in Flow (e.g. use If statement)
In the loop, use the 'Look up Records' action.
- Order by a field (e.g number, updated_at)
- Return < 10,000 records

Once an iteration is complete, update the Flow variable that tracks the value of the last record's attribute (e.g. value of the number or updated_at date/time from the last record)
Go back to the start of the iteration, and use the updated Flow within the 'Conditions' of the 'Look up records' action. (e.g. updated_at >= xxxx/xx/xx)
Repeat the process/iteration until all records are iterated

There could be a maximum number of actions that can be executed within an instance of a flow. If so, this Flow may get time out if the data is very big.

Do you have a use case for this? Interested to know how you were planning to use this.

Thanks

Zack Sargent · ‎02-23-2024

Thanks for your reply. I think I will use the "order by" & "use last record" version of this. I'll probably just use sys_id since there are no better sortable fields, like "serial number."

The application is what I call "VM Reconciliation." Our [cmdb_ci_vm_instance] table is massive. However, not every VM has a proper owner, easily associated Cloud Service Account, or other kinds of properties. What if an employee leaves the company? All of these little "checks" can be run in parallel for each VM. However, because the table is approaching 7 digits, we can't just run it all in one big batch.

Another idea I had tonight was that instead of 9900-record-batches, I could run it in something like 10-record-batches, then also parallelize each member of the batch, as well. Since we have ~15 worker nodes, this should spray the workload around more readily than my old 500-line-script which is single-threaded. Ideally, this should result in much faster overall execution ... and a much more understandable process left behind for those who end up looking at this workflow in a couple of years. 🙂

Current draft (using probably unnecessary custom action):

James Chun · ‎02-23-2024

From your use case, looks like you want to monitor and maintain the data quality/health of those CIs. If so, have you looked into CMDB Health or Data Certification? I don't see a need to create a custom Flow and I "assume' these OOTB capabilities are designed to handle large amounts of data.

Zack Sargent · ‎02-26-2024

The full answer to that question probably wouldn't be appreciated if I were to state it publicly. 😉

We have many sources for the data, and sometimes things are missed. Data Certification and Attestation are definitely on the menu once this process is in place. As are some workflows to handle ownership of the VMs through the Service Catalog (ergo, self-service request ownership or push ownership to another employee). However, there are many corner cases that slip through - not to mention the need to "seed" the first sets of owners (which is a little complicated). All of that will be handled by this workflow, as well as a few other odds-and-ends.