No internet connection
  1. Home
  2. General

Slow Iteration performance when iterating through json members with a "For each" action

By @Josh
    2022-09-28 13:47:19.163Z

    Hi All,

    I'm attempting to parse a json file with just over 100K records. Think of it as a table with 20 columns x 100,000 entries. My end goal is to parse the records and either put them in an XLSX spreadsheet or dump them to database table. This can be done in excel pretty quickly with a data import, but it makes more sense to try to move some of this lifting to think automation as we may want to eventually build more complex parsing around this process.

    So the problem I'm encountering is... I've created a "For each json member in %variableholdingjson%" that then proceeds to extract the fields I want and "update an excel spreadhseet" with the extracted fields as the final "for each" step. On a procedural level everything in this part is working as intended but VERY SLOWLY. We are talking 3 seconds per record slow; at that rate it would take the better part of a week to iterate through the whole data heap when excel could essentially do it in 10-20 seconds.

    I suspect that the part that is slowing us down is "Update Excel" step as it needs to open and write the changes to the disk VS keeping all of the processed data in memory until its ready to write it all. Unfortunately, I don't see a way to accomplish this with the tooling provided. Is there a more direct way to go about this conversion that will not take so long?

    Thanks in advance!
    Josh

    • 3 replies
    1. S
      Stephen Parker @stephenparker
        2022-09-28 14:02:10.221Z

        Hi Josh,

        Currently the for..each json member re-parses the json on each interation - which will be the cause of the slowness. Typically the for..each json member has been used for relatively small json. We can look at caching the parsed json object until the loop completes.

        We will check and update here.

        Many Thanks
        Stephen

        1. S
          In reply toJosh:
          Stephen Parker @stephenparker
            2022-09-29 12:53:34.640Z

            Hi Josh,

            We have updated pre-release 874 which now includes caching of the Json object in For..Each loops. Please update to this and let us know if improves performance.

            You can get pre-release updates by enabling the 'Include Pre-Release Updates' option in the server settings - Customer Participation tab. Then select File - Check For Updates.

            Thanks
            Stephen

            1. J@Josh
                2022-09-29 14:10:09.625Z

                Thanks, that definitely improved performance. I estimate that we went from 1 record every 3 seconds to approximately 5 records / second. May still not be fast enough for our needs but it could at least be a contender.