Dear Team,
During inference on high-volume datasets in Azure Machine Learning—particularly those spanning multiple countries, departments, and similar entities—we have encountered procedure failures caused by timeout errors. These occur because the inference process does not complete within Azure ML’s default maximum execution window of 180 seconds.
To address this, we propose implementing a batch-based inference mechanism. This approach involves dividing the dataset into smaller, manageable chunks, each capable of being processed within the 180-second window. Once inference is completed for a batch, the current connection would be closed, and a new one established to process the next batch—effectively resetting the 180-second timer for each segment. This would ensure complete inference coverage without triggering timeouts.
Alternatively, when attempting to process the data sequentially by looping through country-level entities, we had to introduce a fixed wait time of 180 seconds between iterations. This ensures that a new inference connection is initiated only after the previous one is closed. However, this method is inefficient—especially in cases where inference completes in significantly less time (e.g., 30 seconds)—as we are still forced to wait the full 180 seconds to prevent concurrent connection issues and procedure failures.
We believe the batch-based strategy will offer a more scalable and efficient solution for inference on large datasets.