Just wanted to give a brief update to the issue from @mortenf_3736 that we discussed via the support ticket.
We were never able to get to the bottom of it, but it was entirely related to running on ACA.
Container Setup
VM Setup
Azure Container Apps - Running 4 minimum replicas, max 8 replicas, each 3 cores and 6gb memory
Two D4 (4 cores, 16GB memory) VM with internal load balancer in front between them
Mapped storage from Azure Storage Account v2 Fileshare
Managed Premium Shared Disk, connected to Fail-over Cluster and shared between both VM
Azure SQL Database with 4 cores dedicated
Azure SQL Database Server less, scaling between 0.5 to 4 cores
This doesn't seem to be related to Linux/Docker/Kubernetes, as we have several high-traffic users on Kubernetes clusters without issues like this. However, we have seen a handful of Azure-related problems over the years that manifested in ProGet:
a bad hard drive that was constantly corrupting packages
another had really slow disk I/O (one server out of a handful)
we've seen a buggy storage driver cause some big impacts across some kind of storage configuration
So we believe the issue is the Azure platform itself, similar to the above hardware/software glitches we've uncovered over the past.
We're doing our best to research/identifying issues, and Inedo/ProGet Users aren't the only ones who are experiencing pain like this. Consider this report from a Azure "big data" user:
I have suffered from chronic socket exceptions in multiple Azure platforms - just as everyone else is describing. The main pattern I've noticed is that they happen within Microsoft's proprietary VNET components (private endpoints). They are particularly severe in multi-tenant environments where several customers can be hosted on the same hardware (or even in containers within the same VM.
The problems are related to bugs in Microsofts software-defined-networking components (SDN proxies like "private endpoints" or "managed private endpoints"). I will typically experience these SDN bugs in a large "wave" that impacts me for an hour and then goes away. The problems have been trending for the worse over the past year, and I've opened many tickets (Power BI, ADF, and Synapse Spark).
Other Azure users (who are much more technical than we are) have confirmed that there are indeed severe issues with their SDN infrastructure. Microsoft does appear to be aware of these endemic issues with their platform, and for the time being we simply cannot recommend using Azure's container services for anything that will have any kind of load.
Hope that gives some insight in case anyone stumbles across this thread.
Alana