Welcome to the Inedo Forums! Check out the Forums Guide for help getting started.
If you are experiencing any issues with the forum software, please visit the Contact Form on our website and let us know!
Service unavailable/timeout
-
Hi everyone,
recently we started having performance issue with our ProGet installation (running on docker-linux).We have a reverse proxy that routes the requests, we often get either servise unavailable errors or timeouts if multiple clients try to restore packages (which is the case if multiple projects execute CI pipelines at once, NuGet local cache might not be in-use by CI runners which results in a lot of requests to the ProGet server).
Our typical project need about 73 packages (this include dependencies, analyzers, etc.). We also have multiple feeds to separate develop/production packages.
One request (ie. package search) could take as much as 30s (even longer in some situations) to complete on the server (significant load can be seen both on the Postgres and mono running ProGet).
There are no messages in the ProGet log, so nothing point to an issue in ProGet itself. We originally used NuGet.org cache that we disabled recently (this did improve the performance but didn't fix the issue completely). We have only about 200 packages (including different versions) registered inside ProGet which doesn't seem that many that it should produce performance issues.
Any tips how we could tune ProGet or change some other infrastructure to mitigate the issue? Maybe compacting the DB after removing NuGet connector might help.
Thanks for any help...
Product: ProGet
Version: 5.1.22
-
I notice that the performance of ProGet in Docker is dramatic, especially if you are restoring a project with a bunch of packages that needs to be cached.
I tried it with a dockerized postgres instance, and a standalone instance. The dockerized version just complains about too many client requests. The standalone version shows different behaviour, but shows that the database sessions are not properly closed. So you can end up with far too many open connections.
I will try to see if the Windows version has better performance, otherwise ProGet is not delivering the quality I expect.
-
ProGet is highly optimized for performance, but third-party tools like the NuGet and npm clients will often make hundreds of simultaneous, complex queries that require a lot of back-and-forth communication with a server.
Development workstations are often just as powerful (if not more powerful) than servers, which means even just a few dozen developers can quickly overpower a single server. Increasing the CPU or memory allocation on a web node will rarely help, since the bottleneck is often network-related.
Keep in mind, traffic to ProGet is generally not fast, static requests like those made to npm.org or nuget.org, especially once you add things like connectors, licence validation, security, etc. Those public galleries, by the way, require massive server farms to run, and will generally under-perform when compared to ProGet anyways.
Ultimately, one node, particular a container, may simply not be enough for your usage, especially during peak usage times. This is where load balancing and high availability come in.
Anticipating how many web nodes your High Availability instance of ProGet will need is challenging, and depends on a lot of factors like network latency, package usage, etc.
As a general guidance, consider these ratios:
- High Performance: 1 Server per 50 Users
- Average Performance: 1 Server per 100 Users
- Acceptable Performance: 1 Server per 200 Users
Ultimately, it’s a balancing act between cost (hardware and licensing) and performance.
-
I understand that there are H.A. options, but I'm running the docker configuration for a small team (1~5 users, including 1 single build agent), so it should be more than sufficient. But if I use ProGet as a proxy, i would not expect it to get stuck under somewhat of a peak load.
Positive thing is that the Windows installation does deliver the expected performance, so it is only in the docker image. Even if my docker and windows environments have the same hardware capacity, the performance is not comparable.
So for now I will move to the Windows installation, since I am happy with what proget offers, but the performance of the docker version should really be investigated. It is keeping database connections open in Postgres, that is even a full-blown Postgres and not the dockerized version of it. From the looks of it, the connection pool is saturated, locking all other calls to the system.
Maybe adding package by package can solve it on the short term, this is a costly endeavour for larger projects/projects with many packages. And even if I would go for a payed version, as the database seems not to be the actual bottleneck, a high availability setup (with docker) would not necessarily solve my performance, it would merely mask it.
If needed I'm happy to support the investigation.
-
Thanks for the clarification; we have noticed performance issues, depending on the underling host operating system. I don't understand that to be honest, because I thought it was a virtual/container... but it's hard for us to reproduce, and we suspect it's some driver-related bugs?
But anyways, we will be moving away from Postgres in ProGet 5.2 in favor of SQL Server for Linux. So, I wonder if you can try it again with that? It will be released in the coming weeks.
-
One of my thoughts too was to move to MSSQL for Linux, tried it but noticed the dependency on Postgres.
Will keep an eye out for updates, and test when it is available.
-
Thanks for the info.
We actually identified several problems:
- There seems to be some issue with our cloud provider regarding container isolation (CPU/memory overload of the VM may cause multiple containers to crash)
- There also seems to be persistent connections to the DB that are not properly closed (which may indicate G.J. Admiraal is on the right track here), we're unable to verify whether this is ProGet (connector) issue or containerization issue
Our story:
There is about 5 users in our team using the ProGet, they restore package primarily from local NuGet cache. The real load come from CI runners (as I said in the original question). When the pipeline starts, the load on the server rises quite a lot, this is when the DB connections start to leak. In the end it results in the server to stop servicing requests. Sometimes we need to restart either ProGet docker or DB docker, sometimes we need to restart both to restore operation. At times the server continues to run for weeks without issue, but sometimes only a few restore operations are enough to make it stop working. We might be able to suppress the issue by configuring our runners to prolong clearing of the cache thus making less requests to the ProGet itself.
Thanks G.J. Admiraal for your investigation, we'll try to use Windows installation or wait for MSSQL For Linux support and report back with our findings.
-
While waiting for an update, I had to use Postgres in docker for another piece of software. And that Postgres instance is giving the same kind of performance issues.
So the Postgres docker image is also not really reliable.
-
ProGet 5.2 was released, which uses SQL Server -- so this should make live all kinds of easier.
-
I took a while to get back to this thread, but here goes:
Today we upgraded to the MSSQL backed version of ProGet, that is currently on the Docker hub (5.2.9.11), and most of it went smooth.
However, when restoring all the packages, I did get the unresponsive errors again. But this time the server came back and we could see the log messages. After changing the connection string, things seem to be more stable now when doing a full restore.
The new connection string has this appended to it: "Max Pool Size=200;"
Hopefully this version keeps us happier, as the old version needs a reboot every other day.