Welcome to the Inedo Forums! Check out the Forums Guide for help getting started.

If you are experiencing any issues with the forum software, please visit the Contact Form on our website and let us know!

Are docker feeds de-duped with respect to the uploaded images?



  • I'm concerned about the space requirements associated with using proget to host my internal docker feed - is there any de-duplication or is each image stored raw?

    Product: ProGet
    Version: 4.6.7



  • Docker images are basically already 'de-duped', since they build on top of eachother. If, however, you have a lot of very similar images that are unrelated... then you should consider to enable "Data deduplication" -- it's a feature in Windows, and will save upwards of 90%.



  • The docker images building on top of each other was exactly what I had in mind - if my CI server is pushing newly build images in to proget (that are based on the same "base" docker image) will the space used on my proget feed reflect only the changes?

    For example, let's say I'm using a base image that is 500 mb in size:

    Build 1 --> docker build . -t feed/app:build-001 --> push to proget
    Build 2 --> docker build . -t feed/app:build-002 --> push to proget
    Build 3 --> docker build . -t feed/app:build-003 --> push to proget

    Assuming the build produces 5 MB of data, is my proget server going to show 1515 MB of used space for the given feed, or will it show 515MB?



  • Well, the base image has a hash, and that gets stored once. If you layer on top of the base image, then the differencing image will have its own hash, which is stored once. That's basically how docker works -- it's a repository of these image files via their hashes.



  • Spectacular - that was exactly what I was hoping.

    Thank you for taking the time to answer!


Log in to reply
 

Inedo Website HomeSupport HomeCode of ConductForums GuideDocumentation