Generalize the use of HTTP caching across the infra
Right now, every single service is expected to cache their HTTP requests... which is a little wasteful and leads to duplication of data when two services access the same URL.
By using an HTTP(S) proxy, we could de-duplicate all of these requests, improve performance, and potentially remove most instances of the pull-through container registries and the URL-rewriting complexity that comes with it. This could be done transparently or through the use of HTTP(S)_PROXY.
Here are the services which would benefit the most from it:
- Executor: Simpler architecture, make it easier to external servers to DUTs
- Gitlab-runner: Sharing of the job artefacts between jobs
- Pull-through registries / gitlab runner / podman: Caching of the layers
- Others???
This would also make it easier to restrict the list of services both the gateway and the DUTs have access to, without forcing DUTs not to have any access to the internet (although this is still recommended) while also enabling the possibility of changing this on a per-DUT/job basis!
Potential solutions: