Dockerfile: optimize layer caching
The main optimization is that docker image doesn't have to reinstall all of the apt/pip dependencies after each code change. Other benefits that are result of layer caching include:
- faster builds (not as important for release builds, I guess we use no-cache option anyway but it's important for local testing/building)
- faster deployments (not as many layers are needed to download)