MISTAKES TO AVOID IN DOCKER IMAGES WITH REASON AND SOLUTION 

INTRODUCTION

Docker is a leader in the platform as a service (PaaS) products, which makes it easier to create, deploy, and run applications by using containers, utilizing the features of encapsulation, isolation, resource utilization, portability, reliability etc. Docker image is the basic building block that need to get crafted considering various aspects like cybersecurity, data persistence, performance, memory utilization, etc. In this article, I detail some of the common mistakes to be avoided while creating Docker images.

BEST PRACTICES

  1. Don’t create an image that stores data inside the container

    • Reason: Loss of data while container stops or when underlying  image version upgrades.
    • Solution: Declare the volume externally.
  2.  Don’t run too many services in the same container

    • Reason: Every service added increases the vulnerability in container security. In some cases, it affects scalability (eg: application and DB in the same container), creates difficulty to monitor, and leads to inefficient resource utilization too.
    • Solution: Follow the design pattern of one process per container. It’s better to let a baseline image (Linux) handle addon services like cron, syslog, logrotate, etc.
  3. Don’t use multiple RUN, COPY, V0LUMES  instructions in the Docker image

    • Reason: Using RUN/COPY/VOLUMES commands in Dockerfile results in creating new image layers which is a  characteristic of docker, but an Increase in image layers contributes to larger image size and thus requires more time for image build.
    • Solution: Group RUN/COPY shell sequences together.
  4. Don’t RUN utility package update and install in separate lines/cache layer 

    • Reason: During Image build, utility packages usually need to be updated, but if we install specific packages separately in further steps the installed packages won’t be from the latest version -ie the update will get cached by the build and won’t run when installation runs.    
    • Solution: RUN package update and install in the same line thus making sure that installed packages are from the updated version only.
  5. Don’t use ADD instruction in Dockerfile

    • Reason: ADD/COPY were used interchangeably. but ADD can download and extract from external URLs along with copying from source to destination. ADD proved problematic and unpredictable.
    • Solution: Use COPY  for file duplication from source to destination. If extraction or download of files is required, use wget and curl within the RUN command.
  6. Don’t use the latest tag for base images as maintainer of Docker file

    • Reason: The base image used in Dockerfile has a version tagged to it. If the version marked is ‘latest’ then the base image derives the  latest changes unknown to the maintainer  and can fail the container to give the expected result.
    • Solution: Use the version number in the tag.
  7. Don’t use multiple base images in a Dockerfile

    • Reason: Docker will accept only the latest  ‘FROM’ command for the base image in the Docker file sequence.
    • Solution: If multiple image features are required for an application, then use a separate docker image and link each other(eg: Web Application and Database being separate images and linked each other)
  8. Don’t interact with declared volume during the build process

    • Reason: Volume in your image is added only during run time.
    • Solution: Volume interactions can be done only in containers.
  9. Don’t use a single-layered image

    • Reason: It is difficult to recreate, manage and distribute.       
    • Solution: First, create a base layer image for the OS to use, then a layer of username definition, and then run time installation and configuration.
  10. Don’t store credentials in the image

    • Reason: Increases vulnerability for the container.
    • Solution: Use confidential parameters as Environment variables integrated with Vault.
  11. Don’t run processes as the root user 

    • Reason:  If a hacker manages to take control of an application container running as the root user. then the whole host where the container runs get vulnerable to attacks.  
    • Solution: The Docker file should use USER instruction to create a non-root user which runs the application container. For containers whose processes must run as the root user within the container, you can re-map this user to a less-privileged user on the Docker host. Achieved by enabling user namespace support in Docker daemon.
  12. Don’t rely on IP address

    • Reason: Containers have an internal IP address which will change on restart.
    • Solution: To communicate to another container, use environment variables to pass hostname/container name and port number.
  13. Don’t let containers run unmonitored

    • Reason: If containers run without monitored, a failure will end up in infinite possible reasons and make us unable to solve the root cause of failure.
    • Solution: Design container images such that they can be configured and monitored with monitoring tools.
  14. Don’t let the container write logs to local files

    • Reason: The chance to lose the log is enormous. 
    • Solution: Integrate with a centralized logging solution and redirect logging to the STDOUT. 
  15. Don’t let the image write its data to the root partition of the host.

    • Reason: The data stored in ’/var/lib/docker’ can crash the root partition as content grows, thus turning the host into an unhealthy state, or when the root partition gets corrupted the Docker images will get lost from the host.
    • Solution: Ensure a separate partition in the host exclusive for containers being created.

CONCLUSION

While writing Dockerfile, the main objective should be to create image layers that comprise only the necessary packages and sub-processes. By providing the features such as zero vulnerability, lightweight, efficient resource utilization, and data persistence will help the images to stand out in the Docker world.

About The Author

Avoiding Mistakes in Docker Images, CLOUDCONTROL

Iwin Mathew

Have more than 5 years of experience in managing Test environments, automating and optimizing deployments in Development and Non-Prod Environments, leveraging configuration management, CI/CD, and various DevOps practices. Have profound knowledge and experience in the Orchestration of Applications in Multi-Hybrid-Cloud Platforms.