Docker 安装与优化:生产环境实战指南
本教程参考 Docker 官网安装编写 官网:https://docs.docker.com/
一、安装:选择正确的版本
适用系统:Ubuntu 20.04/22.04、CentOS 7/8、RHEL 8/9 等主流系统
1. 版本策略
生产环境务必使用 Docker CE Stable 版本,避免实验性功能:
2. 开始安装(Linux)
添加docker官方源并安装依赖
Ubuntu/Debian:
# 添加Docker官方GPG密钥sudo apt updatesudo apt install ca-certificates curlsudo install -m 0755 -d /etc/apt/keyringssudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.ascsudo chmod a+r /etc/apt/keyrings/docker.asc# 将存储库添加到Apt源sudo tee /etc/apt/sources.list.d/docker.sources <<EOFTypes: debURIs: https://download.docker.com/linux/ubuntuSuites: $(. /etc/os-release && echo"${UBUNTU_CODENAME:-$VERSION_CODENAME}")Components: stableSigned-By: /etc/apt/keyrings/docker.ascEOF安装 docker
安装最新版本
sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin安装特定版本
apt list --all-versions docker-cedocker-ce/noble 5:29.2.1-1~ubuntu.24.04~noble <arch>docker-ce/noble 5:29.2.0-1~ubuntu.24.04~noble <arch>...选择版本安装
VERSION_STRING=5:29.2.1-1~ubuntu.24.04~noblesudo apt install docker-ce=$VERSION_STRING docker-ce-cli=$VERSION_STRING containerd.io docker-buildx-plugin docker-compose-pluginCentOS/RHEL:
配置docker源
# 低版本系统用yum安装也可以dnf -y install dnf-plugins-corednf config-manager --add-repo https://download.docker.com/linux/rhel/docker-ce.repo安装最新版本
dnf install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin安装指定版本
## 先列出存储库中可用的版本dnf list docker-ce --showduplicates | sort -rdocker-ce.x86_64 3:29.2.1-1.el9 docker-ce-stabledocker-ce.x86_64 3:29.2.0-1.el9 docker-ce-stable<...>选择需要安装的版本
VERSION_STRING=docker-ce-3:29.2.1-1.el9dnf install docker-ce-<VERSION_STRING> docker-ce-cli-<VERSION_STRING> containerd.io docker-buildx-plugin docker-compose-plugin二、核心优化:生产环境必做
1. 守护进程配置(/etc/docker/daemon.json)
{"exec-opts": ["native.cgroupdriver=systemd"],"log-driver": "json-file","log-opts": {"max-size": "100m","max-file": "5" },"storage-driver": "overlay2","storage-opts": ["overlay2.override_kernel_check=true" ],"live-restore": true,"max-concurrent-downloads": 10,"max-concurrent-uploads": 5,"default-ulimits": {"nofile": {"Name": "nofile","Hard": 64000,"Soft": 64000 } },"metrics-addr": "0.0.0.0:9323","experimental": false}关键参数说明:
"exec-opts": ["native.cgroupdriver=systemd"] :指定容器使用的 systemd 驱动 "log-driver": "json-file":控制容器 stdout/stderr 的存储方式 log-opts:防止日志无限增长,到100m就切割 storage-driver: 存储驱动 storage-opts:跳过内核兼容性检查,强制运行在不支持 overlay2 的内核上 live-restore:Docker daemon 重启时容器不退出 max-concurrent-downloads:控制镜像 layer 并发下载数,默认是 3 max-concurrent-uploads :push 镜像时并发上传层数 default-ulimits:资源限制,为所有容器设置默认 ulimit metrics-addr:监控接口
应用配置:
systemctl daemon-reloadsystemctl restart docker2. 网络优化
自定义网桥避免IP冲突:
# 创建生产级网络docker network create \ --driver bridge \ --subnet=172.20.0.0/16 \ --gateway=172.20.0.1 \ --opt "com.docker.network.bridge.name"="br0" \ --opt "com.docker.network.bridge.enable_ip_masquerade"="true" \ --opt "com.docker.network.bridge.enable_icc"="true" \ prod-network内核参数调优(/etc/sysctl.conf):
# conntrack(容器/NAT核心)net.netfilter.nf_conntrack_max = 1048576net.netfilter.nf_conntrack_tcp_timeout_established = 600net.netfilter.nf_conntrack_tcp_timeout_time_wait = 30net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 30net.netfilter.nf_conntrack_tcp_timeout_close_wait = 15# 端口范围net.ipv4.ip_local_port_range = 1024 65535# TCP连接优化net.ipv4.tcp_tw_reuse = 1net.ipv4.tcp_fin_timeout = 30# 队列优化net.core.somaxconn = 4096net.core.netdev_max_backlog = 65536# buffer优化net.core.rmem_max = 16777216net.core.wmem_max = 16777216net.ipv4.tcp_rmem = 4096 87380 16777216net.ipv4.tcp_wmem = 4096 65536 16777216# keepalive(长连接稳定)net.ipv4.tcp_keepalive_time = 600net.ipv4.tcp_keepalive_intvl = 15net.ipv4.tcp_keepalive_probes = 5# SYN攻击防护net.ipv4.tcp_syncookies = 1net.ipv4.tcp_max_syn_backlog = 81923. 存储优化
Overlay2 性能调优:
# 使用独立高速磁盘(SSD/NVMe)sudo mkdir -p /data/dockersudo mount /dev/nvme1n1 /data/docker# 修改Docker根目录sudo systemctl edit docker.service# 添加:[Service]ExecStart=ExecStart=/usr/bin/dockerd --data-root /data/docker -H fd://sudo systemctl restart docker三、安全加固
1. 最小权限原则
# 创建专用用户组,避免rootsudo groupadd dockersudo usermod -aG docker $USER# 限制容器能力(运行时)docker run -d \ --cap-drop=ALL \ --cap-add=NET_BIND_SERVICE \ --security-opt=no-new-privileges:true \ --read-only \ --tmpfs /tmp:noexec,nosuid,size=100m \ your-image2. 镜像安全扫描
# 集成到CI/CDdocker scan your-image:latest# 或使用 Trivytrivy image --severity HIGH,CRITICAL your-image:latest3. 资源限制(防止资源耗尽)
docker run -d \ --memory="512m" \ --memory-swap="512m" \ --cpus="1.0" \ --pids-limit=100 \ --restart=unless-stopped \ your-app四、监控与排障
1. 关键指标采集
# 启用内置监控curl http://localhost:9323/metrics | grep engine_daemon# 推荐监控项:# - engine_daemon_container_states_containers{state="running"}# - engine_daemon_engine_cpus# - engine_daemon_engine_memory_bytes2. 日志收集方案
# docker-compose 生产配置示例logging:driver:"fluentd"options:fluentd-address:localhost:24224tag:docker.{{.Name}}fluentd-async-connect:"true"3. 快速排障命令
# 查看容器资源占用docker stats --no-stream# 查看事件流docker events --since 1h# 进入容器调试(推荐方式)docker exec -it <container> sh -c "export COLUMNS=300; export LINES=200; exec sh"# 导出容器文件系统docker export <container> -o /tmp/debug.tar五、离线安装(一键安装脚本)
当前脚本需要联网下载 docker 离线安装包,修改一下就可以了。
Docker 离线包下载地址:https://download.docker.com/linux/static/stable/ (根据自己系统架构下载)
注意事项:官方明确说明:
二进制安装: ❌ 不自动更新 ❌ 不适合生产 ❌ 无安全补丁自动修复
完整脚本:
#!/usr/bin/env bashset -e# ========================# 可配置参数# ========================DOCKER_VERSION="25.0.3"ARCH="x86_64"DOWNLOAD_URL="https://download.docker.com/linux/static/stable/${ARCH}/docker-${DOCKER_VERSION}.tgz"# ========================# 安装 Docker 二进制# ========================echo">>> 下载 Docker ${DOCKER_VERSION} ..."wget -c ${DOWNLOAD_URL} -O /tmp/docker.tgzecho">>> 解压 ..."tar -xzf /tmp/docker.tgz -C /tmpecho">>> 安装到 /usr/bin ..."cp /tmp/docker/* /usr/bin/# ========================# 创建必要目录# ========================echo">>> 创建目录 ..."mkdir -p /etc/dockermkdir -p /var/lib/docker# 创建 docker 组if ! getent group docker > /dev/null; then groupadd dockerfi# (可选)允许当前用户无 sudo 使用 docker# usermod -aG docker $USER || true# ========================# 创建 daemon.json(可选优化)# ========================cat > /etc/docker/daemon.json <<EOF{"exec-opts": ["native.cgroupdriver=systemd"],"log-driver": "json-file","log-opts": {"max-size": "100m","max-file": "5" },"storage-driver": "overlay2","live-restore": true,"max-concurrent-downloads": 10,"max-concurrent-uploads": 5,"default-ulimits": {"nofile": {"Name": "nofile","Hard": 64000,"Soft": 64000 } },"metrics-addr": "0.0.0.0:9323","experimental": false}EOF# ========================# 创建 containerd service# ========================cat > /etc/systemd/system/containerd.service <<EOF[Unit]Description=containerd container runtimeAfter=network.target[Service]ExecStart=/usr/bin/containerdRestart=alwaysDelegate=yesKillMode=processLimitNOFILE=1048576LimitNPROC=infinityLimitCORE=infinity[Install]WantedBy=multi-user.targetEOF# ========================# 创建 docker.service# ========================cat > /etc/systemd/system/docker.service <<EOF[Unit]Description=Docker Application Container EngineDocumentation=https://docs.docker.comAfter=network-online.target firewalld.service containerd.serviceWants=network-online.targetRequires=containerd.service[Service]Type=notifyExecStart=/usr/bin/dockerd \ --containerd=/run/containerd/containerd.sockExecReload=/bin/kill -s HUP \$MAINPIDTimeoutStartSec=0Restart=alwaysRestartSec=2LimitNOFILE=infinityLimitNPROC=infinityDelegate=yesKillMode=process[Install]WantedBy=multi-user.targetEOF# ========================# 启动服务# ========================echo">>> 重新加载 systemd ..."systemctl daemon-reexecsystemctl daemon-reloadecho">>> 启动 containerd ..."systemctl enable --now containerdecho">>> 启动 docker ..."systemctl enable --now docker# ========================# 验证安装# ========================echo">>> 验证 Docker ..."docker version || truedocker info || trueecho">>> 测试 hello-world ..."docker run hello-worldecho">>> 安装完成 ✅"最后
作为容器技术的核心,Docker的配置优化直接决定了生产环境的稳定性和可靠性。记住一个原则:生产环境中,稳定性永远优先于功能性。 当然,容器技术的世界瞬息万变。如果您在实际部署中遇到特定场景的优化需求(比如K8s节点、CI/CD构建节点等),欢迎在评论区留言,我们一起探讨解决方案。
夜雨聆风