Иногда надо узнать на каком именно хосте docker swarm запущен контейнер, зная только его внутренний hostname, например если мы его знаем из системы мониторинга, что-то произошло, а меры надо принимать на хосте, например освободить места на диске.
Я пытался найти способ узнать это по-простому, но так и не нашёл. Ну и пришёл к двум способам это выяснить и скомбинировал всё в одном скрипте:
- втупую делается “ssh docker ps grep” на хостах docker swarm. Работает быстро, по-меньшей мере с тремя нодами, но требует, соответственно, доступ по ssh на все хосты в сварме, под юзером, имеющим доступ к работе с docker
- метод, использующий cli docker swarm на машине-менеджере swarm, но требует три слоя запуска docker cli и занимает прилично времени
Касательно скорости – на трёх нодах первый метод отрабатывает меньше секунды, тогда как второй выполняется около 25 секунд.
Скрипт имеет неплохую инструкцию по эксплуатации и возможности по конфигурации через переменные окружения.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# | |
# Default settings block, change here if you need or put other defaults to your .bashrc | |
# | |
DSFC_USER="${DSFC_USER:-root}" # defaults to root | |
DSFC_HOSTS_GREP="${DSFC_HOSTS_GREP:-"swarm-"}" # what to grep in /etc/hosts | |
DSFC_HOSTS_LIST="${DSFC_HOSTS_LIST:-`(cat /etc/hosts | grep ${DSFC_HOSTS_GREP} | cut -f1 | xargs -L1 echo | uniq)`}" # change the default inside backticks to your way to get hosts list | |
DSFC_METHOD_SSH="${DSFC_METHOD_SSH:-1}" # whether to use the ssh and docker ps method | |
DSFC_METHOD_SSH_CONNECT_TIMEOUT="${DSFC_METHOD_SSH_CONNECT_TIMEOUT:-5}" # connection timeout for the ssh method | |
DSFC_METHOD_DSM="${DSFC_METHOD_DSM:-1}" # whether to fallback to the local docker swarm manager capabilities | |
DSCF_METHOD_DSM_SERVICE="${DSCF_METHOD_DSM_SERVICE:-.}" # sets a grep pattern for service name too look the container among - speeds up docker swarm manager method. | |
# | |
# Usage printing block | |
# | |
read -r -d '' usage << EOM | |
Finds container by id on a predefined hosts list, since swarm does not make it easy. Assuming you have ssh access to the swarm nodes list or running on a manager node. | |
Usage: | |
$(basename "$0") [-h|--help] - this message | |
$(basename "$0") CONTAINER_ID_OR_NAME [SERVICE_NAME] - this is basically strings passed to grep on each host. Set SERVICE_NAME if known to speed up search in Docker Swarm Manager node mode. | |
Environment variables: | |
DSFC_HOSTS_LIST=see-explanation | |
list of hosts to connect to for the "ssh docker ps" method. Defaults to all records from "/etc/hosts" having substring defined in DSFC_HOSTS_GREP. | |
DSFC_HOSTS_GREP=swarm- | |
if DSFC_HOSTS_LIST is not set, this variable will be used as the grep pattern to filter hosts from "/etc/hosts" | |
DSFC_USER=root | |
user to connect to the hosts on the list. | |
DSFC_METHOD_SSH_CONNECT_TIMEOUT=5 | |
ssh connect timeout when using "ssh docker ps" method | |
DSFC_METHOD_SSH=1 | |
whether to use the "ssh docker ps" method at all | |
DSFC_METHOD_DSM=1 | |
whether to use the "docker swarm manager node" method at all | |
DSCF_METHOD_DSM_SERVICE=. | |
if defined, SERVICE_NAME is set to this var value if not implicitly set on command line | |
Workflow: | |
Tries the "ssh docker ps" method first, since it's way faster. If the first method failed or disabled, tries "docker swarm manager node" method, that may take a while. | |
EOM | |
if [[ $# -eq 0 ]] ; then | |
echo "$usage" | |
exit 1 | |
fi | |
case "$1" in | |
-h|--help) printf "$usage" | |
exit 0 | |
;; | |
esac | |
if [ ! -z "$2" ]; then | |
DSCF_METHOD_DSM_SERVICE="$2" | |
fi | |
# | |
# Trying using the "ssh and docker" ps method | |
# | |
DSFC_METHOD_SSH_RESULT=-1 # using unique simple numbers | |
if (( DSFC_METHOD_SSH == 1 )) ; then | |
DSFC_METHOD_SSH_RESULT=0 | |
for DSFC_HOST in $DSFC_HOSTS_LIST ; do ssh -o ConnectTimeout=$DSFC_METHOD_SSH_CONNECT_TIMEOUT $DSFC_USER'@'$DSFC_HOST 'docker ps | grep '$1' | grep '$DSCF_METHOD_DSM_SERVICE' |sed "s/.*/$(hostname)/"|sort|uniq' || DSFC_METHOD_SSH_RESULT=$? ; done | |
if [ -z "$DSFC_HOSTS_LIST" ]; then # if no hosts were found, consider error | |
DSFC_METHOD_SSH_RESULT=-3 # using unique simple numbers | |
fi | |
fi | |
# | |
# Trying using the slower "swarm manager method" if allowed and ssh method didn't work | |
# | |
DSFC_METHOD_DSM_RESULT=1 # if not trying - do not interfere with previous result | |
if ((DSFC_METHOD_DSM = 1 && DSFC_METHOD_SSH_RESULT != 0)) ; then | |
RW_DSM="$(docker node ls)" | |
if [[ $? -eq 0 ]] ; then | |
SEARCHFOR="$(docker service list --format '{{.Name}}' | tr '\n' ' ' | grep $DSCF_METHOD_DSM_SERVICE )"; ((for f in $(docker service ps -q $SEARCHFOR); do docker inspect --format '{{.Status.ContainerStatus.ContainerID}} {{.NodeID}}' $f; done) | grep $1 | cut -f2 -d' ' | xargs docker node inspect --format "{{.Description.Hostname}}") | sort | uniq | |
DSFC_METHOD_DSM_RESULT=$? | |
else | |
DSFC_METHOD_DSM_RESULT=1 # if not on a docker swarm node, use previous exit codes | |
fi | |
fi | |
exit $(( DSFC_METHOD_DSM_RESULT * DSFC_METHOD_SSH_RESULT )) # we have simple numbers, so unique exit code, but maybe negative |