Skip to content

Commit a5aeb20

Browse files
arun-guptaMano Marks
authored and
Mano Marks
committed
Monitoring Docker containers (docker#181)
* using ls instead of ps for listing containers * Completing the migration to 1.13, closing docker#152, also addressing docker#175, reverting docker#172
1 parent 94b80e9 commit a5aeb20

13 files changed

+253
-180
lines changed

developer-tools/java/chapters/appa-common-commands.adoc

+8-8
Original file line numberDiff line numberDiff line change
@@ -17,19 +17,19 @@ Here is the list of commonly used Docker commands:
1717
| Remove all images | `docker image rm $(docker image ls -aq)`
1818
2+^s| Containers
1919
| Run a container | `docker container run`
20-
| List of running containers | `docker container ps`
21-
| List of all containers | `docker container ps -a`
20+
| List of running containers | `docker container ls`
21+
| List of all containers | `docker container ls -a`
2222
| Stop a container | `docker container stop ${CID}`
23-
| Stop all running containers | `docker container stop $(docker container ps -q)`
24-
| List all exited containers with status 1 | `docker container ps -a --filter "exited=1"`
23+
| Stop all running containers | `docker container stop $(docker container ls -q)`
24+
| List all exited containers with status 1 | `docker container ls -a --filter "exited=1"`
2525
| Remove a container | `docker container rm ${CID}`
26-
| Remove container by a regular expression | `docker container ps -a \| grep wildfly \| awk '{print $1}' \| xargs docker container rm -f`
27-
| Remove all exited containers | `docker container rm -f $(docker container ps -a \| grep Exit \| awk '{ print $1 }')`
28-
| Remove all containers | `docker container rm $(docker container ps -aq)`
26+
| Remove container by a regular expression | `docker container ls -a \| grep wildfly \| awk '{print $1}' \| xargs docker container rm -f`
27+
| Remove all exited containers | `docker container rm -f $(docker container ls -a \| grep Exit \| awk '{ print $1 }')`
28+
| Remove all containers | `docker container rm $(docker container ls -aq)`
2929
| Find IP address of the container | `docker container inspect --format '{{ .NetworkSettings.IPAddress }}' ${CID}`
3030
| Attach to a container | `docker container attach ${CID}`
3131
| Open a shell in to a container | `docker container exec -it ${CID} bash`
32-
| Get container id for an image by a regular expression | `docker container ps \| grep wildfly \| awk '{print $1}'`
32+
| Get container id for an image by a regular expression | `docker container ls \| grep wildfly \| awk '{print $1}'`
3333
|==================
3434

3535
=== Exit code status

developer-tools/java/chapters/ch10-cli.adoc

-4
This file was deleted.

developer-tools/java/chapters/ch10-ddc.adoc

-4
This file was deleted.

developer-tools/java/chapters/ch10-monitoring.adoc

+244
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,247 @@
22

33
= Monitoring Docker Containers
44

5+
This chapter will cover different ways to monitor a Docker container.
6+
7+
== Docker CLI
8+
9+
`docker container stats` command displays a live stream of container(s) resource usage statistics.
10+
11+
. Start a container: `docker container run --name db -d arungupta/couchbase`
12+
. Check the container stats using `docker container stats db`. It shows the output as:
13+
+
14+
```
15+
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
16+
db 2.02% 374.9 MiB / 1.952 GiB 18.76% 648 B / 648 B 0 B / 156 kB 156
17+
```
18+
+
19+
The output is continually updated. It shows:
20+
+
21+
.. Container name
22+
.. Percent CPU utilization
23+
.. Total memory usage vs amount available to the container
24+
.. Percent memory utilization
25+
.. Network activity
26+
.. Disk activity
27+
.. PIDS??
28+
+
29+
. Start another container: `docker container run -d --name web jboss/wildfly`
30+
. Check the stats for two containers using the command `docker container stats db web`. The output is shown:
31+
+
32+
```
33+
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
34+
db 4.61% 393.1 MiB / 1.952 GiB 19.67% 2.61 kB / 928 B 24.6 kB / 799 kB 219
35+
web 0.18% 266.1 MiB / 1.952 GiB 13.31% 782 B / 718 B 0 B / 4.1 kB 53
36+
```
37+
+
38+
. Stats for all the containers can be displayed using the command `docker container stats`
39+
+
40+
```
41+
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
42+
88b04695855e 0.03% 265.9 MiB / 1.952 GiB 13.30% 916 B / 788 B 0 B / 4.1 kB 51
43+
109d917d17e2 5.61% 394.9 MiB / 1.952 GiB 19.76% 2.75 kB / 928 B 24.6 kB / 807 kB 219
44+
```
45+
+
46+
Note that the container id is shown in this case instead of container's name.
47+
+
48+
. Display only container id and percent CPU utilization using the command `docker container stats --format "{{.Container}}: {{.CPUPerc}}"`:
49+
+
50+
```
51+
88b04695855e: 0.14%
52+
109d917d17e2: 4.83%
53+
```
54+
+
55+
. Format the output in a table. The results should include container name, percent CPU utilization and percent memory utilization. This can be achieved using the command `docker container stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"`:
56+
+
57+
```
58+
NAME CPU % MEM USAGE / LIMIT
59+
web 0.13% 266.1 MiB / 1.952 GiB
60+
db 3.06% 398.9 MiB / 1.952 GiB
61+
```
62+
+
63+
. Display only the first result using the command `docker container stats --no-stream`:
64+
+
65+
```
66+
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
67+
88b04695855e 0.15% 263.6 MiB / 1.952 GiB 13.19% 1.38 kB / 928 B 0 B / 4.1 kB 51
68+
109d917d17e2 3.19% 398.6 MiB / 1.952 GiB 19.94% 3.21 kB / 998 B 24.6 kB / 1.16 MB 220
69+
```
70+
71+
== Docker Remote API
72+
73+
Docker Remote API provides a lot more details about the health of the container. It can be invoked using the following format:
74+
75+
```
76+
curl --unix-socket /var/run/docker.sock http://localhost/containers/<name>/stats
77+
```
78+
79+
On Docker for Mac, enabling remote HTTP API still requires a few steps. So this command uses the `--unix-socket` option to invoke the Remote API.
80+
81+
A specific invocation using `curl --unix-socket /var/run/docker.sock http://localhost/containers/db/stats` will show the output:
82+
83+
```
84+
{"read":"2017-02-28T02:40:29.595511475Z","preread":"0001-01-01T00:00:00Z","pids_stats":{"current":220},"blkio_stats":{"io_service_bytes_recursive":[{"major":254,"minor":0,"op":"Read","value":212992},{"major":254,"minor":0,"op":"Write","value":1339392},{"major":254,"minor":0,"op":"Sync","value":1257472},{"major":254,"minor":0,"op":"Async","value":294912},{"major":254,"minor":0,"op":"Total","value":1552384}],"io_serviced_recursive":[{"major":254,"minor":0,"op":"Read","value":3},{"major":254,"minor":0,"op":"Write","value":249},{"major":254,"minor":0,"op":"Sync","value":230},{"major":254,"minor":0,"op":"Async","value":22},{"major":254,"minor":0,"op":"Total","value":252}],"io_queue_recursive":[],"io_service_time_recursive":[],"io_wait_time_recursive":[],"io_merged_recursive":[],"io_time_recursive":[],"sectors_recursive":[]},"num_procs":0,"storage_stats":{},"cpu_stats":{"cpu_usage":{"total_usage":83724160991,"percpu_usage":[30641144914,10843586791,11798818901,30440610385],"usage_in_kernelmode":12390000000,"usage_in_usermode":15170000000},"system_cpu_usage":132730290000000,"throttling_data":{"periods":0,"throttled_periods":0,"throttled_time":0}},"precpu_stats":{"cpu_usage":{"total_usage":0,"usage_in_kernelmode":0,"usage_in_usermode":0},"throttling_data":{"periods":0,"throttled_periods":0,"throttled_time":0}},"memory_stats":{"usage":419139584,"max_usage":426778624,"stats":{"active_anon":404185088,"active_file":20480,"cache":1589248,"dirty":12288,"hierarchical_memory_limit":9223372036854771712,"hierarchical_memsw_limit":9223372036854771712,"inactive_anon":0,"inactive_file":1568768,"mapped_file":122880,"pgfault":226379,"pgmajfault":2,"pgpgin":202886,"pgpgout":103818,"rss":404193280,"rss_huge":0,"swap":0,"total_active_anon":404185088,"total_active_file":20480,"total_cache":1589248,"total_dirty":12288,"total_inactive_anon":0,"total_inactive_file":1568768,"total_mapped_file":122880,"total_pgfault":226379,"total_pgmajfault":2,"total_pgpgin":202886,"total_pgpgout":103818,"total_rss":404193280,"total_rss_huge":0,"total_swap":0,"total_unevictable":0,"total_writeback":0,"unevictable":0,"writeback":0},"limit":2095898624},"name":"/db","id":"109d917d17e241713341b3d03470444c0144510f1e6de726eb72e1d6786a3e5d","networks":{"eth0":{"rx_bytes":3342,"rx_packets":57,"rx_errors":0,"rx_dropped":0,"tx_bytes":998,"tx_packets":13,"tx_errors":0,"tx_dropped":0}}}
85+
```
86+
87+
As you can see, far more details about container's health are shown here. These stats are refereshed every one second. The continuous refresh of metrics can be terminated using `Ctrl + C`.
88+
89+
== Docker Events
90+
91+
`docker system events` provide real time events for the Docker host.
92+
93+
. In one terminal (T1), type `docker system events`. The command does not show output and waits for any event worth reporting to occur. The list of events is listed at https://docs.docker.com/engine/reference/commandline/events/#/extended-description.
94+
. In a new terminal (T2), kill existing container using `docker container rm -f web`.
95+
. T1 shows the updated list of events as:
96+
+
97+
```
98+
2017-02-27T18:48:30.413053776-08:00 container kill 88b04695855ecf5390e57a6955a25f1ff507f7b066c2cd6397a5773a9e7e683f (build-date=20161214, image=jboss/wildfly, license=GPLv2, name=web, signal=9, vendor=CentOS)
99+
2017-02-27T18:48:30.551760207-08:00 container die 88b04695855ecf5390e57a6955a25f1ff507f7b066c2cd6397a5773a9e7e683f (build-date=20161214, exitCode=137, image=jboss/wildfly, license=GPLv2, name=web, vendor=CentOS)
100+
2017-02-27T18:48:30.954543362-08:00 network disconnect 83f14a590c9b8875cca8d050d47ec1e0dbff6db67180a56571496cadbe579e10 (container=88b04695855ecf5390e57a6955a25f1ff507f7b066c2cd6397a5773a9e7e683f, name=bridge, type=bridge)
101+
2017-02-27T18:48:31.192092236-08:00 container destroy 88b04695855ecf5390e57a6955a25f1ff507f7b066c2cd6397a5773a9e7e683f (build-date=20161214, image=jboss/wildfly, license=GPLv2, name=web, vendor=CentOS)
102+
```
103+
+
104+
The output shows a list of events, one in each line. The events shown here are `container kill`, `container die`, `network disconnect` and `container destroy`. Date and timestamp for each event is displayed at the beginning of the line. Other event specific information is displayed as well.
105+
+
106+
. In T2, create a new container: `docker container run -d --name web jboss/wildfly`
107+
. The output in T1 is updated to show:
108+
+
109+
```
110+
2017-02-27T18:49:24.218079500-08:00 container create 3cc3e2bf3c43e278e0e4bd2ea238a829610d0a620ab069010b4881c1bf8e096e (build-date=20161214, image=jboss/wildfly, license=GPLv2, name=web, vendor=CentOS)
111+
2017-02-27T18:49:24.383788816-08:00 network connect 83f14a590c9b8875cca8d050d47ec1e0dbff6db67180a56571496cadbe579e10 (container=3cc3e2bf3c43e278e0e4bd2ea238a829610d0a620ab069010b4881c1bf8e096e, name=bridge, type=bridge)
112+
2017-02-27T18:49:24.930142017-08:00 container start 3cc3e2bf3c43e278e0e4bd2ea238a829610d0a620ab069010b4881c1bf8e096e (build-date=20161214, image=jboss/wildfly, license=GPLv2, name=web, vendor=CentOS)
113+
```
114+
+
115+
The list of events shown here are `container create`, `network connect`, and `container start`.
116+
117+
=== Use filters
118+
119+
The list of events can be restricted by filters specified using `--filter` or `-f` option. The currently supported filters are:
120+
121+
. container (`container=<name or id>`)
122+
. daemon (`daemon=<name or id>`)
123+
. event (`event=<event action>`)
124+
. image (`image=<tag or id>`)
125+
. label (`label=<key>` or `label=<key>=<value>`)
126+
. network (`network=<name or id>`)
127+
. plugin (`plugin=<name or id>`)
128+
. type (`type=<container or image or volume or network or daemon>`)
129+
. volume (`volume=<name or id>`)
130+
131+
Let's use these filters.
132+
133+
. Show events for a container by name
134+
.. In T1, give the command `docker system events -f container=db`.
135+
.. In T2, terminate the `web` container as `docker container rm -f web`.
136+
.. T1 does not show any events because its only listening for events from `db` container.
137+
. Show events for an event
138+
.. In T1, give the command `docker system events -f event=create`.
139+
.. In T2, create a container `docker container run -d --name web2 jboss/wildfly`
140+
.. T1 shows the event for container creation
141+
+
142+
```
143+
2017-02-28T12:55:45.631795937-08:00 container create 4728dab7c27816351423d64e60adf21a0246c1006b1131655d8b66fc82e8b324 (build-date=20161214, image=jboss/wildfly, license=GPLv2, name=dreamy_lamport, vendor=CentOS)
144+
```
145+
+
146+
.. In T2, terminate the container `docker container rm -f web2`
147+
.. T1 does not show any additional events because its only looking for create events
148+
.. More samples are explained at https://docs.docker.com/engine/reference/commandline/events/#/filter-events-by-criteria.
149+
150+
== Prometheus
151+
152+
Docker 1.13 adds an experimental Prometheus-style endpoint with basic metrics on containers, images and other daemon state. This support is only available in Experimental build.
153+
154+
. For Docker for Mac, click on Docker icon in the status menu
155+
. Select `Preferences...`, `Daemon`, `Advanced` tab
156+
. Update daemon settings:
157+
+
158+
```
159+
{
160+
"metrics-addr" : "0.0.0.0:1337",
161+
"experimental" : true
162+
}
163+
```
164+
+
165+
. Click on `Apply & Restart` to restart the daemon
166+
+
167+
image::docker-mac-metrics-endpoint.png[]
168+
+
169+
. Show the complete list of metrics using `curl http://localhost:1337/metrics`
170+
. Show the list of engine metrics using `curl http://localhost:1337/metrics | grep engine`
171+
172+
=== Prometheus node scraper
173+
174+
Prometheus collects metrics from monitored targets by scraping metrics HTTP endpoints on these targets. Since Prometheus also exposes data in the same manner about itself, it can also scrape and monitor its own health.
175+
176+
. Create a new directory `prometheus` and change directory
177+
. Create `prometheus.yml`
178+
+
179+
```
180+
# A scrape configuration scraping a Node Exporter and the Prometheus server
181+
# itself.
182+
scrape_configs:
183+
# Scrape Prometheus itself every 5 seconds.
184+
- job_name: 'prometheus'
185+
scrape_interval: 5s
186+
static_configs:
187+
- targets: ['localhost:9090']
188+
```
189+
+
190+
This will be scraping metrics for the Prometheus container that will be started on port 9090.
191+
+
192+
. Start Prometheus container:
193+
+
194+
```
195+
docker run \
196+
-d \
197+
--name metrics \
198+
-p 9090:9090 \
199+
-v `pwd`:/etc/prometheus \
200+
prom/prometheus
201+
```
202+
+
203+
. Prometheus dashboard is at http://localhost:9090
204+
. Show the list of metrics
205+
+
206+
image::prometheus-metrics.png[]
207+
+
208+
. Choose `http_request_duration_microseconds`
209+
. Switch from `Console` to `Graph`
210+
.. Change the duration from `1h` to `5m`
211+
+
212+
image::prometheus-metrics2.png[]
213+
+
214+
. Stop the container: `docker container rm -f metrics`
215+
216+
== cAdvisor
217+
218+
https://github.com/google/cadvisor[cAdvisor] (Container Advisor) provides resource usage and performance characteristics running containers.
219+
220+
. Run `cAdvisor`
221+
+
222+
```
223+
docker run \
224+
-d \
225+
--name=cadvisor \
226+
-p 8080:8080 \
227+
--volume=/var/run:/var/run:rw \
228+
--volume=/sys:/sys:ro \
229+
--volume=/var/lib/docker/:/var/lib/docker:ro \
230+
google/cadvisor:latest
231+
```
232+
+
233+
. Dashboard is available at http://localhost:8080
234+
+
235+
image::cadvisor-default-dashboard.png[]
236+
+
237+
. A high-level CPU and Memory utilization is shown. More details about CPU, memory, network and filesystem usage is shown in the same page. CPU usage looks like as shown:
238+
+
239+
image::cadvisor-cpu-snapshot.png[]
240+
+
241+
. All Docker containers are in `/docker` sub-container.
242+
+
243+
image::cadvisor-docker-metrics.png[]
244+
+
245+
Click on any of the containers and see more details about the container.
246+
247+
cAdvisor samples once a second and has historical data for only one minute. The data generated from https://github.com/google/cadvisor/blob/master/docs/storage/influxdb.md[cAdvisor can be exported to InfluxDB]. Optionally, you may use a Grafana front end to visualize the data as explained in https://www.brianchristner.io/how-to-setup-docker-monitoring/[How to setup Docker monitoring].
248+

developer-tools/java/chapters/ch10-newrelic.adoc

-4
This file was deleted.

0 commit comments

Comments
 (0)