
Using Prometheus
ServerPrometheus
Prometheus is an open-source system monitoring and alerting tool initially developed at SoundCloud.
It collects metric data from servers or applications, stores it as time-series data, and allows visualization or alerting based on this data.
What is a Metric?
A metric is a numerical measurement:
- Page response time
- CPU or memory usage
- API call count
What is a Time Series?
A time series records changes over time:
- Average page response time
- Changes in CPU or memory usage over a day
- API error rates over a week
Docker Container
docker-compose.yml
YAML
services:
prometheus:
image: prom/prometheus
container_name: metric
volumes:
- ./prometheus/config/:/etc/prometheus/
- ./prometheus/data:/prometheus
ports:
- 9090:9090
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--web.enable-lifecycle'
- '--storage.tsdb.retention.time=2d'
- '--storage.tsdb.retention.size=200MB'
restart: always
Config File
prometheus.yml
YAML
global:
scrape_interval: 1m
evaluation_interval: 1m
scrape_configs:
- job_name: 'blog api'
metrics_path: /api/metrics # api metrics path
scheme: http
static_configs:
- targets: [ 'host.docker.internal:8082' ]
labels:
instance: 'api'
# Collect metrics for Prometheus's own status/performance monitoring. Useful for Grafana integration.
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
Impact of scrape_interval on Memory Usage
Prometheus uses memory per time series.
For example, with 100 metrics and a scrape_interval of 15 seconds, 4 samples are stored per minute.
Reducing the interval to 1 minute stores only 1 sample per minute.
- 15s: High memory usage (~500MB), high data resolution (sensitive to changes)
- 30s: Medium memory usage (~250–300MB), medium data resolution
- 1m: Low memory usage (~150–200MB), low data resolution (good for trends, may miss short-term changes)
What is Data Resolution?
Data resolution refers to how frequently time-series data is recorded, indicating the granularity of data collection.
storage.tsdb.retention
Specifies how long collected data is retained.
--storage.tsdb.retention.time
Defines how long Prometheus retains collected data.
- Longer retention increases disk usage.
- Also affects memory usage (e.g., block indexing).
- For long-term storage, consider external solutions like Thanos or Cortex.
Default
YAML
--storage.tsdb.retention.time=15d # Retain time-series data for 15 days, then delete automatically
--storage.tsdb.retention.size
Limits the maximum size of time-series data stored on disk.
- Suitable for low-usage servers or test environments.
- When used with retention.time, the condition met first takes precedence.
- For example, with retention.time=7d and retention.size=200MB, data exceeding 200MB will be deleted even if 7 days haven't passed.