Instances of the Prometheus agent can be scaled up


As the number of nodes in a cluster grows, so does the volume of data collected by Prometheus. The Prometheus agent’s capability for processing information will eventually be exhausted. Lack of memory, brought on by an increase in the cardinality of the time series, is the most common reason for a system to crash. In this case, Prometheus instances will start to fail because additional memory is needed, signalling the need to start scaling immediately.

Scaled Vertical Grading

Prometheus scaling of this sort is achieved without the introduction of any difficulty. Changing the current active cluster node’s performance only requires replacing the RAM or processor on that system. However, it’s probable that this approach won’t be successful with massive clusters. Or, alternatively, we’d rather not have a single pod occupying such a sizable portion of our node’s memory. You should think about horizontal expansion if that’s the case.

Horizontal Scaling

Support for horizontal scaling is enabled by setting a configuration parameter that allows the usage of several Prometheus servers running in agent mode to collect your data. This scalable design is sometimes referred to as sharding.

The number of replicas in the deployed StatefulSet will match the value of the sharding.total_shards_count property. Use of this configurator feature causes the incorporation of supplementary relabel rules by default. This will ensure that only one Prometheus server will scrape each target. These requirements are conditional on the target’s address hash-mod.

When setting up the relabel rules for each target, the agent first calculates a hash using the specified target address. The agent then modifies the hash by the total number of shards, which is the modulus. After that, it can figure out which piece of the shard the scraped target belongs in.

Identification of the target scraper

Each metric has been given a Prometheus_ server label, which reveals the Prometheus server doing the scraping based on the shard identifier. A Prometheus server instance can be uniquely identified by combining the cluster name and Prometheus server labels. One possible solution is to use both designations together.

Self-Assessment KPIs

Prometheus server self-metrics should be gathered from all Prometheus servers; hence, the job responsible for collecting Prometheus server self-metrics is exempt from the additional rules that are enforced when sharding is implemented. This is possible because the agent supports including the skip_sharding flag in static_target jobs. This parameter has already been provided in the self-metrics job that is the default.

Constraints or Limits

Adding extra scrape jobs to the configuration as extra_scrape_configs, which contains the raw definition of Prometheus jobs, will cause the corresponding targets to be scraped by all Prometheus servers regardless of any sharding configuration. This is because that is where the first iteration of the definition for Prometheus tasks is kept. Auto-scaling is not a feature that is currently available. You can restart the Prometheus pods by changing the chart settings to increase or decrease the number of shards.


There are several options to consider while scaling Prometheus. There is no denying the benefits of doing it alone. If your firm is already large enough to have dedicated personnel who are able to spend some time to the care and feeding of your scalable Prometheus infrastructure, then it can make sense for your organization to investigate this option.

Comments are closed.