Skip to content

feat(host_metrics source): add temperature metrics collector#25607

Open
somaz94 wants to merge 1 commit into
vectordotdev:masterfrom
somaz94:feat/host-metrics-temperature
Open

feat(host_metrics source): add temperature metrics collector#25607
somaz94 wants to merge 1 commit into
vectordotdev:masterfrom
somaz94:feat/host-metrics-temperature

Conversation

@somaz94

@somaz94 somaz94 commented Jun 11, 2026

Copy link
Copy Markdown

Summary

Adds a temperature collector to the host_metrics source. When enabled, it reads hardware temperature sensors via sysinfo::Components and emits three gauges, each tagged with the component label of the sensor it was read from:

  • temperature_celsius — current temperature
  • temperature_max_celsius — highest recorded temperature
  • temperature_critical_celsius — critical threshold (only when the sensor reports one)

The collector is opt-in (it is not part of the default collector set). Many environments where Vector runs — containers, virtual machines, most cloud instances — expose no temperature sensors, so enabling it by default would add a per-scrape Components refresh that yields nothing. Users add temperature to collectors to turn it on. Components that do not report a given value are skipped, and hosts without sensors simply produce no metrics.

Closes: #21389

Vector configuration

sources:
  host:
    type: host_metrics
    collectors:
      - temperature

How did you test this PR?

  • Added a generates_temperature_metrics unit test that asserts every emitted metric is a gauge named temperature* and carries the component tag. The test tolerates an empty result so it also passes in sensorless CI environments.
  • Updated the hand-written metric documentation and regenerated the component config docs (generated/host_metrics.cue) for the new temperature collector enum value.

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Added a changelog fragment under changelog.d/.
  • No. A maintainer will apply the no-changelog label to this PR.

References

@github-actions github-actions Bot added domain: sources Anything related to the Vector's sources domain: external docs Anything related to Vector's external, public documentation labels Jun 11, 2026
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@github-actions github-actions Bot added the docs review on hold The documentation team reviews PRs only after a PR is approved by the COSE team. label Jun 11, 2026
@somaz94

somaz94 commented Jun 11, 2026

Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

@somaz94 somaz94 marked this pull request as ready for review June 12, 2026 01:14
@somaz94 somaz94 requested review from a team as code owners June 12, 2026 01:14

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 34f8b6fec3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

impl HostMetrics {
pub async fn temperature_metrics(&self, output: &mut super::MetricsBuffer) {
output.name = "temperature";
let components = Components::new_with_refreshed_list();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Persist Components before reporting max temperatures

When a Linux sensor does not expose a kernel tempN_highest file, sysinfo::Component::max() is computed by comparing successive refreshes of the same Component. Recreating Components on every temperature_metrics call resets that history, so temperature_max_celsius becomes the current sample on each scrape rather than the highest observed temperature. Keep the Components collection on HostMetrics and refresh it between scrapes, or avoid emitting the computed max when no persistent history is available.

Useful? React with 👍 / 👎.

impl HostMetrics {
pub async fn temperature_metrics(&self, output: &mut super::MetricsBuffer) {
output.name = "temperature";
let components = Components::new_with_refreshed_list();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor SYSFS_ROOT when scraping temperatures

In containerized host-metrics deployments that mount the host sysfs somewhere like /host/sys and set SYSFS_ROOT, the other Linux collectors are redirected through init_roots(), but this sysinfo::Components call reads the process' normal sysfs path instead. Enabling the new collector in that documented setup will scrape the container's /sys and commonly emit no host temperature metrics even though the host sensors are mounted under SYSFS_ROOT.

Useful? React with 👍 / 👎.

Comment on lines +18 to +22
if let Some(temperature) = component.temperature() {
output.gauge(TEMPERATURE_CELSIUS, temperature as f64, tags());
}
if let Some(max) = component.max() {
output.gauge(TEMPERATURE_MAX_CELSIUS, max as f64, tags());

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Drop NaN temperature readings

On Linux, sysinfo can return Some(f32::NAN) for temperature and max values when a sensor file exists but the read fails, and these branches emit that value as a normal gauge. In those sensor-error cases Vector will forward temperature_celsius/temperature_max_celsius samples with NaN values, which downstream metric sinks such as New Relic explicitly reject, so these readings should be filtered with is_finite() before creating metrics.

Useful? React with 👍 / 👎.

Comment on lines +16 to +17
let label = component.label();
let tags = || metric_tags!(COMPONENT => label);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fall back to component IDs for empty labels

On Linux systems where sysinfo falls back from hwmon to /sys/class/thermal (for example Raspberry Pi-style environments), component.label() is empty while component.id() contains the thermal-zone identifier. Using the empty label as the only component tag makes all temperature series share the same tag set when more than one thermal zone is present, so downstream aggregation can collapse distinct sensors; use the ID as a fallback when the label is empty.

Useful? React with 👍 / 👎.

@somaz94 somaz94 force-pushed the feat/host-metrics-temperature branch from 34f8b6f to 4c51702 Compare June 12, 2026 05:52
@evazorro evazorro self-assigned this Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs review on hold The documentation team reviews PRs only after a PR is approved by the COSE team. domain: external docs Anything related to Vector's external, public documentation domain: sources Anything related to the Vector's sources

Projects

None yet

Development

Successfully merging this pull request may close these issues.

could you pls add sensors info in host_metrics

2 participants