Data reliability methodology – Scaler

Purpose

This article explains how Scaler evaluates data reliability: how closely reported consumption represents actual energy, water, or waste use over time.

Scaler assesses data reliability based on the Monitoring method used to obtain consumption values, in line with a PCAF-aligned methodology and market practices used by frameworks such as GRESB and INREV.

Scaler calculates a Data reliability Score (0–100) at the asset level for:

Energy

Water

Waste

This can be found at Data Collection Portal → Portfolio → Reports.

The score helps users understand the confidence level of reported KPIs, particularly for benchmarking, trend analysis, and ESG disclosures.

Data quality vs data reliability

Data quality and data reliability are related but distinct concepts. Quality checks ensure valid inputs; reliability scoring indicates how well those inputs represent actual consumption over time.

Expand to understand the difference

Data quality refers to whether input data is correct, complete, and valid at the point of entry

Managed through Scaler's alert system (errors, missing data, warnings)

Ensures data meets basic standards before calculations run

See Understanding data completion and alerts for details

Data reliability refers to how representative and trustworthy consumption data is over time

Evaluated through monitoring method scoring and coverage analysis

Helps users understand confidence in reported KPIs

This article focuses on reliability methodology

Both work together: quality checks ensure valid inputs, while reliability scores indicate how well those inputs represent actual consumption.

What is data quality?

Data quality describes whether your data is:

Complete – all required fields are filled in

Accurate – values are correct and properly formatted

Consistent – logic and values align across the portfolio

Valid – data passes required validation and reporting rules

High data quality ensures that analytics and reports are built on clean, usable inputs.

How Scaler supports data quality

Scaler supports data quality through a combination of structural controls and automated checks:

Required fields and controlled dropdowns for standardization

Built-in validation rules to flag invalid or illogical entries

Automated alerts for:

Errors (invalid values)

Missing data

Warnings (outliers or inconsistencies)

Outlier detection to highlight unusual changes in resource use

Together, these mechanisms help ensure data is reliable enough for analytics and reporting from the start.

What is data reliability?

Data reliability reflects how closely reported consumption represents actual energy, water, or waste use over time. Scaler assesses data reliability based on the Monitoring method used to obtain consumption values, in line with a PCAF-aligned methodology and market practices used by frameworks such as GRESB and INREV.

Scaler calculates a Data reliability Score (0–100) at the asset level for:

Energy

Water

Waste

This can be found at Data Collection Portal → Portfolio → Reports.

The score helps users understand the confidence level of reported KPIs, particularly for benchmarking, trend analysis, and ESG disclosures.

Next to the asset level bar graphs reflecting data reliability, Scaler also displays an annual trend graph of progress at portfolio level to give insight where improvements in data coverage or monitoring could be made.

How data reliability is determined

Data reliability is based on three primary factors:

Monitoring method

Area coverage – how much of the asset’s floor area is covered

Time coverage – how many days in the reporting period have valid data

The overall reliability score is a weighted average based on Monitoring method and how much area and time they cover.

Monitoring method scoring

Different monitoring methods contribute differently to the data reliability score based on how closely the underlying data reflects direct measurement of the asset. All monitoring methods represent data that is manually entered by clients — Scaler does not generate or substitute this data on your behalf.

`Monitoring method`	Score	Data source type	Classification
Smart meter	1.0	Automatically recorded	Measured
Invoice	0.8	Directly reported	Measured
Conventional meter	0.8	Directly reported	Measured
Standard consumption (cluster average)	0.6	Aggregated from comparable assets	Measured
Estimation (Number of bins) — Waste only	0.2	Derived from observed/invoiced waste volumes	Measured
*Standard consumption (postal code)	0.4	Averaged across geographic area	Modelled
*Manual estimate	0.2	Internally calculated	Modelled

*Based on our mapping to the INREV SDDS reporting framework, these methods correspond to estimated data as defined by INREV. All other methods, including Estimation (Number of bins), correspond to actual data in INREV SDDS outputs.

Note: The final asset score reflects both the monitoring method score and the proportion of area and time covered by each method.

Measured vs modelled — Scaler's classification labels

"Measured" and "Modelled" are Scaler's own shorthand labels for classifying monitoring methods. They are not INREV terminology — they are a mental model to help you understand the nature of each monitoring method and what it means for INREV reporting.

Measured methods reflect data that is directly recorded, reported, or invoiced at the asset level — or derived from aggregated actual meter readings provided by grid operators. Even if a Measured method carries a lower reliability score (such as Estimation (Number of bins) for waste), the underlying data still originates from an actual observation or invoice at the asset.

Modelled methods reflect data that has been averaged or calculated without direct measurement of the specific asset. The value is still client-provided — the label describes the nature of the data source, not a calculation Scaler has performed.

These labels connect directly to two things:

Reliability score assignment — Each monitoring method has its own score (see the table above). Modelled methods score lower than most Measured methods because they are more removed from direct measurement — but the score reflects the specific method, not the broad category. For example, Estimation (Number of bins) is classified as Measured but carries a low score because bin counting is an indirect proxy for waste volumes.

INREV SDDS classification — INREV's Sustainability Data Delivery Standard distinguishes between "actual" and "estimated" data. Scaler's Measured methods map to actual in INREV exports; Modelled methods map to estimated. This mapping is shown in the footnote of the monitoring method table above.

ℹ️

These classification labels apply only to the monitoring methods you select when entering consumption data. They are entirely separate from Scaler's automated estimation model, which generates values to fill gaps in meter data. See Estimated consumption data in Scaler for details.

Monitoring method name changes

As of February 2026, the dropdown options have changed to be general rather than using Netherlands specific terminology.

New name	Previous name
Standard consumption (cluster average)	estimation_(sjv_cluster)
Standard consumption (postal code)	estimation_(sjv_postal_code)
Manual estimate	estimation_(calculation)

Country-specific standardized consumption systems

In some countries, grid operators provide standardized consumption data by aggregating actual meter readings across multiple assets. This data is available in two forms, each with different reliability levels:

Cluster average vs postal code average

Cluster average (Standard consumption (cluster average)): Aggregated actual consumption from similar asset types (e.g., office buildings of similar size and characteristics). Because this reflects real meter readings from comparable assets, it is treated as actual/measured consumption with a score of 0.6.

Postal code average (Standard consumption (postal code)): Averaged consumption across all building types within a geographic area. This broader averaging is treated as modelled consumption with a score of 0.4 because it includes diverse property types and use patterns.

Why the difference matters

Cluster averages provide more reliable data because they compare similar assets, while postal code averages dilute specificity by including all property types in an area. This distinction affects both data reliability scoring and whether consumption is categorized as actual/measured or modelled/estimated for reporting purposes.

Netherlands: Standaard Jaarverbruik (SJV)

The Netherlands uses Standaard Jaarverbruik (SJV) data provided by grid operators:

SJV Cluster data: Corresponds to Standard consumption (cluster average) — aggregated from comparable asset types

SJV Postal Code data: Corresponds to Standard consumption (postal code) — averaged across all buildings in a postal code

Other countries

Similar systems may exist where grid operators provide aggregated consumption data at the asset-type or connection level. Where such systems are methodologically comparable to cluster-based aggregation, clients may use Standard consumption (cluster average) in Scaler, provided this accurately represents how the consumption data was obtained.

Explaining anomalies

Scaler flags unusual consumption trends and allows users to add contextual comments directly in the platform.

This supports transparency around:

Building closures

Occupancy changes

Operational disruptions

Other one-off events

Why data quality and reliability matter

Together, data quality validation and data reliability scoring help ensure that:

Inputs are correct from the start (via quality checks)

KPIs can be trusted over time (via reliability scoring)

ESG reporting is defensible and auditable

Decision-making for decarbonisation pathways is better informed

Related: To understand how Scaler validates data at the point of input through automated alerts, see Understanding data completion and alerts.