eTree provides a broad range of Information Technology Services including software development and Web server applications. Development languages include Borland Delphi, Borland C++, Java, Oracle JDeveloper, Microsoft C# .Net, Oracle database, SQL, PL/SQL

 

 

| home | contact |      

Delphi  
 
 

 

Analytical Toolbox

  database
 
 

The Analytical Toolbox is a suite of statistical tools that have application to the retail energy market.

Unlike many statistical problems, much is known about the target population since consumers’ meters are traditionally read every 60 days thus the entire target population is measured albeit asynchronously and to this relatively coarse time granularity. 

The advent of smart meters that allow polling opens the possibility of measurement of the entire population on a much more real-time basis.  Nevertheless, if there is a finite cost associated with either the introduction of the meters or polling itself, it becomes worthwhile to devise optimal strategies for sampling both in selection of which meters to poll and how often. 

The toolbox has been developed with the above considerations in mind and forms the basis of a suite of statistical functions that can be added to and modified as requirements dictate.  Deployment as a dynamic link library (dll) allows function calls from common Windows programming language thus providing the flexibility to integrate with existing packages.

Random Sampling

The toolbox exports functions to calculate statistical parameters associated with random sampling from both infinite and finite populations. In these cases the “Central Limit Theorem” applies which means the results hold no matter what form the underlying population might take:

  • Estimate population mean (or sum) from random sample of n consumers with confidence interval.

  • Calculate how many random samples are required to estimate the population mean (or sum) to a specified confidence.

Stratified Sampling

Stratified sampling makes use of prior measurement of the population to choose a “better than random” sample. As an example, consumers might be ranked based on their consumption over an earlier time period (day, month, year etc.). Then every nth consumer is selected. This generates a more statistically robust estimate for the population going forward since the resulting sample will in general better reflect the underlying population than a purely random one.

In practice, stratified sampling means that the population mean (or sum) can be estimated with more accuracy and with greater confidence or conversely, the same accuracy and confidence as a random sample can be achieved but with considerably less samples required.

The toolbox exports functions to calculate statistical parameters associated with stratified sampling from a finite population:

  • Estimate population mean (or sum) from stratified sample of n consumers with confidence interval.

  • Calculate how many stratified samples are required to estimate the population mean (or sum) to a specified confidence.

Aggregated and interlaced sampling

Any set of meter reads on any given day is effectively a sample of the population.  While the sample may not be random or stratified, the results are still of statistical significance provided the relationship to the rest of the population can be characterised. This might be achieved using historical data or otherwise. Likewise a different sample of meter reads on a following day bears significance for the first sample for the days elapsed since those meters were read. Such staggered or interlaced readings may be aggregated in a statistically robust way using functions exported by the toolbox to provide the best estimate of the population at any given point in time. Conversely, the frequency of polling meters required to achieve certain levels of precision in population estimation can be determined.

Advanced Modelling

Probability Density Function

An analysis of collected data would allow the development of a Probability Density Function (PDF) for energy consumption over a certain timescale.  A PDF is a mathematical formula (model) that characterises the spread of values in a set of data. The shape of a PDF is similar to that seen when one histograms the data but gives an increased understanding of the underlying statistical behaviour of the population.  Once such a model was established, the likely effectiveness of any proposed sampling regime would be better understood and quantified.

Time Series Analysis

An analysis of historical data as a time series would aid predictive analysis for future consumption. Having a model for the PDF helps since one would expect its form to remain invariant despite significant seasonal fluctuation. This reduces the complexity of the problem to one of modelling how the parameters (3 or four numbers) of the PDF change with time.  An analysis that integrates historical behaviour with recent measurement in a statistically robust way would provide the best estimate going forward.

Contact eTree™ today to request the Analytical Toolbox or discuss your requirements.

Click here to download this page as a pdf document

 

   
Copyright © 2008 eTree™. All rights reserved.

www.etree.co.nz