Home Blog About Us Work we do Content Contact Us

Combining Estimates

I have many skills in life, but judging the age of people is not one of them!

Let’s pretend that you know, from past experience, that I make age estimation errors that are normally distributed and have a standard deviation of 5.5 years.

A colleague at work has much better age estimating skills. Let’s say that, based on her historic performance, her standard deviation is a more respectable 2.3 years.

A person, who neither of us has met before, appears. I estimate the age of this mystery person to be 54 years old. My colleague estimates her age to be 64.

Is it possible to combine our two (independent) estimates to get a more accurate prediction, and if so, what would be this estimate?

It might be tempting to just accept my colleague’s estimates, since she has a lower standard deviation, but there is information gained from my answer too.

Image: Nell Moralee

Blended Estimate

Let’s look at this generically. Let G1 and G2 be our respective guesses and σ1 and σ2 be our standard deviations.

We’re looking to create a blended estimate that will be combine some of my answer (and variance), and some of my colleagues. Let’s assume we do this with a linear weighting. We’ll take k of my estimate, and (1-k) of my colleague’s.

The Variance (the average of the squared differences from the mean) of our blended answers is added (because our estimates are independent). Variance is the square of standard deviation.

Differentiating the Variance with respect to k, and setting this to zero will find the turning points, and confirming that the second derivative is always positive shows that this corresponds to a minimum value for the Variance.

Rearranging this gives the value for k that gives the lowest Variance.

Substituting this back into our guess formula reveals the blended guess with the minimum Variance.


Applying the values of G1 = 54 and σ1 = 5.5, and of G2 = 64 and σ2 = 2.3, we get a blended estimate of ≈ 62.5 years old.

Related Articles

A little while ago, I wrote about estimating the number of people running in a marathon, and another famous similar problem is the application, during the Second World War, of estimating the number of German tanks procuced by examing the serial numbers of captured and destroyed tanks (also in that link).

Here's an intersting sampling problem about how to estimate the number of unseen errors in a document after it has been reviewed by two independent people.

Other interesting sampling problems

Here are some other articles that also consider sampling and imperfect views on collections to make estimates:

You can find a complete list of all the articles here.      Click here to receive email alerts on new articles.

© 2009-2016 DataGenetics    Privacy Policy