Mathematicians/statisticians needed!
Feb. 20th, 2004 04:34 pmI've been pondering something to do with my article/thesis, and realising I may have done something slightly wrong.
I am running Poisson statistical models, though that isn't important in terms of the method.
Basically I'm producing a statistical model using certain years of data, and using that model to predict the following year.
e.g.
Produce the model
# tropical cyclones = a + b*temperature (a, b are regression coefficients)
using 1960-1992 data, and want to predict the #tcs in 1993.
Due to using several more predictors than that and them being sometimes different orders of magnitude, the predictor data was standardised to give equal weighting to each of the predictors.
This was done by subtracting its overall mean and dividing by its overall standard deviation over the 1960-1992 data for each predictor.
Where I'm confused is how do I then standardise the 1993 predictor data:
Do I subtract the overall mean and divide by the overall std from the 1960-1992 data?
Or do I calculate the mean and std from the 1993 data?
I'm leaning towards the first way now, whereas I was doing a variation on the second. I don't think it will make a large difference to the final forecasts, just because of the size of the numbers, but I need to get this clear in my head.
Can anyone help?
So wish I'd done more stats than just the first year course. It's been a nightmare having to teach myself.
I am running Poisson statistical models, though that isn't important in terms of the method.
Basically I'm producing a statistical model using certain years of data, and using that model to predict the following year.
e.g.
Produce the model
# tropical cyclones = a + b*temperature (a, b are regression coefficients)
using 1960-1992 data, and want to predict the #tcs in 1993.
Due to using several more predictors than that and them being sometimes different orders of magnitude, the predictor data was standardised to give equal weighting to each of the predictors.
This was done by subtracting its overall mean and dividing by its overall standard deviation over the 1960-1992 data for each predictor.
Where I'm confused is how do I then standardise the 1993 predictor data:
Do I subtract the overall mean and divide by the overall std from the 1960-1992 data?
Or do I calculate the mean and std from the 1993 data?
I'm leaning towards the first way now, whereas I was doing a variation on the second. I don't think it will make a large difference to the final forecasts, just because of the size of the numbers, but I need to get this clear in my head.
Can anyone help?
So wish I'd done more stats than just the first year course. It's been a nightmare having to teach myself.