Print

Print


Hello folks,
I am reviewing the way the fitter treats the MC statistics because I am
afraid part of our current headaches with scans might be due to this.
I still have a doubt and I would appreciate advice on the correct thing to
do.

Currently we are fitting the data shape of Mx with three shapes, the
vub,vcb and 'other' component. The model (i.e. the sum of the three MC) is
assumed to have no error, i.e. the MC statistics uncertainty is completely
neglected when fitting for the central value (it is propagated later on).
We might be sensitive to this if the three MC have significantly
different statistics.

I am now implementing the possibility of considering the error on the
model. This is now implemented in VubAnalysis, a part from a test which is
still failing, but I am looking after that.

My question to you concerns the correct way of calculating the error on
the Vub yield from MC statitics.
Right now we are considering the first bin (i.e. Mx< mxcut) of the Vub MC,
scaling it by the fitted scaling factor and we call this 'Vub yield'.
The error on this is computed as the product of the Vub MC content in the
first bin and the uncertainty on the scaling factor. But since we were using a
limited statistics on MC, in my new implementation this scaling factor has
an error that depends also on the Vub MC statistics in the first bin.
This leads to the paradox that if the Vub signal were only in the first
bin and the background where everywhere BUT not in the first bin, then the
uncertainty on the Vub yield would be the sum in quadrature of the error
on the data yield and on the available Vub MC statistics. This
happens despite the fact we are not using the Vub MC information at all,
i.e. in this case I believe that the MC statistics error should be 0 while
it would be quite large.

Since we have introduced the Vub MC component only to account properly for
signal leakage above the Mx cut, I suggest we compute the Vub yield as
   data - Vcb component-other component
and the error is propagated accordingly (the error coming from 'data'
would be the pure statistics and the error on the other two pieces would
have to be splitted between MC stat and data statistics)

  does this sound reasonable to you?
	ciao
	ric