I listened to the latest podcast and i applaud the desire to bring some rigor to the testing of rifle accuracy. I have a comment / suggestion about the “group size” and that is to forget it altogether. Group size is irrelevant and was just a way to make a manual process of evaluating accuracy tractable by the average human. But since you are using a computer to calculate confidence levels its totally unnecessary and un desired. Simply collect the data on 16 or 25 or 50 individual shots and compute a 1 MOA confidence level, 1.5 MOA confidence level etc etc. Group size has no meaning as it is being used as a proxy to barrel heat.
That being said, you need to establish a time interval between each shot to avoid barrel overheating as ideally you want the barrel in a heated but quiescent state. Additionally, you could tape a thermocouple to the barrel and only shoot when the temperature drops below some established reasonable constant.
This topic is fascinating in my opinion. Load development is another area of interesting discussion. Ladder testing and looking for stable nodes is more witchcraft and wishful thinking than science. Most load development is no better than just a random pick. Hornady has an interesting podcast series on this starting with this one. Ep. 050 - Your Groups Are Too Small | SAMPLE SIZE | - YouTube
It is science Rhombo, Is as different as using iron sights. Your statistics change if you are reloading ammo with low S.D.'s. There are many things that effects a bullet in transit from the chamber. Reduce your effects in every way and you reduce your statistic.
Its only science if you use a statically significant sample size for each independent variable which nobody does, well almost nobody. You need 35ish samples just to be reasonably accurate.
Yeah, they really are. Im getting started in NRL22 so im going to be working this problem shortly. I have a good scientific background so im excited for the challenge.
If for example, you shoot a 3 shot group as a test to measure the accuracy of the rifle, then the true accuracy of the rifle will be as much as 70% greater ot lower than your three shot test. If you shoot a 30 shot group then your true accuracy will be within 10-15% of that 30 shot group. The larger number of samples the more certain you can be of your rifles accuracy. You can never know exactly, you can only refine your uncertainty.
The way to say it is that smaller groups have more variance (uncertainty) than larger sample sizes.
Accuracy is ever changing! Even if you shoot a larger group of 30 sample size or even bigger no two bullets are ever going to leave the barrel the same. Throat and barrel erosion take effect on accuracy with every shot. The best barrels burn out within 7000 shots or less depending upon velocity and pressures and barrel maintenance. To maintain accuracy within the 7000 or so shots you will have to adjust for the loss of velocity and higher pressures that will ultimately come with degradation of barrel life. Without the adjustment the sample size will just increase steadily. Impossible to maintain but fun to chase.
Ahh, now your getting on to it. I’ve spent a lot of time in my professional career trying to measure changes in complex systems involving people and it can be frustrating. There always seems to be one more variable to consider. I watched the video Rhombo linked to and it was quite interesting. Anyway, I’ve wondered about sample size (statistical power) for a while with 5 shot groups. To put it another way, with a small sample size the possibility is greater that the outcome was due to chance and not your test variable. As you noted, the gun does change over time with use, so that might need to be considered, but I’m guessing in 35 or even 100 shots, the effect size will be smaller.
Statistical analysis is reality. if you buy a “better” gun, scope or ammo and it doesnt show in the statistics then the imaginary part is the word “better”.
I would say at a broad level there are clearly differences in accuracy between production guns. A poor quality gun will introduce more variation. In other words, bigger groups. Most of us could probably detect that level of difference by shooting the guns enough. Still, that is on a broad level. I think the original post was more about trying to detect the difference statistically using 5 shot groups in something that has a small effect size, say the difference between two good guns or the difference with a 0.2 grain increment in powder. Statistical significance just means that the difference was not likely due to random chance. In order to detect statistical significance in those small differences requires a large number of shots.