

You can scientifically prove that one code snippet has fewer bugs than another though, and there’s already mountains of evidence of static typing making code significantly less buggy on average.
Do you mean memory safety here? Because yes, for memory safety, this is proven. E.g. there are reports from Google that wide usage of memory-safe languages for new code reduces the number of bugs.
You can’t scientifically prove that something is “better” than another thing because that’s not a measurable metric.
Then, first, why don’t the claims that statically compiled languages come with claims on measurable, objective benefits? If they are really significantly better it should be easy to come up with such measures?
And the second thing: We have at least one large-scale experiment, because Google introduced Go und used it widely in its own company to replace Python.
Now, it is clear that programs in Go run with higher performance than Python, no question.
But did this lead to productivity increases or better code because of Go being a strongly-statically typed language ? I have seen no such report - in spite of that they now have 16 years of experience with it.
(And just for fun, Python itself is memory safe and concurrency bugs in Pyhton code can’t lead to undefined behaviour, like in C. Go is neither memory safe nor has it that level of concurrency safety: If you concurrently modify a hash table in two different threads, this will cause a crash.)
I was not meaning slices or array size checks at runtime.
The array size can be part of the static type, that is one that can be checked at compile time. Rust, Pascal, and Ada have such Arrays.
But if static typing is always better, why are they rarely used?