I guess nobody ever told Bethesda that you can use both collision meshes and per-pixel hit detection... technology that was used in games 15 years ago.
How exactly would you do "per-pixel hit detection" in a 3D game in which the smallest unit of movement is something along the lines of 1 * 10^-63, and where small enclosed areas probably reach 100 in all directions, minimum, with the boundaries being all over the place?
The magic word here is "raycasting". A mind- bending, unbelievable method used in every single 3D game since first Wolfenstein.
Now the best part. Not like Bethesda didn't used it; it's there. The problem is, VATS and real time shooting are two completely separate modes, each with its own method for detecting intersections. The intersections for real time are tested only after the bullet is shot...
A perfect real world analogy: two morons working together at the same thing, both not knowing what the other one is doing. It's an intentional example. Certainly that's how an average day in Bethesda looks like.
I'm well aware of it, it's one of the two main methods used for collision detection afterall.
First off, we've got raytracing, in we shoot out a ray to find a point of collision and well, collide, to add some extra precision, we shoot out more rays to test for hits.
Secondly, we just move the entity in a step and see if he ends up inside anything, and if he does, we typically try progressively smaller steps until we moved somewhere in which we're not inside somethng.
Both of these have their own pros and cons, raytracing for example works great for really small objects (which using the other method would move in steps far exceeding itself in size), especially ones which are moving really fast, since you can cheaply check a vector (or several) for their collision points and update accordingly.
On the other hand, raytracing is really bad for big objects, as the amount of rays you'll need to shoot out to check for collisions with adequate precision become rather prohibitive; you never know what kind of weird stuff you'll have to test against.
The other method moves things in deltas at a fixed rate per time-slice, and you can increase the amount of steps to accomodate smaller objects, and this is what's typically used. Basically the entity is moved, and then the engine checks to see it it'll end up inside anything, if yes, it responds in some way, some try progressively smaller steps, others will attempt to apply resistance/inertia/momentum to other things.
This is great for big objects, as it allows them to be moved far more cheaply, for small objects it's really bad and it will always end up in one of two ways; 1) The object moves so fast, it will test on the otherside of the obstacle, and will therefore completely avoid said obstacle 2) the movement resolution is made so fine, it becomes prohobitively expensive to do, just like raytracing is with large objects.
Both of these can technically be made (much) cheaper and more accurate via the use of collision meshes, but that brings us right back to the original issue, things with bounding boxes far exceeding itself.
tl;dr, raytracing is not the silver bullet you make it out to be.