surface: cache frame callback lists again
Caching frame callback lists is actually the correct behavior, because if a surface is locked because of e.g. subsurface synchronization, clients would expect to receive frame done events only after the pending state is actually committed.
If surface locking is going to be used for atomic layout updates by compositors, we would probably need to distinguish between locks required by protocols and locks made by compositors.