I have been having this problem since my
I2 was New (October 2017)
It isn't as bad as you demonstrate, but it is as if the Aircraft doesn't fully realise it isn't Level (which it isn't when hovering static in any real wind) and the gimbal doesn't perform enough compensation for the fixed angle offset... (ok it doesn't fully realise anything, hey it is a dumb robot)
I deal with it by having the roll control active on the screen (too bad i havn't found a way to map it to one of the control sticks) and periodically manually correct the "horizon" /gimbal roll error....
As a Control Engineer, this is one demonstration of the value in designing a (mechanical design) stabilisation platform so that the (manual) axis of interest (yaw) is the last in the control chain, meaning that when the Roll and Pitch are stabilised, yaw is decoupled in the control algorithm, eliminating the need for real time correction (which introduces artefacts at a "not insignificant" timescale in a lot of cases. In the x5s Roll is the second (of 3) "leg" of the actuation chain with Yaw as the initial leg, meaning that the system could have been designed worse.
Of course if we were using a state control scheme with all states immediately visible to the controller (not estimated with induced delays), everything is (as one of my "professors" would often say) Trivial.
Another concept is that in the design of such platforms (any remote sensor system), 5 (or 6) degrees of freedom for the gimbal allows the aircraft to autonomously control for roll and Pitch correction/stabilisation, while the camera operator (second person, or pilot as the case may be) has the freedom to actively program or directly manipulate 3 degrees of freedom without interference / cross coupling. Then the single Yaw control (or have concentric Yaw rings, one for the craft to stabilise and the second for "framing control") can be number 1 in the Camera operator's control chain, allowing it to be slaved to; a compass heading, point of interest or manually controlled without affecting (or affected by) the operation of the craft (followed by camera pitch and roll (Z,Y,X) - 2 sets of gimbal control allows one to be a high rate (low amplitude: +/- (say)30degrees would work well) control system and the second to be operated at a human rate... All that is merely to get the framing right, lets hope we got the camera parameters sorted out before hitting that shutter button..
Time to get building.