[Open problem] Equivalence between two notions of `special case` for bouncers

In the context of bouncers (see the bouncers write-up: here) we define that a formula tape f' is a special case of formula tape f if if f' can be obtained from f by replacing subwords of the form (w) by w^n(w)w^m for some n,m\geq 0 and w\in\Sigma^+ (1).

However, a more general and abstract definition of special case for regular languages is: \mathcal{L}(f') \subseteq \mathcal{L}(f) (2).

Clearly, (1) \Rightarrow (2). We believe that (2) \Rightarrow (1) with the additional assumptions (that come for free with bouncers) that the formula tapes are aligned and contain the same number of repeaters and that |r_i| = |r'_i|, i.e. corresponding repeaters in each formula tape are of same size.

Here is a proof given by @savask for the case of 1 repeater:

Nonetheless, (2) \Rightarrow (1) remains open in the general case (arbitrary number of repeaters) and it would be cool to have a proof!

Here’s a dumb counterexample (implying further assumptions are needed): f = (1)(11), f'=(1)(11)1.
To expand on that: When two adjacent repeaters are powers of a common word, their order does not affect the language. So we can stick a power of the left repeater on the right side, and from the language viewpoint we get a special case, but from the formula viewpoint perhaps not.

If we require nonempty intermediate walls, this starts to look a lot like the conditions under which greedy formula-tape fitting is valid.
It may also be enough to forbid adjacent commuting repeaters. (I’m pretty sure a proof we’ve already done implies that commuting repeaters are those of the form (r^a)(r^b).) In the bouncer context this is justifiable; any formula tape solving a bouncer can be simplified to avoid them.