Since (load, load) is one of the most common instruction pairs on SunSpider and load has one of the smallest instruction bodies, this sounds like a good idea. However, all of my attempts to implement it lead to massive regressions on SunSpider, even if I don't even use the opcode. This points to a random GCC problem, so I'll post the code to see if we can fix it together.
Created attachment 21990 [details]
Patch that causes regression
Here is a patch that causes a performance regression of about 8% on SunSpider, despite the opcode not even being used. I originally had the opcode after op_load, but even moving it near the end didn't fix the regression.
The same thing happens with op_mov2.
Created attachment 22001 [details]
Patch that improves performance
I changed the code to evade GCC and now get a 0.2% progression. Maybe I could do better if I moved the opcode elsewhere?
This optimization was performed by different means in bug 20286.