- 26 Sep, 2011 40 commits
-
-
Zhigang Gong authored
When we need to solid fill an entire pixmap with a specific color, we do not need to draw it immediately. We can defer it to the following occasions: 1. The pixmap will be used as source, then we can just use a shader to instead of one copyarea. 2. The pixmap will be used as target, then we can do the filling just before drawing new pixel onto it. The filling and drawing will have the same target texture, we can save one time of fbo context switching. Actually, for the 2nd case, we have opportunity to further optimize it. We can just fill the untouched region. By applying this patch, the cairo-trace for the firefox-planet-gnome's rendering time decrease to 14seconds from 16 seconds. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
We already handle all format checking in pixmap uploading and converting, don't need to do that again. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Now, support dual crtc configuration. Signed-off-by:
Zhigang Gong <zhigang.gong@gmail.com>
-
Zhigang Gong authored
When fallback to cpu for the polylines procedure, we can just download required region to CPU rather than to download the whole pixmap. This significant improve the performance if we have to fallback, for example do non-solid filling in the game Mines. Signed-off-by:
Zhigang Gong <zhigang.gong@gmail.com>
-
Zhigang Gong authored
This reverts commit eb16fe0b7c8ea27b5cf9122d02e48bf585495228. As currently glamor_prepare_access/finish_access will touch the whole pixmap, not just the request region, then write only mode will not work correctly. We may need to revisit all fallback case, and convert the image to the right size before do the prepare/finish processing. Signed-off-by:
Zhigang Gong <zhigang.gong@gmail.com>
-
Zhigang Gong authored
Signed-off-by:
Zhigang Gong <zhigang.gong@gmail.com>
-
Zhigang Gong authored
As the eglTerminate will close the card when close screen, we may need to reopen it at next time create a screen resource. and thus we need to re initialize the drmmode crtc also. Otherwise , the cursor handling will be broken as it has the wrong fd.
-
Zhigang Gong authored
Some strange web page has 20000*1 png picture, and actually only use partial of it. We force to convert it to a actuall size rather than its original size,if it is the case. Then to avoid latter's failure uploading. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
It will return when the destination pixmap has a fbo but will continue when it doesn't have a fbo. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
If we only need a short part of the source or mask's drawable pixmap, we can convert it to a new small picture before call to the low level compositing function. Then it will only upload the smaller picture latter. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
glamor_fill is only called from internal functions glamor_fillspancs and glamor_polyfillrect. And both functions already add the offset to the coords, so the coords are already relative value, we can't add the offset once again. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
If the dest pixmap is in texture memory, but source pixmap is not. Then we need to upload the source pixmap to texture memory. Previous version will upload the whole source pixmap. This commit preprocess the source pixmap, and reduce it to a smaller tempory pixmap only contains the required region. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Some special case we want to get a cpu memory pixmap. For example to gather a large cpu memory pixmap's block to a small pixmap. Add pixmap's priviate data's deallocation when destroy a pixmap. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Access mapped vbo address is too slow. And by use system memory directly, rgb10text/aa10text increases from 980K/1160K to 117K/140K. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
This reduce the time when running cairo-performance-trace with the firefox-planet-gnome.trace from 23.5 seconds to 21.5 seconds. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
If the pixmap is write-only, then use a pbo mapping will not get too much benefit. And even worse, when the software rendering is access this mapped data range, it's much slower than just using a system memory. From the glamor_prepare_access glamor_finish_access view, we have two options here: option 1: 1.0 create a pbo 1.1 copy texture to the pbo 1.2 map the pbo to va 1.3 access the va directly in software rendering. 1.4 bind the pbo as unpack buffer & draw it back to texture. option 2: 2.0 allocate a block memory in system memory space. 2.1 read the texture memory to the system memory. 2.2 access the system memory and do rendering. 2.3 draw the system memory back to texture. In general, 1.1 plush 1.2 is much faster than 2.1. And 1.3 is slower than 2.2. 1.4 is faster than 2.3. If the access mode is read only or read write, option 1 may be fater, but if the access mode is write only. Then most of the time option 1 is much faster. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Signed-off-by:
Zhigang Gong <zhigang.gong@gmail.com>
-
Zhigang Gong authored
This is a bug, as if we do blend set up before do the pixmap dynamic uploading. We will have a incorrect blend env when doing the uploading. Signed-off-by:
Zhigang Gong <zhigang.gong@gmail.com>
-
Zhigang Gong authored
When try to upload a pixmap without yInverted set, we must set up a fbo for it to do the y flip. Previous implementation only consider the ax bit. After fix this problem, we can enable the dynamic uploading feature in copyarea function when the yInverted is not set (from Xephyr). Signed-off-by:
Zhigang Gong <zhigang.gong@gmail.com>
-
Zhigang Gong authored
When calling from ephyr, we forgot to initialize it to the correct value. Will cause segfault when run Xephyr. Signed-off-by:
Zhigang Gong <zhigang.gong@gmail.com>
-
Zhigang Gong authored
Change the glamor_change_window_attributes's handling. We don't need to fallback every thing to cpu at the beginning. Only when there is a real need to change the pixmap's format, we need to do something. Otherwise, we need do nothing here. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Concentrate the verties and texture coords processing code to a new file glamor_utils.h. Change most of the code to macro. Will have some performance benefit on slow machine. And reduce most of the duplicate code when calculate the normalized coords. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Major refactoring. 1. Rewrite the pixmap texture uploading and downloading functions. Add some new functions for both the prepare/finish access and the new performance feature dynamic texture uploading, which could download and upload the current image to/from a private texture/fbo. In the uploading or downloading phase, we need to handle two things: The first is the yInverted option, If it set, then we don't need to flip y. If not set, if it is from a dynamic texture uploading then we don't need to flip either if the current drawing process will flip it latter. If it is from finish_access, then we must flip the y axis. The second thing is the alpha channel hanlding, if the pixmap's format is something like x8a8r8g8, x1r5g5b5 which means it doesn't has alpha channel, but it do has those extra bits. Then we need to wire those bits to 1. 2. Add almost all the required picture format support. This is not as trivial as it looks like. The previous implementation only support GL_a8,GL_a8r8g8b8,GL_x8r8g8b8. All the other format, we have to fallback to cpu. The reason why we can't simply add those other color format is because the exists of picture. one drawable pixmap may has one or even more container pictures. The drawable pixmap's depth can't map to a specified color format, for example depth 16 can mapped to r5g6b5, x1r5g5b5, a1r5g5b5, or even b5g6r5. So we can't get get the color format just from the depth value. But the pixmap do not has a pict_format element. We have to make a new one in the pixmap private data structure. Reroute the CreatePicture to glamor_create_picture and then store the picture's format to the pixmap's private structure. This is not an ideal solution, as there may be more than one pictures refer to the same pixmap. Then we will have trouble. There is an example in glamor_composite_with_shader. The source and mask often share the same pixmap, but use different picture format. Our current solution is to combine those two different picture formats to one which will not lose any data. Then change the source's format to this new format and then upload the pixmap to texture once. It works. If we fail to find a matched new format then we fallback. There still is a potential problem, if two pictures refer to the same pixmap, and one of them destroy the picture, but the other still remained to be used latter. We don't handle that situation currently. To be fixed. 3. Dynamic texture uploading. This is a performance feature. Although we don't like the client to hold a pixmap data to shared memory and we can't accelerate it. And even worse, we may need to fallback all the required pixmaps to cpu memory and then process them on CPU. This feature is to mitigate this penalty. When the target pixmap has a valid gl fbo attached to it. But the other pixmaps are not. Then it will be more efficient to upload the other pixmaps to GPU and then do the blitting or rendering on GPU than fallback all the pixmaps to CPU. To enable this feature, I experienced a significant performance improvement in the Game "Mines" :). 4. Debug facility. Modify the debug output mechanism. Now add a new macro: glamor_debug_output(_level_, _format_,...) to conditional output some messages according to the environment variable GLAMOR_DEBUG. We have the following levels currently. exports GLAMOR_DEBUG to 3 will enable all the above messages. 5. Changes in pixmap private data structure. Add some for the full color format supports and relate it to the pictures which already described. Also Add the following new elements: gl_fbo - to indicates whether this pixmap is on gpu only. gl_tex - to indicates whether the tex is valid and is containing the pixmap's image originally. As we bring the dynamic pixmap uploading feature, so a cpu memory pixmap may also has a valid fbo or tex attached to it. So we will have to use the above new element to check it true type. After this commit, we can pass the rendercheck testing for all the picture formats. And is much much fater than fallback to cpu when doing rendercheck testing. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
This commit was borrowed from uxa driver contributed by Eric. commit number is e0066e77e026b0dd0daa0c3765473c7d63aa6753. commit log paste as below: We were clipping each span against the bounds of the clip, throwing out the span early if it was all clipped, and then walked the clip box clipping against each of the cliprects. We would expect spans to typically be clipped against one box, and not thrown out, so we were not saving any work there. For multiple cliprects, we were adding work. Only for many spans clipped entirely out of a complicated clip region would it have saved work, and it clearly didn't save bugs as evidenced by the many fix attempts here. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
It's better to give a correct output when we haven't implement all the code path. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
The previous implementation will just skip the rendering which is not good. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
By default, fallback to frame buffer currently. This commit makes us pass the rendercheck's triangles testing. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
For 1bpp pixmap, software fb get better performance than GL surface. The main reason is that fbo doesn't support 1bpp texture as internal format, so we have to translate a 1bpp bitmap to a 8bit alpha format each time which is very inefficient. And the previous implementation is not supported by the latest OpenGL 4.0, the GL_BITMAP was deprecated. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Added a new shader aswizlle_prog to wired the alpha to 1 when the image color depth is 24 (xrgb). Then we don't need to fallback the xrgb source/mask to software composite in render phase. Also don't wire the alpha bit to 1 in the render phase. This can get about 2x performance gain with the cairo performance trace's firefox-planet case. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-
Zhigang Gong authored
use pbo if possible when we load texture to a temporary tex. And for the previous direct texture load function, it's not correct and get removed in this commit. Signed-off-by:
Zhigang Gong <zhigang.gong@linux.intel.com>
-