mirror of
https://github.com/panda3d/panda3d.git
synced 2025-10-05 11:28:17 -04:00
documents moved to CMU wiki: www.panda3d.org
This commit is contained in:
parent
856a53bd5c
commit
ab21c536ad
@ -1,110 +0,0 @@
|
|||||||
HOW TO FIX TRANSPARENCY ISSUES
|
|
||||||
|
|
||||||
Usually transparency works as expected in Panda automatically, but
|
|
||||||
sometimes it just seems to go awry, where a semitransparent object in
|
|
||||||
the background seems to partially obscure a semitransparent object in
|
|
||||||
front of it. This is especially likely to happen with large flat
|
|
||||||
polygon cutouts, or when a transparent object is contained within
|
|
||||||
another transparent object, or when parts of a transparent object can
|
|
||||||
be seen behind other parts of the same object.
|
|
||||||
|
|
||||||
The fundamental problem is that correct transparency, in the absence
|
|
||||||
of special hardware support involving extra framebuffer bits, requires
|
|
||||||
drawing everything in order from farthest away to nearest. This means
|
|
||||||
sorting each polygon--actually, each pixel, for true correctness--into
|
|
||||||
back-to-front order before drawing the scene.
|
|
||||||
|
|
||||||
It is, of course, impossible to split up every transparent object into
|
|
||||||
individual pixels or polygons for sorting individually, so Panda sorts
|
|
||||||
objects at the Geom level, according to the center of the bounding
|
|
||||||
volume. This works well 95% of the time.
|
|
||||||
|
|
||||||
You run into problems with large flat polygons, though, since these
|
|
||||||
tend to have parts that are far away from the center of their bounding
|
|
||||||
volume. The bounding-volume sorting is especially likely to go awry
|
|
||||||
when you have two or more large flats close behind the other, and you
|
|
||||||
view them from slightly off-axis. (Try drawing a picture, of the two
|
|
||||||
flats as seen from the top, and imagine yourself viewing them from
|
|
||||||
different directions. Also imagine where the center of the bounding
|
|
||||||
volumes is.)
|
|
||||||
|
|
||||||
Now, there are a number of solutions to this sort of problem. No one
|
|
||||||
solution is right for every situation.
|
|
||||||
|
|
||||||
First, the easiest thing to do is to use M_dual transparency. This is
|
|
||||||
a special transparency mode in which the completely invisible parts of
|
|
||||||
the object aren't drawn into the Z-buffer at all, so that they don't
|
|
||||||
have any chance of obscuring things behind them. This only works well
|
|
||||||
if the flats are typical cutouts, where there is a big solid part
|
|
||||||
(alpha == 1.0) and a big transparent part (alpha == 0.0), and not a
|
|
||||||
lot of semitransparent parts (0.0 < alpha < 1.0). It is also a
|
|
||||||
slightly more expensive rendering mode than the default of M_alpha, so
|
|
||||||
it's not enabled by default in Panda. But egg-palettize will turn it
|
|
||||||
on automatically for a particular model if it detects textures that
|
|
||||||
appear to be cutouts of the appropriate nature, which is another
|
|
||||||
reason to use egg-palettize if you are not using it already.
|
|
||||||
|
|
||||||
If you don't use egg-palettize (you really should, you know), you can
|
|
||||||
just hand-edit the egg files to put the line:
|
|
||||||
|
|
||||||
<Scalar> alpha { dual }
|
|
||||||
|
|
||||||
within the <Texture> reference for the textures in question.
|
|
||||||
|
|
||||||
A second easy option is to use M_multisample transparency, which
|
|
||||||
doesn't have any ordering issues at all, but it only looks good on
|
|
||||||
very high-end cards that have special multisample bits to support
|
|
||||||
full-screen antialiasing. Also, at the present it only looks good on
|
|
||||||
these high-end cards in OpenGL mode (since our pandadx drivers don't
|
|
||||||
support M_multisample explicitly right now). But if M_multisample is
|
|
||||||
not supported by a particular hardware or panda driver, it
|
|
||||||
automatically falls back to M_binary, which also doesn't have any
|
|
||||||
ordering issues, but it always has jaggy edges along the cutout edge.
|
|
||||||
This only works well on texture images that represent cutouts, like
|
|
||||||
M_dual, above.
|
|
||||||
|
|
||||||
If you use egg-palettize, you can engage M_multisample mode by putting
|
|
||||||
the keyword "ms" on the line with the texture(s). Without
|
|
||||||
egg-palettize, hand-edit the egg files to put the line:
|
|
||||||
|
|
||||||
<Scalar> alpha { ms }
|
|
||||||
|
|
||||||
within the <Texture> reference for the textures in question.
|
|
||||||
|
|
||||||
A third easy option is to chop up one or both competing models into
|
|
||||||
smaller pieces, each of which can be sorted independently by Panda.
|
|
||||||
For instance, you can split one big polygon into a grid of little
|
|
||||||
polygons, and the sorting is more likely to be accurate for each piece
|
|
||||||
(because the center of the bounding volume is closer to the pixels).
|
|
||||||
You can draw a picture to see how this works. In order to do this
|
|
||||||
properly, you can't just make it one big mesh of small polygons, since
|
|
||||||
Panda will make a mesh into a single Geom of tristrips; instead, it
|
|
||||||
needs to be separate meshes, so that each one will become its own
|
|
||||||
Geom. Obviously, this is slightly more expensive too, since you are
|
|
||||||
introducing additional vertices and adding more objects to the sort
|
|
||||||
list; so you don't want to go too crazy with the smallness of your
|
|
||||||
polygons.
|
|
||||||
|
|
||||||
A fourth option is simply to disable the depth write on your
|
|
||||||
transparent objects. This is most effective when you are trying to
|
|
||||||
represent something that is barely visible, like glass or a soap
|
|
||||||
bubble. Doing this doesn't improve the likelihood of correct sorting,
|
|
||||||
but it will tend to make the artifacts of an incorrect sorting less
|
|
||||||
obvious. You can achieve this by using the transparency option
|
|
||||||
"blend_no_occlude" in an egg file, or by explicitly disabling the
|
|
||||||
depth write on a loaded model with node_path.set_depth_write(false).
|
|
||||||
You should be careful only to disable depth write on the transparent
|
|
||||||
pieces, and not on the opaque parts.
|
|
||||||
|
|
||||||
A final option is to make explicit sorting requests to Panda. This is
|
|
||||||
often the last resort because it is more difficult, and doesn't
|
|
||||||
generalize well, but it does have the advantage of not adding
|
|
||||||
additional performance penalties to your scene. It only works well
|
|
||||||
when the transparent objects can be sorted reliably with respect to
|
|
||||||
everything else behind them. For instance, clouds in the sky can
|
|
||||||
reliably be drawn before almost everything else in the scene, except
|
|
||||||
the sky itself. Similarly, a big flat that is up against an opaque
|
|
||||||
wall can reliably be drawn after all of the opaque objects, but before
|
|
||||||
any other transparent object, regardless of where the camera happens
|
|
||||||
to be placed in the scene. See howto.control_render_order.txt for
|
|
||||||
more information about explicitly controlling the rendering order.
|
|
@ -1,243 +0,0 @@
|
|||||||
MULTITEXTURE OVERVIEW
|
|
||||||
|
|
||||||
Modern graphics cards are capable of applying more than one texture
|
|
||||||
image at once to geometry as they render polygons. This capability is
|
|
||||||
referred to as multitexture.
|
|
||||||
|
|
||||||
The textures are applied in a pipeline fashion, where the output of
|
|
||||||
each texturing operation is used as the input to the next. A
|
|
||||||
particular graphics card will have a certain number of texture units
|
|
||||||
dedicated to this function, which limits the number of textures that
|
|
||||||
may be pipelined in this way.
|
|
||||||
|
|
||||||
To apply a texture in Panda, you must have a Texture object (which you
|
|
||||||
might have loaded from disk, or extracted from a model) and a
|
|
||||||
TextureStage object (which you can create on-the-fly). The primary
|
|
||||||
call to add a texture to the pipeline is:
|
|
||||||
|
|
||||||
nodepath.set_texture(texture_stage, texture);
|
|
||||||
|
|
||||||
This adds the indicated texture into the pipeline for all the geometry
|
|
||||||
at nodepath level and below, associating it with the indicated
|
|
||||||
TextureStage object.
|
|
||||||
|
|
||||||
The purpose of the TextureStage object is to represent a single stage
|
|
||||||
in the texture pipeline. You can create as many TextureStage objects
|
|
||||||
as you like; each one can associate a different texture, and each of
|
|
||||||
those textures will be applied together (within the limits of your
|
|
||||||
hardware). If you want to change out a particular texture within the
|
|
||||||
pipeline without disturbing the others, keep a handle to the
|
|
||||||
TextureStage object that you used for that stage, and issue a
|
|
||||||
set_texture() call using the same TextureStage object and a different
|
|
||||||
texture--this replaces the texture that you assigned previously. (You
|
|
||||||
may do this on the same NodePath, or on a lower NodePath level to
|
|
||||||
override the texture specified from above.)
|
|
||||||
|
|
||||||
To undo a set_texture() call for a particular stage or for all stages,
|
|
||||||
do:
|
|
||||||
|
|
||||||
nodepath.clear_texture(texture_stage)
|
|
||||||
nodepath.clear_texture()
|
|
||||||
|
|
||||||
Don't confuse this with the calls to actively disable a particular
|
|
||||||
texture stage or to disable texturing altogether, which are:
|
|
||||||
|
|
||||||
nodepath.set_texture_off(texture_stage)
|
|
||||||
nodepath.set_texture_off()
|
|
||||||
|
|
||||||
The difference between the two is that set_texture_off() inserts a
|
|
||||||
command into the scene graph to specifically turn off the texture
|
|
||||||
associated with the indicated texture stage, while clear_texture()
|
|
||||||
simply removes the texture stage from this node's list of assigned
|
|
||||||
textures. Use clear_texture() to undo a previous call to
|
|
||||||
set_texture() on a given node. You need set_texture_off() more
|
|
||||||
rarely; you might use this when you want to override a particular
|
|
||||||
setting from above to turn off just one particular stage of the
|
|
||||||
pipeline (for instance, you may have a set_texture() applied at the
|
|
||||||
root of a scene to apply a particular effect to everything in the
|
|
||||||
scene, but use set_texture_off() on one particular model for which you
|
|
||||||
don't want that effect applied).
|
|
||||||
|
|
||||||
There is also a default TextureStage object that is used for all of
|
|
||||||
the old single-texture Panda interfaces (like
|
|
||||||
nodepath.set_texture(texture)). It is also the TextureStage that will
|
|
||||||
be used to apply Textures onto models (e.g. egg files and/or bam
|
|
||||||
files) that do not specify the use of multitexturing. This default
|
|
||||||
TextureStage can be accessed by TextureStage::get_default().
|
|
||||||
|
|
||||||
There are a number of different blend modes that you may specify for
|
|
||||||
each texture stage in the pipeline; these are specified with
|
|
||||||
texture_stage.set_mode(). The mode may be one of:
|
|
||||||
|
|
||||||
TextureStage::M_modulate
|
|
||||||
Multiplies the incoming color by the texture color. This
|
|
||||||
allows the texture to darken, but not brighten, the incoming
|
|
||||||
color.
|
|
||||||
|
|
||||||
TextureStage::M_add
|
|
||||||
Adds the incoming color and the texture color. This allows the
|
|
||||||
texture to brighten, but not darken, the incoming color, and
|
|
||||||
tends to lead to bright, desaturated colors.
|
|
||||||
|
|
||||||
TextureStage::M_decal
|
|
||||||
Shows the texture color where the texture is alpha = 1, and the
|
|
||||||
incoming color where the texture is alpha = 0. This can be
|
|
||||||
used to paint a texture on top of the existing texture.
|
|
||||||
|
|
||||||
TextureStage::M_blend
|
|
||||||
Defined for grayscale textures only. You can specify an
|
|
||||||
arbitrary color as a separate parameter with
|
|
||||||
texture_stage.set_color(), and then the result of M_blend is to
|
|
||||||
produced the specified color where the texture is white, and
|
|
||||||
the incoming color where the texture is black. This can be
|
|
||||||
used to paint arbitrary color stripes or a similar effect over
|
|
||||||
an existing texture.
|
|
||||||
|
|
||||||
TextureStage::M_blend_color_scale
|
|
||||||
This is identical to M_blend, except that the blend color,
|
|
||||||
specified by texture_stage.set_color(), is also modified by the
|
|
||||||
color scale applied to the scene graph.
|
|
||||||
|
|
||||||
TextureStage::M_replace
|
|
||||||
Completely replaces the incoming color with the texture color;
|
|
||||||
probably not terribly useful in a multitexture environment,
|
|
||||||
except for the first texture stage.
|
|
||||||
|
|
||||||
TextureStage::M_combine
|
|
||||||
This mode supercedes most of the above with a more powerful
|
|
||||||
collection of options, including signed add and/or subtract,
|
|
||||||
and linear interpolation between two different colors using a
|
|
||||||
third parameter. You can specify the input(s) as one or more
|
|
||||||
combinations of a specified constant color, or the previous
|
|
||||||
texture in the pipeline, or the incoming color. However, very
|
|
||||||
old graphics drivers may not support this mode.
|
|
||||||
|
|
||||||
Since combine mode has a number of associated parameters, you
|
|
||||||
enable this mode by calling set_combine_rgb() and
|
|
||||||
set_combine_alpha() with the appropriate parameters; it's not
|
|
||||||
necessary to call set_mode(M_combine). A complete description
|
|
||||||
of this mode is not given here.
|
|
||||||
|
|
||||||
Some of the above modes are very order-dependent. For this reason,
|
|
||||||
you may use texture_stage.set_sort() to specify the order in which
|
|
||||||
textures should be applied, using an integer sort parameter. When
|
|
||||||
Panda collects the textures together for rendering a particular piece
|
|
||||||
of geometry, it will sort them in order from lowest sort value to
|
|
||||||
highest sort value. The default sort value is 0. Thus, you can
|
|
||||||
specify a large positive number to apply a texture on top of existing
|
|
||||||
textures, or a large negative number to apply it beneath existing
|
|
||||||
textures.
|
|
||||||
|
|
||||||
The egg loader will create texture stages automatically in the
|
|
||||||
presence of a multitexturing specification in the egg file, and it
|
|
||||||
will assign to these stages sort values in multiples of 10: the lowest
|
|
||||||
texture stage will have a sort value of 0, the next 10, the next 20,
|
|
||||||
and so on.
|
|
||||||
|
|
||||||
Since the number of texture units available on the hardware is
|
|
||||||
limited, and is usually a small number (and some hardware doesn't
|
|
||||||
support multitexturing at all, so effectively has only one texture
|
|
||||||
unit), Panda needs some rule for selecting the subset of textures to
|
|
||||||
render when you have requested more texture stages than are available.
|
|
||||||
For this Panda relies on the texture_stage.set_priority() value, which
|
|
||||||
is an integer value that represents the importance of this particular
|
|
||||||
texture. If the requested textures will not fit on the available
|
|
||||||
number of texture units, Panda will select the n textures with the
|
|
||||||
highest priority (and then sort them into order by the set_sort()
|
|
||||||
parameter). Between two textures with the same priority, Panda will
|
|
||||||
prefer the one with the lower sort value. The default priority is 0.
|
|
||||||
|
|
||||||
If you need to know the actual limit, you can query your available
|
|
||||||
number of texture stages from the GraphicsStateGuardian, with the call
|
|
||||||
gsg->get_max_texture_stages() (e.g. from Python, call
|
|
||||||
base.win.getGsg().getMaxTextureStages()).
|
|
||||||
|
|
||||||
|
|
||||||
TEXTURE COORDINATES
|
|
||||||
|
|
||||||
In many cases, all of the texture stages need to use the same set of
|
|
||||||
texture coordinates, which is the default behavior. You can also
|
|
||||||
apply a different texture matrix on some texture stages to apply a
|
|
||||||
linear transformation to the texture coordinates (for instance, to
|
|
||||||
position a decal on the surface).
|
|
||||||
|
|
||||||
nodepath.set_tex_offset(texture_stage, u_offset, v_offset);
|
|
||||||
nodepath.set_tex_scale(texture_stage, u_scale, v_scale);
|
|
||||||
nodepath.set_tex_rotate(texture_stage, degrees);
|
|
||||||
nodepath.set_tex_transform(texture_stage, general_transform);
|
|
||||||
|
|
||||||
These operations accumulate through nested nodes just like standard
|
|
||||||
scene graph transforms. In fact, you can get and set relative texture
|
|
||||||
transforms:
|
|
||||||
|
|
||||||
rel_offset = nodepath.get_tex_offset(other, texture_stage);
|
|
||||||
nodepath.set_tex_scale(other, texture_stage, u_scale, v_scale);
|
|
||||||
(etc.)
|
|
||||||
|
|
||||||
You may create LerpIntervals to lerp texture matrices. There are no
|
|
||||||
interval types that operate directly on a texture matrix, but you can
|
|
||||||
set up a TexProjectorEffect to bind a node's transform to the texture
|
|
||||||
matrix:
|
|
||||||
|
|
||||||
nodepath.set_tex_projector(texture_stage, from, to)
|
|
||||||
|
|
||||||
Where "from" and "to" are arbitrary NodePaths. The TexProjectorEffect
|
|
||||||
will measure the relative transform between "from" and "to" each frame
|
|
||||||
and apply it to the nodepath's texture matrix. Once this is in place,
|
|
||||||
you may create a LerpPosInterval, or any other Panda construct, to
|
|
||||||
adjust either the "from" or the "to" NodePath, which will thus
|
|
||||||
indirectly adjust the texture matrix by the same amount.
|
|
||||||
|
|
||||||
|
|
||||||
Sometimes, a texture stage may need to use a completely different set
|
|
||||||
of texture coordinates, for instance as provided by the artist who
|
|
||||||
generated the model. Panda allows a model to store any number of
|
|
||||||
different sets of texture coordinates on its vertices, each with a
|
|
||||||
unique name. You can associate any texture stage with any set of
|
|
||||||
texture coordinates you happen to have available on your model:
|
|
||||||
|
|
||||||
texture_stage.set_texcoord_name(name)
|
|
||||||
|
|
||||||
|
|
||||||
Finally, you may need to generate texture coordinates for a particular
|
|
||||||
texture stage on the fly. This is particularly useful, for instance,
|
|
||||||
to apply reflection maps, e.g. sphere maps or cube maps. To enable
|
|
||||||
this effect, use:
|
|
||||||
|
|
||||||
nodepath.set_tex_gen(texture_stage, mode)
|
|
||||||
|
|
||||||
Where mode is one of the enumerated types named by TexGenAttrib::Mode;
|
|
||||||
at the present, this may be any of M_world_position,
|
|
||||||
M_object_position, M_eye_position, or M_sphere_map. The first three
|
|
||||||
modes simply apply the X, Y, Z coordinates of the vertex to its U, V
|
|
||||||
texture coordinates (a texture matrix may then be applied to transform
|
|
||||||
the generated texture coordinates into the particular U, V coordinate
|
|
||||||
space that you require). The remaining modes generate texture
|
|
||||||
coordinates appropriate to a reflection map of the corresponding type,
|
|
||||||
based on the position and normal of each vertex, relative to the
|
|
||||||
camera.
|
|
||||||
|
|
||||||
|
|
||||||
The texture generation mode and the tex projector mode may be combined
|
|
||||||
to provide hardware-assisted projective texturing, where a texture is
|
|
||||||
applied to geometry as if it were projected from a particular point in
|
|
||||||
space, like a slide projector. This is particularly useful for
|
|
||||||
applying shadow maps or flashlight effects, for instance. There is a
|
|
||||||
convenience function on NodePath that automatically makes the three
|
|
||||||
separate calls needed to enable projective texturing:
|
|
||||||
|
|
||||||
nodepath.project_texture(texture_stage, texture, projector);
|
|
||||||
|
|
||||||
Where projector is a NodePath that references a LensNode. The
|
|
||||||
indicated texture is applied to the geometry at nodepath and below, as
|
|
||||||
if it were projected from the indicated projector. The lens
|
|
||||||
properties such as field of view may be adjusted on the fly to adjust
|
|
||||||
the projection.
|
|
||||||
|
|
||||||
(Note that Panda also provides a ProjectionScreen object, which
|
|
||||||
performs an effect very similar to the project_texture() call, except
|
|
||||||
that it is performed entirely in the CPU, whereas project_texture()
|
|
||||||
will offload the work onto the graphics card if the card supports
|
|
||||||
this. This may or may not result in a performance improvement over
|
|
||||||
ProjectionScreen, depending on the nature of your scene and your CPU
|
|
||||||
load versus your graphics card load.)
|
|
@ -1,403 +0,0 @@
|
|||||||
QUICK INTRODUCTION
|
|
||||||
|
|
||||||
PStats is Panda's built-in performance analysis tool. It can graph
|
|
||||||
frame rate over time, and can further graph the work spent within each
|
|
||||||
frame into user-defined subdivisions of the frame (for instance, app,
|
|
||||||
cull and draw), and thus can be an invaluable tool in identifying
|
|
||||||
performance bottlenecks. It can also show frame-based data that
|
|
||||||
reflects any arbitrary quantity other than time intervals, for
|
|
||||||
instance, texture memory in use or number of vertices drawn.
|
|
||||||
|
|
||||||
The performance graphs may be drawn on the same computer that is
|
|
||||||
running the Panda client, or they may be drawn on another computer on
|
|
||||||
the same LAN, which is useful for analyzing fullscreen applications.
|
|
||||||
The remote computer need not be running the same operating system as
|
|
||||||
the client computer.
|
|
||||||
|
|
||||||
To use PStats, you first need to build the PStats server program,
|
|
||||||
which is part of the Pandatool tree (it's called pstats.exe on
|
|
||||||
Windows, and gtk-stats on a Unix platform). Start by running the
|
|
||||||
PStats server program (it runs in the background), and then start your
|
|
||||||
Direct/Panda client with the following in your Config.prc file:
|
|
||||||
|
|
||||||
want-pstats 1
|
|
||||||
|
|
||||||
Or, at runtime, issue the Python command:
|
|
||||||
|
|
||||||
PStatClient.connect()
|
|
||||||
|
|
||||||
Or if you're running pview, press shift-S.
|
|
||||||
|
|
||||||
Any of the above will contact your running PStats server program,
|
|
||||||
which will proceed to open a window and start a running graph of your
|
|
||||||
client's performance. If you are running the server on a different
|
|
||||||
machine than the client, add the pstats-host variable to your client's
|
|
||||||
Config.prc file, naming the hostname or IP address of the machine
|
|
||||||
running the PStats server.
|
|
||||||
|
|
||||||
If you are developing Python code, you may be interested in reporting
|
|
||||||
the relative time spent within each Python task (by subdividing the
|
|
||||||
total time spent in Python, as reported under "Show Code"). To do
|
|
||||||
this, add the following lines to your Config.prc file before you start
|
|
||||||
ShowBase:
|
|
||||||
|
|
||||||
task-timer-verbose 1
|
|
||||||
pstats-tasks 1
|
|
||||||
|
|
||||||
|
|
||||||
THE PSTATS SERVER (The user interface)
|
|
||||||
|
|
||||||
The GUI for managing the graphs and drilling down to view more detail
|
|
||||||
is entirely controlled by the PStats server program. At the time of
|
|
||||||
this writing, there are two different versions of the PStats server,
|
|
||||||
one for Unix called gtk-stats and one for Windows called simply
|
|
||||||
pstats. The interfaces are similar but not identical; the following
|
|
||||||
paragraphs describe the Windows version.
|
|
||||||
|
|
||||||
When you run pstats.exe, it adds a program to the taskbar but does not
|
|
||||||
immediately open a window. The program name is typically "PStats
|
|
||||||
5185", showing the default PStats TCP port number of 5185; see "HOW IT
|
|
||||||
WORKS" below for more details about the TCP communication system. For
|
|
||||||
the most part you don't need to worry about the port number, as long
|
|
||||||
as server and client agree (and the port is not already being used by
|
|
||||||
another application).
|
|
||||||
|
|
||||||
Each time a client connects to the PStats server, a new monitor window
|
|
||||||
is created. This monitor window owns all of the graphs that you
|
|
||||||
create to view the performance data from that particular connection.
|
|
||||||
Initially, a strip chart showing the frame time of the main thread is
|
|
||||||
created by default; you can create additional graphs by selecting from
|
|
||||||
the Graphs pulldown menu.
|
|
||||||
|
|
||||||
Time-based Strip Charts
|
|
||||||
|
|
||||||
This is the graph type you will use most frequently to examine
|
|
||||||
performance data. The horizontal axis represents the passage of time;
|
|
||||||
each frame is represented as a vertical slice on the graph. The
|
|
||||||
overall height of the colored bands represents the total amount of
|
|
||||||
time spent on each frame; within the frame, the time is further
|
|
||||||
divided into the primary subdivisions represented by different color
|
|
||||||
bands (and labeled on the left). These subdivisions are called
|
|
||||||
"collectors" in the PStats terminology, since they represent time
|
|
||||||
collected by different tasks.
|
|
||||||
|
|
||||||
Normally, the three primary collectors are App, Cull, and Draw, the
|
|
||||||
three stages of the graphics pipeline. Atop these three colored
|
|
||||||
collectors is the label "Frame", which represents any remaining time
|
|
||||||
spent in the frame that was not specifically allocated to one of the
|
|
||||||
three child collectors (normally, there should not be significant time
|
|
||||||
reported here).
|
|
||||||
|
|
||||||
The frame time in milliseconds, averaged over the past three seconds,
|
|
||||||
is drawn above the upper right corner of the graph. The labels on the
|
|
||||||
guide bars on the right are also shown in milliseconds; if you prefer
|
|
||||||
to think about a target frame rate rather than an elapsed time in
|
|
||||||
milliseconds, you may find it useful to select "Hz" from the Units
|
|
||||||
pulldown menu, which changes the time units accordingly.
|
|
||||||
|
|
||||||
The running Panda client suggests its target frame rate, as well as
|
|
||||||
the initial vertical scale of the graph (that is, the height of the
|
|
||||||
colored bars). You can change the scale freely by clicking within the
|
|
||||||
graph itself and dragging the mouse up or down as necessary. One of
|
|
||||||
the horizontal guide bars is drawn in a lighter shade of gray; this
|
|
||||||
one represents the actual target frame rate suggested by the client.
|
|
||||||
The other, darker, guide bars are drawn automatically at harmonic
|
|
||||||
subdvisions of the target frame rate. You can change the target frame
|
|
||||||
rate with the Config.prc variable pstats-target-frame-rate on the
|
|
||||||
client.
|
|
||||||
|
|
||||||
You can also create any number of user-defined guide bars by dragging
|
|
||||||
them into the graph from the gray space immediately above or below the
|
|
||||||
graph. These are drawn in a dashed blue line. It is sometimes useful
|
|
||||||
to place one of these to mark a performance level so it may be
|
|
||||||
compared to future values (or to alternate configurations).
|
|
||||||
|
|
||||||
The primary collectors labeled on the left might themselves be further
|
|
||||||
subdivided, if the data is provided by the client. For instance, App
|
|
||||||
is often divided into Show Code, Animation, and Collisions, where Show
|
|
||||||
Code is the time spent executing any Python code, Animation is the
|
|
||||||
time used to compute any animated characters, and Collisions is the
|
|
||||||
time spent in the collision traverser(s).
|
|
||||||
|
|
||||||
To see any of these further breakdowns, double-click on the
|
|
||||||
corresponding colored label (or on the colored band within the graph
|
|
||||||
itself). This narrows the focus of the strip chart from the overall
|
|
||||||
frame to just the selected collector, which has two advantages.
|
|
||||||
Firstly, it may be easier to observe the behavior of one particular
|
|
||||||
collector when it is drawn alone (as opposed to being stacked on top
|
|
||||||
of some other color bars), and the time in the upper-right corner will
|
|
||||||
now reflect just the total time spent within just this collector.
|
|
||||||
Secondly, if there are further breakdowns to this collector, they will
|
|
||||||
now be shown as further colored bars. As in the Frame chart, the
|
|
||||||
topmost label is the name of the parent collector, and any time shown
|
|
||||||
in this color represents time allocated to the parent collector that
|
|
||||||
is not accounted for by any of the child collectors.
|
|
||||||
|
|
||||||
You can further drill down by double-clicking on any of the new
|
|
||||||
labels; or double-click on the top label, or the white part of the
|
|
||||||
graph, to return back up to the previous level.
|
|
||||||
|
|
||||||
Value-based Strip Charts
|
|
||||||
|
|
||||||
There are other strip charts you may create, which show arbitrary
|
|
||||||
kinds of data per frame other than elapsed time. These can only be
|
|
||||||
accessed from the Graphs pulldown menu, and include things such as
|
|
||||||
texture memory in use and vertices drawn. They behave similarly to
|
|
||||||
the time-based strip charts described above.
|
|
||||||
|
|
||||||
Piano Roll Charts
|
|
||||||
|
|
||||||
This graph is used less frequently, but when it is needed it is a
|
|
||||||
valuable tool to reveal exactly how the time is spent within a frame.
|
|
||||||
The PStats server automatically collects together all the time spent
|
|
||||||
within each collector and shows it as a single total, but in reality
|
|
||||||
it may not all have been spent in one continuous block of time.
|
|
||||||
|
|
||||||
For instance, when Panda draws each display region in single-threaded
|
|
||||||
mode, it performs a cull traversal followed by a draw traversal for
|
|
||||||
each display region. Thus, if your Panda client includes multiple
|
|
||||||
display regions, it will alternate its time spent culling and drawing
|
|
||||||
as it processes each of them. The strip chart, however, reports only
|
|
||||||
the total cull time and draw time spent.
|
|
||||||
|
|
||||||
Sometimes you really need to know the sequence of events in the frame,
|
|
||||||
not just the total time spent in each collector. The piano roll chart
|
|
||||||
shows this kind of data. It is so named because it is similar to the
|
|
||||||
paper music roll for an old-style player piano, with holes punched
|
|
||||||
down the roll for each note that is to be played. The longer the
|
|
||||||
hole, the longer the piano key is held down. (Think of the chart as
|
|
||||||
rotated 90 degrees from an actual piano roll. A player piano roll
|
|
||||||
plays from bottom to top; the piano roll chart reads from left to
|
|
||||||
right.)
|
|
||||||
|
|
||||||
Unlike a strip chart, a piano roll chart does not show trends; the
|
|
||||||
chart shows only the current frame's data. The horizontal axis shows
|
|
||||||
time within the frame, and the individual collectors are stacked up in
|
|
||||||
an arbitrary ordering along the vertical axis.
|
|
||||||
|
|
||||||
The time spent within the frame is drawn from left to right; at any
|
|
||||||
given time, the collector(s) that are active will be drawn with a
|
|
||||||
horizontal bar. You can observe the CPU behavior within a frame by
|
|
||||||
reading the graph from left to right. You may find it useful to
|
|
||||||
select "pause" from the Speed pulldown menu to freeze the graph on
|
|
||||||
just one frame while you read it.
|
|
||||||
|
|
||||||
Note that the piano roll chart shows time spent within the frame on
|
|
||||||
the horizontal axis, instead of the vertical axis, as it is on the
|
|
||||||
strip charts. Thus, the guide bars on the piano roll chart are
|
|
||||||
vertical lines instead of horizontal lines, and they may be dragged in
|
|
||||||
from the left or the right sides (instead of from the top or bottom,
|
|
||||||
as on the strip charts). Apart from this detail, these are the same
|
|
||||||
guide bars that appear on the strip charts.
|
|
||||||
|
|
||||||
The piano roll chart may be created from the Graphs pulldown menu.
|
|
||||||
|
|
||||||
Additional threads
|
|
||||||
|
|
||||||
If the panda client has multiple threads that generate PStats data,
|
|
||||||
the PStats server can open up graphs for these threads as well. Each
|
|
||||||
separate thread is considered unrelated to the main thread, and may
|
|
||||||
have the same or an independent frame rate. Each separate thread will
|
|
||||||
be given its own pulldown menu to create graphs associated with that
|
|
||||||
thread; these auxiliary thread menus will appear on the menu bar
|
|
||||||
following the Graphs menu. At the time of this writing, support for
|
|
||||||
multiple threads within the PStats graph is largely theoretical and
|
|
||||||
untested.
|
|
||||||
|
|
||||||
|
|
||||||
HOW TO DEFINE YOUR OWN COLLECTORS
|
|
||||||
|
|
||||||
The PStats client code is designed to be generic enough to allow users
|
|
||||||
to define their own collectors to time any arbitrary blocks of code
|
|
||||||
(or record additional non-time-based data), from either the C++ or the
|
|
||||||
Python level.
|
|
||||||
|
|
||||||
The general idea is to create a PStatCollector for each separate block
|
|
||||||
of code you wish to time. The name which is passed to the
|
|
||||||
PStatCollector constructor is a unique identifier: all collectors that
|
|
||||||
share the same name are deemed to be the same collector.
|
|
||||||
|
|
||||||
Furthermore, the collector's name can be used to define the
|
|
||||||
hierarchical relationship of each collector with other existing
|
|
||||||
collectors. To do this, prefix the collector's name with the name of
|
|
||||||
its parent(s), followed by a colon separator. For instance,
|
|
||||||
PStatCollector("Draw:Flip") defines a collector named "Flip", which is
|
|
||||||
a child of the "Draw" collector, defined elsewhere.
|
|
||||||
|
|
||||||
You can also define a collector as a child of another collector by
|
|
||||||
giving the parent collector explicitly followed by the name of the
|
|
||||||
child collector alone, which is handy for dynamically-defined
|
|
||||||
collectors. For instance, PStatCollector(draw, "Flip") defines the
|
|
||||||
same collector named above, assuming that draw is the result of the
|
|
||||||
PStatCollector("Draw") constructor.
|
|
||||||
|
|
||||||
Note that, because of an unfortunate limitation with the interrogate
|
|
||||||
parser, statically-defined PStatCollector objects can't be parsed by
|
|
||||||
interrogate. (In general, interrogate can't parse C++ objects that
|
|
||||||
are constructed with parameters at the outermost scoping level.) As a
|
|
||||||
workaround, we usually protect these declarations from interrogate by
|
|
||||||
using the syntax #ifndef CPPPARSER .. #endif.
|
|
||||||
|
|
||||||
Once you have a collector, simply bracket the region of code you wish
|
|
||||||
to time with collector.start() and collector.stop(). It is important
|
|
||||||
to ensure that each call to start() is matched by exactly one call to
|
|
||||||
stop(). If you are programming in C++, it is highly recommended that
|
|
||||||
you use the PStatTimer class to make these calls automatically, which
|
|
||||||
guarantees the correct pairing; the PStatTimer's constructor calls
|
|
||||||
start() and its destructor calls stop(), so you may simply define a
|
|
||||||
PStatTimer object at the beginning of the block of code you wish to
|
|
||||||
time. If you are programming in Python, you must call start() and
|
|
||||||
stop() explicitly.
|
|
||||||
|
|
||||||
When you call start() and there was another collector already started,
|
|
||||||
that previous collector is paused until you call the matching stop()
|
|
||||||
(at which time the previous collector is resumed). That is, time is
|
|
||||||
accumulated only towards the collector indicated by the innermost
|
|
||||||
start() .. stop() pair.
|
|
||||||
|
|
||||||
Time accumulated towards any collector is also counted towards that
|
|
||||||
collector's parent, as defined in the collector's constructor
|
|
||||||
(described above).
|
|
||||||
|
|
||||||
It is important to understand the difference between collectors nested
|
|
||||||
implicitly by runtime start/stop invocations, and the static hierarchy
|
|
||||||
implicit in the collector definition. Time is accumulated in parent
|
|
||||||
collectors according to the statically-defined parents of the
|
|
||||||
innermost active collector only, without regard to the runtime stack
|
|
||||||
of paused collectors.
|
|
||||||
|
|
||||||
For example, suppose you are in the middle of processing the "Draw"
|
|
||||||
task and have therefore called start() on the "Draw" collector. While
|
|
||||||
in the middle of processing this block of code, you call a function
|
|
||||||
that has its own collector called "Cull:Sort". As soon as you start
|
|
||||||
the new collector, you have paused the "Draw" collector and are now
|
|
||||||
accumulating time in the "Cull:Sort" collector. Once this new
|
|
||||||
collector stops, you will automatically return to accumulating time in
|
|
||||||
the "Draw" collector. The time spent within the nested "Cull:Sort"
|
|
||||||
collector will be counted towards the "Cull" total time, not the
|
|
||||||
"Draw" total time.
|
|
||||||
|
|
||||||
Color and Other Optional Collector Properties
|
|
||||||
|
|
||||||
If you do not specify a color for a particular collector, it will be
|
|
||||||
assigned a random color at runtime. At present, the only way to
|
|
||||||
specify a color is to modify
|
|
||||||
panda/src/pstatclient/pStatProperties.cxx, and add a line to the table
|
|
||||||
for your new collector(s). You can also define additional properties
|
|
||||||
here such as a suggested initial scale for the graph and, for
|
|
||||||
non-time-based collectors, a unit name and/or scale factor. The order
|
|
||||||
in which these collectors are listed in this table is also relevant;
|
|
||||||
they will appear in the same order on the graphs. The first column
|
|
||||||
should be set to 1 for your new collectors unless you wish them to be
|
|
||||||
disabled by default. You must recompile the client (but not the
|
|
||||||
server) to reflect changes to this table.
|
|
||||||
|
|
||||||
|
|
||||||
HOW IT WORKS (What's actually happening)
|
|
||||||
|
|
||||||
The PStats code is divided into two main parts: the client code and
|
|
||||||
the server code.
|
|
||||||
|
|
||||||
The PStats Client
|
|
||||||
|
|
||||||
The client code is in panda/src/pstatclient, and is available to run
|
|
||||||
in every Panda client unless it is compiled out. (It will be compiled
|
|
||||||
out if OPTIMIZE is set to level 4, unless DO_PSTATS is also explicitly
|
|
||||||
set to non-empty. It will also be compiled out if NSPR is not
|
|
||||||
available, since both client and server depend on the NSPR library to
|
|
||||||
exchange data, even when running the server on the same machine as the
|
|
||||||
client.)
|
|
||||||
|
|
||||||
The client code is designed for minimal runtime overhead when it is
|
|
||||||
compiled in but not enabled (that is, when the client is not in
|
|
||||||
contact with a PStats server), as well as when it is enabled (when the
|
|
||||||
client is in contact with a PStats server). It is also designed for
|
|
||||||
zero runtime overhead when it is compiled out.
|
|
||||||
|
|
||||||
There is one global PStatClient class object, which manages all of the
|
|
||||||
communications on the client side. Each PStatCollector is simply an
|
|
||||||
index into an array stored within the PStatClient object, although the
|
|
||||||
interface is intended to hide this detail from the programmer.
|
|
||||||
|
|
||||||
Initially, before the PStatClient has established a connection, calls
|
|
||||||
to start() and stop() simply return immediately.
|
|
||||||
|
|
||||||
When you call PStatClient.connect(), the client attempts to contact
|
|
||||||
the PStatServer via a TCP connection to the hostname and port named in
|
|
||||||
the pstats-host and pstats-port Config.prc variables, respectively.
|
|
||||||
(The default hostname and port are localhost and 5185.) You can also
|
|
||||||
pass in a specific hostname and/or port to the connect() call. Upon
|
|
||||||
successful connection and handshake with the server, the PStatClient
|
|
||||||
sends a list of the available collectors, along with their names,
|
|
||||||
colors, and hierarchical relationships, on the TCP channel.
|
|
||||||
|
|
||||||
Once connected, each call to start() and stop() adds a collector
|
|
||||||
number and timestamp to an array maintained by the PStatClient. At
|
|
||||||
the end of each frame, the PStatClient boils this array into a
|
|
||||||
datagram for shipping to the server. Each start() and stop() event
|
|
||||||
requires 6 bytes; if the resulting datagram will fit within a UDP
|
|
||||||
packet (1K bytes, or about 84 start/stop pairs), it is sent via UDP;
|
|
||||||
otherwise, it is sent on the TCP channel. (Some fraction of the
|
|
||||||
packets that are eligible for UDP, from 0% to 100%, may be sent via
|
|
||||||
TCP instead; you can specify this with the pstats-tcp-ratio Config.prc
|
|
||||||
variable.)
|
|
||||||
|
|
||||||
Also, to prevent flooding the network and/or overwhelming the PStats
|
|
||||||
server, only so many frames of data will be sent per second. This
|
|
||||||
parameter is controlled by the pstats-max-rate Config.prc variable and
|
|
||||||
is set to 30 by default. (If the packets are larger than 1K, the max
|
|
||||||
transmission rate is also automatically reduced further in
|
|
||||||
proportion.) If the frame rate is higher than this limit, some frames
|
|
||||||
will simply not be transmitted. The server is designed to cope with
|
|
||||||
missing frames and will assume missing frames are similar to their
|
|
||||||
neighbors.
|
|
||||||
|
|
||||||
The server does all the work of analyzing the data after that. The
|
|
||||||
client's next job is simply to clear its array and prepare itself for
|
|
||||||
the next frame.
|
|
||||||
|
|
||||||
|
|
||||||
The PStats Server
|
|
||||||
|
|
||||||
The generic server code is in pandatool/src/pstatserver, and the
|
|
||||||
GUI-specific server code is in pandatool/src/gtk-stats and
|
|
||||||
pandatool/src/win-stats, for Unix and Windows, respectively. (There
|
|
||||||
is also an OS-independent text-stats subdirectory, which builds a
|
|
||||||
trivial PStats server that presents a scrolling-text interface. This
|
|
||||||
is mainly useful as a proof of technology rather than as a usable
|
|
||||||
tool.)
|
|
||||||
|
|
||||||
The GUI-specific code is the part that manages the interaction with
|
|
||||||
the user via the creation of windows and the handling of mouse input,
|
|
||||||
etc.; most of the real work of interpreting the data is done in the
|
|
||||||
generic code in the pstatserver directory.
|
|
||||||
|
|
||||||
The PStatServer owns all of the connections, and interfaces with the
|
|
||||||
NSPR library to communicate with the clients. It listens on the
|
|
||||||
specified port for new connections, using the pstats-port Config.prc
|
|
||||||
variable to determine the port number (this is the same variable that
|
|
||||||
specifies the port to the client). Usually you can leave this at its
|
|
||||||
default value of 5185, but there may be some cases in which that port
|
|
||||||
is already in use on a particular machine (for instance, maybe someone
|
|
||||||
else is running another PStats server on another display of the same
|
|
||||||
machine).
|
|
||||||
|
|
||||||
Once a connection is received, it creates a PStatMonitor class (this
|
|
||||||
class is specialized for each of the different GUI variants) that
|
|
||||||
handles all the data for this particular connection. In the case of
|
|
||||||
the windows pstats.exe program, each new monitor instance is
|
|
||||||
represented by a new toplevel window. Multiple monitors can be
|
|
||||||
active at once.
|
|
||||||
|
|
||||||
The work of digesting the data from the client is performed by the
|
|
||||||
PStatView class, which analyzes the pattern of start and stop
|
|
||||||
timestamps, along with the relationship data of the various
|
|
||||||
collectors, and boils it down into a list of the amount of time spent
|
|
||||||
in each collector per frame.
|
|
||||||
|
|
||||||
Finally, a PStatStripChart or PStatPianoRoll class object defines the
|
|
||||||
actual graph output of colored lines and bars; the generic versions of
|
|
||||||
these include virtual functions to do the actual drawing (the GUI
|
|
||||||
specializations of these redefine these methods to make the
|
|
||||||
appropriate calls).
|
|
||||||
|
|
Loading…
x
Reference in New Issue
Block a user