mirror of
https://github.com/panda3d/panda3d.git
synced 2025-10-04 10:54:24 -04:00
documents moved to CMU wiki: www.panda3d.org
This commit is contained in:
parent
856a53bd5c
commit
ab21c536ad
@ -1,110 +0,0 @@
|
||||
HOW TO FIX TRANSPARENCY ISSUES
|
||||
|
||||
Usually transparency works as expected in Panda automatically, but
|
||||
sometimes it just seems to go awry, where a semitransparent object in
|
||||
the background seems to partially obscure a semitransparent object in
|
||||
front of it. This is especially likely to happen with large flat
|
||||
polygon cutouts, or when a transparent object is contained within
|
||||
another transparent object, or when parts of a transparent object can
|
||||
be seen behind other parts of the same object.
|
||||
|
||||
The fundamental problem is that correct transparency, in the absence
|
||||
of special hardware support involving extra framebuffer bits, requires
|
||||
drawing everything in order from farthest away to nearest. This means
|
||||
sorting each polygon--actually, each pixel, for true correctness--into
|
||||
back-to-front order before drawing the scene.
|
||||
|
||||
It is, of course, impossible to split up every transparent object into
|
||||
individual pixels or polygons for sorting individually, so Panda sorts
|
||||
objects at the Geom level, according to the center of the bounding
|
||||
volume. This works well 95% of the time.
|
||||
|
||||
You run into problems with large flat polygons, though, since these
|
||||
tend to have parts that are far away from the center of their bounding
|
||||
volume. The bounding-volume sorting is especially likely to go awry
|
||||
when you have two or more large flats close behind the other, and you
|
||||
view them from slightly off-axis. (Try drawing a picture, of the two
|
||||
flats as seen from the top, and imagine yourself viewing them from
|
||||
different directions. Also imagine where the center of the bounding
|
||||
volumes is.)
|
||||
|
||||
Now, there are a number of solutions to this sort of problem. No one
|
||||
solution is right for every situation.
|
||||
|
||||
First, the easiest thing to do is to use M_dual transparency. This is
|
||||
a special transparency mode in which the completely invisible parts of
|
||||
the object aren't drawn into the Z-buffer at all, so that they don't
|
||||
have any chance of obscuring things behind them. This only works well
|
||||
if the flats are typical cutouts, where there is a big solid part
|
||||
(alpha == 1.0) and a big transparent part (alpha == 0.0), and not a
|
||||
lot of semitransparent parts (0.0 < alpha < 1.0). It is also a
|
||||
slightly more expensive rendering mode than the default of M_alpha, so
|
||||
it's not enabled by default in Panda. But egg-palettize will turn it
|
||||
on automatically for a particular model if it detects textures that
|
||||
appear to be cutouts of the appropriate nature, which is another
|
||||
reason to use egg-palettize if you are not using it already.
|
||||
|
||||
If you don't use egg-palettize (you really should, you know), you can
|
||||
just hand-edit the egg files to put the line:
|
||||
|
||||
<Scalar> alpha { dual }
|
||||
|
||||
within the <Texture> reference for the textures in question.
|
||||
|
||||
A second easy option is to use M_multisample transparency, which
|
||||
doesn't have any ordering issues at all, but it only looks good on
|
||||
very high-end cards that have special multisample bits to support
|
||||
full-screen antialiasing. Also, at the present it only looks good on
|
||||
these high-end cards in OpenGL mode (since our pandadx drivers don't
|
||||
support M_multisample explicitly right now). But if M_multisample is
|
||||
not supported by a particular hardware or panda driver, it
|
||||
automatically falls back to M_binary, which also doesn't have any
|
||||
ordering issues, but it always has jaggy edges along the cutout edge.
|
||||
This only works well on texture images that represent cutouts, like
|
||||
M_dual, above.
|
||||
|
||||
If you use egg-palettize, you can engage M_multisample mode by putting
|
||||
the keyword "ms" on the line with the texture(s). Without
|
||||
egg-palettize, hand-edit the egg files to put the line:
|
||||
|
||||
<Scalar> alpha { ms }
|
||||
|
||||
within the <Texture> reference for the textures in question.
|
||||
|
||||
A third easy option is to chop up one or both competing models into
|
||||
smaller pieces, each of which can be sorted independently by Panda.
|
||||
For instance, you can split one big polygon into a grid of little
|
||||
polygons, and the sorting is more likely to be accurate for each piece
|
||||
(because the center of the bounding volume is closer to the pixels).
|
||||
You can draw a picture to see how this works. In order to do this
|
||||
properly, you can't just make it one big mesh of small polygons, since
|
||||
Panda will make a mesh into a single Geom of tristrips; instead, it
|
||||
needs to be separate meshes, so that each one will become its own
|
||||
Geom. Obviously, this is slightly more expensive too, since you are
|
||||
introducing additional vertices and adding more objects to the sort
|
||||
list; so you don't want to go too crazy with the smallness of your
|
||||
polygons.
|
||||
|
||||
A fourth option is simply to disable the depth write on your
|
||||
transparent objects. This is most effective when you are trying to
|
||||
represent something that is barely visible, like glass or a soap
|
||||
bubble. Doing this doesn't improve the likelihood of correct sorting,
|
||||
but it will tend to make the artifacts of an incorrect sorting less
|
||||
obvious. You can achieve this by using the transparency option
|
||||
"blend_no_occlude" in an egg file, or by explicitly disabling the
|
||||
depth write on a loaded model with node_path.set_depth_write(false).
|
||||
You should be careful only to disable depth write on the transparent
|
||||
pieces, and not on the opaque parts.
|
||||
|
||||
A final option is to make explicit sorting requests to Panda. This is
|
||||
often the last resort because it is more difficult, and doesn't
|
||||
generalize well, but it does have the advantage of not adding
|
||||
additional performance penalties to your scene. It only works well
|
||||
when the transparent objects can be sorted reliably with respect to
|
||||
everything else behind them. For instance, clouds in the sky can
|
||||
reliably be drawn before almost everything else in the scene, except
|
||||
the sky itself. Similarly, a big flat that is up against an opaque
|
||||
wall can reliably be drawn after all of the opaque objects, but before
|
||||
any other transparent object, regardless of where the camera happens
|
||||
to be placed in the scene. See howto.control_render_order.txt for
|
||||
more information about explicitly controlling the rendering order.
|
@ -1,243 +0,0 @@
|
||||
MULTITEXTURE OVERVIEW
|
||||
|
||||
Modern graphics cards are capable of applying more than one texture
|
||||
image at once to geometry as they render polygons. This capability is
|
||||
referred to as multitexture.
|
||||
|
||||
The textures are applied in a pipeline fashion, where the output of
|
||||
each texturing operation is used as the input to the next. A
|
||||
particular graphics card will have a certain number of texture units
|
||||
dedicated to this function, which limits the number of textures that
|
||||
may be pipelined in this way.
|
||||
|
||||
To apply a texture in Panda, you must have a Texture object (which you
|
||||
might have loaded from disk, or extracted from a model) and a
|
||||
TextureStage object (which you can create on-the-fly). The primary
|
||||
call to add a texture to the pipeline is:
|
||||
|
||||
nodepath.set_texture(texture_stage, texture);
|
||||
|
||||
This adds the indicated texture into the pipeline for all the geometry
|
||||
at nodepath level and below, associating it with the indicated
|
||||
TextureStage object.
|
||||
|
||||
The purpose of the TextureStage object is to represent a single stage
|
||||
in the texture pipeline. You can create as many TextureStage objects
|
||||
as you like; each one can associate a different texture, and each of
|
||||
those textures will be applied together (within the limits of your
|
||||
hardware). If you want to change out a particular texture within the
|
||||
pipeline without disturbing the others, keep a handle to the
|
||||
TextureStage object that you used for that stage, and issue a
|
||||
set_texture() call using the same TextureStage object and a different
|
||||
texture--this replaces the texture that you assigned previously. (You
|
||||
may do this on the same NodePath, or on a lower NodePath level to
|
||||
override the texture specified from above.)
|
||||
|
||||
To undo a set_texture() call for a particular stage or for all stages,
|
||||
do:
|
||||
|
||||
nodepath.clear_texture(texture_stage)
|
||||
nodepath.clear_texture()
|
||||
|
||||
Don't confuse this with the calls to actively disable a particular
|
||||
texture stage or to disable texturing altogether, which are:
|
||||
|
||||
nodepath.set_texture_off(texture_stage)
|
||||
nodepath.set_texture_off()
|
||||
|
||||
The difference between the two is that set_texture_off() inserts a
|
||||
command into the scene graph to specifically turn off the texture
|
||||
associated with the indicated texture stage, while clear_texture()
|
||||
simply removes the texture stage from this node's list of assigned
|
||||
textures. Use clear_texture() to undo a previous call to
|
||||
set_texture() on a given node. You need set_texture_off() more
|
||||
rarely; you might use this when you want to override a particular
|
||||
setting from above to turn off just one particular stage of the
|
||||
pipeline (for instance, you may have a set_texture() applied at the
|
||||
root of a scene to apply a particular effect to everything in the
|
||||
scene, but use set_texture_off() on one particular model for which you
|
||||
don't want that effect applied).
|
||||
|
||||
There is also a default TextureStage object that is used for all of
|
||||
the old single-texture Panda interfaces (like
|
||||
nodepath.set_texture(texture)). It is also the TextureStage that will
|
||||
be used to apply Textures onto models (e.g. egg files and/or bam
|
||||
files) that do not specify the use of multitexturing. This default
|
||||
TextureStage can be accessed by TextureStage::get_default().
|
||||
|
||||
There are a number of different blend modes that you may specify for
|
||||
each texture stage in the pipeline; these are specified with
|
||||
texture_stage.set_mode(). The mode may be one of:
|
||||
|
||||
TextureStage::M_modulate
|
||||
Multiplies the incoming color by the texture color. This
|
||||
allows the texture to darken, but not brighten, the incoming
|
||||
color.
|
||||
|
||||
TextureStage::M_add
|
||||
Adds the incoming color and the texture color. This allows the
|
||||
texture to brighten, but not darken, the incoming color, and
|
||||
tends to lead to bright, desaturated colors.
|
||||
|
||||
TextureStage::M_decal
|
||||
Shows the texture color where the texture is alpha = 1, and the
|
||||
incoming color where the texture is alpha = 0. This can be
|
||||
used to paint a texture on top of the existing texture.
|
||||
|
||||
TextureStage::M_blend
|
||||
Defined for grayscale textures only. You can specify an
|
||||
arbitrary color as a separate parameter with
|
||||
texture_stage.set_color(), and then the result of M_blend is to
|
||||
produced the specified color where the texture is white, and
|
||||
the incoming color where the texture is black. This can be
|
||||
used to paint arbitrary color stripes or a similar effect over
|
||||
an existing texture.
|
||||
|
||||
TextureStage::M_blend_color_scale
|
||||
This is identical to M_blend, except that the blend color,
|
||||
specified by texture_stage.set_color(), is also modified by the
|
||||
color scale applied to the scene graph.
|
||||
|
||||
TextureStage::M_replace
|
||||
Completely replaces the incoming color with the texture color;
|
||||
probably not terribly useful in a multitexture environment,
|
||||
except for the first texture stage.
|
||||
|
||||
TextureStage::M_combine
|
||||
This mode supercedes most of the above with a more powerful
|
||||
collection of options, including signed add and/or subtract,
|
||||
and linear interpolation between two different colors using a
|
||||
third parameter. You can specify the input(s) as one or more
|
||||
combinations of a specified constant color, or the previous
|
||||
texture in the pipeline, or the incoming color. However, very
|
||||
old graphics drivers may not support this mode.
|
||||
|
||||
Since combine mode has a number of associated parameters, you
|
||||
enable this mode by calling set_combine_rgb() and
|
||||
set_combine_alpha() with the appropriate parameters; it's not
|
||||
necessary to call set_mode(M_combine). A complete description
|
||||
of this mode is not given here.
|
||||
|
||||
Some of the above modes are very order-dependent. For this reason,
|
||||
you may use texture_stage.set_sort() to specify the order in which
|
||||
textures should be applied, using an integer sort parameter. When
|
||||
Panda collects the textures together for rendering a particular piece
|
||||
of geometry, it will sort them in order from lowest sort value to
|
||||
highest sort value. The default sort value is 0. Thus, you can
|
||||
specify a large positive number to apply a texture on top of existing
|
||||
textures, or a large negative number to apply it beneath existing
|
||||
textures.
|
||||
|
||||
The egg loader will create texture stages automatically in the
|
||||
presence of a multitexturing specification in the egg file, and it
|
||||
will assign to these stages sort values in multiples of 10: the lowest
|
||||
texture stage will have a sort value of 0, the next 10, the next 20,
|
||||
and so on.
|
||||
|
||||
Since the number of texture units available on the hardware is
|
||||
limited, and is usually a small number (and some hardware doesn't
|
||||
support multitexturing at all, so effectively has only one texture
|
||||
unit), Panda needs some rule for selecting the subset of textures to
|
||||
render when you have requested more texture stages than are available.
|
||||
For this Panda relies on the texture_stage.set_priority() value, which
|
||||
is an integer value that represents the importance of this particular
|
||||
texture. If the requested textures will not fit on the available
|
||||
number of texture units, Panda will select the n textures with the
|
||||
highest priority (and then sort them into order by the set_sort()
|
||||
parameter). Between two textures with the same priority, Panda will
|
||||
prefer the one with the lower sort value. The default priority is 0.
|
||||
|
||||
If you need to know the actual limit, you can query your available
|
||||
number of texture stages from the GraphicsStateGuardian, with the call
|
||||
gsg->get_max_texture_stages() (e.g. from Python, call
|
||||
base.win.getGsg().getMaxTextureStages()).
|
||||
|
||||
|
||||
TEXTURE COORDINATES
|
||||
|
||||
In many cases, all of the texture stages need to use the same set of
|
||||
texture coordinates, which is the default behavior. You can also
|
||||
apply a different texture matrix on some texture stages to apply a
|
||||
linear transformation to the texture coordinates (for instance, to
|
||||
position a decal on the surface).
|
||||
|
||||
nodepath.set_tex_offset(texture_stage, u_offset, v_offset);
|
||||
nodepath.set_tex_scale(texture_stage, u_scale, v_scale);
|
||||
nodepath.set_tex_rotate(texture_stage, degrees);
|
||||
nodepath.set_tex_transform(texture_stage, general_transform);
|
||||
|
||||
These operations accumulate through nested nodes just like standard
|
||||
scene graph transforms. In fact, you can get and set relative texture
|
||||
transforms:
|
||||
|
||||
rel_offset = nodepath.get_tex_offset(other, texture_stage);
|
||||
nodepath.set_tex_scale(other, texture_stage, u_scale, v_scale);
|
||||
(etc.)
|
||||
|
||||
You may create LerpIntervals to lerp texture matrices. There are no
|
||||
interval types that operate directly on a texture matrix, but you can
|
||||
set up a TexProjectorEffect to bind a node's transform to the texture
|
||||
matrix:
|
||||
|
||||
nodepath.set_tex_projector(texture_stage, from, to)
|
||||
|
||||
Where "from" and "to" are arbitrary NodePaths. The TexProjectorEffect
|
||||
will measure the relative transform between "from" and "to" each frame
|
||||
and apply it to the nodepath's texture matrix. Once this is in place,
|
||||
you may create a LerpPosInterval, or any other Panda construct, to
|
||||
adjust either the "from" or the "to" NodePath, which will thus
|
||||
indirectly adjust the texture matrix by the same amount.
|
||||
|
||||
|
||||
Sometimes, a texture stage may need to use a completely different set
|
||||
of texture coordinates, for instance as provided by the artist who
|
||||
generated the model. Panda allows a model to store any number of
|
||||
different sets of texture coordinates on its vertices, each with a
|
||||
unique name. You can associate any texture stage with any set of
|
||||
texture coordinates you happen to have available on your model:
|
||||
|
||||
texture_stage.set_texcoord_name(name)
|
||||
|
||||
|
||||
Finally, you may need to generate texture coordinates for a particular
|
||||
texture stage on the fly. This is particularly useful, for instance,
|
||||
to apply reflection maps, e.g. sphere maps or cube maps. To enable
|
||||
this effect, use:
|
||||
|
||||
nodepath.set_tex_gen(texture_stage, mode)
|
||||
|
||||
Where mode is one of the enumerated types named by TexGenAttrib::Mode;
|
||||
at the present, this may be any of M_world_position,
|
||||
M_object_position, M_eye_position, or M_sphere_map. The first three
|
||||
modes simply apply the X, Y, Z coordinates of the vertex to its U, V
|
||||
texture coordinates (a texture matrix may then be applied to transform
|
||||
the generated texture coordinates into the particular U, V coordinate
|
||||
space that you require). The remaining modes generate texture
|
||||
coordinates appropriate to a reflection map of the corresponding type,
|
||||
based on the position and normal of each vertex, relative to the
|
||||
camera.
|
||||
|
||||
|
||||
The texture generation mode and the tex projector mode may be combined
|
||||
to provide hardware-assisted projective texturing, where a texture is
|
||||
applied to geometry as if it were projected from a particular point in
|
||||
space, like a slide projector. This is particularly useful for
|
||||
applying shadow maps or flashlight effects, for instance. There is a
|
||||
convenience function on NodePath that automatically makes the three
|
||||
separate calls needed to enable projective texturing:
|
||||
|
||||
nodepath.project_texture(texture_stage, texture, projector);
|
||||
|
||||
Where projector is a NodePath that references a LensNode. The
|
||||
indicated texture is applied to the geometry at nodepath and below, as
|
||||
if it were projected from the indicated projector. The lens
|
||||
properties such as field of view may be adjusted on the fly to adjust
|
||||
the projection.
|
||||
|
||||
(Note that Panda also provides a ProjectionScreen object, which
|
||||
performs an effect very similar to the project_texture() call, except
|
||||
that it is performed entirely in the CPU, whereas project_texture()
|
||||
will offload the work onto the graphics card if the card supports
|
||||
this. This may or may not result in a performance improvement over
|
||||
ProjectionScreen, depending on the nature of your scene and your CPU
|
||||
load versus your graphics card load.)
|
@ -1,403 +0,0 @@
|
||||
QUICK INTRODUCTION
|
||||
|
||||
PStats is Panda's built-in performance analysis tool. It can graph
|
||||
frame rate over time, and can further graph the work spent within each
|
||||
frame into user-defined subdivisions of the frame (for instance, app,
|
||||
cull and draw), and thus can be an invaluable tool in identifying
|
||||
performance bottlenecks. It can also show frame-based data that
|
||||
reflects any arbitrary quantity other than time intervals, for
|
||||
instance, texture memory in use or number of vertices drawn.
|
||||
|
||||
The performance graphs may be drawn on the same computer that is
|
||||
running the Panda client, or they may be drawn on another computer on
|
||||
the same LAN, which is useful for analyzing fullscreen applications.
|
||||
The remote computer need not be running the same operating system as
|
||||
the client computer.
|
||||
|
||||
To use PStats, you first need to build the PStats server program,
|
||||
which is part of the Pandatool tree (it's called pstats.exe on
|
||||
Windows, and gtk-stats on a Unix platform). Start by running the
|
||||
PStats server program (it runs in the background), and then start your
|
||||
Direct/Panda client with the following in your Config.prc file:
|
||||
|
||||
want-pstats 1
|
||||
|
||||
Or, at runtime, issue the Python command:
|
||||
|
||||
PStatClient.connect()
|
||||
|
||||
Or if you're running pview, press shift-S.
|
||||
|
||||
Any of the above will contact your running PStats server program,
|
||||
which will proceed to open a window and start a running graph of your
|
||||
client's performance. If you are running the server on a different
|
||||
machine than the client, add the pstats-host variable to your client's
|
||||
Config.prc file, naming the hostname or IP address of the machine
|
||||
running the PStats server.
|
||||
|
||||
If you are developing Python code, you may be interested in reporting
|
||||
the relative time spent within each Python task (by subdividing the
|
||||
total time spent in Python, as reported under "Show Code"). To do
|
||||
this, add the following lines to your Config.prc file before you start
|
||||
ShowBase:
|
||||
|
||||
task-timer-verbose 1
|
||||
pstats-tasks 1
|
||||
|
||||
|
||||
THE PSTATS SERVER (The user interface)
|
||||
|
||||
The GUI for managing the graphs and drilling down to view more detail
|
||||
is entirely controlled by the PStats server program. At the time of
|
||||
this writing, there are two different versions of the PStats server,
|
||||
one for Unix called gtk-stats and one for Windows called simply
|
||||
pstats. The interfaces are similar but not identical; the following
|
||||
paragraphs describe the Windows version.
|
||||
|
||||
When you run pstats.exe, it adds a program to the taskbar but does not
|
||||
immediately open a window. The program name is typically "PStats
|
||||
5185", showing the default PStats TCP port number of 5185; see "HOW IT
|
||||
WORKS" below for more details about the TCP communication system. For
|
||||
the most part you don't need to worry about the port number, as long
|
||||
as server and client agree (and the port is not already being used by
|
||||
another application).
|
||||
|
||||
Each time a client connects to the PStats server, a new monitor window
|
||||
is created. This monitor window owns all of the graphs that you
|
||||
create to view the performance data from that particular connection.
|
||||
Initially, a strip chart showing the frame time of the main thread is
|
||||
created by default; you can create additional graphs by selecting from
|
||||
the Graphs pulldown menu.
|
||||
|
||||
Time-based Strip Charts
|
||||
|
||||
This is the graph type you will use most frequently to examine
|
||||
performance data. The horizontal axis represents the passage of time;
|
||||
each frame is represented as a vertical slice on the graph. The
|
||||
overall height of the colored bands represents the total amount of
|
||||
time spent on each frame; within the frame, the time is further
|
||||
divided into the primary subdivisions represented by different color
|
||||
bands (and labeled on the left). These subdivisions are called
|
||||
"collectors" in the PStats terminology, since they represent time
|
||||
collected by different tasks.
|
||||
|
||||
Normally, the three primary collectors are App, Cull, and Draw, the
|
||||
three stages of the graphics pipeline. Atop these three colored
|
||||
collectors is the label "Frame", which represents any remaining time
|
||||
spent in the frame that was not specifically allocated to one of the
|
||||
three child collectors (normally, there should not be significant time
|
||||
reported here).
|
||||
|
||||
The frame time in milliseconds, averaged over the past three seconds,
|
||||
is drawn above the upper right corner of the graph. The labels on the
|
||||
guide bars on the right are also shown in milliseconds; if you prefer
|
||||
to think about a target frame rate rather than an elapsed time in
|
||||
milliseconds, you may find it useful to select "Hz" from the Units
|
||||
pulldown menu, which changes the time units accordingly.
|
||||
|
||||
The running Panda client suggests its target frame rate, as well as
|
||||
the initial vertical scale of the graph (that is, the height of the
|
||||
colored bars). You can change the scale freely by clicking within the
|
||||
graph itself and dragging the mouse up or down as necessary. One of
|
||||
the horizontal guide bars is drawn in a lighter shade of gray; this
|
||||
one represents the actual target frame rate suggested by the client.
|
||||
The other, darker, guide bars are drawn automatically at harmonic
|
||||
subdvisions of the target frame rate. You can change the target frame
|
||||
rate with the Config.prc variable pstats-target-frame-rate on the
|
||||
client.
|
||||
|
||||
You can also create any number of user-defined guide bars by dragging
|
||||
them into the graph from the gray space immediately above or below the
|
||||
graph. These are drawn in a dashed blue line. It is sometimes useful
|
||||
to place one of these to mark a performance level so it may be
|
||||
compared to future values (or to alternate configurations).
|
||||
|
||||
The primary collectors labeled on the left might themselves be further
|
||||
subdivided, if the data is provided by the client. For instance, App
|
||||
is often divided into Show Code, Animation, and Collisions, where Show
|
||||
Code is the time spent executing any Python code, Animation is the
|
||||
time used to compute any animated characters, and Collisions is the
|
||||
time spent in the collision traverser(s).
|
||||
|
||||
To see any of these further breakdowns, double-click on the
|
||||
corresponding colored label (or on the colored band within the graph
|
||||
itself). This narrows the focus of the strip chart from the overall
|
||||
frame to just the selected collector, which has two advantages.
|
||||
Firstly, it may be easier to observe the behavior of one particular
|
||||
collector when it is drawn alone (as opposed to being stacked on top
|
||||
of some other color bars), and the time in the upper-right corner will
|
||||
now reflect just the total time spent within just this collector.
|
||||
Secondly, if there are further breakdowns to this collector, they will
|
||||
now be shown as further colored bars. As in the Frame chart, the
|
||||
topmost label is the name of the parent collector, and any time shown
|
||||
in this color represents time allocated to the parent collector that
|
||||
is not accounted for by any of the child collectors.
|
||||
|
||||
You can further drill down by double-clicking on any of the new
|
||||
labels; or double-click on the top label, or the white part of the
|
||||
graph, to return back up to the previous level.
|
||||
|
||||
Value-based Strip Charts
|
||||
|
||||
There are other strip charts you may create, which show arbitrary
|
||||
kinds of data per frame other than elapsed time. These can only be
|
||||
accessed from the Graphs pulldown menu, and include things such as
|
||||
texture memory in use and vertices drawn. They behave similarly to
|
||||
the time-based strip charts described above.
|
||||
|
||||
Piano Roll Charts
|
||||
|
||||
This graph is used less frequently, but when it is needed it is a
|
||||
valuable tool to reveal exactly how the time is spent within a frame.
|
||||
The PStats server automatically collects together all the time spent
|
||||
within each collector and shows it as a single total, but in reality
|
||||
it may not all have been spent in one continuous block of time.
|
||||
|
||||
For instance, when Panda draws each display region in single-threaded
|
||||
mode, it performs a cull traversal followed by a draw traversal for
|
||||
each display region. Thus, if your Panda client includes multiple
|
||||
display regions, it will alternate its time spent culling and drawing
|
||||
as it processes each of them. The strip chart, however, reports only
|
||||
the total cull time and draw time spent.
|
||||
|
||||
Sometimes you really need to know the sequence of events in the frame,
|
||||
not just the total time spent in each collector. The piano roll chart
|
||||
shows this kind of data. It is so named because it is similar to the
|
||||
paper music roll for an old-style player piano, with holes punched
|
||||
down the roll for each note that is to be played. The longer the
|
||||
hole, the longer the piano key is held down. (Think of the chart as
|
||||
rotated 90 degrees from an actual piano roll. A player piano roll
|
||||
plays from bottom to top; the piano roll chart reads from left to
|
||||
right.)
|
||||
|
||||
Unlike a strip chart, a piano roll chart does not show trends; the
|
||||
chart shows only the current frame's data. The horizontal axis shows
|
||||
time within the frame, and the individual collectors are stacked up in
|
||||
an arbitrary ordering along the vertical axis.
|
||||
|
||||
The time spent within the frame is drawn from left to right; at any
|
||||
given time, the collector(s) that are active will be drawn with a
|
||||
horizontal bar. You can observe the CPU behavior within a frame by
|
||||
reading the graph from left to right. You may find it useful to
|
||||
select "pause" from the Speed pulldown menu to freeze the graph on
|
||||
just one frame while you read it.
|
||||
|
||||
Note that the piano roll chart shows time spent within the frame on
|
||||
the horizontal axis, instead of the vertical axis, as it is on the
|
||||
strip charts. Thus, the guide bars on the piano roll chart are
|
||||
vertical lines instead of horizontal lines, and they may be dragged in
|
||||
from the left or the right sides (instead of from the top or bottom,
|
||||
as on the strip charts). Apart from this detail, these are the same
|
||||
guide bars that appear on the strip charts.
|
||||
|
||||
The piano roll chart may be created from the Graphs pulldown menu.
|
||||
|
||||
Additional threads
|
||||
|
||||
If the panda client has multiple threads that generate PStats data,
|
||||
the PStats server can open up graphs for these threads as well. Each
|
||||
separate thread is considered unrelated to the main thread, and may
|
||||
have the same or an independent frame rate. Each separate thread will
|
||||
be given its own pulldown menu to create graphs associated with that
|
||||
thread; these auxiliary thread menus will appear on the menu bar
|
||||
following the Graphs menu. At the time of this writing, support for
|
||||
multiple threads within the PStats graph is largely theoretical and
|
||||
untested.
|
||||
|
||||
|
||||
HOW TO DEFINE YOUR OWN COLLECTORS
|
||||
|
||||
The PStats client code is designed to be generic enough to allow users
|
||||
to define their own collectors to time any arbitrary blocks of code
|
||||
(or record additional non-time-based data), from either the C++ or the
|
||||
Python level.
|
||||
|
||||
The general idea is to create a PStatCollector for each separate block
|
||||
of code you wish to time. The name which is passed to the
|
||||
PStatCollector constructor is a unique identifier: all collectors that
|
||||
share the same name are deemed to be the same collector.
|
||||
|
||||
Furthermore, the collector's name can be used to define the
|
||||
hierarchical relationship of each collector with other existing
|
||||
collectors. To do this, prefix the collector's name with the name of
|
||||
its parent(s), followed by a colon separator. For instance,
|
||||
PStatCollector("Draw:Flip") defines a collector named "Flip", which is
|
||||
a child of the "Draw" collector, defined elsewhere.
|
||||
|
||||
You can also define a collector as a child of another collector by
|
||||
giving the parent collector explicitly followed by the name of the
|
||||
child collector alone, which is handy for dynamically-defined
|
||||
collectors. For instance, PStatCollector(draw, "Flip") defines the
|
||||
same collector named above, assuming that draw is the result of the
|
||||
PStatCollector("Draw") constructor.
|
||||
|
||||
Note that, because of an unfortunate limitation with the interrogate
|
||||
parser, statically-defined PStatCollector objects can't be parsed by
|
||||
interrogate. (In general, interrogate can't parse C++ objects that
|
||||
are constructed with parameters at the outermost scoping level.) As a
|
||||
workaround, we usually protect these declarations from interrogate by
|
||||
using the syntax #ifndef CPPPARSER .. #endif.
|
||||
|
||||
Once you have a collector, simply bracket the region of code you wish
|
||||
to time with collector.start() and collector.stop(). It is important
|
||||
to ensure that each call to start() is matched by exactly one call to
|
||||
stop(). If you are programming in C++, it is highly recommended that
|
||||
you use the PStatTimer class to make these calls automatically, which
|
||||
guarantees the correct pairing; the PStatTimer's constructor calls
|
||||
start() and its destructor calls stop(), so you may simply define a
|
||||
PStatTimer object at the beginning of the block of code you wish to
|
||||
time. If you are programming in Python, you must call start() and
|
||||
stop() explicitly.
|
||||
|
||||
When you call start() and there was another collector already started,
|
||||
that previous collector is paused until you call the matching stop()
|
||||
(at which time the previous collector is resumed). That is, time is
|
||||
accumulated only towards the collector indicated by the innermost
|
||||
start() .. stop() pair.
|
||||
|
||||
Time accumulated towards any collector is also counted towards that
|
||||
collector's parent, as defined in the collector's constructor
|
||||
(described above).
|
||||
|
||||
It is important to understand the difference between collectors nested
|
||||
implicitly by runtime start/stop invocations, and the static hierarchy
|
||||
implicit in the collector definition. Time is accumulated in parent
|
||||
collectors according to the statically-defined parents of the
|
||||
innermost active collector only, without regard to the runtime stack
|
||||
of paused collectors.
|
||||
|
||||
For example, suppose you are in the middle of processing the "Draw"
|
||||
task and have therefore called start() on the "Draw" collector. While
|
||||
in the middle of processing this block of code, you call a function
|
||||
that has its own collector called "Cull:Sort". As soon as you start
|
||||
the new collector, you have paused the "Draw" collector and are now
|
||||
accumulating time in the "Cull:Sort" collector. Once this new
|
||||
collector stops, you will automatically return to accumulating time in
|
||||
the "Draw" collector. The time spent within the nested "Cull:Sort"
|
||||
collector will be counted towards the "Cull" total time, not the
|
||||
"Draw" total time.
|
||||
|
||||
Color and Other Optional Collector Properties
|
||||
|
||||
If you do not specify a color for a particular collector, it will be
|
||||
assigned a random color at runtime. At present, the only way to
|
||||
specify a color is to modify
|
||||
panda/src/pstatclient/pStatProperties.cxx, and add a line to the table
|
||||
for your new collector(s). You can also define additional properties
|
||||
here such as a suggested initial scale for the graph and, for
|
||||
non-time-based collectors, a unit name and/or scale factor. The order
|
||||
in which these collectors are listed in this table is also relevant;
|
||||
they will appear in the same order on the graphs. The first column
|
||||
should be set to 1 for your new collectors unless you wish them to be
|
||||
disabled by default. You must recompile the client (but not the
|
||||
server) to reflect changes to this table.
|
||||
|
||||
|
||||
HOW IT WORKS (What's actually happening)
|
||||
|
||||
The PStats code is divided into two main parts: the client code and
|
||||
the server code.
|
||||
|
||||
The PStats Client
|
||||
|
||||
The client code is in panda/src/pstatclient, and is available to run
|
||||
in every Panda client unless it is compiled out. (It will be compiled
|
||||
out if OPTIMIZE is set to level 4, unless DO_PSTATS is also explicitly
|
||||
set to non-empty. It will also be compiled out if NSPR is not
|
||||
available, since both client and server depend on the NSPR library to
|
||||
exchange data, even when running the server on the same machine as the
|
||||
client.)
|
||||
|
||||
The client code is designed for minimal runtime overhead when it is
|
||||
compiled in but not enabled (that is, when the client is not in
|
||||
contact with a PStats server), as well as when it is enabled (when the
|
||||
client is in contact with a PStats server). It is also designed for
|
||||
zero runtime overhead when it is compiled out.
|
||||
|
||||
There is one global PStatClient class object, which manages all of the
|
||||
communications on the client side. Each PStatCollector is simply an
|
||||
index into an array stored within the PStatClient object, although the
|
||||
interface is intended to hide this detail from the programmer.
|
||||
|
||||
Initially, before the PStatClient has established a connection, calls
|
||||
to start() and stop() simply return immediately.
|
||||
|
||||
When you call PStatClient.connect(), the client attempts to contact
|
||||
the PStatServer via a TCP connection to the hostname and port named in
|
||||
the pstats-host and pstats-port Config.prc variables, respectively.
|
||||
(The default hostname and port are localhost and 5185.) You can also
|
||||
pass in a specific hostname and/or port to the connect() call. Upon
|
||||
successful connection and handshake with the server, the PStatClient
|
||||
sends a list of the available collectors, along with their names,
|
||||
colors, and hierarchical relationships, on the TCP channel.
|
||||
|
||||
Once connected, each call to start() and stop() adds a collector
|
||||
number and timestamp to an array maintained by the PStatClient. At
|
||||
the end of each frame, the PStatClient boils this array into a
|
||||
datagram for shipping to the server. Each start() and stop() event
|
||||
requires 6 bytes; if the resulting datagram will fit within a UDP
|
||||
packet (1K bytes, or about 84 start/stop pairs), it is sent via UDP;
|
||||
otherwise, it is sent on the TCP channel. (Some fraction of the
|
||||
packets that are eligible for UDP, from 0% to 100%, may be sent via
|
||||
TCP instead; you can specify this with the pstats-tcp-ratio Config.prc
|
||||
variable.)
|
||||
|
||||
Also, to prevent flooding the network and/or overwhelming the PStats
|
||||
server, only so many frames of data will be sent per second. This
|
||||
parameter is controlled by the pstats-max-rate Config.prc variable and
|
||||
is set to 30 by default. (If the packets are larger than 1K, the max
|
||||
transmission rate is also automatically reduced further in
|
||||
proportion.) If the frame rate is higher than this limit, some frames
|
||||
will simply not be transmitted. The server is designed to cope with
|
||||
missing frames and will assume missing frames are similar to their
|
||||
neighbors.
|
||||
|
||||
The server does all the work of analyzing the data after that. The
|
||||
client's next job is simply to clear its array and prepare itself for
|
||||
the next frame.
|
||||
|
||||
|
||||
The PStats Server
|
||||
|
||||
The generic server code is in pandatool/src/pstatserver, and the
|
||||
GUI-specific server code is in pandatool/src/gtk-stats and
|
||||
pandatool/src/win-stats, for Unix and Windows, respectively. (There
|
||||
is also an OS-independent text-stats subdirectory, which builds a
|
||||
trivial PStats server that presents a scrolling-text interface. This
|
||||
is mainly useful as a proof of technology rather than as a usable
|
||||
tool.)
|
||||
|
||||
The GUI-specific code is the part that manages the interaction with
|
||||
the user via the creation of windows and the handling of mouse input,
|
||||
etc.; most of the real work of interpreting the data is done in the
|
||||
generic code in the pstatserver directory.
|
||||
|
||||
The PStatServer owns all of the connections, and interfaces with the
|
||||
NSPR library to communicate with the clients. It listens on the
|
||||
specified port for new connections, using the pstats-port Config.prc
|
||||
variable to determine the port number (this is the same variable that
|
||||
specifies the port to the client). Usually you can leave this at its
|
||||
default value of 5185, but there may be some cases in which that port
|
||||
is already in use on a particular machine (for instance, maybe someone
|
||||
else is running another PStats server on another display of the same
|
||||
machine).
|
||||
|
||||
Once a connection is received, it creates a PStatMonitor class (this
|
||||
class is specialized for each of the different GUI variants) that
|
||||
handles all the data for this particular connection. In the case of
|
||||
the windows pstats.exe program, each new monitor instance is
|
||||
represented by a new toplevel window. Multiple monitors can be
|
||||
active at once.
|
||||
|
||||
The work of digesting the data from the client is performed by the
|
||||
PStatView class, which analyzes the pattern of start and stop
|
||||
timestamps, along with the relationship data of the various
|
||||
collectors, and boils it down into a list of the amount of time spent
|
||||
in each collector per frame.
|
||||
|
||||
Finally, a PStatStripChart or PStatPianoRoll class object defines the
|
||||
actual graph output of colored lines and bars; the generic versions of
|
||||
these include virtual functions to do the actual drawing (the GUI
|
||||
specializations of these redefine these methods to make the
|
||||
appropriate calls).
|
||||
|
Loading…
x
Reference in New Issue
Block a user