documents moved to CMU wiki: www.panda3d.org

This commit is contained in:
David Rose 2005-05-02 16:55:03 +00:00
parent 856a53bd5c
commit ab21c536ad
3 changed files with 0 additions and 756 deletions

View File

@ -1,110 +0,0 @@
HOW TO FIX TRANSPARENCY ISSUES
Usually transparency works as expected in Panda automatically, but
sometimes it just seems to go awry, where a semitransparent object in
the background seems to partially obscure a semitransparent object in
front of it. This is especially likely to happen with large flat
polygon cutouts, or when a transparent object is contained within
another transparent object, or when parts of a transparent object can
be seen behind other parts of the same object.
The fundamental problem is that correct transparency, in the absence
of special hardware support involving extra framebuffer bits, requires
drawing everything in order from farthest away to nearest. This means
sorting each polygon--actually, each pixel, for true correctness--into
back-to-front order before drawing the scene.
It is, of course, impossible to split up every transparent object into
individual pixels or polygons for sorting individually, so Panda sorts
objects at the Geom level, according to the center of the bounding
volume. This works well 95% of the time.
You run into problems with large flat polygons, though, since these
tend to have parts that are far away from the center of their bounding
volume. The bounding-volume sorting is especially likely to go awry
when you have two or more large flats close behind the other, and you
view them from slightly off-axis. (Try drawing a picture, of the two
flats as seen from the top, and imagine yourself viewing them from
different directions. Also imagine where the center of the bounding
volumes is.)
Now, there are a number of solutions to this sort of problem. No one
solution is right for every situation.
First, the easiest thing to do is to use M_dual transparency. This is
a special transparency mode in which the completely invisible parts of
the object aren't drawn into the Z-buffer at all, so that they don't
have any chance of obscuring things behind them. This only works well
if the flats are typical cutouts, where there is a big solid part
(alpha == 1.0) and a big transparent part (alpha == 0.0), and not a
lot of semitransparent parts (0.0 < alpha < 1.0). It is also a
slightly more expensive rendering mode than the default of M_alpha, so
it's not enabled by default in Panda. But egg-palettize will turn it
on automatically for a particular model if it detects textures that
appear to be cutouts of the appropriate nature, which is another
reason to use egg-palettize if you are not using it already.
If you don't use egg-palettize (you really should, you know), you can
just hand-edit the egg files to put the line:
<Scalar> alpha { dual }
within the <Texture> reference for the textures in question.
A second easy option is to use M_multisample transparency, which
doesn't have any ordering issues at all, but it only looks good on
very high-end cards that have special multisample bits to support
full-screen antialiasing. Also, at the present it only looks good on
these high-end cards in OpenGL mode (since our pandadx drivers don't
support M_multisample explicitly right now). But if M_multisample is
not supported by a particular hardware or panda driver, it
automatically falls back to M_binary, which also doesn't have any
ordering issues, but it always has jaggy edges along the cutout edge.
This only works well on texture images that represent cutouts, like
M_dual, above.
If you use egg-palettize, you can engage M_multisample mode by putting
the keyword "ms" on the line with the texture(s). Without
egg-palettize, hand-edit the egg files to put the line:
<Scalar> alpha { ms }
within the <Texture> reference for the textures in question.
A third easy option is to chop up one or both competing models into
smaller pieces, each of which can be sorted independently by Panda.
For instance, you can split one big polygon into a grid of little
polygons, and the sorting is more likely to be accurate for each piece
(because the center of the bounding volume is closer to the pixels).
You can draw a picture to see how this works. In order to do this
properly, you can't just make it one big mesh of small polygons, since
Panda will make a mesh into a single Geom of tristrips; instead, it
needs to be separate meshes, so that each one will become its own
Geom. Obviously, this is slightly more expensive too, since you are
introducing additional vertices and adding more objects to the sort
list; so you don't want to go too crazy with the smallness of your
polygons.
A fourth option is simply to disable the depth write on your
transparent objects. This is most effective when you are trying to
represent something that is barely visible, like glass or a soap
bubble. Doing this doesn't improve the likelihood of correct sorting,
but it will tend to make the artifacts of an incorrect sorting less
obvious. You can achieve this by using the transparency option
"blend_no_occlude" in an egg file, or by explicitly disabling the
depth write on a loaded model with node_path.set_depth_write(false).
You should be careful only to disable depth write on the transparent
pieces, and not on the opaque parts.
A final option is to make explicit sorting requests to Panda. This is
often the last resort because it is more difficult, and doesn't
generalize well, but it does have the advantage of not adding
additional performance penalties to your scene. It only works well
when the transparent objects can be sorted reliably with respect to
everything else behind them. For instance, clouds in the sky can
reliably be drawn before almost everything else in the scene, except
the sky itself. Similarly, a big flat that is up against an opaque
wall can reliably be drawn after all of the opaque objects, but before
any other transparent object, regardless of where the camera happens
to be placed in the scene. See howto.control_render_order.txt for
more information about explicitly controlling the rendering order.

View File

@ -1,243 +0,0 @@
MULTITEXTURE OVERVIEW
Modern graphics cards are capable of applying more than one texture
image at once to geometry as they render polygons. This capability is
referred to as multitexture.
The textures are applied in a pipeline fashion, where the output of
each texturing operation is used as the input to the next. A
particular graphics card will have a certain number of texture units
dedicated to this function, which limits the number of textures that
may be pipelined in this way.
To apply a texture in Panda, you must have a Texture object (which you
might have loaded from disk, or extracted from a model) and a
TextureStage object (which you can create on-the-fly). The primary
call to add a texture to the pipeline is:
nodepath.set_texture(texture_stage, texture);
This adds the indicated texture into the pipeline for all the geometry
at nodepath level and below, associating it with the indicated
TextureStage object.
The purpose of the TextureStage object is to represent a single stage
in the texture pipeline. You can create as many TextureStage objects
as you like; each one can associate a different texture, and each of
those textures will be applied together (within the limits of your
hardware). If you want to change out a particular texture within the
pipeline without disturbing the others, keep a handle to the
TextureStage object that you used for that stage, and issue a
set_texture() call using the same TextureStage object and a different
texture--this replaces the texture that you assigned previously. (You
may do this on the same NodePath, or on a lower NodePath level to
override the texture specified from above.)
To undo a set_texture() call for a particular stage or for all stages,
do:
nodepath.clear_texture(texture_stage)
nodepath.clear_texture()
Don't confuse this with the calls to actively disable a particular
texture stage or to disable texturing altogether, which are:
nodepath.set_texture_off(texture_stage)
nodepath.set_texture_off()
The difference between the two is that set_texture_off() inserts a
command into the scene graph to specifically turn off the texture
associated with the indicated texture stage, while clear_texture()
simply removes the texture stage from this node's list of assigned
textures. Use clear_texture() to undo a previous call to
set_texture() on a given node. You need set_texture_off() more
rarely; you might use this when you want to override a particular
setting from above to turn off just one particular stage of the
pipeline (for instance, you may have a set_texture() applied at the
root of a scene to apply a particular effect to everything in the
scene, but use set_texture_off() on one particular model for which you
don't want that effect applied).
There is also a default TextureStage object that is used for all of
the old single-texture Panda interfaces (like
nodepath.set_texture(texture)). It is also the TextureStage that will
be used to apply Textures onto models (e.g. egg files and/or bam
files) that do not specify the use of multitexturing. This default
TextureStage can be accessed by TextureStage::get_default().
There are a number of different blend modes that you may specify for
each texture stage in the pipeline; these are specified with
texture_stage.set_mode(). The mode may be one of:
TextureStage::M_modulate
Multiplies the incoming color by the texture color. This
allows the texture to darken, but not brighten, the incoming
color.
TextureStage::M_add
Adds the incoming color and the texture color. This allows the
texture to brighten, but not darken, the incoming color, and
tends to lead to bright, desaturated colors.
TextureStage::M_decal
Shows the texture color where the texture is alpha = 1, and the
incoming color where the texture is alpha = 0. This can be
used to paint a texture on top of the existing texture.
TextureStage::M_blend
Defined for grayscale textures only. You can specify an
arbitrary color as a separate parameter with
texture_stage.set_color(), and then the result of M_blend is to
produced the specified color where the texture is white, and
the incoming color where the texture is black. This can be
used to paint arbitrary color stripes or a similar effect over
an existing texture.
TextureStage::M_blend_color_scale
This is identical to M_blend, except that the blend color,
specified by texture_stage.set_color(), is also modified by the
color scale applied to the scene graph.
TextureStage::M_replace
Completely replaces the incoming color with the texture color;
probably not terribly useful in a multitexture environment,
except for the first texture stage.
TextureStage::M_combine
This mode supercedes most of the above with a more powerful
collection of options, including signed add and/or subtract,
and linear interpolation between two different colors using a
third parameter. You can specify the input(s) as one or more
combinations of a specified constant color, or the previous
texture in the pipeline, or the incoming color. However, very
old graphics drivers may not support this mode.
Since combine mode has a number of associated parameters, you
enable this mode by calling set_combine_rgb() and
set_combine_alpha() with the appropriate parameters; it's not
necessary to call set_mode(M_combine). A complete description
of this mode is not given here.
Some of the above modes are very order-dependent. For this reason,
you may use texture_stage.set_sort() to specify the order in which
textures should be applied, using an integer sort parameter. When
Panda collects the textures together for rendering a particular piece
of geometry, it will sort them in order from lowest sort value to
highest sort value. The default sort value is 0. Thus, you can
specify a large positive number to apply a texture on top of existing
textures, or a large negative number to apply it beneath existing
textures.
The egg loader will create texture stages automatically in the
presence of a multitexturing specification in the egg file, and it
will assign to these stages sort values in multiples of 10: the lowest
texture stage will have a sort value of 0, the next 10, the next 20,
and so on.
Since the number of texture units available on the hardware is
limited, and is usually a small number (and some hardware doesn't
support multitexturing at all, so effectively has only one texture
unit), Panda needs some rule for selecting the subset of textures to
render when you have requested more texture stages than are available.
For this Panda relies on the texture_stage.set_priority() value, which
is an integer value that represents the importance of this particular
texture. If the requested textures will not fit on the available
number of texture units, Panda will select the n textures with the
highest priority (and then sort them into order by the set_sort()
parameter). Between two textures with the same priority, Panda will
prefer the one with the lower sort value. The default priority is 0.
If you need to know the actual limit, you can query your available
number of texture stages from the GraphicsStateGuardian, with the call
gsg->get_max_texture_stages() (e.g. from Python, call
base.win.getGsg().getMaxTextureStages()).
TEXTURE COORDINATES
In many cases, all of the texture stages need to use the same set of
texture coordinates, which is the default behavior. You can also
apply a different texture matrix on some texture stages to apply a
linear transformation to the texture coordinates (for instance, to
position a decal on the surface).
nodepath.set_tex_offset(texture_stage, u_offset, v_offset);
nodepath.set_tex_scale(texture_stage, u_scale, v_scale);
nodepath.set_tex_rotate(texture_stage, degrees);
nodepath.set_tex_transform(texture_stage, general_transform);
These operations accumulate through nested nodes just like standard
scene graph transforms. In fact, you can get and set relative texture
transforms:
rel_offset = nodepath.get_tex_offset(other, texture_stage);
nodepath.set_tex_scale(other, texture_stage, u_scale, v_scale);
(etc.)
You may create LerpIntervals to lerp texture matrices. There are no
interval types that operate directly on a texture matrix, but you can
set up a TexProjectorEffect to bind a node's transform to the texture
matrix:
nodepath.set_tex_projector(texture_stage, from, to)
Where "from" and "to" are arbitrary NodePaths. The TexProjectorEffect
will measure the relative transform between "from" and "to" each frame
and apply it to the nodepath's texture matrix. Once this is in place,
you may create a LerpPosInterval, or any other Panda construct, to
adjust either the "from" or the "to" NodePath, which will thus
indirectly adjust the texture matrix by the same amount.
Sometimes, a texture stage may need to use a completely different set
of texture coordinates, for instance as provided by the artist who
generated the model. Panda allows a model to store any number of
different sets of texture coordinates on its vertices, each with a
unique name. You can associate any texture stage with any set of
texture coordinates you happen to have available on your model:
texture_stage.set_texcoord_name(name)
Finally, you may need to generate texture coordinates for a particular
texture stage on the fly. This is particularly useful, for instance,
to apply reflection maps, e.g. sphere maps or cube maps. To enable
this effect, use:
nodepath.set_tex_gen(texture_stage, mode)
Where mode is one of the enumerated types named by TexGenAttrib::Mode;
at the present, this may be any of M_world_position,
M_object_position, M_eye_position, or M_sphere_map. The first three
modes simply apply the X, Y, Z coordinates of the vertex to its U, V
texture coordinates (a texture matrix may then be applied to transform
the generated texture coordinates into the particular U, V coordinate
space that you require). The remaining modes generate texture
coordinates appropriate to a reflection map of the corresponding type,
based on the position and normal of each vertex, relative to the
camera.
The texture generation mode and the tex projector mode may be combined
to provide hardware-assisted projective texturing, where a texture is
applied to geometry as if it were projected from a particular point in
space, like a slide projector. This is particularly useful for
applying shadow maps or flashlight effects, for instance. There is a
convenience function on NodePath that automatically makes the three
separate calls needed to enable projective texturing:
nodepath.project_texture(texture_stage, texture, projector);
Where projector is a NodePath that references a LensNode. The
indicated texture is applied to the geometry at nodepath and below, as
if it were projected from the indicated projector. The lens
properties such as field of view may be adjusted on the fly to adjust
the projection.
(Note that Panda also provides a ProjectionScreen object, which
performs an effect very similar to the project_texture() call, except
that it is performed entirely in the CPU, whereas project_texture()
will offload the work onto the graphics card if the card supports
this. This may or may not result in a performance improvement over
ProjectionScreen, depending on the nature of your scene and your CPU
load versus your graphics card load.)

View File

@ -1,403 +0,0 @@
QUICK INTRODUCTION
PStats is Panda's built-in performance analysis tool. It can graph
frame rate over time, and can further graph the work spent within each
frame into user-defined subdivisions of the frame (for instance, app,
cull and draw), and thus can be an invaluable tool in identifying
performance bottlenecks. It can also show frame-based data that
reflects any arbitrary quantity other than time intervals, for
instance, texture memory in use or number of vertices drawn.
The performance graphs may be drawn on the same computer that is
running the Panda client, or they may be drawn on another computer on
the same LAN, which is useful for analyzing fullscreen applications.
The remote computer need not be running the same operating system as
the client computer.
To use PStats, you first need to build the PStats server program,
which is part of the Pandatool tree (it's called pstats.exe on
Windows, and gtk-stats on a Unix platform). Start by running the
PStats server program (it runs in the background), and then start your
Direct/Panda client with the following in your Config.prc file:
want-pstats 1
Or, at runtime, issue the Python command:
PStatClient.connect()
Or if you're running pview, press shift-S.
Any of the above will contact your running PStats server program,
which will proceed to open a window and start a running graph of your
client's performance. If you are running the server on a different
machine than the client, add the pstats-host variable to your client's
Config.prc file, naming the hostname or IP address of the machine
running the PStats server.
If you are developing Python code, you may be interested in reporting
the relative time spent within each Python task (by subdividing the
total time spent in Python, as reported under "Show Code"). To do
this, add the following lines to your Config.prc file before you start
ShowBase:
task-timer-verbose 1
pstats-tasks 1
THE PSTATS SERVER (The user interface)
The GUI for managing the graphs and drilling down to view more detail
is entirely controlled by the PStats server program. At the time of
this writing, there are two different versions of the PStats server,
one for Unix called gtk-stats and one for Windows called simply
pstats. The interfaces are similar but not identical; the following
paragraphs describe the Windows version.
When you run pstats.exe, it adds a program to the taskbar but does not
immediately open a window. The program name is typically "PStats
5185", showing the default PStats TCP port number of 5185; see "HOW IT
WORKS" below for more details about the TCP communication system. For
the most part you don't need to worry about the port number, as long
as server and client agree (and the port is not already being used by
another application).
Each time a client connects to the PStats server, a new monitor window
is created. This monitor window owns all of the graphs that you
create to view the performance data from that particular connection.
Initially, a strip chart showing the frame time of the main thread is
created by default; you can create additional graphs by selecting from
the Graphs pulldown menu.
Time-based Strip Charts
This is the graph type you will use most frequently to examine
performance data. The horizontal axis represents the passage of time;
each frame is represented as a vertical slice on the graph. The
overall height of the colored bands represents the total amount of
time spent on each frame; within the frame, the time is further
divided into the primary subdivisions represented by different color
bands (and labeled on the left). These subdivisions are called
"collectors" in the PStats terminology, since they represent time
collected by different tasks.
Normally, the three primary collectors are App, Cull, and Draw, the
three stages of the graphics pipeline. Atop these three colored
collectors is the label "Frame", which represents any remaining time
spent in the frame that was not specifically allocated to one of the
three child collectors (normally, there should not be significant time
reported here).
The frame time in milliseconds, averaged over the past three seconds,
is drawn above the upper right corner of the graph. The labels on the
guide bars on the right are also shown in milliseconds; if you prefer
to think about a target frame rate rather than an elapsed time in
milliseconds, you may find it useful to select "Hz" from the Units
pulldown menu, which changes the time units accordingly.
The running Panda client suggests its target frame rate, as well as
the initial vertical scale of the graph (that is, the height of the
colored bars). You can change the scale freely by clicking within the
graph itself and dragging the mouse up or down as necessary. One of
the horizontal guide bars is drawn in a lighter shade of gray; this
one represents the actual target frame rate suggested by the client.
The other, darker, guide bars are drawn automatically at harmonic
subdvisions of the target frame rate. You can change the target frame
rate with the Config.prc variable pstats-target-frame-rate on the
client.
You can also create any number of user-defined guide bars by dragging
them into the graph from the gray space immediately above or below the
graph. These are drawn in a dashed blue line. It is sometimes useful
to place one of these to mark a performance level so it may be
compared to future values (or to alternate configurations).
The primary collectors labeled on the left might themselves be further
subdivided, if the data is provided by the client. For instance, App
is often divided into Show Code, Animation, and Collisions, where Show
Code is the time spent executing any Python code, Animation is the
time used to compute any animated characters, and Collisions is the
time spent in the collision traverser(s).
To see any of these further breakdowns, double-click on the
corresponding colored label (or on the colored band within the graph
itself). This narrows the focus of the strip chart from the overall
frame to just the selected collector, which has two advantages.
Firstly, it may be easier to observe the behavior of one particular
collector when it is drawn alone (as opposed to being stacked on top
of some other color bars), and the time in the upper-right corner will
now reflect just the total time spent within just this collector.
Secondly, if there are further breakdowns to this collector, they will
now be shown as further colored bars. As in the Frame chart, the
topmost label is the name of the parent collector, and any time shown
in this color represents time allocated to the parent collector that
is not accounted for by any of the child collectors.
You can further drill down by double-clicking on any of the new
labels; or double-click on the top label, or the white part of the
graph, to return back up to the previous level.
Value-based Strip Charts
There are other strip charts you may create, which show arbitrary
kinds of data per frame other than elapsed time. These can only be
accessed from the Graphs pulldown menu, and include things such as
texture memory in use and vertices drawn. They behave similarly to
the time-based strip charts described above.
Piano Roll Charts
This graph is used less frequently, but when it is needed it is a
valuable tool to reveal exactly how the time is spent within a frame.
The PStats server automatically collects together all the time spent
within each collector and shows it as a single total, but in reality
it may not all have been spent in one continuous block of time.
For instance, when Panda draws each display region in single-threaded
mode, it performs a cull traversal followed by a draw traversal for
each display region. Thus, if your Panda client includes multiple
display regions, it will alternate its time spent culling and drawing
as it processes each of them. The strip chart, however, reports only
the total cull time and draw time spent.
Sometimes you really need to know the sequence of events in the frame,
not just the total time spent in each collector. The piano roll chart
shows this kind of data. It is so named because it is similar to the
paper music roll for an old-style player piano, with holes punched
down the roll for each note that is to be played. The longer the
hole, the longer the piano key is held down. (Think of the chart as
rotated 90 degrees from an actual piano roll. A player piano roll
plays from bottom to top; the piano roll chart reads from left to
right.)
Unlike a strip chart, a piano roll chart does not show trends; the
chart shows only the current frame's data. The horizontal axis shows
time within the frame, and the individual collectors are stacked up in
an arbitrary ordering along the vertical axis.
The time spent within the frame is drawn from left to right; at any
given time, the collector(s) that are active will be drawn with a
horizontal bar. You can observe the CPU behavior within a frame by
reading the graph from left to right. You may find it useful to
select "pause" from the Speed pulldown menu to freeze the graph on
just one frame while you read it.
Note that the piano roll chart shows time spent within the frame on
the horizontal axis, instead of the vertical axis, as it is on the
strip charts. Thus, the guide bars on the piano roll chart are
vertical lines instead of horizontal lines, and they may be dragged in
from the left or the right sides (instead of from the top or bottom,
as on the strip charts). Apart from this detail, these are the same
guide bars that appear on the strip charts.
The piano roll chart may be created from the Graphs pulldown menu.
Additional threads
If the panda client has multiple threads that generate PStats data,
the PStats server can open up graphs for these threads as well. Each
separate thread is considered unrelated to the main thread, and may
have the same or an independent frame rate. Each separate thread will
be given its own pulldown menu to create graphs associated with that
thread; these auxiliary thread menus will appear on the menu bar
following the Graphs menu. At the time of this writing, support for
multiple threads within the PStats graph is largely theoretical and
untested.
HOW TO DEFINE YOUR OWN COLLECTORS
The PStats client code is designed to be generic enough to allow users
to define their own collectors to time any arbitrary blocks of code
(or record additional non-time-based data), from either the C++ or the
Python level.
The general idea is to create a PStatCollector for each separate block
of code you wish to time. The name which is passed to the
PStatCollector constructor is a unique identifier: all collectors that
share the same name are deemed to be the same collector.
Furthermore, the collector's name can be used to define the
hierarchical relationship of each collector with other existing
collectors. To do this, prefix the collector's name with the name of
its parent(s), followed by a colon separator. For instance,
PStatCollector("Draw:Flip") defines a collector named "Flip", which is
a child of the "Draw" collector, defined elsewhere.
You can also define a collector as a child of another collector by
giving the parent collector explicitly followed by the name of the
child collector alone, which is handy for dynamically-defined
collectors. For instance, PStatCollector(draw, "Flip") defines the
same collector named above, assuming that draw is the result of the
PStatCollector("Draw") constructor.
Note that, because of an unfortunate limitation with the interrogate
parser, statically-defined PStatCollector objects can't be parsed by
interrogate. (In general, interrogate can't parse C++ objects that
are constructed with parameters at the outermost scoping level.) As a
workaround, we usually protect these declarations from interrogate by
using the syntax #ifndef CPPPARSER .. #endif.
Once you have a collector, simply bracket the region of code you wish
to time with collector.start() and collector.stop(). It is important
to ensure that each call to start() is matched by exactly one call to
stop(). If you are programming in C++, it is highly recommended that
you use the PStatTimer class to make these calls automatically, which
guarantees the correct pairing; the PStatTimer's constructor calls
start() and its destructor calls stop(), so you may simply define a
PStatTimer object at the beginning of the block of code you wish to
time. If you are programming in Python, you must call start() and
stop() explicitly.
When you call start() and there was another collector already started,
that previous collector is paused until you call the matching stop()
(at which time the previous collector is resumed). That is, time is
accumulated only towards the collector indicated by the innermost
start() .. stop() pair.
Time accumulated towards any collector is also counted towards that
collector's parent, as defined in the collector's constructor
(described above).
It is important to understand the difference between collectors nested
implicitly by runtime start/stop invocations, and the static hierarchy
implicit in the collector definition. Time is accumulated in parent
collectors according to the statically-defined parents of the
innermost active collector only, without regard to the runtime stack
of paused collectors.
For example, suppose you are in the middle of processing the "Draw"
task and have therefore called start() on the "Draw" collector. While
in the middle of processing this block of code, you call a function
that has its own collector called "Cull:Sort". As soon as you start
the new collector, you have paused the "Draw" collector and are now
accumulating time in the "Cull:Sort" collector. Once this new
collector stops, you will automatically return to accumulating time in
the "Draw" collector. The time spent within the nested "Cull:Sort"
collector will be counted towards the "Cull" total time, not the
"Draw" total time.
Color and Other Optional Collector Properties
If you do not specify a color for a particular collector, it will be
assigned a random color at runtime. At present, the only way to
specify a color is to modify
panda/src/pstatclient/pStatProperties.cxx, and add a line to the table
for your new collector(s). You can also define additional properties
here such as a suggested initial scale for the graph and, for
non-time-based collectors, a unit name and/or scale factor. The order
in which these collectors are listed in this table is also relevant;
they will appear in the same order on the graphs. The first column
should be set to 1 for your new collectors unless you wish them to be
disabled by default. You must recompile the client (but not the
server) to reflect changes to this table.
HOW IT WORKS (What's actually happening)
The PStats code is divided into two main parts: the client code and
the server code.
The PStats Client
The client code is in panda/src/pstatclient, and is available to run
in every Panda client unless it is compiled out. (It will be compiled
out if OPTIMIZE is set to level 4, unless DO_PSTATS is also explicitly
set to non-empty. It will also be compiled out if NSPR is not
available, since both client and server depend on the NSPR library to
exchange data, even when running the server on the same machine as the
client.)
The client code is designed for minimal runtime overhead when it is
compiled in but not enabled (that is, when the client is not in
contact with a PStats server), as well as when it is enabled (when the
client is in contact with a PStats server). It is also designed for
zero runtime overhead when it is compiled out.
There is one global PStatClient class object, which manages all of the
communications on the client side. Each PStatCollector is simply an
index into an array stored within the PStatClient object, although the
interface is intended to hide this detail from the programmer.
Initially, before the PStatClient has established a connection, calls
to start() and stop() simply return immediately.
When you call PStatClient.connect(), the client attempts to contact
the PStatServer via a TCP connection to the hostname and port named in
the pstats-host and pstats-port Config.prc variables, respectively.
(The default hostname and port are localhost and 5185.) You can also
pass in a specific hostname and/or port to the connect() call. Upon
successful connection and handshake with the server, the PStatClient
sends a list of the available collectors, along with their names,
colors, and hierarchical relationships, on the TCP channel.
Once connected, each call to start() and stop() adds a collector
number and timestamp to an array maintained by the PStatClient. At
the end of each frame, the PStatClient boils this array into a
datagram for shipping to the server. Each start() and stop() event
requires 6 bytes; if the resulting datagram will fit within a UDP
packet (1K bytes, or about 84 start/stop pairs), it is sent via UDP;
otherwise, it is sent on the TCP channel. (Some fraction of the
packets that are eligible for UDP, from 0% to 100%, may be sent via
TCP instead; you can specify this with the pstats-tcp-ratio Config.prc
variable.)
Also, to prevent flooding the network and/or overwhelming the PStats
server, only so many frames of data will be sent per second. This
parameter is controlled by the pstats-max-rate Config.prc variable and
is set to 30 by default. (If the packets are larger than 1K, the max
transmission rate is also automatically reduced further in
proportion.) If the frame rate is higher than this limit, some frames
will simply not be transmitted. The server is designed to cope with
missing frames and will assume missing frames are similar to their
neighbors.
The server does all the work of analyzing the data after that. The
client's next job is simply to clear its array and prepare itself for
the next frame.
The PStats Server
The generic server code is in pandatool/src/pstatserver, and the
GUI-specific server code is in pandatool/src/gtk-stats and
pandatool/src/win-stats, for Unix and Windows, respectively. (There
is also an OS-independent text-stats subdirectory, which builds a
trivial PStats server that presents a scrolling-text interface. This
is mainly useful as a proof of technology rather than as a usable
tool.)
The GUI-specific code is the part that manages the interaction with
the user via the creation of windows and the handling of mouse input,
etc.; most of the real work of interpreting the data is done in the
generic code in the pstatserver directory.
The PStatServer owns all of the connections, and interfaces with the
NSPR library to communicate with the clients. It listens on the
specified port for new connections, using the pstats-port Config.prc
variable to determine the port number (this is the same variable that
specifies the port to the client). Usually you can leave this at its
default value of 5185, but there may be some cases in which that port
is already in use on a particular machine (for instance, maybe someone
else is running another PStats server on another display of the same
machine).
Once a connection is received, it creates a PStatMonitor class (this
class is specialized for each of the different GUI variants) that
handles all the data for this particular connection. In the case of
the windows pstats.exe program, each new monitor instance is
represented by a new toplevel window. Multiple monitors can be
active at once.
The work of digesting the data from the client is performed by the
PStatView class, which analyzes the pattern of start and stop
timestamps, along with the relationship data of the various
collectors, and boils it down into a list of the amount of time spent
in each collector per frame.
Finally, a PStatStripChart or PStatPianoRoll class object defines the
actual graph output of colored lines and bars; the generic versions of
these include virtual functions to do the actual drawing (the GUI
specializations of these redefine these methods to make the
appropriate calls).