25331 Commits

Author SHA1 Message Date
Disyer
a08e42d015 makepanda: Create 7-zip debug symbol archives by default, if available
7-zip archives will only be created if 7-zip is available during the build phase. When 7-zip is unavailable, ZIP archives will be created as a fallback.

Benchmarks:

- Default ZIP compression: ~23.5 seconds, 162 MB
- 7-zip compression: ~7.5 seconds, 108 MB
- 7-zip compression, --lzma set: ~44 seconds, 88 MB
- 7-zip compression, solid archive: ~5 minutes, 83 MB (not implemented)

Closes #1261
2022-02-24 11:55:06 +01:00
rdb
4df8c86590 tests: Add unit test for PandaNode prev_transform tracking mechanism 2022-02-24 11:43:44 +01:00
rdb
5695d1a719 tests: Add separate unit test for AsyncFuture.wait() with timeout
Since it's using a different implementation than the no-timeout version now
2022-02-24 11:43:11 +01:00
rdb
6bc22d1822 pgraph: Rewrite inefficient prev_transform tracking mechanism
The previous system was causing a lot of lock contention when transforms are modified in the Cull thread.

The new implementation doesn't use a linked list or lock at all, but a simple atomically incrementing integer that indicates that the prev transforms have changed.  set_transform() reads this and backs up the prev transform the first time a transform is modified after reset_all_prev_transforms() is called.
2022-02-24 11:42:04 +01:00
rdb
c356285212 dtoolbase: Compilation fix for broken STLs without atomic::value_type 2022-02-24 11:41:54 +01:00
rdb
ba4173b32c pgraph: Add constexpr to CacheStats constructor 2022-02-23 23:20:53 +01:00
rdb
c2d088f232 pipeline: Don't use Sleep(1) to yield on Windows
Use Sleep(0) instead.  Sleep(0) is not guaranteed to yield, which is a problem, but Sleep(1) can easily take up to 16 ms, which is really unacceptable except in very low-priority thread.  But really, you shouldn't be relying on force_yield() for anything except with the SIMPLE_THREADS model.

There is also SwitchToThread(), but in fact it is even weaker than Sleep(0).
2022-02-23 23:20:53 +01:00
rdb
cb8563acac event: Update AsyncFuture to use new atomics implementation
With explicit barriers, and the non-timeout version of wait() is now significantly more efficient by using the new futexes if available
2022-02-23 23:20:49 +01:00
rdb
70c49a6416 pipeline: Add Thread::relax() for more efficient busy waiting
Equivalent to cpu_relax() or the pause instruction on x86
2022-02-23 23:20:30 +01:00
rdb
c3ce8164bc dtoolbase: Add atomic wait and notify operations from C++20
Adds patomic_signed_lock_free, patomic_unsigned_lock_free, and patomic_flag with wait/notify methods modelled after C++20.  Implemented using futexes, falling back to a mutex+condition variable hash table if not supported.  (Currently the hash table has a fixed size of 64, which we could increase if necessary, but we really shouldn't even have a fraction of that number of simultaneously sleeping threads...)

Other atomic types are unaffected at the moment, in part because futexes are really restricted to 32-bit ints on Linux anyway
2022-02-23 23:20:26 +01:00
rdb
5196719f29 device: Fix XInput compile error compiling for newer Windows versions 2022-02-23 21:46:15 +01:00
rdb
fd033e66f1 pstats: Add support for profiling thread context switches
Disabled by default, enable with `pstats-thread-profiling true` in Config.prc
2022-02-22 18:02:39 +01:00
rdb
b4d51c24e9 event: Remove FunctionAsyncTask
To create a task from a lambda, it is more efficient to use the new AsyncTaskManager::add() short-hand which creates an AsyncTask subclass in-place.
2022-02-22 18:02:34 +01:00
rdb
7baeaf3809 event: New C++ AsyncTaskManager::add() no longer uses std::function
std::function has unnecessary overhead, better to just create an AsyncTask subclass in-place storing the closure

This obsoletes FunctionAsyncTask, it will be removed in a future commit
2022-02-22 17:02:42 +01:00
rdb
72c891c0df display: Fix issues with PStats GPU timing:
- Leaking queries by never reusing / releasing them
- Clock synchronization was way off when driver waited on GPU during sync point
2022-02-22 17:00:47 +01:00
rdb
284ffe9e83 pstats: Fix status bar when collector has level data on multiple threads
Status bar now shows total across all threads, and double-clicking it opens strip charts for all the threads that have data for it
2022-02-22 16:52:55 +01:00
rdb
759115fbc7 pstats: Fix crash when frame has only level data and no time data 2022-02-22 15:25:04 +01:00
rdb
a33fcab8da tests: Switch from deprecated ConditionVarFull to ConditionVar 2022-02-22 15:22:10 +01:00
rdb
8b5fc7d835 stdpy: Switch from deprecated ConditionVarFull to ConditionVar 2022-02-22 15:21:32 +01:00
rdb
0a3733ccb9 pstats: GPU timing improvements; use same frame numbering everywhere
Timer queries are significantly more efficient, are synchronized to CPU time, and the synchronized frame numbering makes it possible to correlate stuff in the Timeline view
2022-02-20 17:33:40 +01:00
rdb
65ee79158f showbase: Start recording right away when opening PStats connection
Don't wait until the next frame - makes it harder to diagnose long load times in the new Timeline view
2022-02-20 16:54:15 +01:00
rdb
739ad1ebd6 pstats: Fix strip chart scale glitches on Windows when switching collector 2022-02-20 16:50:54 +01:00
rdb
65cd882cb2 display: PStats collector reorganisation
Remove *:do_frame (which adds another stack frame with very little value), remove unused App:Delete collector, merge Flip Begin/End collectors
2022-02-20 16:23:14 +01:00
rdb
161ac4c2f7 pstats: Another major update for PStats server UI, including:
- New powerful scrolling Timeline view for seeing all time events across all threads
- Redo flame graph to use stack-based nesting rather than the standard collector nesting
- Rewrite flame graph drawing to not use labels
- Status bar appears in main window showing top-level level collectors; double-clicking them brings up their chart and right-clicking them shows their children
- Context menus are added when right-clicking labels and charts
- Tooltips now appear when mouse hovers over collector area in a chart
- Strip chart windows now automatically determine the appropriate scale better
- Graph menus redone to allow opening flame chart anywhere as well as strip chart
- Instead of just ms everywhere, also use s / us / ns where appropriate
- Don't disable smoothing right away on mouse down on strip chart, only after dragging
- Windows: The MDI child windows are quite ugly and overlap with the status bar, so instead they are now top-level windows, but some code is added to make them spawn inside and move with the parent window, and minimize to its corner.  I can back this out if people prefer the old behavior despite the ugly decoration
- Windows: Label text shows ellipsis when cut off
- Windows: Graph windows no longer have icons
- Windows: Graph windows no longer spawn perfectly on top of each other, rather cascading
- GTK: Render at high resolution when GDK_SCALE is not 1
- GTK: Graph windows are forced to be floating in tiling WMs
- GTK: Flame chart window no longer has useless dividing bar
- GTK: Use more efficient cairo surface types
2022-02-18 18:19:11 +01:00
rdb
c0c5eeb27e display: Don't start/stop collectors for empty window list 2022-02-18 17:24:43 +01:00
rdb
d7bbcfb0b7 pstats: Some collector reorganisation:
- "App:Show code:General" is gone, it was causing too much trouble
- Replace odd "Client::GuiObjects" with "Nodes:GUI"
- Regroup "Dirty PipelineCyclers" underneath "PipelineCyclers"
2022-02-17 12:48:36 +01:00
rdb
93b7ebffaa pstats: PStatClient.connect() should wait for UDP connection to be established
This makes the behavior of PStats more predictable, reducing missed frames at the beginning
2022-02-17 12:48:36 +01:00
rdb
cf9574b412 pstats: Add convenience method for ticking current thread only 2022-02-17 12:48:36 +01:00
rdb
aea2d6ef45 display: Release lock before notifying render thread in GraphicsEngine
Otherwise the render thread will wake up only to be blocked by the mutex right away.
2022-02-17 12:48:36 +01:00
rdb
07586c82e6 workflow: Update GitHub CI builder to Windows 2019/2022 2022-02-17 12:48:33 +01:00
Paul m. p. P
833ad89eba py_panda: Fix compilation issue with Python 3.11 2022-02-07 19:33:19 +01:00
rdb
25a468ba12 Merge branch 'release/1.10.x' 2022-02-07 19:31:04 +01:00
rdb
355cd5b4cd pstats: Remove unused field from PStatClient::InternalThread 2022-02-07 17:03:58 +01:00
rdb
77b0d2d6a7 pstats: Switch from AtomicAdjust to C++11-style atomics 2022-02-07 17:02:30 +01:00
rdb
b401884f1c makepanda: Support building with OpenEXR 3.0 or 3.1 on Windows 2022-02-07 11:12:10 +01:00
rdb
3c142a61ab makepanda: Properly detect keyboard interrupts on Windows 2022-02-07 11:10:32 +01:00
rdb
e27162df0b Merge branch 'release/1.10.x' 2022-02-06 15:32:24 +01:00
rdb
287b0d5a74 mathutil: Add proper __repr__ for LPlane class
Fixes #1248
2022-02-06 15:29:25 +01:00
rdb
a37dfa727e makepanda: Support building with mimalloc on Windows, experimentally
Partial backport of 07545bc9e318d1799ceabe8838d04d7ad9297a45 for Windows, requires building with `--override USE_MEMORY_MIMALLOC=1 --override USE_DELETED_CHAIN=UNDEF` for optimum effect
2022-02-06 15:29:25 +01:00
rdb
be2e07637f gtk-stats: Fix mouse motion detected outside strip chart graph area
Cherry-picked from 3a38543f65670b2d754838c5b08a556df1485a01
2022-02-06 15:29:25 +01:00
Disyer
bc6502a8fe makepanda: Record cache timestamps as integers rather than floats
We don't need the extra precision, in fact it is detrimental to restoring build caches in a cross-platform way.

This commit will invalidate all current build caches.

Cherry-picked from 2a904f398592ce7effedc4f12720be0cef9b6cc9 (see #1260)
2022-02-06 15:29:25 +01:00
rdb
a12359275f makepanda: Support building with OpenSSL 1.1.1 on Windows 2022-02-06 15:29:21 +01:00
rdb
94570f20aa pgraph: Remove need for grabbing lock in RenderState destructor 2022-02-05 22:25:51 +01:00
rdb
4e925a839a makepanda: Support building with mimalloc on non-Windows
For experimentation only - it's disabled by default unless you also specify --override USE_MEMORY_MIMALLOC=1 (I did not see a discernable benefit over glibc, but more experimentation is warranted, especially with older glibc versions)
2022-02-05 22:25:44 +01:00
rdb
5bb616dca7 pstatserver: Fix compilation error with STDFLOAT_DOUBLE=1
Regression in 7da70cf9399e2703ac9cacbb6977edeb173de159

Fixes #1259
2022-02-05 22:25:36 +01:00
Disyer
2a904f3985 makepanda: Record cache timestamps as integers rather than floats
We don't need the extra precision, in fact it is detrimental to restoring build caches in a cross-platform way.

This commit will invalidate all current build caches.
2022-02-05 23:16:59 +02:00
rdb
f30e87e7d1 CMake: Add FindGTK3.cmake file 2022-02-04 23:51:07 +01:00
rdb
07545bc9e3 dtoolbase: Use mimalloc on Windows, disable USE_DELETED_CHAIN
Windows' malloc has awful performance.  mimalloc is orders of magnitude faster, even faster than DeletedBufferChain.  Therefore, only enable USE_DELETED_CHAIN on Windows when building without mimalloc.

On Linux, mimalloc doesn't appear to be measurably faster than glibc's own allocator.  Both are marginally than DeletedBufferChain, though, and substantially faster in the multi-threaded case, so USE_DELETED_CHAIN is disabled there in all cases.
2022-02-04 23:50:57 +01:00
rdb
46a1ad3544 pipeline: Improve performance of Thread::get_current_thread() substantially
Speedup is realised by using thread-local variables.  Note that on Windows we can't inline get_current_thread, but it's still faster this way than calling TlsGetValue.

In theory the cache line alignment should help avoid false sharing but I have not profiled that extensively.
2022-02-04 23:49:39 +01:00
rdb
39d69f13de dtoolbase: Change DeletedBufferChain to use new C++11-style atomics 2022-02-04 20:52:31 +01:00