Now GLRasteizationContexts require having an active GLContext. This will
allow preserving GLContexts and possibly framebuffers between
rasterization sessions, improving GL Rasterization performance.
Linux Before:
+ Painting Per Tile 4.5559 4.3392 1.6920 18.5548 74
Painting 170.1554 151.8353 0.0008 350.1093 28
Linux After:
+ Painting Per Tile 3.8726 3.1299 1.5848 12.6732 62
Painting 13.5480 10.8947 0.0029 39.1198 23
To actually make the multiprocess communication work, we'll need to
reroute the task creation to the pipeline or the compositor. But this
works as a first step.
profile: Make the time and memory profilers run over IPC.
Uses a couple of extra threads to work around the lack of cross-process
boxed trait objects.
r? @nnethercote
<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/servo/servo/6629)
<!-- Reviewable:end -->
We currently store LayerBuffers, because previously NativeSurfaces did
not record their own size. Now we can store NativeSurfaces directly,
which saves a bit of space in the surface cache and allows us to create
LayerBuffers only in the PaintTask.
This also means that instead of sending cached LayerBuffers, the
compositor can just send cached NativeSurfaces to the PaintTask.
By doing this on either side of the call to the relevant tasks' start()
method, we don't need to store the mem::ProfilerChan or the reporter
name in the task itself.
This commit introduces the `serde` dependency, which we will use to
serialize messages going between processes in multiprocess Servo.
This also adds a new debugging flag, `-Z print-display-list-json`,
allowing the output of display list serialization to be visualized.
This will be useful for our experiments with alternate rasterizers.
GLRasterizationContext is now responsible for doing GPU rasterization.
It can coexist with its target NativeSurface, so we don't have to
continually recreate NativeSurfaces when doing GPU rasterization.
Now that NativeDisplay can be shared between the compositor and the
paint task, we can move the LayerBuffer cache to the compositor. This
allows surfaces to be potentially reused between different paint tasks
and will eventually allow OpenGL contexts to be preserved between
instances of GL rasterization.
* Add parser support for 3d transforms.
* Change ComputedMatrix to a representation that suits interpolation.
* Switch stacking contexts to use 4x4 matrices.
The transforms themselves are still converted to 2d and handled by azure for now, but this is a small standalone part that can be landed now to make it easier to review.
<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/servo/servo/6214)
<!-- Reviewable:end -->
* Add parser support for 3d transforms.
* Change ComputedMatrix to a representation that suits interpolation.
* Switch stacking contexts to use 4x4 matrices.
The transforms themselves are still converted to 2d and handled by azure for now, but this is a small standalone part that can be landed now to make it easier to review.
Example output from the memory profiler:
```
| 1.04 MiB -- url(http://en.wikipedia.org/wiki/Main_Page)
| 0.26 MiB -- display-list
| 0.78 MiB -- paint-task # new output line
| 0.78 MiB -- buffer-map # new output line
```
The buffer maps aren't huge, but they're worth measuring, and it's good
to get the memory profiler plumbing into PaintTask.
The basic idea is it's safe to output an image for reftest by testing:
- That the compositor doesn't have any animations active.
- That the compositor is not waiting on any outstanding paint messages to arrive.
- That the script tasks are "idle" and therefore won't cause reflow.
- This currently means page loaded, onload fired, reftest-wait not active, first reflow triggered.
- It could easily be expanded to handle pending timers etc.
- That the "epoch" that the layout tasks have last laid out after script went idle, is reflected by the compositor in all visible layers for that pipeline.
A rebuild after touching components/profile/mem.rs now takes 48 seconds (and
only rebuilds `profile` and `servo`) which is much lower than it used to be.
In comparison, a rebuild after touching components/profile_traits/mem.rs takes
294 seconds and rebuilds many more crates.
This change also removes some unnecessary crate dependencies in `net` and
`net_traits`.
* Simpler image cache API for clients to use.
* Significantly fewer threads.
* One thread for image cache task (multiplexes commands, decoder threads and async resource requests).
* 4 threads for decoder worker tasks.
* Removed ReflowEvent hacks in script and layout tasks.
* Image elements pass a Trusted<T> to image cache, which is used to dirty nodes via script task. Previous use of Untrusted addresses was unsafe.
* Image requests such as background-image on layout / paint threads trigger repaint only rather than full reflow.
* Add reflow batching for when multiple images load quickly.
* Reduces the number of paints loading wikipedia from ~95 to ~35.
* Reasonably simple to add proper prefetch support in a follow up PR.
* Async loaded images always construct Image fragments now, instead of generic.
* Image fragments support the image not being present.
* Simpler implementation of synchronous image loading for reftests.
* Removed image holder.
* image.onload support.
* image NaturalWidth and NaturalHeight support.
* Updated WPT expectations.
Transition events are not yet supported, and the only animatable
properties are `top`, `right`, `bottom`, and `left`. However, all other
features of transitions are supported. There are no automated tests at
present because I'm not sure how best to test it, but three manual tests
are included.
- Most of util::memory has been moved into profile::mem, though the
`SizeOf` trait and related things remain in util::memory. The
`SystemMemoryReporter` code is now in a submodule
profile::mem::system_reporter.
- util::time has been moved entirely into profile::time.
Previously this used the number of layout threads to allocate the
threadpool. This also makes the member name consistent with the rest of
the structure.