http://www.anandtech.com/show/11740/hot-chips-microsoft-xbox-one-x-scorpio-engine-live-blog-930am-pt-430pm-utc
From Hot Chips conference in Cupertino

- Conservative occlusion query. Perhaps Conservative Rasterization.
- OOO (Out Of Order) Rasterization aka Rasterizer Ordered Views. Can boost transparency/alpha/blend ROPS performance and reduce shader workaround methods.
For reference

GCN 5 = Vega.
- OOO (Out Of Order) Rasterization similar to Rasterizer Ordered Views
https://msdn.microsoft.com/en-us/library/windows/desktop/dn914601(v=vs.85).aspx
Rasterizer Order Views
This enables Order Independent Transparency (OIT) algorithms to work, which give much better rendering results when multiple transparent objects are in line with each other in a view.
- Conservative occlusion query. Perhaps Conservative Rasterization.
https://msdn.microsoft.com/en-us/library/windows/desktop/dn914594(v=vs.85).aspx
Conservative Rasterization
Conservative rasterization is useful in a number of situations, including for certainty in collision detection, occlusion culling, and visibility detection.
From http://www.anandtech.com/show/11740/hot-chips-microsoft-xbox-one-x-scorpio-engine-live-blog-930am-pt-430pm-utc#post0821123606
12:36PM EDT - 8x 256KB render caches
12:37PM EDT - 2MB L2 cache with bypass and index buffer access
12:38PM EDT - out of order rasterization, 1MB parameter cache, delta color compression, depth compression, compressed texture access
X1X's GPU's Render Back Ends (RBE) has 256KB cache each and there's 8 of them, hence 2 MB render cache.
X1X's GPU has 2 MB L2 cache, 1 MB parameter cache and 2MB render cache. That's 5 MB of cache.
https://www.amd.com/Documents/GCN_Architecture_whitepaper.pdf
The old GCN's render back end (RBE) cache. Page 13 of 18.
Once the pixels fragments in a tile have been shaded, they flow to the Render Back-Ends (RBEs). The RBEs apply depth, stencil and alpha tests to determine whether pixel fragments are visible in the final frame. The visible pixels fragments are then sampled for coverage and color to construct the final output pixels. The RBEs in GCN can access up to 8 color samples (i.e. 8x MSAA) from the 16KB color caches and 16 coverage samples (i.e. for up to 16x EQAA) from the 4KB depth caches per pixel. The color samples are blended using weights determined by the coverage samples to generate a final anti-aliased pixel color. The results are written out to the frame buffer, through the memory controllers
GCN version 1.0's RBE cache size is just 20 KB. 8x RBE = 160 KB render cache (for 7970)
AMD R9-290X/R9-390X's aging RBE/ROPS comparison. https://www.slideshare.net/DevCentralAMD/gs4106-the-amd-gcn-architecture-a-crash-course-by-layla-mah

16 RBE with each RBE contains 4 ROPS. Each RBE has 24 bytes cache.
24 bytes x 16 = 384 bytes.
X1X's RBE/ROPS has 2048 KB or 2 MB render cache.
X1X's RBE has 256 KB. 8x RBE = 2048 KB (or 2 MB) render cache. X1X has hold more rendering data on the chip when compared to Radeon HD 7970.
http://www.eurogamer.net/articles/digitalfoundry-2017-project-scorpio-tech-revealed
We quadrupled the GPU L2 cache size, again for targeting the 4K performance."
X1X GPU's 2MB L2 cache being used for rendering in addition to 2 MB render cache.
For comparison, Vega 56 has 4 MB L2 cache which is directly accessible by RBE/ROPS and TMUs.
http://www.anandtech.com/show/10446/the-amd-radeon-rx-480-preview
RX-480 has 5.7 billion.
https://www.allaboutcircuits.com/news/microsofts-scorpio-system-on-chip/
X1X GPU has 7 billion transistors.
Log in to comment