m3dude1's forum posts

Avatar image for m3dude1
#1 Posted by m3dude1 (2336 posts) -

the power gap between consoles and high end PCS will narrow considerably going forward compared to previous generations. performance increases are smaller and take much longer than ever on both the cpu and gpu front. we are also nearly at the end of silicon shrinkage. whether or not this translates to consoles taking a bigger % of the market i dont know

Avatar image for m3dude1
#2 Posted by m3dude1 (2336 posts) -

decima engine

Avatar image for m3dude1
#3 Posted by m3dude1 (2336 posts) -

@ronvalencia: its not that simple. many devs have discussed over the years that register pressure is much more of a problem on GCN than nvidia when it comes to getting high levels of utilization. lots of them on Beyond3d

Avatar image for m3dude1
#4 Edited by m3dude1 (2336 posts) -

@ronvalencia: i wouldnt use those numbers as anything other than a theroetical best case. Also i meant register pressure, not cache. I had a brain fart. The reduced register pressure is probably the biggest benefit to rpm on AMD cards. Register bottlenecks are huge on AMD

Avatar image for m3dude1
#5 Posted by m3dude1 (2336 posts) -

@ronvalencia: no it doesnt. the biggest benefit of fp16 in terms of something related to bandwidth is down to cache, something which delta color compression doesnt affect at all AFAIK

Avatar image for m3dude1
#6 Posted by m3dude1 (2336 posts) -

@ronvalencia said:
@m3dude1 said:
@ronvalencia said:
@m3dude1 said:

@ronvalencia: source that RE2 REmake uses rapid packed math? also the 2080ti performance there is 31% faster than 1080ti. thats completely standard with the majority of games.

https://www.overclock3d.net/reviews/software/resident_evil_2_remake_pc_performance_review/13

Capcom has fundamentally changed Resident Evil 2, creating what the game would have been if it were created today, not what the original would look like with enhanced visuals, forging a game that will surpass the original for many. On PC we also get to see the game push beyond the other versions of the remake on a technological level, supporting advanced HBAO+ ambient occlusion, AMD's Rapid Packed Math acceleration tech, FP16 compute and other graphical settings that can push past all of the game's console version.

DirectML API enables Rapid Pack Maths and machine learning instruction set hardware access be to uniformed across multiple GPU vendors.

Turing CUDA has full Rapid Pack Math in addition to Tensor matrix math cores. AMD has merged machine learning instruction set with GCN's CUs.

Both NVIDIA and AMD are following Microsoft's DirectX12 evolution road maps.

RTX 2080 Ti's tensor cores and rapid pack math feature set are enabled which can overlap with workstation GPU cards and similar argument for VII vs MI50/MI60.

DIrectML doesnt exist yet outside of microsoft R&D. RPM on amd cards is used via their shader intrinsics so im doubtful its enabled on NVIDIA gpus, especially considering benchmarks. AFAIK the only game to expose fp16 on nvidia turing gpus is wolfenstien 2, part of the abnormally large performance increase on turing. theres currently no API standard way to utilize RPM on amd and nvidia concurrently. it has to be done thru each IHVs specific instructions outside of standard API calls

1. According to Microsoft, DirectML will use NVIDIA's Tensor hardware.

2. DirectML's metacommands expose hardware specifics optimizations. It's effectively Microsoft is building another "Xbox" on Windows PC with vendor neutral API hardware access.

3. DirectML to perform better than hand written compute shaders! Shader Model 6 has a short life.

4. DirectML is coming with the next major Windows 10 update.

5. DirectML has been confirmed to run on Radeon VII.

Reference

1,2,3,4, http://on-demand.gputechconf.com/siggraph/2018/video/sig1814-2-adrian-tsai-gpu-inferencing-directml-and-directx-12.html

5, https://www.guru3d.com/news-story/amd-could-do-dlss-alternative-with-radeon-vii-though-directml-api.html and https://wccftech.com/amd-radeon-vii-excellent-result-directml/

AMD: Radeon VII Has Excellent Results with DirectML; We Could Try a GPGPU Approach for Something NVIDIA DLSS-like

AMD is already working on DirectML SDK for Radeon VII which is outside Microsoft's R&D.

Again, AMD and NVIDIA is following Microsoft's DirectX12 evolution road map.

Turing has heavy TFLOPS bias relative to it's raster power, hence Turing more closely resembles AMD GCN than Pascal in that portions of the GPU aren't fully used. Multi-engine allows better use of both GPUs. Over time, Turing will probably age better than some of the previous Nvidia cards unless Nvidia reduces support similar to Kepler.

RE 2 still wont make use of fp16 on nvidia gpus without capcom modifying the code. when DML finally releases it wont retrofit fp16 into existing games where it was used with AMD specific extensions

Avatar image for m3dude1
#7 Edited by m3dude1 (2336 posts) -

@ronvalencia said:
@m3dude1 said:

@ronvalencia: source that RE2 REmake uses rapid packed math? also the 2080ti performance there is 31% faster than 1080ti. thats completely standard with the majority of games.

https://www.overclock3d.net/reviews/software/resident_evil_2_remake_pc_performance_review/13

Capcom has fundamentally changed Resident Evil 2, creating what the game would have been if it were created today, not what the original would look like with enhanced visuals, forging a game that will surpass the original for many. On PC we also get to see the game push beyond the other versions of the remake on a technological level, supporting advanced HBAO+ ambient occlusion, AMD's Rapid Packed Math acceleration tech, FP16 compute and other graphical settings that can push past all of the game's console version.

DirectML API enables Rapid Pack Maths and machine learning instruction set hardware access be to uniformed across multiple GPU vendors.

Turing CUDA has full Rapid Pack Math in addition to Tensor matrix math cores. AMD has merged machine learning instruction set with GCN's CUs.

Both NVIDIA and AMD are following Microsoft's DirectX12 evolution road maps.

RTX 2080 Ti's tensor cores and rapid pack math feature set are enabled which can overlap with workstation GPU cards and similar argument for VII vs MI50/MI60.

DIrectML doesnt exist yet outside of microsoft R&D. RPM on amd cards is used via their shader intrinsics so im doubtful its enabled on NVIDIA gpus, especially considering benchmarks. AFAIK the only game to expose fp16 on nvidia turing gpus is wolfenstien 2, part of the abnormally large performance increase on turing. theres currently no API standard way to utilize RPM on amd and nvidia concurrently. it has to be done thru each IHVs specific instructions outside of standard API calls

Avatar image for m3dude1
#8 Edited by m3dude1 (2336 posts) -

@ronvalencia: source that RE2 REmake uses rapid packed math? also the 2080ti performance there is 31% faster than 1080ti. thats completely standard with the majority of games.

Avatar image for m3dude1
#9 Posted by m3dude1 (2336 posts) -

game still looks very good, but not as good as the reveal footage. i played the recent alpha or whatever it was

Avatar image for m3dude1
#10 Edited by m3dude1 (2336 posts) -

its not surprising pc gamers took last place. outside of the miniscule number of people who play as their job, they are just the absolute worst droolers imaginable