Commit Graph

58 Commits

Author SHA1 Message Date
microsoftv 5fe7b4b8ae resource tracking for buffer atomic 2024-08-17 13:58:25 -04:00
microsoftv e006f3a6af clang 2024-08-17 01:46:56 -04:00
microsoftv 9d7e97a6dc failed attempt to fix 2024-08-17 00:29:39 -04:00
microsoftv 386e6f9e5b Merge remote-tracking branch 'upstream/main' 2024-08-16 18:45:59 -04:00
microsoftv 3471051ede attempt fix shader comp 2024-08-16 18:45:55 -04:00
Dzmitry Dubrova dcb057dd7f
misc changes, part ?/? (#441)
* gui: add option to boot a game by choosing elf file

* core: some small implementations

* fs: implement open func

* add some validations

* spirv: add image format

* video_core: add eR16Uint to formats
2024-08-16 20:16:15 +03:00
TheTurtle 1d1c88ad31
control_flow_graph: Initial divergence handling (#434)
* control_flow_graph: Initial divergence handling

* cfg: Handle additional case

* spirv: Handle tgid enable bits

* clang format

* spirv: Use proper format

* translator: Add more instructions
2024-08-16 20:05:37 +03:00
microsoftv ba1ebeb854 rebase to upstream 2024-08-14 19:48:51 -04:00
microsoftv 9de11cd6bb Shared Atomics 2024-08-14 19:02:25 -04:00
psucien 9adc638220
shader_recompiler: basic implementation of `BUFFER_STORE_FORMAT_` (#431)
* shader_recompiler: basic implementation of buffer store w\ fmt conversion

* added `Format16` dfmt
2024-08-15 00:15:07 +02:00
microsoftv 89e564a400 Merge remote-tracking branch 'upstream/main' 2024-08-14 17:24:16 -04:00
Dzmitry Dubrova 6f4e1a47b9
core: misc changes (#430)
* core: misc changes

* video_core: add some formats for detiling

* clang format
2024-08-14 20:37:05 +02:00
TheTurtle d332a5e611
spirv: Simplify shared memory handling (#427)
* spirv: Simplify shared memory handling

* spirv: Ignore clip plane

* spirv: Fix image offsets

* ir_pass: Implement shared memory lowering pass

* NVIDIA doesn't like using shared mem in fragment shader and softlocks driver

* spirv: Add log for ignoring pos1
2024-08-14 19:01:17 +03:00
microsoftv fcf4b20f30 clang 2024-08-14 02:28:15 -04:00
microsoftv d4dab94acd Clang Format & UNREACHABLE_MSG 2024-08-14 02:26:05 -04:00
microsoftv 1ff5c65586 BUFFER_ATOMIC | DS_MINMAX_U32
- Emission of BufferAtomicU32
- Addition of Buffer opcodes to IR
- Translator for BUFFER_ATOMIC Opcode
- Translators for DS_MAXMIN_U32 Opcodes
2024-08-14 01:58:58 -04:00
TheTurtle 1fb0da9b89
video_core: Crucial buffer cache fixes + proper GPU clears (#414)
* translator: Use templates for stronger type guarantees

* spirv: Define buffer offsets upfront

* Saves a lot of shader instructions

* buffer_cache: Use dynamic vertex input when available

* Fixes issues when games like dark souls rebind vertex buffers with different stride

* externals: Update boost

* spirv: Use runtime array for ssbos

* ssbos can be large and typically their size will vary, especially in generic copy/clear cs shaders

* fs: Lock when doing case insensitive search

* Dark Souls does fs lookups from different threads

* texture_cache: More precise invalidation from compute

* Fixes unrelated render targets being cleared

* texture_cache: Use hashes for protect gpu modified images from reupload

* translator: Treat V_CNDMASK as float

* Sometimes it can have input modifiers. Worst this will cause is some extra calls to uintBitsToFloat and opposite. But most often this is used as float anyway

* translator: Small optimization for V_SAD_U32

* Fix review

* clang format
2024-08-13 09:21:48 +03:00
Vinicius Rangel dfcfd62d4f
spirv: fix image sample lod/clamp/offset translation (#402)
* spirv: fix image sample lod/clamp translation

* spirv: fix image sample offsets

* fix ImageSample opcodes & offset emission
2024-08-13 09:12:38 +03:00
psucien 3d0fdf11f0
Build stabilization (#413)
* shader_recompiler: fix for float convert and debug asserts

* libraries: kernel: correct return code on invalid semaphore

* amdgpu: additional case for cb extents retrieval heuristic

* removed redundant check in assert

* amdgpu: fix for linear tiling mode detection fin color buffers

* texture_cache: fix for unexpected scheduler flushes by detiler

* renderer_vulkan: missing depth barrier

* texture_cache: missed slices in rt view; + detiler format
2024-08-12 17:23:01 +03:00
IndecisiveTurtle 3fd2abdd5b vk_graphics_pipeline: Fix regression 2024-08-08 17:01:03 +03:00
TheTurtle 381ba8c7a5
video_core: Implement guest buffer manager (#373)
* video_core: Introduce buffer cache

* video_core: Use multi level page table for caches

* renderer_vulkan: Remove unused stream buffer

* fix build

* oops forgot optimize off
2024-08-08 15:02:10 +03:00
TheTurtle 159be2c7f4
video_core: Minor fixes (#366)
* data_share: Fix DS instruction

* vk_graphics_pipeline: Fix unnecessary invalidate

* spirv: Remove subgroup id

* vector_alu: Simplify mbcnt pattern

* shader_recompiler: More instructions

* clang format

* kernel: Fix cond memory leak and reduce spam

* liverpool: Print error on exception

* build fix
2024-08-05 13:45:28 +03:00
TheTurtle a7c9bfa5c5
shader_recompiler: Small instruction parsing refactor/bugfixes (#340)
* translator: Implemtn f32 to f16 convert

* shader_recompiler: Add bit instructions

* shader_recompiler: More data share instructions

* shader_recompiler: Remove exec contexts, fix S_MOV_B64

* shader_recompiler: Split instruction parsing into categories

* shader_recompiler: Better BFS search

* shader_recompiler: Constant propagation pass for cmp_class_f32

* shader_recompiler: Partial readfirstlane implementation

* shader_recompiler: Stub readlane/writelane only for non-compute

* hack: Fix swizzle on RDR

* Will properly fix this when merging this

* clang format

* address_space: Bump user area size to full

* shader_recompiler: V_INTERP_MOV_F32

* Should work the same as spirv will emit flat decoration on demand

* kernel: Add MAP_OP_MAP_FLEXIBLE

* image_view: Attempt to apply storage swizzle on format

* vk_scheduler: Barrier attachments on renderpass end

* clang format

* liverpool: cs state backup

* shader_recompiler: More instructions and formats

* vector_alu: Proper V_MBCNT_U32_B32

* shader_recompiler: Port some dark souls things

* file_system: Implement sceKernelRename

* more formats

* clang format

* resource_tracking_pass: Back to assert

* translate: Tracedata

* kernel: Remove tracy lock

* Solves random crashes in Dark Souls

* code: Review comments
2024-07-30 23:32:40 +02:00
Vinicius Rangel 680192a0c4
64 bits OP, impl V_ADDC_U32 & V_MAD_U64_U32 (#310)
* impl V_ADDC_U32 & V_MAD_U64_U32

* shader recompiler: add 64 bits version to get register / GetSrc

* fix V_ADDC_U32 carry

* shader recompiler: removed automatic conversion to force_flt in GetSRc

* shader recompiler: auto cast between u32 and u64 during ssa pass

* shader recompiler: fix SetVectorReg64 & standardize switches-case

* shader translate: fix overflow detection in V_ADD_I32

use vcc lo instead of vcc thread bit

* shader recompiler: more 64-bit work

- removed bit_size parameter from Get[Scalar/Vector]Register
- add BitwiseOr64
- add SetDst64 as a replacement for SetScalarReg64 & SetVectorReg64
- add GetSrc64 for 64-bit value

* shader recompiler: add V_MAD_U64_U32 vcc output

- add V_MAD_U64_U32 vcc output
- ILessThan for 64-bits

* shader recompiler: removed unnecessary changes & missing consts

* shader_recompiler: Add s64 type in constant propagation
2024-07-27 17:23:59 +03:00
TheTurtle a2cd1669b6
memory: Cleanups and refactors (#324)
* memory: Various fixes

* Added (Partial) sceKernelBatchMap/sceKernelBatchMap2

* memory: Rename and implement batch unmap

* memory: Remove uneeded assert

* memory: Commonize free search routine

* memory: Contains check is inclusive

* memory: Address some alignment issues

* clang format

---------

Co-authored-by: raziel1000 <ckraziel@gmail.com>
2024-07-25 23:01:12 +03:00
psucien 64459f1a76
Surface management rework (1/3) (#307)
* amdgpu: proper CB and DB sizes calculation; minor refactoring

* texture_cache: separate file for image_info

* texture_cache: image guest address moved into image info

* texture_cache: surface size calculation

* shader_recompiler: fixed sin/cos

Thanks to red_pring and gandalfthewhite0173

* initial preparations for subresources upload

* review comments
2024-07-20 12:51:21 +03:00
TheTurtle bfe3322977
spirv: Address some regressions in buffer loads (#304)
* spirv: Use correct index

* spirv: Fix indices during buffer load

* clang-format fix

* spirv: Index can be const

---------

Co-authored-by: georgemoralis <giorgosmrls@gmail.com>
2024-07-19 19:36:07 +03:00
Vladislav Mikhalin d0d7ef06e8
Fixed buffer_store_* regression (#302) 2024-07-18 21:04:12 +03:00
georgemoralis 439c0be9a6 clang format fix 2024-07-17 17:57:54 +03:00
IndecisiveTurtle cd009cfec6 shader_recompiler: Normal gathers 2024-07-17 16:49:45 +03:00
Vladislav Mikhalin f9e96793cc
Implemented load_buffer_format_* conversions (#295)
* Implemented load_buffer_format_* conversions

* clang-format insists on ugly things
2024-07-16 15:03:07 +03:00
georgemoralis c49afb4c17
Merge pull request #287 from polybiusproxy/dev
gnmdriver: Implement shader functions
2024-07-15 07:47:33 +03:00
psucien ed37fb32a7 review comments applied 2024-07-14 23:25:41 +02:00
psucien f041276b04 recompiler: added support for discard on export with masked EXEC 2024-07-13 14:57:01 +02:00
psucien 1b94f07a6a recompiler: proper VS inputs initialization 2024-07-13 01:00:24 +02:00
Stolas 2620919f0b
Added Legacy Min/Max ops (#266)
* Forwarding V_MAX_LEGACY_F32 to V_MAX3_F32. Fixes Translation error in Geometry Wars 3.

* Forwarded to correct op

* Implemented Legacy Max/Min using NMax/NMin

* Added extra argument to Min/Max op codes

* Removed extra translator functions, replaced with bool

* Formatting
2024-07-08 12:24:12 +03:00
psucien 6dbb842bec renderer: a bit more formats to support 2024-07-07 14:34:36 +02:00
psucien cfbe8b9e6d renderer: added support for instance step rates 2024-07-06 18:03:43 +02:00
TheTurtle 38080b60af
shader_recompiler: Check usage before enabling capabilities (#245)
* vk_instance: Better feature check

* shader_recompiler: Make most features optional

* vk_instance: Bump extension vector size

* resource_tracking_pass: Perform BFS for sharp tracking

* The Witness triggered this
2024-07-06 02:42:16 +03:00
TheTurtle 6ceab6dfac
shader_recompiler: Implement most integer image atomics, workgroup barriers and shared memory load/store (#231)
* shader_recompiler: Add LDEXP

* shader_recompiler: Add most image integer atomic ops

* shader_recompiler: Implement shared memory load/store

* shader_recompiler: More image atomics

* externals: Update sirit

* clang format

* cmake: Add missing files

* shader_recompiler: Fix some atomic bugs

* shader_recompiler: Vs outputs

* shader_recompiler: Shared mem has side-effects, fix format component order

* shader_recompiler: Inline constant buffer impl

* video_core: Fix regressions

* Work

* Fixup a few things
2024-07-05 00:15:44 +03:00
IndecisiveTurtle a603bc7d88 shader_recompiler: More instructions 2024-07-01 22:42:45 +03:00
IndecisiveTurtle 20e83b4d53 clang format 2024-07-01 13:56:14 +03:00
IndecisiveTurtle b4d24d8737 renderer_vulkan: Prefer depth stencil read-only layout when possible
* Persona reads a depth attachment while it is being attached with writes disabled. Now this works without spamming vk validation errors
2024-07-01 13:56:14 +03:00
IndecisiveTurtle 22b930ba5e video_core: Track renderpass scopes properly 2024-07-01 13:56:14 +03:00
IndecisiveTurtle 5da79d4798 spirv: Add fragdepth and implement image query 2024-07-01 13:56:14 +03:00
IndecisiveTurtle 8850c2f4be shader_recompiler: Add more instructions 2024-06-22 18:09:03 +03:00
psucien be67fdc9c9 shader_recompiler: correct format for SSBO store op 2024-06-16 21:21:19 +02:00
psucien 54f8616d6a shader_recompiler: added MUL_HI VOP2 (896) 2024-06-16 20:39:53 +02:00
TheTurtle 7b1a317b09
video_core: Preliminary storage image support and more (#188)
* vk_rasterizer: Clear depth buffer when DB_RENDER_CONTROL says so

* video_core: Preliminary storage image support, more opcodes

* renderer_vulkan: a fix for vertex buffers merging

* renderer_vulkan: a heuristic for blend override when alpha out is masked

---------

Co-authored-by: psucien <bad_cast@protonmail.com>
2024-06-10 22:35:14 +03:00
TheTurtle 998d046210
video_core: Add depth buffer support and fix some bugs (#172)
* memory: Avoid crash when alignment is zero

* Also remove unused file

* shader_recompiler: Add more instructions

* Also fix some minor issues with a few existing instructions

* control_flow: Don't emit discard for null exports

* renderer_vulkan: Add depth buffer support

* liverpool: Fix wrong color buffer number type and viewport zscale

* Also add some more formats
2024-06-07 16:26:43 +03:00