SlideShare una empresa de Scribd logo
1 de 73
Siggraph 2016
Tristan Lorach Manager of Developer Technology Group, NVIDIA US
7/25/2016
VULKAN AND NVIDIA:
THE ESSENTIALS
2
ANALOGY ON GRAPHIC APIS
(getting ready for my 7 years old son’s questions on my job…)
Lego Kit Derby KitCar Toy
3
Analogy
Different Valid Approaches
Fixed-function OpenGL Modern AZDO OpenGL with
Programmable Shaders
Vulkan
(adult supervision required!)(booring…) (cool... Messes-up the bedroom)
4
WHAT IS VULKAN ?
…It’s a modern API
• Designed and maintained by Khronos Group
• Designed for high performance on rendering and compute
• [Extremely] low level : no more “baby-sitting” from our driver
• Manage yourself memory, resource updates; batching; scheduling…
• [Extremely] verbose : Lots of structures to fill with parameters
• close to DX12 design…
• Opposite of OpenGL: Multi-threading friendly : Vulkan will especially shine if
multi-threading used
• But still generic enough to work on many HW vendors & platforms
5
Beneficial Vulkan Scenarios
Is your graphics
work CPU bound?
Can your graphics
creation be parallelized?
start
yes
Vulkan
friendly
yes
Your graphics
platform is fixed
You’ll
do whatever
it takes to squeeze out
Max perf.
You put
a premium on
avoiding
hitches
You can
manage your
graphics resource
allocations
yes
yes
yes
yes
6
Beneficial Vulkan Scenarios
Is your graphics
work CPU bound?
Can your graphics
creation be parallelized?
start
yes
Vulkan
friendly
yes
Your graphics
platform is fixed
You’ll
do whatever
it takes to squeeze out
Max perf.
You put
a premium on
avoiding
hitches
You can
manage your
graphics resource
allocations
yes
yes
yes
yes
Tired with OpenGL
(state-machine)
or even D3D ?
Want to learn new stuff ?
Spend lots of time coding ?
No sleep ?
Kinda…
(it’s a Yes)
Alright…
(Yes)
7
Unlikely to Benefit
Scenarios to Reconsider Coding to Vulkan
1. Need for compatibility to pre-Vulkan platforms
2. Heavily GPU-bound application
3. Heavily CPU-bound application due to non-graphics work
4. Single-threaded application, unlikely to change
5. App can target middle-ware engine, avoiding 3D graphics API dependencies
• Consider using an engine targeting Vulkan, instead of dealing with Vulkan
yourself
Good News in any case: NVIDIA OpenGL driver is great and will always be there !
8
memory GPU
BIG PICTURE –OPENGL CASE
Vertex Puller (IA)
Vertex Shader
TCS (Tessellation)
TES (Tessellation)
Tessellator
Geometry Shader
Transform Feedback
Rasterization
Fragment Shader
Per-Fragment Ops
Framebuffer
Tr. Feedback buffer
Uniform Block
Texture Fetch
Image Load/Store
Atomic Counter
Shader Storage
Element buffer (EBO)
Draw Indirect Buffer
Vertex Buffer (VBO)
Front-End
(decoder)
OpenGL
Driver
Application
FBO resources
(Textures / RB)
OpenGL
Commands
Cmd bundles
Push-Buffer
(FIFO)
cmds
OpenGL
resources
Resources
Graphics pipeline
States
Dependencies
Heap
.
.
9
Application
memory GPU
BIG PICTURE – VULKAN
Vertex Puller (IA)
Vertex Shader
TCS (Tessellation)
TES (Tessellation)
Tessellator
Geometry Shader
Transform Feedback
Rasterization
Fragment Shader
Per-Fragment Ops
Framebuffer
Tr. Feedback buffer
Uniform Block
Texture Fetch
Image Load/Store
Atomic Counter
Shader Storage
Element buffer (EBO)
Draw Indirect Buffer
Vertex Buffer (VBO)
Front-End
(decoder)
OpenGL
Driver
FBO resources
(Textures / RB)
Cmd bundles
Push-Buffer
(FIFO)
cmds
Minimal memory
management
Fewer translation,
Validation checks
And internal mgt
Resources
Pipeline
States
Cmd-buffers / queues
Dependencies
Heap
Render
Passes
Descriptor
Sets
10
Instance
VULKAN COMPONENTS
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
11
Instance
Device
VULKAN COMPONENTS
Device(s)
Queue(s)
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
12
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
VULKAN OBJECTS: DEVICE
Instance
Device(s)
Instance ~~ OpenGL Context
Instance-Layers
• Intercepting API calls for misc. purposes
• Many layers available (api-dump;
core/std/parms validation; screenshot…)
Instance-Specific Extensions
• KHR_Surface (for Swap-chains)
• EXT_debug_report
• …
Exposes some Devices…
13
HOW DOES IT LOOK ?
Instance creation
Layer properties
Extension properties
14
HOW DOES IT LOOK ?
Instance creation
etc
Get Instance-Extension’s functions
15
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
Can have many …
VULKAN OBJECTS: DEVICE
Device(s)
VkPhysicalDevice
• Capabilities
• Memory Management
• Queues
• Objects
• Buffers
• Images
• Sync Primitives
16
NVIDIA’S VULKAN CAPABILITIES
Properties listed from Physical Device
NVIDIA is almost full featured
Top to bottom: from GeForce, Quadro down to Tegra
Check http://vulkan.gpuinfo.org/listreports.php
17
NVIDIA’S VULKAN CAPABILITIESGeForce GTX 980
Tegra X1 & K1
18
HOW DOES IT LOOK ?
Device creation – enumerate physical devices; gather properties
20
HOW DOES IT LOOK ?
Device creation – Create the device (!)
21
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Queue
VULKAN COMPONENTS
22
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
QUEUES
Command queue was hidden in OpenGL Context… now explitly declared
Multiple threads can submit work to a queue (or queues)!
Queues accept GPU work via CommandBuffer submissions
few operations available around Queues:, “submit work” and “wait for idle”
Queue submissions can include sync primitives for the queue to:
Wait upon before processing the submitted work
Signal when the work in this submission is completed
Queue “families” can accept different types of work, e.g.
NVIDIA exposes 2 families: 1+16 Queues
16 for all available types of work
1 for transfer operations only (Copy Engine)
Queue
23
HOW DOES IT LOOK ?
Queue(s)
24
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
…
2ndary Command-buffer
…
…
Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Cmd.Buffer Pool
VULKAN COMPONENTS
25
SYNCHRONIZATION
events and barriers
used to synchronize work within a command
buffer or sequence of command buffers
submitted to a single queue
semaphores
used to synchronize work across queues or
across coarse-grained submissions to a
single queue
fences
used to synchronize work between the
device and the host.
Device
Queue
Cmd-buffer
Host
barrier
event
Queue
Cmd-buffer
Queue
Cmd-buffer
event
Semaphores
Fences
26
COMMAND-BUFFERS
Vulkan Rendering  Command-Buffers
Close to what GPU will get at Front-End (FIFO)
Minor translation & optimization from the Driver prior to
sending to the GPU
Each can be created either for one shot or for multiple
frames/submissions
Cannot Cmd-Buffers from GPU (command-lists can): API
calls to vkCmd…() between Begin & End
Multi-threading friendly (!)
Primary Cmd-Buffer can call many many
2ndary Cmd-Buffers
2ndary Command-
buffer
…
2ndary Command-
buffer
…
Primary Cmd-buffer
…
2ndary Cmd-buffer
…
…
Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Cmd.Buffer Pool
27
Thread 1
(Busy)
Update Work
Feed Cmd
Buffers
cmd. Buffer Pool
Create 2dary Cmd Buffer
Give out Cmd Buffers
COMMAND-BUFFERS AND MULTI-THREADING
Main thread
(Busy)
Game Work
Thread Coordination
Swapping
Collect
Submit to Q
cmd. Buffer Pool
Create 1ary Cmd Buffer
1ary Cmd calls 2dary ones
Thread 3
(Busy)
Update Work
Feed Cmd
Buffers
cmd. Buffer Pool
Create 2dary Cmd Buffer
Give out Cmd Buffers
Thread 4
(Busy)
Update Work
Feed Cmd Buffers
cmd. Buffer Pool
Create 2dary Cmd Buffer
Give out Cmd Buffers
Thread 2
(Busy)
Update Work
Feed Cmd Buffers
cmd. Buffer Pool
Create 2dary Cmd Buffer
Give out Cmd Buffers
Command Buffer Pool local to the thread
To prevent conflicts in concurrent access
28
Must not recycle a CommandBuffer for rewriting until it is no longer in flight (In flight
== GPU still consuming it on its side)
But we can’t flush the queue each frame: would break parallelism !
VkFences can be provided with a queue submission to test when a command buffer is
ready to be recycled
COMMAND BUFFER THREAD SAFETY
App Submissions to the Queue
GPU Consumes Queue
Fence A
CommandBuffer CommandBuffer CommandBuffer
Fence B
CommandBuffer CommandBuffer
Fence A Signaled to App
Rewrite command buffer
CommandBuffer
29
Frame NFrame N-1
Threads can have more than 1 Command Pool
Ring-buffer: One Command-Pool per Frame
when the frame is no longer in flight (Using Fences):
simply reset the whole Pool
Frame N-2
THREADS AND COMMAND POOLS
Thread 1 CommandPool
Command
Buffer
Command
Buffer
CommandPool
Command
Buffer
Command
Buffer
CommandPool
Command
Buffer
Command
Buffer
Thread 2 CommandPool
Command
Buffer
Command
Buffer
CommandPool
Command
Buffer
Command
Buffer
CommandPool
Command
Buffer
Command
Buffer
30
HOW DOES IT LOOK ?
Command-Buffers
Note: helpers (structs
wrappers) to use
Constructors & functors for
compact data declaration
See NVK.h/cpp in
https://github.com/nvpro-
samples
31
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
VULKAN COMPONENTS
Graphics pipeline
32
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
Descriptor-set LayoutDescriptor-set Layout
GRAPHICS PIPELINE
Snapshot of all States
Including Shaders
Pre-compiled & Immutable
Ideally: done at Initialization time
Ok at render-time *if* using the
Pipeline-Cache
Prevents validation overhead during
rendering loop
Some Render-states can be
excluded from it: they become
“Dynamic” States
Graphics pipeline
Pipeline-layout
Descriptor-set Layout
Shader Stage
Vertex Input
Tess. State
Viewport State
Rasterizations State
Multi-Sample State
Depth & Stencil State
Color Blend State
Optional Dynamic States
Viewport Scissor
Blend const Stencil Ref
Depth Bounds Depth Bias
Shader Module
Shader Stage
Rasterizations State
• depthClipEnable
• rasterizerDiscardEnable
• fillMode
• cullMode
• frontFace
• depthBiasEnable
• depthBias
• depthBiasClamp
• slopeScaledDepthBias
• lineWidth
Pipeline
cache
33
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
Descriptor-set LayoutDescriptor-set Layout
GRAPHICS PIPELINE
Graphics Pipeline must be
consistent with shaders
No “introspection”, so everything
known & prepared in advance
Vertex Input:
tells how Attributes: Locations are
attached to which Vertex Buffer at
which offset
Pipeline Layout:
Tells how to map Sets and Bindings
for the shaders at each stage (Vtx,
Fragment, Geom…)
Graphics pipeline
Pipeline-layout
Descriptor-set Layout
Shader Stage
Vertex Input
Shader Module
GLSL Code
layout(std140, set= 0 , binding= 0) uniform A { ... };
layout(std140, set= 0 , binding= 1) uniform B { … };
layout(std140, set= 1 , binding= 2) uniform C { … };
…
layout(location=0) in vec3 pos;
layout(location=1) in vec3 N;
…
void main() { …
Spir-Vcompiled
Descriptor-Set(s)
Descriptor-Sets for
…
34
HOW DOES IT LOOK ?
Graphics pipeline - setup
35
HOW DOES IT LOOK ?
Graphics pipeline creation (compact notation)
36
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
Buffer
Buffer
VULKAN COMPONENTS
37
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
BUFFERS
Highly Heterogenous. Most often used for:
Index/Vertex Buffers
Uniform Buffers (Matrices, material parameters…)
Vulkan Object: Must be bound to some Device Memory
Can be CPU accessible memory (mappable)
Can be CPU cached
Can be GPU accessible only: need a “Staging Buffer” to write into it
But most Efficient
(More on Device Memory later…)
38
COMMAND-BUFFERS: UPDATE/PUSH CONSTANTS
2 more ways to update constants/uniforms for Shaders from
the Command-Buffer
Update-Buffer: prior to Render-Pass: can target any Buffer bound
by Descriptor Sets
Push-Constants: targets a dedicated section in GLSL/SpirV
New values appended “in-band”: in the Command-Buffer
Efficient; but good for small amount of values
Primary Cmd-buffer
…
Begin Render-Pass
Draw…
vkCmdPushConstants
vkCmdUpdateBuffer()
layout(push_constant) uniform objectBuffer {
mat4 matrixObject;
vec4 diffuse;
} object;
layout(set=0 , binding = 2 ) uniform MyBuffer {
mat4 mW;
…
39
HOW DOES IT LOOK ?
Buffers – copy data in a buffer
• Create Buffer
• Create staging buffer
• Bind objects to some memory
• Command buffer for copy
• Enqueue command buffer
• Note: could be put with other
graphic commands for better
efficiency
40
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
Image
Image View
Image View
VULKAN COMPONENTS
41
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
IMAGES AND IMAGEVIEW
Images represent all kind of ‘pixel-like’ arrays
Textures: Color or Depth-Stencil
Render targets : Color and Depth-Stencil
Even Compute data
Shader Load/Store (imgLoadStore)
ImageView required to expose Images properly when specific format required
For Shaders
For Framebuffers
42
HOW DOES IT LOOK ?
Way more complex than OpenGL !
• Load image
• Create an Image
(1D/2D/3D/Cube…)
• Create an Image-View
• Aggregate layers/mipmap layers
info (offsets, sizes) in a
structure (VkBufferImageCopy)
• Aggregate layers & mipmap data
to contiguous memory
• Create staging buffer + bind
memory + copy data in it
• Use command-buffer to copy to
the image: layers and mipmaps
• Layout transition of image
for copy
• vkCmdCopyBufferToImage
• Layout transition of image
for use by shader
• Enqueue command buffer and
execute
Simple texture creation
43
HOW DOES IT LOOK ?
Simple texture creation
44
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
VULKAN COMPONENTS
Descriptor-Set
45
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
DESCRIPTOR-SET
Each DescriptorSet holds references to some resources
Descriptor-Set-Layout defines how resources must be put
together in a DescriptorSet
Command buffers can then efficiently bind any or them
They must match what shaders of each stage expect !
Descriptor-Set
Image
Memory
Image View
Buffer
Sampler
Heap
DescriptorSet Pool
Descriptor-set Layout
GLSL Code
layout(std140, set= 0 , binding= 0) uniform A { ... };
layout(std140, set= 0 , binding= 1) uniform B { … };
layout(std140, set= 1 , binding= 2) uniform C { … };
layout(set=0, binding=3) uniform sampler2D tex;
…
void main() { …
46
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
DESCRIPTOR-SET
Descriptor-set Layout A
Uniform Buffer for Vtx
Storage Buffer for frag.
Image View for frag
Descriptor-set Layout C
Uniform Buffer for Vtx
Descriptor-set Layout B
Image View for frag.
Sampler for frag. shd
Dset A1
Buffer L
Image View M
Buffer K
Dset B1
Image View Q
Sampler R
Dset C3
Buffer S
Command-buffer
…
Bind Graphics-pipeline
Bind Descriptor-Set(s)
Dset A2
Buffer O
Image View P
Buffer N
?
Memory
Heap
Image M
Buffer K
Image P
Image Q
Buffer L
Buffer N
Buffer O
Buffer S
Buffer T
Dset C2
Buffer T
Dset C1
Buffer S
?
Defined for
Layout
A+B+C
47
HOW DOES IT LOOK ?
Descriptor Set - initialization
48
HOW DOES IT LOOK ?
Descriptor Set – using it
49
DESCRIPTOR SETS AND MULTI-THREADING
Thread 1
Update Work
Update
resources
in Desc. Sets
Descriptor Pool
Allocate Desc. Sets
… more Vulkan work …
Main thread
Game Work
Thread Coordination
Swapping
synchronize
Descriptor Pool
Allocate Desc. Sets
Update resources
in Desc. Sets
… more Vulkan work …
Thread 2
Update Work
Update resources
in Desc. Sets
Descriptor Pool
Allocate Desc. Sets
… more Vulkan work …
Thread 1
Update Work
Update
resources
in Desc. Sets
Descriptor Pool
Allocate Desc. Sets
… more Vulkan work …
Thread 2
Update Work
Update resources
in Desc. Sets
Descriptor Pool
Allocate Desc. Sets
… more Vulkan work …
! Descriptor Pool local to the thread !
cmd. Buffer Pool
Create 2dary Cmd Buffer
50
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
Framebuffer Render-Pass
VULKAN COMPONENTS
51
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
VULKAN COMPONENTS
Framebuffer Render-Pass
Image
Image ViewImage ViewImage ViewImageView #0
ImageImageImage #0
Attachment Desc.
• Image #0: fmt…
• Image #1: fmt…
• Image #2: fmt…
• Image #3: fmt…
Must match
Sub-Pass #0 Desc.
• Color: Image #0
• DST: Image #1
• MSAA resolve:Image #2
Sub-Pass #1 Desc.
• Color0: Image #3
• Color1: Image #4
• DST: Image #1
• MSAA resolve:Image #5
• Framebuffer
• Simpler than OpenGL
• “Bag” or “Repository” of resource views
• No role defined for the resources
• Render-Pass
• Really defines the role of Framebuffer resources
• Can have more than 1 Sub-Pass
• Each Sub-Passes defines which Framebuffer resource to use
• invented for Tilers Arch
Sub-Pass #N Desc.
…
Default
Sub-Pass
Can use many if compatibles
52
HOW DOES IT LOOK ?
Renderpass Creation (‘compacted’ notation for Vulkan structures)
53
HOW DOES IT LOOK ?
Framebuffer Creation and use
54
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
Memory (Vid)
Heap 2
Memory (Sys)
Heap 1
VULKAN COMPONENTS
55
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
MEMORY VULKAN OBJECTS
Vulkan Objects referring to buffer(s) of data need binding to memory
Vertex/Index Buffers; Uniform Buffers; Images/Textures…
Vulkan Device exposes various Memory Heaps - Example:
heap 0: size:12,288Mb (Video Memory of my K6000)
heap 1: size:17,911Mb (System Memory of my PC)
And various Memory Types from these Heaps. Example:
Mem.Type Heap Flags
0 1 (sys.mem) -
1 0 (Video) DEVICE_LOCAL
2 1 (sys.mem) HOST_VISIBLE | HOST_COHERENT
3 1 (sys.mem) HOST_VISIBLE | HOST_COHERENT | HOST_CACHED
Tegra: Adds one more:
HOST_VISIBLE “NON-Coherent”
Memory (Vid)
Heap 2
Memory (Sys)
Heap 1
56
Device
Queue
Command-buffer
…
Graphics pipeline
Descriptor-Set
Framebuffer Render-Pass
Image
Memory
2ndary Command-buffer
…
…
Buffer
Image View
Buffer
Sampler
Image View Begin Render-Pass
Bind Graphics-pipeline
Bind Vertex/Idx Buffer(s)
Bind Descriptor-Set(s)
Draw…
End Render-Pass
Execute Commands
Update Buffer
Set misc. dynamic states
Barrier synchronization
Heap
Cmd.Buffer Pool
DescriptorSet Pool
Command-buffer
MEMORY VULKAN OBJECTS
Image (Tex)
Image View
Buffer
(Uniforms)
Image View
Buffer (Vertices)
Memory (Vid)
Heap 2
Memory (Sys)
Heap 1
matrices
Vtx Buf.
V1.xyzw
V2.xyzw…
Image (RT)
rgba
rgb
Vtx
Buf.
57
RESOURCE MANAGEMENT
Allocation and Sub allocation
HEAP supporting A,B HEAP supporting B
Allocation Type A Allocation Type B
Image
...
Cube Image Buffer
Allocate memory type from heap
Query Vulkan Object about size,
alignment & type requirements
Assign memory subregion to a resource
(allows aliasing)
BufferView BufferView
Create resource views on subranges of
a buffer or image (array slices...)
Buffer
58
RESOURCE MANAGEMENT
#HappyGPU
Memory Allocation
Buffer
UniformVertexIndex
Memory Allocation
Buffer
Index Vertex
Buffer
Uniform
Buffer
Better...
Buffer
Index Vertex
Buffer
Uniform
Buffer
Not. So. Good.
Bind same buffer with
Offsets > 0
1 buffer can have
many types of data
59
HOW DOES IT LOOK ?
Binding Memory to Objects – Simple use-case
Mem properties previously gathered at init time
60
SHADERS
Vulkan uses SPIR-V passed directly to the driver
Can be compiled from GLSL Via glslang or LunarG’s glslangValidator; Google ShaderC
theoretically other languages could be compiled to Spir-V…
Libraries available to compile GLSL to Spir-V from the application
NVIDIA allows to compile GLSL directly
glslangGLSL
Some other
language
NVIDIA VK_NV_glsl_shader: Vulkan reads GLSL directly
61
SHADERS
Multiple entry points can be defined in a single Spir-V
shader-module
Prevents redundant code: shader module used by
many Graphics-Pipelines
Specialization Constants: early setup of constants for
shaders in given Graphics-Pipeline
Allows sharing snippets of code : easier to share
common shader code
Module
vtx_main()
frag_pastic(float t)
Frag_metal()lighting()
rotate(…)
Graphics pipeline A
Vtx_main()
frag_pastic(1.1)
Graphics pipeline B
Vtx_main()
frag_metal()
Graphics pipeline C
Vtx_main()
frag_wood()
frag_wood()
Graphics pipeline A
Vtx_main()
frag_pastic(3.2)
62
VULKAN WINDOW SYSTEM INTEGRATION (WSI)
WSI manages the ownership of images via a swap chain
One image is presented while the other is rendered to
WSI is a Vulkan Extension
WSIDisplay
Swap Chain
(images)
Application
Submit image to WSI
The display owns the image
Acquire image from WSI
The application owns the image
63
NVIDIA OPENGL VULKAN INTEROP
Alternative to WSI: GL_NV_draw_vulkan_image
Create an OpenGL Context and all the usual things
Create Vulkan Device
Rendering Loop involves both OpenGL and Vulkan
Blit the Vulkan image to OpenGL backbuffer: glDrawVkImageNV
Extra care on synchronization (Semaphores)
Bonus: Mix OpenGL rendering (UI overlay…) with Vulkan
Allows smooth transition in projects
OpenGL
Vulkan
64
HOW DOES IT LOOK ?
GL_NV_draw_vulkan_image
Initialize…
65
HOW DOES IT LOOK ?
GL_NV_draw_vulkan_image
Submit commands to Queue with semaphores
Backbuffer Blit
66
PRE-REQUISITES TO WORK WITH VULKAN
Lunar-G (http://lunarg.com/ )
Vulkan Loader (+Source code)
Tools: Spir-V compiler for GLSL code and other libraries
Layers: intermediate code invoked by Vulkan API functions to
help debug
Vulkan Includes
Drivers:
GeForce Experience
https://developer.nvidia.com/vulkan-driver
NVIDIA resources: https://developer.nvidia.com/Vulkan
67
RECAP’ ON NVIDIA-SPECIFIC FEATURES
Compatible GPUs for Vulkan: Kepler and Higher; Shield Tablet; Shield Android TV
VK_NV_glsl_shader : GLSL can be directly sent to Vulkan
VK_NV_dedicated_allocation : more efficient memory usage
GL_NV_draw_vulkan_image can replace WSI
16 Queues. All available for all kind of use; 1 Queue for Copy-Engine only
3 frames (max) in flight with WSI
All Host memories are “Coherent” (except one for Tegra)
Layout transitions don’t exist in our HW (VK_IMAGE_LAYOUT_GENERAL)
Linear-Tiling only for 2D non-mipmapped textures… please avoid (bad performance)
Shaders never need re-compilation due to states in Graphics-pipeline
68
NSIGHT FOR VULKAN
69
RECAP’ ON VULKAN PHILOSOPHY
Validate as much as possible up-front (DescriptorSets; Pipelines…)
The driver doesn’t waste time on figuring-out how to set things-up
Reuse existing patterns of Graphics-Pipelines: cached pipelines
Know your application: Taylor Vulkan design according to it
Know your memory usage: You are in charge of optimal sub-allocations
Explicit multi-threading for graphics: Application’s responsibility
Explicit Resource updates: Either through [non]Coherent buffers; or Queue-Based
DMA transfers
70
FEW WORDS ON VKCPP PROJECT
C++11 to the rescue
• Open-Source Project of a C++11 overlay for Vulkan: became Khronos-offcial (!)
• Simplify Vulkan usage by
• reducing risk of errors, i.e. type safety, automatic initialization of sType, …
• Reduce #lines of written code, i.e. constructors, initalizer lists for arrays, …
• Add utility functions for common tasks (suballocators, resource tracking, …)
http://on-demand.gputechconf.com/gtc/2016/events/vulkanday/Vulkan_C++.pdf
https://developer.nvidia.com/open-source-vulkan-c-api
https://www.khronos.org/news/permalink/khronos-introduces-vulkan-hpp-open-
source-vulkan-c-api
71
VKCPP PROJECT
Two C++ based layers
Autogenerated ‚low-level‘ layer using vulkan.xml
• Type safety
• Syntactic sugar
• Lightweight layer; Keeps you closer to the real Vulkan
Hand-coded ‚high level‘ layer
• Reduce code complexity
• Exception safety, resource lifetime tracking, …
• Closer dependency with VkCpp internal implementations
72
NATIVE VULKAN VS. VKCPP CODE
Native Vulkan: ~750 lines vkCPP: ~200 lines
73
REFERENCES
Vulkan info from NVIDIA:
https://developer.nvidia.com/Vulkan
https://developer.nvidia.com/vulkan-graphics-api-here
Samples + Source code in OpenGL and Vulkan:
https://github.com/nvpro-samples
Other:
https://gameworks.nvidia.com
https://developer.nvidia.com/designworks
http://vulkan.gpuinfo.org/listreports.php
https://developer.nvidia.com/open-source-vulkan-c-api
THANK YOU
HTTPS://DEVELOPER.NVIDIA.COM/DESIGNWORKS
TLORACH@NVIDIA.COM

Más contenido relacionado

La actualidad más candente

Approaching zero driver overhead
Approaching zero driver overheadApproaching zero driver overhead
Approaching zero driver overhead
Cass Everitt
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3
guest11b095
 
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Johan Andersson
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Johan Andersson
 

La actualidad más candente (20)

Vulkan introduction
Vulkan introductionVulkan introduction
Vulkan introduction
 
Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016
 
Volumetric Lighting for Many Lights in Lords of the Fallen
Volumetric Lighting for Many Lights in Lords of the FallenVolumetric Lighting for Many Lights in Lords of the Fallen
Volumetric Lighting for Many Lights in Lords of the Fallen
 
Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)
 
Approaching zero driver overhead
Approaching zero driver overheadApproaching zero driver overhead
Approaching zero driver overhead
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3
 
vkFX: Effect(ive) approach for Vulkan API
vkFX: Effect(ive) approach for Vulkan APIvkFX: Effect(ive) approach for Vulkan API
vkFX: Effect(ive) approach for Vulkan API
 
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
 
HDR Theory and practicce (JP)
HDR Theory and practicce (JP)HDR Theory and practicce (JP)
HDR Theory and practicce (JP)
 
OpenGL 3.2 and More
OpenGL 3.2 and MoreOpenGL 3.2 and More
OpenGL 3.2 and More
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
 
OpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering TechniquesOpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering Techniques
 
Siggraph 2011: Occlusion culling in Alan Wake
Siggraph 2011: Occlusion culling in Alan WakeSiggraph 2011: Occlusion culling in Alan Wake
Siggraph 2011: Occlusion culling in Alan Wake
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666
 
Screen Space Reflections in The Surge
Screen Space Reflections in The SurgeScreen Space Reflections in The Surge
Screen Space Reflections in The Surge
 
FrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteFrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in Frostbite
 
Future Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsFuture Directions for Compute-for-Graphics
Future Directions for Compute-for-Graphics
 
NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017
 
DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3
 

Similar a Siggraph 2016 - Vulkan and nvidia : the essentials

Commandlistsiggraphasia2014 141204005310-conversion-gate02
Commandlistsiggraphasia2014 141204005310-conversion-gate02Commandlistsiggraphasia2014 141204005310-conversion-gate02
Commandlistsiggraphasia2014 141204005310-conversion-gate02
RubnCuesta2
 
SDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's StampedeSDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's Stampede
Intel® Software
 

Similar a Siggraph 2016 - Vulkan and nvidia : the essentials (20)

Migrating from OpenGL to Vulkan
Migrating from OpenGL to VulkanMigrating from OpenGL to Vulkan
Migrating from OpenGL to Vulkan
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)
 
Commandlistsiggraphasia2014 141204005310-conversion-gate02
Commandlistsiggraphasia2014 141204005310-conversion-gate02Commandlistsiggraphasia2014 141204005310-conversion-gate02
Commandlistsiggraphasia2014 141204005310-conversion-gate02
 
SDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's StampedeSDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's Stampede
 
FortranCon2020: Highly Parallel Fortran and OpenACC Directives
FortranCon2020: Highly Parallel Fortran and OpenACC DirectivesFortranCon2020: Highly Parallel Fortran and OpenACC Directives
FortranCon2020: Highly Parallel Fortran and OpenACC Directives
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
 
Dragon flow neutron lightning talk
Dragon flow neutron lightning talkDragon flow neutron lightning talk
Dragon flow neutron lightning talk
 
Khronos Munich 2018 - Halcyon and Vulkan
Khronos Munich 2018 - Halcyon and VulkanKhronos Munich 2018 - Halcyon and Vulkan
Khronos Munich 2018 - Halcyon and Vulkan
 
Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with Kubernetes
 
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio [Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
 
Mantle for Developers
Mantle for DevelopersMantle for Developers
Mantle for Developers
 
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
 
seven-ways-to-run-flink-on-aws.pdf
seven-ways-to-run-flink-on-aws.pdfseven-ways-to-run-flink-on-aws.pdf
seven-ways-to-run-flink-on-aws.pdf
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011
 
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
 
NFV Orchestration for Optimal Performance
NFV Orchestration for Optimal PerformanceNFV Orchestration for Optimal Performance
NFV Orchestration for Optimal Performance
 
Terraform
TerraformTerraform
Terraform
 
Embedded Recipes 2019 - Making embedded graphics less special
Embedded Recipes 2019 - Making embedded graphics less specialEmbedded Recipes 2019 - Making embedded graphics less special
Embedded Recipes 2019 - Making embedded graphics less special
 
Profiling & Testing with Spark
Profiling & Testing with SparkProfiling & Testing with Spark
Profiling & Testing with Spark
 
Docker 101
Docker 101 Docker 101
Docker 101
 

Último

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Último (20)

Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Intro To Electric Vehicles PDF Notes.pdf
Intro To Electric Vehicles PDF Notes.pdfIntro To Electric Vehicles PDF Notes.pdf
Intro To Electric Vehicles PDF Notes.pdf
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 

Siggraph 2016 - Vulkan and nvidia : the essentials

  • 1. Siggraph 2016 Tristan Lorach Manager of Developer Technology Group, NVIDIA US 7/25/2016 VULKAN AND NVIDIA: THE ESSENTIALS
  • 2. 2 ANALOGY ON GRAPHIC APIS (getting ready for my 7 years old son’s questions on my job…) Lego Kit Derby KitCar Toy
  • 3. 3 Analogy Different Valid Approaches Fixed-function OpenGL Modern AZDO OpenGL with Programmable Shaders Vulkan (adult supervision required!)(booring…) (cool... Messes-up the bedroom)
  • 4. 4 WHAT IS VULKAN ? …It’s a modern API • Designed and maintained by Khronos Group • Designed for high performance on rendering and compute • [Extremely] low level : no more “baby-sitting” from our driver • Manage yourself memory, resource updates; batching; scheduling… • [Extremely] verbose : Lots of structures to fill with parameters • close to DX12 design… • Opposite of OpenGL: Multi-threading friendly : Vulkan will especially shine if multi-threading used • But still generic enough to work on many HW vendors & platforms
  • 5. 5 Beneficial Vulkan Scenarios Is your graphics work CPU bound? Can your graphics creation be parallelized? start yes Vulkan friendly yes Your graphics platform is fixed You’ll do whatever it takes to squeeze out Max perf. You put a premium on avoiding hitches You can manage your graphics resource allocations yes yes yes yes
  • 6. 6 Beneficial Vulkan Scenarios Is your graphics work CPU bound? Can your graphics creation be parallelized? start yes Vulkan friendly yes Your graphics platform is fixed You’ll do whatever it takes to squeeze out Max perf. You put a premium on avoiding hitches You can manage your graphics resource allocations yes yes yes yes Tired with OpenGL (state-machine) or even D3D ? Want to learn new stuff ? Spend lots of time coding ? No sleep ? Kinda… (it’s a Yes) Alright… (Yes)
  • 7. 7 Unlikely to Benefit Scenarios to Reconsider Coding to Vulkan 1. Need for compatibility to pre-Vulkan platforms 2. Heavily GPU-bound application 3. Heavily CPU-bound application due to non-graphics work 4. Single-threaded application, unlikely to change 5. App can target middle-ware engine, avoiding 3D graphics API dependencies • Consider using an engine targeting Vulkan, instead of dealing with Vulkan yourself Good News in any case: NVIDIA OpenGL driver is great and will always be there !
  • 8. 8 memory GPU BIG PICTURE –OPENGL CASE Vertex Puller (IA) Vertex Shader TCS (Tessellation) TES (Tessellation) Tessellator Geometry Shader Transform Feedback Rasterization Fragment Shader Per-Fragment Ops Framebuffer Tr. Feedback buffer Uniform Block Texture Fetch Image Load/Store Atomic Counter Shader Storage Element buffer (EBO) Draw Indirect Buffer Vertex Buffer (VBO) Front-End (decoder) OpenGL Driver Application FBO resources (Textures / RB) OpenGL Commands Cmd bundles Push-Buffer (FIFO) cmds OpenGL resources Resources Graphics pipeline States Dependencies Heap . .
  • 9. 9 Application memory GPU BIG PICTURE – VULKAN Vertex Puller (IA) Vertex Shader TCS (Tessellation) TES (Tessellation) Tessellator Geometry Shader Transform Feedback Rasterization Fragment Shader Per-Fragment Ops Framebuffer Tr. Feedback buffer Uniform Block Texture Fetch Image Load/Store Atomic Counter Shader Storage Element buffer (EBO) Draw Indirect Buffer Vertex Buffer (VBO) Front-End (decoder) OpenGL Driver FBO resources (Textures / RB) Cmd bundles Push-Buffer (FIFO) cmds Minimal memory management Fewer translation, Validation checks And internal mgt Resources Pipeline States Cmd-buffers / queues Dependencies Heap Render Passes Descriptor Sets
  • 10. 10 Instance VULKAN COMPONENTS Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool
  • 11. 11 Instance Device VULKAN COMPONENTS Device(s) Queue(s) Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer
  • 12. 12 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer VULKAN OBJECTS: DEVICE Instance Device(s) Instance ~~ OpenGL Context Instance-Layers • Intercepting API calls for misc. purposes • Many layers available (api-dump; core/std/parms validation; screenshot…) Instance-Specific Extensions • KHR_Surface (for Swap-chains) • EXT_debug_report • … Exposes some Devices…
  • 13. 13 HOW DOES IT LOOK ? Instance creation Layer properties Extension properties
  • 14. 14 HOW DOES IT LOOK ? Instance creation etc Get Instance-Extension’s functions
  • 15. 15 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer Can have many … VULKAN OBJECTS: DEVICE Device(s) VkPhysicalDevice • Capabilities • Memory Management • Queues • Objects • Buffers • Images • Sync Primitives
  • 16. 16 NVIDIA’S VULKAN CAPABILITIES Properties listed from Physical Device NVIDIA is almost full featured Top to bottom: from GeForce, Quadro down to Tegra Check http://vulkan.gpuinfo.org/listreports.php
  • 18. 18 HOW DOES IT LOOK ? Device creation – enumerate physical devices; gather properties
  • 19. 20 HOW DOES IT LOOK ? Device creation – Create the device (!)
  • 20. 21 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Queue VULKAN COMPONENTS
  • 21. 22 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer QUEUES Command queue was hidden in OpenGL Context… now explitly declared Multiple threads can submit work to a queue (or queues)! Queues accept GPU work via CommandBuffer submissions few operations available around Queues:, “submit work” and “wait for idle” Queue submissions can include sync primitives for the queue to: Wait upon before processing the submitted work Signal when the work in this submission is completed Queue “families” can accept different types of work, e.g. NVIDIA exposes 2 families: 1+16 Queues 16 for all available types of work 1 for transfer operations only (Copy Engine) Queue
  • 22. 23 HOW DOES IT LOOK ? Queue(s)
  • 23. 24 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer … 2ndary Command-buffer … … Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Cmd.Buffer Pool VULKAN COMPONENTS
  • 24. 25 SYNCHRONIZATION events and barriers used to synchronize work within a command buffer or sequence of command buffers submitted to a single queue semaphores used to synchronize work across queues or across coarse-grained submissions to a single queue fences used to synchronize work between the device and the host. Device Queue Cmd-buffer Host barrier event Queue Cmd-buffer Queue Cmd-buffer event Semaphores Fences
  • 25. 26 COMMAND-BUFFERS Vulkan Rendering  Command-Buffers Close to what GPU will get at Front-End (FIFO) Minor translation & optimization from the Driver prior to sending to the GPU Each can be created either for one shot or for multiple frames/submissions Cannot Cmd-Buffers from GPU (command-lists can): API calls to vkCmd…() between Begin & End Multi-threading friendly (!) Primary Cmd-Buffer can call many many 2ndary Cmd-Buffers 2ndary Command- buffer … 2ndary Command- buffer … Primary Cmd-buffer … 2ndary Cmd-buffer … … Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Cmd.Buffer Pool
  • 26. 27 Thread 1 (Busy) Update Work Feed Cmd Buffers cmd. Buffer Pool Create 2dary Cmd Buffer Give out Cmd Buffers COMMAND-BUFFERS AND MULTI-THREADING Main thread (Busy) Game Work Thread Coordination Swapping Collect Submit to Q cmd. Buffer Pool Create 1ary Cmd Buffer 1ary Cmd calls 2dary ones Thread 3 (Busy) Update Work Feed Cmd Buffers cmd. Buffer Pool Create 2dary Cmd Buffer Give out Cmd Buffers Thread 4 (Busy) Update Work Feed Cmd Buffers cmd. Buffer Pool Create 2dary Cmd Buffer Give out Cmd Buffers Thread 2 (Busy) Update Work Feed Cmd Buffers cmd. Buffer Pool Create 2dary Cmd Buffer Give out Cmd Buffers Command Buffer Pool local to the thread To prevent conflicts in concurrent access
  • 27. 28 Must not recycle a CommandBuffer for rewriting until it is no longer in flight (In flight == GPU still consuming it on its side) But we can’t flush the queue each frame: would break parallelism ! VkFences can be provided with a queue submission to test when a command buffer is ready to be recycled COMMAND BUFFER THREAD SAFETY App Submissions to the Queue GPU Consumes Queue Fence A CommandBuffer CommandBuffer CommandBuffer Fence B CommandBuffer CommandBuffer Fence A Signaled to App Rewrite command buffer CommandBuffer
  • 28. 29 Frame NFrame N-1 Threads can have more than 1 Command Pool Ring-buffer: One Command-Pool per Frame when the frame is no longer in flight (Using Fences): simply reset the whole Pool Frame N-2 THREADS AND COMMAND POOLS Thread 1 CommandPool Command Buffer Command Buffer CommandPool Command Buffer Command Buffer CommandPool Command Buffer Command Buffer Thread 2 CommandPool Command Buffer Command Buffer CommandPool Command Buffer Command Buffer CommandPool Command Buffer Command Buffer
  • 29. 30 HOW DOES IT LOOK ? Command-Buffers Note: helpers (structs wrappers) to use Constructors & functors for compact data declaration See NVK.h/cpp in https://github.com/nvpro- samples
  • 30. 31 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer VULKAN COMPONENTS Graphics pipeline
  • 31. 32 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer Descriptor-set LayoutDescriptor-set Layout GRAPHICS PIPELINE Snapshot of all States Including Shaders Pre-compiled & Immutable Ideally: done at Initialization time Ok at render-time *if* using the Pipeline-Cache Prevents validation overhead during rendering loop Some Render-states can be excluded from it: they become “Dynamic” States Graphics pipeline Pipeline-layout Descriptor-set Layout Shader Stage Vertex Input Tess. State Viewport State Rasterizations State Multi-Sample State Depth & Stencil State Color Blend State Optional Dynamic States Viewport Scissor Blend const Stencil Ref Depth Bounds Depth Bias Shader Module Shader Stage Rasterizations State • depthClipEnable • rasterizerDiscardEnable • fillMode • cullMode • frontFace • depthBiasEnable • depthBias • depthBiasClamp • slopeScaledDepthBias • lineWidth Pipeline cache
  • 32. 33 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer Descriptor-set LayoutDescriptor-set Layout GRAPHICS PIPELINE Graphics Pipeline must be consistent with shaders No “introspection”, so everything known & prepared in advance Vertex Input: tells how Attributes: Locations are attached to which Vertex Buffer at which offset Pipeline Layout: Tells how to map Sets and Bindings for the shaders at each stage (Vtx, Fragment, Geom…) Graphics pipeline Pipeline-layout Descriptor-set Layout Shader Stage Vertex Input Shader Module GLSL Code layout(std140, set= 0 , binding= 0) uniform A { ... }; layout(std140, set= 0 , binding= 1) uniform B { … }; layout(std140, set= 1 , binding= 2) uniform C { … }; … layout(location=0) in vec3 pos; layout(location=1) in vec3 N; … void main() { … Spir-Vcompiled Descriptor-Set(s) Descriptor-Sets for …
  • 33. 34 HOW DOES IT LOOK ? Graphics pipeline - setup
  • 34. 35 HOW DOES IT LOOK ? Graphics pipeline creation (compact notation)
  • 35. 36 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer Buffer Buffer VULKAN COMPONENTS
  • 36. 37 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer BUFFERS Highly Heterogenous. Most often used for: Index/Vertex Buffers Uniform Buffers (Matrices, material parameters…) Vulkan Object: Must be bound to some Device Memory Can be CPU accessible memory (mappable) Can be CPU cached Can be GPU accessible only: need a “Staging Buffer” to write into it But most Efficient (More on Device Memory later…)
  • 37. 38 COMMAND-BUFFERS: UPDATE/PUSH CONSTANTS 2 more ways to update constants/uniforms for Shaders from the Command-Buffer Update-Buffer: prior to Render-Pass: can target any Buffer bound by Descriptor Sets Push-Constants: targets a dedicated section in GLSL/SpirV New values appended “in-band”: in the Command-Buffer Efficient; but good for small amount of values Primary Cmd-buffer … Begin Render-Pass Draw… vkCmdPushConstants vkCmdUpdateBuffer() layout(push_constant) uniform objectBuffer { mat4 matrixObject; vec4 diffuse; } object; layout(set=0 , binding = 2 ) uniform MyBuffer { mat4 mW; …
  • 38. 39 HOW DOES IT LOOK ? Buffers – copy data in a buffer • Create Buffer • Create staging buffer • Bind objects to some memory • Command buffer for copy • Enqueue command buffer • Note: could be put with other graphic commands for better efficiency
  • 39. 40 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer Image Image View Image View VULKAN COMPONENTS
  • 40. 41 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer IMAGES AND IMAGEVIEW Images represent all kind of ‘pixel-like’ arrays Textures: Color or Depth-Stencil Render targets : Color and Depth-Stencil Even Compute data Shader Load/Store (imgLoadStore) ImageView required to expose Images properly when specific format required For Shaders For Framebuffers
  • 41. 42 HOW DOES IT LOOK ? Way more complex than OpenGL ! • Load image • Create an Image (1D/2D/3D/Cube…) • Create an Image-View • Aggregate layers/mipmap layers info (offsets, sizes) in a structure (VkBufferImageCopy) • Aggregate layers & mipmap data to contiguous memory • Create staging buffer + bind memory + copy data in it • Use command-buffer to copy to the image: layers and mipmaps • Layout transition of image for copy • vkCmdCopyBufferToImage • Layout transition of image for use by shader • Enqueue command buffer and execute Simple texture creation
  • 42. 43 HOW DOES IT LOOK ? Simple texture creation
  • 43. 44 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer VULKAN COMPONENTS Descriptor-Set
  • 44. 45 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer DESCRIPTOR-SET Each DescriptorSet holds references to some resources Descriptor-Set-Layout defines how resources must be put together in a DescriptorSet Command buffers can then efficiently bind any or them They must match what shaders of each stage expect ! Descriptor-Set Image Memory Image View Buffer Sampler Heap DescriptorSet Pool Descriptor-set Layout GLSL Code layout(std140, set= 0 , binding= 0) uniform A { ... }; layout(std140, set= 0 , binding= 1) uniform B { … }; layout(std140, set= 1 , binding= 2) uniform C { … }; layout(set=0, binding=3) uniform sampler2D tex; … void main() { …
  • 45. 46 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer DESCRIPTOR-SET Descriptor-set Layout A Uniform Buffer for Vtx Storage Buffer for frag. Image View for frag Descriptor-set Layout C Uniform Buffer for Vtx Descriptor-set Layout B Image View for frag. Sampler for frag. shd Dset A1 Buffer L Image View M Buffer K Dset B1 Image View Q Sampler R Dset C3 Buffer S Command-buffer … Bind Graphics-pipeline Bind Descriptor-Set(s) Dset A2 Buffer O Image View P Buffer N ? Memory Heap Image M Buffer K Image P Image Q Buffer L Buffer N Buffer O Buffer S Buffer T Dset C2 Buffer T Dset C1 Buffer S ? Defined for Layout A+B+C
  • 46. 47 HOW DOES IT LOOK ? Descriptor Set - initialization
  • 47. 48 HOW DOES IT LOOK ? Descriptor Set – using it
  • 48. 49 DESCRIPTOR SETS AND MULTI-THREADING Thread 1 Update Work Update resources in Desc. Sets Descriptor Pool Allocate Desc. Sets … more Vulkan work … Main thread Game Work Thread Coordination Swapping synchronize Descriptor Pool Allocate Desc. Sets Update resources in Desc. Sets … more Vulkan work … Thread 2 Update Work Update resources in Desc. Sets Descriptor Pool Allocate Desc. Sets … more Vulkan work … Thread 1 Update Work Update resources in Desc. Sets Descriptor Pool Allocate Desc. Sets … more Vulkan work … Thread 2 Update Work Update resources in Desc. Sets Descriptor Pool Allocate Desc. Sets … more Vulkan work … ! Descriptor Pool local to the thread ! cmd. Buffer Pool Create 2dary Cmd Buffer
  • 49. 50 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer Framebuffer Render-Pass VULKAN COMPONENTS
  • 50. 51 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer VULKAN COMPONENTS Framebuffer Render-Pass Image Image ViewImage ViewImage ViewImageView #0 ImageImageImage #0 Attachment Desc. • Image #0: fmt… • Image #1: fmt… • Image #2: fmt… • Image #3: fmt… Must match Sub-Pass #0 Desc. • Color: Image #0 • DST: Image #1 • MSAA resolve:Image #2 Sub-Pass #1 Desc. • Color0: Image #3 • Color1: Image #4 • DST: Image #1 • MSAA resolve:Image #5 • Framebuffer • Simpler than OpenGL • “Bag” or “Repository” of resource views • No role defined for the resources • Render-Pass • Really defines the role of Framebuffer resources • Can have more than 1 Sub-Pass • Each Sub-Passes defines which Framebuffer resource to use • invented for Tilers Arch Sub-Pass #N Desc. … Default Sub-Pass Can use many if compatibles
  • 51. 52 HOW DOES IT LOOK ? Renderpass Creation (‘compacted’ notation for Vulkan structures)
  • 52. 53 HOW DOES IT LOOK ? Framebuffer Creation and use
  • 53. 54 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer Memory (Vid) Heap 2 Memory (Sys) Heap 1 VULKAN COMPONENTS
  • 54. 55 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer MEMORY VULKAN OBJECTS Vulkan Objects referring to buffer(s) of data need binding to memory Vertex/Index Buffers; Uniform Buffers; Images/Textures… Vulkan Device exposes various Memory Heaps - Example: heap 0: size:12,288Mb (Video Memory of my K6000) heap 1: size:17,911Mb (System Memory of my PC) And various Memory Types from these Heaps. Example: Mem.Type Heap Flags 0 1 (sys.mem) - 1 0 (Video) DEVICE_LOCAL 2 1 (sys.mem) HOST_VISIBLE | HOST_COHERENT 3 1 (sys.mem) HOST_VISIBLE | HOST_COHERENT | HOST_CACHED Tegra: Adds one more: HOST_VISIBLE “NON-Coherent” Memory (Vid) Heap 2 Memory (Sys) Heap 1
  • 55. 56 Device Queue Command-buffer … Graphics pipeline Descriptor-Set Framebuffer Render-Pass Image Memory 2ndary Command-buffer … … Buffer Image View Buffer Sampler Image View Begin Render-Pass Bind Graphics-pipeline Bind Vertex/Idx Buffer(s) Bind Descriptor-Set(s) Draw… End Render-Pass Execute Commands Update Buffer Set misc. dynamic states Barrier synchronization Heap Cmd.Buffer Pool DescriptorSet Pool Command-buffer MEMORY VULKAN OBJECTS Image (Tex) Image View Buffer (Uniforms) Image View Buffer (Vertices) Memory (Vid) Heap 2 Memory (Sys) Heap 1 matrices Vtx Buf. V1.xyzw V2.xyzw… Image (RT) rgba rgb Vtx Buf.
  • 56. 57 RESOURCE MANAGEMENT Allocation and Sub allocation HEAP supporting A,B HEAP supporting B Allocation Type A Allocation Type B Image ... Cube Image Buffer Allocate memory type from heap Query Vulkan Object about size, alignment & type requirements Assign memory subregion to a resource (allows aliasing) BufferView BufferView Create resource views on subranges of a buffer or image (array slices...) Buffer
  • 57. 58 RESOURCE MANAGEMENT #HappyGPU Memory Allocation Buffer UniformVertexIndex Memory Allocation Buffer Index Vertex Buffer Uniform Buffer Better... Buffer Index Vertex Buffer Uniform Buffer Not. So. Good. Bind same buffer with Offsets > 0 1 buffer can have many types of data
  • 58. 59 HOW DOES IT LOOK ? Binding Memory to Objects – Simple use-case Mem properties previously gathered at init time
  • 59. 60 SHADERS Vulkan uses SPIR-V passed directly to the driver Can be compiled from GLSL Via glslang or LunarG’s glslangValidator; Google ShaderC theoretically other languages could be compiled to Spir-V… Libraries available to compile GLSL to Spir-V from the application NVIDIA allows to compile GLSL directly glslangGLSL Some other language NVIDIA VK_NV_glsl_shader: Vulkan reads GLSL directly
  • 60. 61 SHADERS Multiple entry points can be defined in a single Spir-V shader-module Prevents redundant code: shader module used by many Graphics-Pipelines Specialization Constants: early setup of constants for shaders in given Graphics-Pipeline Allows sharing snippets of code : easier to share common shader code Module vtx_main() frag_pastic(float t) Frag_metal()lighting() rotate(…) Graphics pipeline A Vtx_main() frag_pastic(1.1) Graphics pipeline B Vtx_main() frag_metal() Graphics pipeline C Vtx_main() frag_wood() frag_wood() Graphics pipeline A Vtx_main() frag_pastic(3.2)
  • 61. 62 VULKAN WINDOW SYSTEM INTEGRATION (WSI) WSI manages the ownership of images via a swap chain One image is presented while the other is rendered to WSI is a Vulkan Extension WSIDisplay Swap Chain (images) Application Submit image to WSI The display owns the image Acquire image from WSI The application owns the image
  • 62. 63 NVIDIA OPENGL VULKAN INTEROP Alternative to WSI: GL_NV_draw_vulkan_image Create an OpenGL Context and all the usual things Create Vulkan Device Rendering Loop involves both OpenGL and Vulkan Blit the Vulkan image to OpenGL backbuffer: glDrawVkImageNV Extra care on synchronization (Semaphores) Bonus: Mix OpenGL rendering (UI overlay…) with Vulkan Allows smooth transition in projects OpenGL Vulkan
  • 63. 64 HOW DOES IT LOOK ? GL_NV_draw_vulkan_image Initialize…
  • 64. 65 HOW DOES IT LOOK ? GL_NV_draw_vulkan_image Submit commands to Queue with semaphores Backbuffer Blit
  • 65. 66 PRE-REQUISITES TO WORK WITH VULKAN Lunar-G (http://lunarg.com/ ) Vulkan Loader (+Source code) Tools: Spir-V compiler for GLSL code and other libraries Layers: intermediate code invoked by Vulkan API functions to help debug Vulkan Includes Drivers: GeForce Experience https://developer.nvidia.com/vulkan-driver NVIDIA resources: https://developer.nvidia.com/Vulkan
  • 66. 67 RECAP’ ON NVIDIA-SPECIFIC FEATURES Compatible GPUs for Vulkan: Kepler and Higher; Shield Tablet; Shield Android TV VK_NV_glsl_shader : GLSL can be directly sent to Vulkan VK_NV_dedicated_allocation : more efficient memory usage GL_NV_draw_vulkan_image can replace WSI 16 Queues. All available for all kind of use; 1 Queue for Copy-Engine only 3 frames (max) in flight with WSI All Host memories are “Coherent” (except one for Tegra) Layout transitions don’t exist in our HW (VK_IMAGE_LAYOUT_GENERAL) Linear-Tiling only for 2D non-mipmapped textures… please avoid (bad performance) Shaders never need re-compilation due to states in Graphics-pipeline
  • 68. 69 RECAP’ ON VULKAN PHILOSOPHY Validate as much as possible up-front (DescriptorSets; Pipelines…) The driver doesn’t waste time on figuring-out how to set things-up Reuse existing patterns of Graphics-Pipelines: cached pipelines Know your application: Taylor Vulkan design according to it Know your memory usage: You are in charge of optimal sub-allocations Explicit multi-threading for graphics: Application’s responsibility Explicit Resource updates: Either through [non]Coherent buffers; or Queue-Based DMA transfers
  • 69. 70 FEW WORDS ON VKCPP PROJECT C++11 to the rescue • Open-Source Project of a C++11 overlay for Vulkan: became Khronos-offcial (!) • Simplify Vulkan usage by • reducing risk of errors, i.e. type safety, automatic initialization of sType, … • Reduce #lines of written code, i.e. constructors, initalizer lists for arrays, … • Add utility functions for common tasks (suballocators, resource tracking, …) http://on-demand.gputechconf.com/gtc/2016/events/vulkanday/Vulkan_C++.pdf https://developer.nvidia.com/open-source-vulkan-c-api https://www.khronos.org/news/permalink/khronos-introduces-vulkan-hpp-open- source-vulkan-c-api
  • 70. 71 VKCPP PROJECT Two C++ based layers Autogenerated ‚low-level‘ layer using vulkan.xml • Type safety • Syntactic sugar • Lightweight layer; Keeps you closer to the real Vulkan Hand-coded ‚high level‘ layer • Reduce code complexity • Exception safety, resource lifetime tracking, … • Closer dependency with VkCpp internal implementations
  • 71. 72 NATIVE VULKAN VS. VKCPP CODE Native Vulkan: ~750 lines vkCPP: ~200 lines
  • 72. 73 REFERENCES Vulkan info from NVIDIA: https://developer.nvidia.com/Vulkan https://developer.nvidia.com/vulkan-graphics-api-here Samples + Source code in OpenGL and Vulkan: https://github.com/nvpro-samples Other: https://gameworks.nvidia.com https://developer.nvidia.com/designworks http://vulkan.gpuinfo.org/listreports.php https://developer.nvidia.com/open-source-vulkan-c-api

Notas del editor

  1. Some applications might be unable to anticipate every single combination of states/shaders in an exhaustive set of graphics-pipelines. The Pipeline cache is here for that: allows more flexibility in recycling existing pipelines that would match the required combination. If not, the application would create it and add it to this cache.
  2. The WSI extension allows you to coordinate rendering operations with display. NVIDIA hardware allows for 2 frames in flight at any one time, so by leveraging what WSI refers to as a swap chain, Vulkan can be presenting 1 image whilst you’re rendering to another. So a swap chain is really just an array of images with the notion of ‘ownership’ attached to them. WSI allows you to acquire a free swap chain image that you can then use for rendering and to present a swap chain image that WSI will then display. When you ask WSI for an image it will wait until one becomes available and then give it to you. Once WSI has given you that image, your application owns it until you give it back to WSI. Giving an image back to WSI is essentially presenting it. So when you present an image, you give ownership of that image back to WSI. WSI then owns that image until you acquire it again.