// 01 — render showcase
Rendering Systems
Three rendering pipelines implemented: Forward+ for dynamic multi-light scenes,
Deferred for decoupling geometry from lighting, and Deferred PBR for
physically-based materials. All share the same Shadow Atlas system for multi-light shadow rendering and G-Buffer
for deferred ones and velocity pass (for motion blur post process).
Forward+
Deferred
Deferred PBR — metallic / roughness / ao
PBR implemented: G-Buffer stores albedo, normal, metallic, roughness and AO. The lighting pass
reconstructs world position from depth and evaluates the Cook-Torrance BRDF per light.
G-Buffer Passes
The deferred pipeline separates geometry from lighting into discrete, inspectable passes. Visualized with
RenderDoc.
Depth Pass
Color Pass
Shadow Atlas
Lighting Pass
Post-Process Pipeline
Three post-process effects chained together over a single scene target. Each effect consumes the previous one's
output texture, forming a MotionBlur → DoF → Bloom stack. Motion blur uses a dedicated velocity
pass that encodes per-pixel screen-space velocity from current and previous view-projection matrices.
Scene Target
→
Velocity Pass
→
Motion Blur
→
Depth of Field
→
Bloom
→
Blit to Swapchain
Motion Blur
Depth of Field
Bloom
All three
effects combined
// 02 — additional visual systems
SkyBox
A cubemap-based skybox rendered as the last pass in the frame, after all geometry and lights, to avoid
overdraw. The six faces are loaded at engine startup and bound as a cube sampler in the descriptor set. In the
PBR pipeline, the cubemap is also sampled for image-based lighting (IBL) — visible in the
specular reflections on the metal spheres below.
SkyBox + PBR cubemap reflections
std::string facesCubeMap[6] = {
"../data/Skybox/right.jpg", "../data/Skybox/left.jpg",
"../data/Skybox/top.jpg", "../data/Skybox/bottom.jpg",
"../data/Skybox/front.jpg", "../data/Skybox/back.jpg"
};
HTML html{ 1200, 1000, "Demo", facesCubeMap };
// SkyBox is initialized inside VulkanWrapper and bound globally
// PBR system reads the cubemap sampler for IBL specular contribution
Terrain
A simplex noise and heightmap-based terrain rendered with texture. The system generates mesh geometry at load
time and submits it through the standard render path, benefiting from the same PBR materials and
shadow atlas as any other mesh in the scene.
Terrain
Particle System
CPU-side particle simulation (position, velocity, lifetime, color) with a dedicated
ParticleRenderSystem that submits billboarded quads to the GPU each frame. The simulation logic was
written by my teammate; my contribution was the Vulkan rendering side — descriptor sets, vertex
buffers, the render pass integration and synchronization with the main frame.
Particle System — Vulkan port
Note — Particle simulation logic authored by teammate. Vulkan integration
(ParticleRenderSystem, descriptor layout, frame sync) by me.
// 02 — procedural content
Procedural Generation Systems
Two procedural systems built as production-style tools: an L-System based plant and tree generator with five
species, and a Wave Function Collapse tilemap generator. Both are integrated directly into the engine as
standalone demos.
L-System Plant Generator
Symbol alphabet (F, X, +, −, [, ], &, ^, \) that expands iteratively via rewriting rules. Five distinct
plant types — stochastic, spiral, symmetric, and standard variants. Randomness is injected at the
rule-expansion stage, producing different geometry on every run.
Foliage / Forest Generator
Combines the L-System generator with Simplex Noise for biome-aware placement. A threshold parameter
controls forest density. Each tree receives a random rotation and iteration count, giving organic variation
across the landscape.
L-Systems are the foundation of most procedural vegetation tools
(SpeedTree, Houdini's branch solvers).
Wave Function Collapse
Constraint-based tilemap generator. Each tile type defines which neighbors are valid, and the algorithm
resolves the grid by collapsing cells with lowest entropy first. The result is always coherent — no
water-next-to-dirt without sand in between.
—
—
WFC is the algorithm behind many modern level-generation
pipelines.
// 03 — pipeline deep dive
Engine Architecture
The HTML class is the single public interface the user sees. Internally it owns and orchestrates all subsystems
through unique_ptr. No subsystem knows about the others — HTML coordinates them.
Frame Time Line
Every frame follows a fixed sequence. Input and logic run first, then rendering, then ImGui overlay, then
physics step. The separation is explicit — no rendering during physics, no physics during rendering.
Shadow Atlas
Instead of one shadow map per light (which requires N render passes and N textures), all shadow maps are packed
into a single atlas texture. The stepLight loop distributes slots: directional light gets 1 slot,
each point
light needs 6 faces (cubemap), each spot light needs 1.
// shadow atlas - slot distribution
int numberOfLights = 1; // Directional
numberOfLights += (int)pointLights.size() * 6; // 6 faces per point light
numberOfLights += (int)spotLights.size(); // 1 per spot light
int maxPerBatch = GetMaxNumberLightsShadowAtlas();
for (int i = 0; i < numberOfLights; i += maxPerBatch) {
InitRenderProcess(frameInfo, i, dirPosReference); // -> ShadowMap + GBuffer for this batch
RenderObjects(frameInfo, i, dirPosReference); // -> Lighting pass
if (i + maxPerBatch < numberOfLights)
vkWrapper_.EndOffScreenRenderPass(frameInfo); // Close pass, start next batch
}
G-Buffer Layout
The Deferred pipeline writes geometry data to a set of render targets in a first pass, then the lighting pass
reads from them. Separating geometry from lighting allows N lights at roughly the cost of 1 forward pass.
// G-Buffer Attachments - Deferred PBR
Culling System
Before any draw call, objects outside the camera frustum are skipped entirely. A bounding sphere is generated
at model load time. Each frame, the sphere is tested against the 6 frustum planes. On Android, distance-based
culling is used instead — simpler and cheaper on mobile GPUs.
// For each entity with a MeshComponent:
BoundingSphere sphere = mesh.boundingSphere; // center + radius, generated at load
for (Plane plane : frustum.planes) { // 6 planes: near, far, left, right, top, bottom
float dist = dot(plane.normal, sphere.center) + plane.d;
if (dist < -sphere.radius) -> CULL (behind this plane)
}
-> VISIBLE: submit to draw call list
// Android fallback:
if (distance(camera.pos, sphere.center) > maxDistance) -> CULL
Frustum culling is the single most impactful CPU-side optimization. With 1000+ entities, skipping invisible
objects before they reach the GPU eliminates a significant fraction of draw calls at zero GPU cost.
Image Barriers & Synchronization
Vulkan has no implicit synchronization — every image layout transition and pipeline stage dependency must be
declared explicitly. The post-process chain requires a sequence of barriers to correctly hand off ownership
between passes.
// After geometry pass: transition scene target to be readable by post-process shader
VkImageMemoryBarrier barrier{};
barrier.oldLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
barrier.newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
barrier.srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT; // waited until color writes finish
barrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT; // safe to read in fragment shader
vkCmdPipelineBarrier(cmd,
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT, // source stage
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, // destination stage
...);
// BlitTexture: transition post-process output -> scene target
// src: SHADER_READ_ONLY -> TRANSFER_SRC
// dst: SHADER_READ_ONLY -> TRANSFER_DST
// -> vkCmdBlitImage
// -> final barrier: TRANSFER_DST -> SHADER_READ_ONLY
Post-Process Chain
Each effect writes to its own output texture. viewbase and sampler are redirected
after each effect, so the chain requires no intermediate copies — the output of one pass is the input of the
next.
// MotionBlur reads velocity pass output
motionBlurSystem_->Apply(frameInfo.commandBuffer, viewbase, sampler,
velTex->getImageView(), velTex->getSampler(), ...);
viewbase = mbOutput->getImageView(); // redirect for next pass
// DoF reads MB output + depth buffer
dofSystem_->Apply(frameInfo.commandBuffer, viewbase, sampler,
vkWrapper_.GetOffScreenDepthView(), ...);
viewbase = dofOut->getImageView();
// Bloom reads DoF output
bloomSystem_->Apply(frameInfo.commandBuffer, viewbase, sampler, ...);
ECS Architecture
The engine is built around an ECS with a contiguous component storage + O(1) lookup map. A Job System handles
async model and texture loading. The render layer is fully abstracted behind API-specific backends (Vulkan /
Android /
OpenGL / Web).
// Contiguous storage for cache-friendly iteration
std::vector<std::pair<size_t, T>> entityList_;
// O(1) lookup by entity ID
std::unordered_map<size_t, size_t> entityIdToIndexMap_;
// Usage: direct access without full scan
size_t index = entityIdToIndexMap_[entityID];
T* comp = &entityList_[index].second;
ECS design: contiguous vector<pair<size_t,T>> for cache-friendly bulk iteration +
unordered_map for O(1) per-entity lookup. Migrating from classic OOP gave an immediate, measurable
performance jump.
Plant Generator
case PlantGeneratorSystem::kStochasticTree:
result = { 'X' };
for (size_t i = 0; i < iterations; i++) {
for (char node : result) {
if (node == 'X') {
int r = rand() % 100;
if (r < 40) AppendString(newVector, "FF[+X][-X]X");
else if (r < 70) AppendString(newVector, "F[+X]F[-X]+X");
else if (r < 90) AppendString(newVector, "F[\\X][/X]FX");
else AppendString(newVector, "FX");
}
}
}
break;
// 04 — challenges & solutions
Problems I Solved
⚠ Problem
Multi-light shadow rendering
✓ Solution
Shadow Atlas class packs multiple shadow maps into a single texture. The stepLight parameter
routes each light to its atlas slot, with correct synchronization between passes.
⚠ Problem
Motion blur required per-pixel velocity
✓ Solution
Dedicated velocity pass encodes screen-space velocity from current and previous view-projection matrices.
Result is passed as a separate texture into the blur shader.
⚠ Problem
Frustum Culling
✓ Solution
Bounding sphere generated at model load time, tested against the frustum planes each frame. Objects outside
the frustum are skipped entirely before draw calls.
⚠ Problem
PhysX integration
✓ Solution
Compiled PhysX from source and built a wrapper around the scene, actors, and step functions. A base Pawn
Controller abstracts different movement types (character, vehicle, custom).
✓ Solution
Vulkan's portability made the Android Studio + CMake port manageable. Required adjusting surface creation
guards and adapting the app lifecycle for Android's C++ support.