Back to Portfolio TECH_BLOG.EXE

// technical art breakdown

Engine cover

Hyper Technical Magic Leprechaun (HTML)

Vulkan-OpenGL Graphics Engine

2025 - 2026 Vulkan · C++ Team of 2 8 months
PBR Deferred Rendering Forward+ Post-Process Pipeline Procedural Generation
C++ Vulkan OpenGL PhysX 5 GLSL

This project was built to understand rendering at the GPU level — not to use an engine, but to build one from scratch. The goal was to know exactly what happens below the surface of tools like UE5 or Unity.

The project was split between Vulkan and OpenGL. I was responsible for the full Vulkan side: ECS, PhysX, ImGui, Android port, and all rendering systems.

// source_code // ask for it

Rendering Systems

Three rendering pipelines implemented: Forward+ for dynamic multi-light scenes, Deferred for decoupling geometry from lighting, and Deferred PBR for physically-based materials. All share the same Shadow Atlas system for multi-light shadow rendering and G-Buffer for deferred ones and velocity pass (for motion blur post process).

Forward+ Rendering
Forward+
Deferred Rendering
Deferred
PBR
Deferred PBR — metallic / roughness / ao
PBR implemented: G-Buffer stores albedo, normal, metallic, roughness and AO. The lighting pass reconstructs world position from depth and evaluates the Cook-Torrance BRDF per light.

G-Buffer Passes

The deferred pipeline separates geometry from lighting into discrete, inspectable passes. Visualized with RenderDoc.

Depth Pass
Depth Pass
Color Pass
Color Pass
Shadow Pass
Shadow Atlas
Lighting Pass
Lighting Pass

Post-Process Pipeline

Three post-process effects chained together over a single scene target. Each effect consumes the previous one's output texture, forming a MotionBlur → DoF → Bloom stack. Motion blur uses a dedicated velocity pass that encodes per-pixel screen-space velocity from current and previous view-projection matrices.

Scene Target
Velocity Pass
Motion Blur
Depth of Field
Bloom
Blit to Swapchain
Motion Blur
Motion Blur
Depth of Field
Depth of Field
Bloom
Bloom
All post-process effects combined

All three effects combined

SkyBox

A cubemap-based skybox rendered as the last pass in the frame, after all geometry and lights, to avoid overdraw. The six faces are loaded at engine startup and bound as a cube sampler in the descriptor set. In the PBR pipeline, the cubemap is also sampled for image-based lighting (IBL) — visible in the specular reflections on the metal spheres below.

SkyBox with PBR reflections
SkyBox + PBR cubemap reflections
C++ — main setup
std::string facesCubeMap[6] = {
  "../data/Skybox/right.jpg",  "../data/Skybox/left.jpg",
  "../data/Skybox/top.jpg",    "../data/Skybox/bottom.jpg",
  "../data/Skybox/front.jpg",  "../data/Skybox/back.jpg"
};
HTML html{ 1200, 1000, "Demo", facesCubeMap };
// SkyBox is initialized inside VulkanWrapper and bound globally
// PBR system reads the cubemap sampler for IBL specular contribution

Terrain

A simplex noise and heightmap-based terrain rendered with texture. The system generates mesh geometry at load time and submits it through the standard render path, benefiting from the same PBR materials and shadow atlas as any other mesh in the scene.

Terrain
Terrain

Particle System

CPU-side particle simulation (position, velocity, lifetime, color) with a dedicated ParticleRenderSystem that submits billboarded quads to the GPU each frame. The simulation logic was written by my teammate; my contribution was the Vulkan rendering side — descriptor sets, vertex buffers, the render pass integration and synchronization with the main frame.

Particles
Particle System — Vulkan port
Note — Particle simulation logic authored by teammate. Vulkan integration (ParticleRenderSystem, descriptor layout, frame sync) by me.

Procedural Generation Systems

Two procedural systems built as production-style tools: an L-System based plant and tree generator with five species, and a Wave Function Collapse tilemap generator. Both are integrated directly into the engine as standalone demos.

L-System Plant Generator

Symbol alphabet (F, X, +, −, [, ], &, ^, \) that expands iteratively via rewriting rules. Five distinct plant types — stochastic, spiral, symmetric, and standard variants. Randomness is injected at the rule-expansion stage, producing different geometry on every run.

L-System plants

Foliage / Forest Generator

Combines the L-System generator with Simplex Noise for biome-aware placement. A threshold parameter controls forest density. Each tree receives a random rotation and iteration count, giving organic variation across the landscape.

Procedural forest
L-Systems are the foundation of most procedural vegetation tools (SpeedTree, Houdini's branch solvers).

Wave Function Collapse

Constraint-based tilemap generator. Each tile type defines which neighbors are valid, and the algorithm resolves the grid by collapsing cells with lowest entropy first. The result is always coherent — no water-next-to-dirt without sand in between.

WFC step 1

WFC step 2
WFC is the algorithm behind many modern level-generation pipelines.

Engine Architecture

The HTML class is the single public interface the user sees. Internally it owns and orchestrates all subsystems through unique_ptr. No subsystem knows about the others — HTML coordinates them.

// engine Architecture
Engine Architecture

Frame Time Line

Every frame follows a fixed sequence. Input and logic run first, then rendering, then ImGui overlay, then physics step. The separation is explicit — no rendering during physics, no physics during rendering.

// frame execution order
Frame Timeline

Shadow Atlas

Instead of one shadow map per light (which requires N render passes and N textures), all shadow maps are packed into a single atlas texture. The stepLight loop distributes slots: directional light gets 1 slot, each point light needs 6 faces (cubemap), each spot light needs 1.

// shadow atlas - slot distribution
Shadow Atlas Pass
C++ — renderManager.cpp · light loop
int numberOfLights = 1;                                       // Directional
numberOfLights += (int)pointLights.size() * 6;                // 6 faces per point light
numberOfLights += (int)spotLights.size();                     // 1 per spot light

int maxPerBatch = GetMaxNumberLightsShadowAtlas();

for (int i = 0; i < numberOfLights; i += maxPerBatch) {
  InitRenderProcess(frameInfo, i, dirPosReference);           // -> ShadowMap + GBuffer for this batch
  RenderObjects(frameInfo, i, dirPosReference);               // -> Lighting pass
  if (i + maxPerBatch < numberOfLights)
    vkWrapper_.EndOffScreenRenderPass(frameInfo);             // Close pass, start next batch
}

G-Buffer Layout

The Deferred pipeline writes geometry data to a set of render targets in a first pass, then the lighting pass reads from them. Separating geometry from lighting allows N lights at roughly the cost of 1 forward pass.

// G-Buffer Attachments - Deferred PBR
G-Buffer Attachments - Deferred PBR

Culling System

Before any draw call, objects outside the camera frustum are skipped entirely. A bounding sphere is generated at model load time. Each frame, the sphere is tested against the 6 frustum planes. On Android, distance-based culling is used instead — simpler and cheaper on mobile GPUs.

pseudocode - frustum sphere test
// For each entity with a MeshComponent:
BoundingSphere sphere = mesh.boundingSphere;         // center + radius, generated at load

for (Plane plane : frustum.planes) {                 // 6 planes: near, far, left, right, top, bottom
  float dist = dot(plane.normal, sphere.center) + plane.d;
  if (dist < -sphere.radius) -> CULL (behind this plane)
}
-> VISIBLE: submit to draw call list

// Android fallback:
if (distance(camera.pos, sphere.center) > maxDistance) -> CULL
Frustum culling is the single most impactful CPU-side optimization. With 1000+ entities, skipping invisible objects before they reach the GPU eliminates a significant fraction of draw calls at zero GPU cost.

Image Barriers & Synchronization

Vulkan has no implicit synchronization — every image layout transition and pipeline stage dependency must be declared explicitly. The post-process chain requires a sequence of barriers to correctly hand off ownership between passes.

C++ — renderManager.cpp · scene target barrier
// After geometry pass: transition scene target to be readable by post-process shader
VkImageMemoryBarrier barrier{};
barrier.oldLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
barrier.newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
barrier.srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;   // waited until color writes finish
barrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;              // safe to read in fragment shader
vkCmdPipelineBarrier(cmd,
  VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,                // source stage
  VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,                        // destination stage
  ...);

// BlitTexture: transition post-process output -> scene target
// src: SHADER_READ_ONLY -> TRANSFER_SRC
// dst: SHADER_READ_ONLY -> TRANSFER_DST
// -> vkCmdBlitImage
// -> final barrier: TRANSFER_DST -> SHADER_READ_ONLY

Post-Process Chain

Each effect writes to its own output texture. viewbase and sampler are redirected after each effect, so the chain requires no intermediate copies — the output of one pass is the input of the next.

C++ — renderManager.cpp · Post-Process Chain
// MotionBlur reads velocity pass output
motionBlurSystem_->Apply(frameInfo.commandBuffer, viewbase, sampler,
  velTex->getImageView(), velTex->getSampler(), ...);
viewbase = mbOutput->getImageView();  // redirect for next pass

// DoF reads MB output + depth buffer
dofSystem_->Apply(frameInfo.commandBuffer, viewbase, sampler,
  vkWrapper_.GetOffScreenDepthView(), ...);
viewbase = dofOut->getImageView();

// Bloom reads DoF output
bloomSystem_->Apply(frameInfo.commandBuffer, viewbase, sampler, ...);

ECS Architecture

The engine is built around an ECS with a contiguous component storage + O(1) lookup map. A Job System handles async model and texture loading. The render layer is fully abstracted behind API-specific backends (Vulkan / Android / OpenGL / Web).

C++ — ECS.hpp
// Contiguous storage for cache-friendly iteration
std::vector<std::pair<size_t, T>> entityList_;

// O(1) lookup by entity ID
std::unordered_map<size_t, size_t> entityIdToIndexMap_;

// Usage: direct access without full scan
size_t index = entityIdToIndexMap_[entityID];
T* comp = &entityList_[index].second;
ECS design: contiguous vector<pair<size_t,T>> for cache-friendly bulk iteration + unordered_map for O(1) per-entity lookup. Migrating from classic OOP gave an immediate, measurable performance jump.

Plant Generator

C++ — PlantGeneratorSystem.cpp · Stochastic Tree
case PlantGeneratorSystem::kStochasticTree:
  result = { 'X' };
  for (size_t i = 0; i < iterations; i++) {
    for (char node : result) {
      if (node == 'X') {
        int r = rand() % 100;
        if      (r < 40) AppendString(newVector, "FF[+X][-X]X");
        else if (r < 70) AppendString(newVector, "F[+X]F[-X]+X");
        else if (r < 90) AppendString(newVector, "F[\\X][/X]FX");
        else              AppendString(newVector, "FX");
      }
    }
  }
  break;

Problems I Solved

⚠ Problem

Multi-light shadow rendering

✓ Solution

Shadow Atlas class packs multiple shadow maps into a single texture. The stepLight parameter routes each light to its atlas slot, with correct synchronization between passes.

⚠ Problem

Motion blur required per-pixel velocity

✓ Solution

Dedicated velocity pass encodes screen-space velocity from current and previous view-projection matrices. Result is passed as a separate texture into the blur shader.

⚠ Problem

Frustum Culling

✓ Solution

Bounding sphere generated at model load time, tested against the frustum planes each frame. Objects outside the frustum are skipped entirely before draw calls.

⚠ Problem

PhysX integration

✓ Solution

Compiled PhysX from source and built a wrapper around the scene, actors, and step functions. A base Pawn Controller abstracts different movement types (character, vehicle, custom).

⚠ Problem

Android port

✓ Solution

Vulkan's portability made the Android Studio + CMake port manageable. Required adjusting surface creation guards and adapting the app lifecycle for Android's C++ support.

By The Numbers

+60 FPS
+1000 Entities
3 Render Pipelines
3 Post FX
8 months Development
Performance metrics

Platforms

Vulkan desktop
Vulkan · Desktop
Android
Android · Mobile

What I Learned