Segmented Video Compression

Overview

This project explores a simple idea: not all parts of a video frame are equally important. In many scenes, the viewer’s attention is drawn to moving subjects while the background remains relatively static.

In this system, video frames are analyzed to detect motion and separate foreground regions from the background. These segmented regions can then be treated differently during compression, preserving more detail in moving areas while applying stronger compression to static regions.

The project demonstrates the full pipeline from raw frame processing to synchronized video playback, along with visualizations that show how motion segmentation influences the compression process.

Demo

The demo video below shows the final reconstructed video along with visualizations of the segmentation process used to identify moving regions in each frame.

The first portion of the video shows the reconstructed playback with synchronized audio.
Later in the video, segmentation visualizations illustrate how the system identifies moving regions and foreground blocks during processing.

Source Code

The implementation for this project, including the compression pipeline and visualization scripts, is available on GitHub.

View the repository β†’

System Pipeline

The compression system processes raw video frames, identifies moving regions, and applies transform-based compression before reconstructing the final video.

Foreground Segmentation with Detectron2

The segmentation step was extended using Detectron2, Facebook AI Research’s framework for object detection and instance segmentation. Rather than relying solely on motion differencing, Detectron2 identifies semantic objects in the scene, making it possible to isolate foreground subjects like the tennis player more reliably.

Classical motion analysis and deep learning segmentation are combined so the pipeline can better determine which regions deserve higher visual quality during compression.

Segmentation Visualization

The video below shows the segmentation stage in action. Detected foreground regions are highlighted to illustrate how object-level masks guide compression decisions across the frame.

Technical Highlights

  • Built a full video compression pipeline over raw RGB frames with synchronized audio playback.
  • Used Three-Step Search motion estimation to classify macroblocks as foreground or background per frame.
  • Applied 2D DCT with separate quantization parameters for foreground and background regions, then reconstructed frames via IDCT.
  • Extended segmentation with Detectron2 and dense optical flow to produce semantically-aware foreground masks.
  • Built a four-panel visualization showing original frames, Detectron2 segmentation, optical flow, and final foreground block classification side by side.

Team & Technical Scope

Pranav Rathod
Video compression pipeline, playback system, and visualizations
James Kasaba
Segmentation pipeline and integration
Ruiqi Zhang
Detectron2 integration and segmentation experimentation