Video Scrubbing

May 8, 2025

While watching YouTube video, if you hover over the seek bar you can preview video image of that location. You can see images even at 10 min even though only 30 sec of video has buffered yet. So, if the video has not loaded till 10 min, how is the video player able to display image at that location ? Isn’t it interesting ?

This happens with a feature called video scrubbing. We need to process the uploaded video for video scrubbing. When a video is uploaded (say of 10 min), we can take snapshots at regular intervals (of say 3sec). So, for a 10 min video there will be 10*60/3 = 200 snapshots. We can generate these snapshots using ffmpeg.

ffmpeg -i <input_video_path> -vf fps=1/3,scale=160:90 -q:v 3 <output_dir>/thumb-%04d.jpg

This would generate images of 160:90 size from input video at 3 sec interval with quality of 3 in the output directory. Output thumbnails will follow thumb-0001.jpg like pattern.

Loading 200 images would result in 200 requests from client. This would be very inefficient. Plus, the size of images is quite small (few kBs). If we can combine these images in some way, then we can reduce the number of requests from 200 to 1.

There is a way. We can create a sprite-sheet from these generated images. Sprite sheet is a sheet of small small images. To identify which image is stored at which location a metadata.json should also be created. With this metadata.json the video player can easily display appropriate image when user hovers over some specific duration of video.

We can use Pillow library to generate sprite-sheet as below:

def create_sprite_sheet(
        thumbnail_paths: List[Path],
        output_path: Path,
        columns: int,
        thumbnail_width: int,
        thumbnail_height: int,
        interval_seconds: int
) -> Tuple[Path, List[Dict]]:
    """
    Create a sprite sheet from individual thumbnails.
    
    Args:
        thumbnail_paths (List[Path]): List of paths to thumbnail images
        output_path (Path): Path to save the sprite sheet
        columns (int): Number of thumbnails per row
        thumbnail_width (int): Width of each thumbnail in pixels
        thumbnail_height (int): Height of each thumbnail in pixels
        interval_seconds (int): Interval between thumbnails in seconds
    
    Returns:
        Tuple[Path, List[Dict]]: Path to the sprite sheet and list of frame metadata
    """
    try:
        from PIL import Image
    except ImportError:
        raise ImportError("Pillow is required to create sprite sheets. Install with 'pip install Pillow'")

    print(f"Creating sprite sheet with {len(thumbnail_paths)} thumbnails")

    # Calculate dimensions
    rows = math.ceil(len(thumbnail_paths) / columns)
    sprite_width = columns * thumbnail_width
    sprite_height = rows * thumbnail_height

    # Create a new image for the sprite sheet
    sprite_sheet = Image.new('RGB', (sprite_width, sprite_height))

    # Prepare metadata for frames
    frames_metadata = []

    # Add each thumbnail to the sprite sheet
    for i, thumb_path in enumerate(thumbnail_paths):
        # Calculate position
        row = i // columns
        col = i % columns
        x = col * thumbnail_width
        y = row * thumbnail_height

        # Open thumbnail and paste into sprite sheet
        with Image.open(thumb_path) as thumb:
            sprite_sheet.paste(thumb, (x, y))

        # Add metadata for this frame
        frames_metadata.append({
            "timestamp": i * interval_seconds,
            "position": {
                "x": x,
                "y": y
            },
            "index": i
        })

    # Save sprite sheet
    sprite_sheet.save(output_path, quality=85, optimize=True)
    print(f"Sprite sheet created at {output_path}")

    return output_path, frames_metadata

Sample sprite sheet image:

The json file looks like:

  "frames": [
    {
      "timestamp": 0,
      "position": {
        "x": 0,
        "y": 0
      },
      "index": 0
    },
    {
      "timestamp": 3,
      "position": {
        "x": 160,
        "y": 0
      },
      "index": 1
    },
    ...
  ]

Once sprite-sheet and json metadata is created video player can use them to display appropriate images on hovering over seek bar. This is how video scrubbing works.