: The model uses a memory mechanism to track objects even if they are temporarily occluded (hidden) or exit and re-enter the scene.

: SAM 2 is roughly 6x faster than the original SAM on image segmentation benchmarks.

: It serves as a single, drop-in replacement for both image and video segmentation tasks.

The following feature details the capabilities and technical specifications of this model as of April 2026. Core Capabilities of SAM 2

: Users can use points, bounding boxes, or masks as prompts to identify objects. For video, a prompt in one frame creates a "masklet" that tracks the object throughout the entire clip.