Where to look at the movies : Analyzing visual attention to understand
movie editing
- URL: http://arxiv.org/abs/2102.13378v1
- Date: Fri, 26 Feb 2021 09:54:58 GMT
- Title: Where to look at the movies : Analyzing visual attention to understand
movie editing
- Authors: Alexandre Bruckert, Marc Christie, Olivier Le Meur
- Abstract summary: We propose a new eye-tracking database, containing gaze pattern information on movie sequences.
We show how state-of-the-art computational saliency techniques behave on this dataset.
- Score: 75.16856363008128
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the process of making a movie, directors constantly care about where the
spectator will look on the screen. Shot composition, framing, camera movements
or editing are tools commonly used to direct attention. In order to provide a
quantitative analysis of the relationship between those tools and gaze
patterns, we propose a new eye-tracking database, containing gaze pattern
information on movie sequences, as well as editing annotations, and we show how
state-of-the-art computational saliency techniques behave on this dataset. In
this work, we expose strong links between movie editing and spectators
scanpaths, and open several leads on how the knowledge of editing information
could improve human visual attention modeling for cinematic content. The
dataset generated and analysed during the current study is available at
https://github.com/abruckert/eye_tracking_filmmaking
Related papers
- Temporally Consistent Object Editing in Videos using Extended Attention [9.605596668263173]
We propose a method to edit videos using a pre-trained inpainting image diffusion model.
We ensure that the edited information will be consistent across all the video frames.
arXiv Detail & Related papers (2024-06-01T02:31:16Z) - MagicStick: Controllable Video Editing via Control Handle Transformations [49.29608051543133]
MagicStick is a controllable video editing method that edits the video properties by utilizing the transformation on the extracted internal control signals.
We present experiments on numerous examples within our unified framework.
We also compare with shape-aware text-based editing and handcrafted motion video generation, demonstrating our superior temporal consistency and editing capability than previous works.
arXiv Detail & Related papers (2023-12-05T17:58:06Z) - FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video
editing [65.60744699017202]
We introduce optical flow into the attention module in the diffusion model's U-Net to address the inconsistency issue for text-to-video editing.
Our method, FLATTEN, enforces the patches on the same flow path across different frames to attend to each other in the attention module.
Results on existing text-to-video editing benchmarks show that our proposed method achieves the new state-of-the-art performance.
arXiv Detail & Related papers (2023-10-09T17:59:53Z) - MoviePuzzle: Visual Narrative Reasoning through Multimodal Order
Learning [54.73173491543553]
MoviePuzzle is a novel challenge that targets visual narrative reasoning and holistic movie understanding.
To tackle this quandary, we put forth MoviePuzzle task that amplifies the temporal feature learning and structure learning of video models.
Our approach outperforms existing state-of-the-art methods on the MoviePuzzle benchmark.
arXiv Detail & Related papers (2023-06-04T03:51:54Z) - The Anatomy of Video Editing: A Dataset and Benchmark Suite for
AI-Assisted Video Editing [90.59584961661345]
This work introduces the Anatomy of Video Editing, a dataset, and benchmark, to foster research in AI-assisted video editing.
Our benchmark suite focuses on video editing tasks, beyond visual effects, such as automatic footage organization and assisted video assembling.
To enable research on these fronts, we annotate more than 1.5M tags, with relevant concepts to cinematography, from 196176 shots sampled from movie scenes.
arXiv Detail & Related papers (2022-07-20T10:53:48Z) - Film Trailer Generation via Task Decomposition [65.16768855902268]
We model movies as graphs, where nodes are shots and edges denote semantic relations between them.
We learn these relations using joint contrastive training which leverages privileged textual information from screenplays.
An unsupervised algorithm then traverses the graph and generates trailers that human judges prefer to ones generated by competitive supervised approaches.
arXiv Detail & Related papers (2021-11-16T20:50:52Z) - GAZED- Gaze-guided Cinematic Editing of Wide-Angle Monocular Video
Recordings [6.980491499722598]
We present GAZED- eye GAZe-guided EDiting for videos captured by a solitary, static, wide-angle and high-resolution camera.
Eye-gaze has been effectively employed in computational applications as a cue to capture interesting scene content.
arXiv Detail & Related papers (2020-10-22T17:27:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.