Interesting: at first blush, it looks like it's clustered based on image similarity rather than time?
But that must be wrong, the README.md mentions it visualizing time.
I'd love to understand a bit more.
The README.md punts to Wikipedia on Hilbert curves, which is classic Wikipedia, makes sense if you understand it already :) and a 20 minute video on Hilbert curves, which I find hard to commit to, assuming its unlikely it touches on movie visualization via Hilbert curves.
It's definitely hard, and not your responsibility, to explain the scientific concept.
But, I'd love to have your understanding of how the visualization is more interesting.
A Hilbert curve is a mapping between 1D and 2D space that attempts to preserve locality. Two points that are close in 2D space tend to map to two points that are close in 1D space and vice versa.
If you imagine a movie as a line along the time axis with each frame as a pixel, there are multiple way to create an 2D image.
Bargraph is simple approach, but essentially it is still is a one dimensional. We are only using x axis.
Zig-zag pattern is another approach, where you start from top to bottom, left to right. But in this case the relative distance between close frames aren't fully preserved. Two distant frames might appear together, or close frames might end up far apart, which leads to odd looking artifacts.
Hilbert curve is a pattern to map 1D to fill space (2D) such that relative distance between any two points (frames) on the 1D line is somewhat preserved. That's why it appears as clump/blob.
Here it is hard to see the movie progression from start to end but all frames from a scene are always closer, which was what I was aiming. I find it interesting that visual aspect (color/scene) is easy to see here but temporal aspect isn't.
I was excited about the whole 1D to 2D mapping aspect at that time, leading to this toy.
This is really cool. I'd love to see this done for the movie "Lola rennt" (Run Lola Run) which is studied for it's color symbolism throughout the film.
This is using the linked python code - thanks for sharing, OP! - I didn't look into the details but it takes an incredibly long time to save .png files for the source frames despite only selecting 6,400 of them.
Reminds me of the time I applied this technique to basically a webcam stream of the northern night sky. You could immediately see if there were northern lights that night (and when) without having to scrub through the footage. I bet there are other use cases that haven't been explored yet.
This is quite nice. Not sure what the meaning is of a circle versus, say, a linear strip, but it’s very effective for showing the dominant colors over time. I’d love to generally see this for many movies across time; my understanding is most are color graded green/yellow now and it’d be nice to visually see this evolution.
I think it's something people keep rediscovering. It's a pretty fun programming problem that lets you explore lots of different domains at the same time (video processing, color theory, different coordinate systems for visualizing things) and you get a tangible "cool" piece of art at the end of your effort.
I built one of these back in the day. Part of the fun was seeing how fast I could make the pipeline. Once I realized that FFMPEG could read arbitrary byte ranges directly from S3, I went full ham into throwing machines at the problem. I could crunch through a 4 hour movie in a few seconds by distributing the scene extraction over an army lambdas (while staying in the free tier!). Ditto for color extraction and presentation. Lots of fun was had.
I have a cli tool I maintain that finds visually similar images.
As a fun experiment several years ago I extracted all the frames of Skyfall and all the frames of the first Harry Potter movie.
I then reconstructed Harry Potter frame by frame using the corresponding frame from Skyfall that was most visually similar.
The end result was far more indecipherable than I'd ever expected. The much darker color pallet of Harry Potter lead to the final result largely using frames from a single dark scene in Skyfall, with single frames often being used over and over. It was pretty disappointing given it took hours and hours to process.
Thinking about it now, there's probably a way to compensate for this. Some sort of overall pallette compensation.
I imagine the best case outcome would look something like Jack Gallant's 2011 work on visual mind reading, where they trained a model on brain activity watching hundreds of hours of YouTube and then attempted to reconstruct a view of video not in the training set by correlating the new brain activity with frames from the input... Sorry I'm not explaining very clearly but there's a YouTube video of the result naturally :)
That's an interesting idea. I wonder how well the film iris/barcodes could be used to figure out which movies make the best 'palette' to recreate a given scene.
If you had a much bigger corpus, and used some semantics-aware similarity metrics (think embeddings), you could maybe end up with something actually coherent
Probably want a `-vf fps=1/n` in there (where n = number of seconds) to trim the film down to a manageable number of frames.
(you'd ideally do some "clever" processing to combine a number of frames into a single colour strip but that's obviously more faff than just a simple `ffmpeg` call...)
You could also use something like https://github.com/Breakthrough/PySceneDetect to first split the video into camera shots, and then grab a single (or average) frame per shot, leading to a cleaner result.
I have imagemagick installed, it no longer has a 'magick' command you run montage etc. directly. the problem is globbing that huge amount of files, so I'm wondering if you've tested the commands.
I'm not sure the actors are mumbling their words. I think the issue stems from the way sound is decoded on people's devices. Movies used to be mixed down to stereo and TVs have no problem producing the sound. Now that it's possible to stream, say 7.2, the TV will tell the service to send something like that but the TV does a poor job of outputting the center vocal channel through it's two rear $3 speakers so you can hear it. The same audio track decoded with a nice cinema amp and two center speakers will generally be really clear. (Also some TVs have a setting to boost the vocals to try and fix this)
Again, with the dark movies. They seem to be color graded and encoded for HDR screens, and most HDR screens are really a fake type of HDR and so the movies just come out super dark. They look a lot less horrible on a really great high-end TV in a totally dark room.
I did something similar a few years back. I find the Hilbert curve pattern much more interesting.
https://github.com/akash-akya/hilbert-montage
Not an original idea, it was inspired by someone else at that time.
Interesting: at first blush, it looks like it's clustered based on image similarity rather than time?
But that must be wrong, the README.md mentions it visualizing time.
I'd love to understand a bit more.
The README.md punts to Wikipedia on Hilbert curves, which is classic Wikipedia, makes sense if you understand it already :) and a 20 minute video on Hilbert curves, which I find hard to commit to, assuming its unlikely it touches on movie visualization via Hilbert curves.
It's definitely hard, and not your responsibility, to explain the scientific concept.
But, I'd love to have your understanding of how the visualization is more interesting.
A Hilbert curve is a mapping between 1D and 2D space that attempts to preserve locality. Two points that are close in 2D space tend to map to two points that are close in 1D space and vice versa.
Tl;dr fancy way to map 1D time to 2D image
If you imagine a movie as a line along the time axis with each frame as a pixel, there are multiple way to create an 2D image.
Bargraph is simple approach, but essentially it is still is a one dimensional. We are only using x axis.
Zig-zag pattern is another approach, where you start from top to bottom, left to right. But in this case the relative distance between close frames aren't fully preserved. Two distant frames might appear together, or close frames might end up far apart, which leads to odd looking artifacts.
Hilbert curve is a pattern to map 1D to fill space (2D) such that relative distance between any two points (frames) on the 1D line is somewhat preserved. That's why it appears as clump/blob.
Here it is hard to see the movie progression from start to end but all frames from a scene are always closer, which was what I was aiming. I find it interesting that visual aspect (color/scene) is easy to see here but temporal aspect isn't.
I was excited about the whole 1D to 2D mapping aspect at that time, leading to this toy.
Another good example of hilbert curves as visualization is this online bin file analyzer:
https://binvis.io/#/view/examples/elf-Linux-ARMv7-ls.bin
Nice project and thanks for all your work maintaining vix!
These barcode type things always remind of Cinema Redux - distilled films down into single images, an early memorable use of Processing.
https://brendandawes.com/projects/cinemaredux
This is really cool. I'd love to see this done for the movie "Lola rennt" (Run Lola Run) which is studied for it's color symbolism throughout the film.
Here you are: https://images2.imgbox.com/23/1f/hDNP21e9_o.png
This is using the linked python code - thanks for sharing, OP! - I didn't look into the details but it takes an incredibly long time to save .png files for the source frames despite only selecting 6,400 of them.
I think the choice of Hero is based on a similar idea to this, too.
Reminds me of the time I applied this technique to basically a webcam stream of the northern night sky. You could immediately see if there were northern lights that night (and when) without having to scrub through the footage. I bet there are other use cases that haven't been explored yet.
This post brings back memories. I remember watching this trilogy in an old movie theater that still showed film, and it was amazing experiance.
"Red," "White," and "Blue" is a trilogy of French films made by Polish-born filmmaker Krzysztof Kieslowski. Each movie follows the color pallet.
Modern digital movies are way too sharp, in a bad way.
This is quite nice. Not sure what the meaning is of a circle versus, say, a linear strip, but it’s very effective for showing the dominant colors over time. I’d love to generally see this for many movies across time; my understanding is most are color graded green/yellow now and it’d be nice to visually see this evolution.
Probably because it kinda looks like eye colors, hence the name "iris" and the shape
Iris gotta look like iris
What comes first, the name or the design?
I don't know if this is the "original" (it's a common idea), but here's some "movie barcode" Python code from 9 years ago: https://github.com/timbennett/movie-barcodes
This tumblr was posting these from 2011 to 2018 - moviebarcode.tumblr.com
I think it's something people keep rediscovering. It's a pretty fun programming problem that lets you explore lots of different domains at the same time (video processing, color theory, different coordinate systems for visualizing things) and you get a tangible "cool" piece of art at the end of your effort.
I built one of these back in the day. Part of the fun was seeing how fast I could make the pipeline. Once I realized that FFMPEG could read arbitrary byte ranges directly from S3, I went full ham into throwing machines at the problem. I could crunch through a 4 hour movie in a few seconds by distributing the scene extraction over an army lambdas (while staying in the free tier!). Ditto for color extraction and presentation. Lots of fun was had.
I have a cli tool I maintain that finds visually similar images.
As a fun experiment several years ago I extracted all the frames of Skyfall and all the frames of the first Harry Potter movie.
I then reconstructed Harry Potter frame by frame using the corresponding frame from Skyfall that was most visually similar.
The end result was far more indecipherable than I'd ever expected. The much darker color pallet of Harry Potter lead to the final result largely using frames from a single dark scene in Skyfall, with single frames often being used over and over. It was pretty disappointing given it took hours and hours to process.
Thinking about it now, there's probably a way to compensate for this. Some sort of overall pallette compensation.
I imagine the best case outcome would look something like Jack Gallant's 2011 work on visual mind reading, where they trained a model on brain activity watching hundreds of hours of YouTube and then attempted to reconstruct a view of video not in the training set by correlating the new brain activity with frames from the input... Sorry I'm not explaining very clearly but there's a YouTube video of the result naturally :)
https://youtu.be/nsjDnYxJ0bo
Learned about this from Mary Lou Jepsens 2013 TED talk, what a throwback
https://www.ted.com/talks/mary_lou_jepsen_could_future_devic...
That's an interesting idea. I wonder how well the film iris/barcodes could be used to figure out which movies make the best 'palette' to recreate a given scene.
I spent some time earlier this year on creating mosaics of movie posters using other posters as tiles: https://joshmosier.com/posts/movie-posters/full-res.jpg (warning: 20mb file) Using this on each frame of a scene gave some good results with a fine enough grid even with no repeating tiles: https://youtu.be/GVHPi-FrDY4
If you had a much bigger corpus, and used some semantics-aware similarity metrics (think embeddings), you could maybe end up with something actually coherent
if you take this idea to the limit you'll basically generate a visual embedding equivalent to tvtropes.org
I'm interested in this. What's your similarity measure? I was going to write a frame comparison tool to make seamlessly looping video.
Here's how to do it with ffmpeg and ImageMagick:
Probably want a `-vf fps=1/n` in there (where n = number of seconds) to trim the film down to a manageable number of frames.
(you'd ideally do some "clever" processing to combine a number of frames into a single colour strip but that's obviously more faff than just a simple `ffmpeg` call...)
You could also use something like https://github.com/Breakthrough/PySceneDetect to first split the video into camera shots, and then grab a single (or average) frame per shot, leading to a cleaner result.
I believe ffmpeg also has some scene detection stuff[0] but I've never had occasion to use it.
[0] e.g. https://stackoverflow.com/questions/35675529/using-ffmpeg-ho...
does it work for you on a film? 'magick' isn't a command on my install. I can run montage but get
>bash: /usr/bin/montage: Argument list too long
with 114394 files for a 1h20m film.
You'll need to install ImageMagick: https://imagemagick.org/script/download.php
I have imagemagick installed, it no longer has a 'magick' command you run montage etc. directly. the problem is globbing that huge amount of files, so I'm wondering if you've tested the commands.
>montage
>Version: ImageMagick 6.9.11-60 Q16 x86_64 2021-01-25
Ah. Good point. I only tried on a shorter video. Might have to concatenate batches of frames to get around that (much less elegant).
I wonder if someone could use this to demonstrate how much darker films seem to have gotten.
That and actors that mumble their words...
I'm not sure the actors are mumbling their words. I think the issue stems from the way sound is decoded on people's devices. Movies used to be mixed down to stereo and TVs have no problem producing the sound. Now that it's possible to stream, say 7.2, the TV will tell the service to send something like that but the TV does a poor job of outputting the center vocal channel through it's two rear $3 speakers so you can hear it. The same audio track decoded with a nice cinema amp and two center speakers will generally be really clear. (Also some TVs have a setting to boost the vocals to try and fix this)
Again, with the dark movies. They seem to be color graded and encoded for HDR screens, and most HDR screens are really a fake type of HDR and so the movies just come out super dark. They look a lot less horrible on a really great high-end TV in a totally dark room.
Great article on the subject: https://www.slashfilm.com/673162/heres-why-movie-dialogue-ha...
What a fantastic and in-depth article, thank you. Seems to cover every single issue with the sound!