66 comments

  • socalgal2 10 hours ago ago

    Cool but ... this also sounds like hording behavior. The number of things I've saved over the years only to throw them away years later and realize that saving them in the first place was a waste of time.

    In the 90s my friend's mom would video tape AMC movies. She had 300+ tapes. Maybe she had a few rare ones but now all those movies are available on demand either legally or illegally and in much better quality. Another friend kept all of his 1980s computer magazines (Byte, etc...) and moved these extremely heavy boxes through 30+ years of moves. I doubt he ever opened a single magazine since the moment he saved them. Then they all appeared on The Archive and he finally got rid of them.

    To be clear, I have a few youtube videos saved on my local storage. I'm just thinking that saving every video I watch reminds me of the things I've personally over-saved.

    Actually that reminds me. I met up with the magazine saving friend recently which is when I verified that he finally got rid of his stash. It made me think about things I'm still saving that if I reflect on I know I will never actually look at. For example I have box of about eight 3.5 inch floppy disks from my Amiga days. The odds that I'm going to get an Amiga or download an Amiga emu and get a drive to read those are close enough to zero that I should throw them away. Similarly I have a book of CD-ROMs of backed up data from the 90s. There's a close to 0% chance that I'm never going to bother look at their contents.

    • toomuchtodo 7 hours ago ago

      PSA: if you have a collection or other artifacts for ingest by IA, I’ll cover reasonable shipping costs to get them there. Above a certain size, they’ll handle logistics of packing and shipping for ingest.

      https://help.archive.org/help/how-do-i-make-a-physical-donat...

      Tools to make this easy exist if you already have digital versions.

      https://github.com/jjjake/internetarchive

      And don’t forget to send a few dollars if and when you can.

      https://archive.org/donate

      (no affiliation, I just like the public good)

    • mananaysiempre 9 hours ago ago

      > Another friend kept all of his 1980s computer magazines (Byte, etc...) and moved these extremely heavy boxes through 30+ years of moves.

      I don’t think IA has all early issues of the Microsoft Systems Journal (later MSDN Magazine), among others. So this can be useful. (Also, what kind of person do you think put the magazines up on IA in the first place?..)

      • asdefghyk 9 hours ago ago

        Lots magazines never made it to the archives and have been lost.

    • SilverElfin 8 hours ago ago

      I would like to be able to search old videos I’ve seen sometimes. Like to find that one recipe I saw or to pull out that one fact I thought I heard. Or sometimes just to listen to a song that later got made private or deleted outright. When YouTube deletes a video it doesn’t even leave the title in your playlist so it can be frustrating to try and find the same thing again.

    • jh00ker 9 hours ago ago

      > I have a book of CD-ROMs of backed up data

      >There's a close to 0% chance that I'm never going to bother look at their contents.

      More likely scenario, your children, grandchildren or other family members go through your shit after you pass away and discover stuff about you that perhaps you never wanted to share.

      This is something I think about a lot because I don't have a "digital legacy plan."

      • socalgal2 8 hours ago ago

        > More likely scenario, your children, grandchildren or other family members go through your shit after you pass away

        I think that's not really likely. I'm pretty sure if you poll you'll find that few children care about their parent's "stuff". You can find plenty of people who've lost parents who found that they didn't have any interest in going through their parents stuff and then from that realized their children would be the same to them.

        Most children aren't going to dig through anything more than a physical photo album, and when they do, the only pictures that are relevant to them are those with people they know. The rest only have meaning to the dead parent. They aren't going to dig through hard drives or CDs unless they are searching for financial documents so they can finish up their parent's financial affairs.

        > discover stuff about you that perhaps you never wanted to share

        I do worry about that. I just tell myself I'll be dead so it doesn't really matter.

        • Larrikin 4 hours ago ago

          Where are you making this conclusion from?

          Nobody in my family was waiting for one of my parents to die and it actually happened rather suddenly although he was retirement age. There was a very rapid effort to ensure we discovered as many passwords as possible, bought a family NAS, and backed up his entire computer starting with the Lightroom video and pictures. We later went through all of the family photos and folders he hadn't put in there.

          To this day it's constantly running with an off site back up to my NAS. There are some photos of cousins we didn't really know, but he owned the best digital cameras of every era since their invention so it's a huge documentation of life. It would have been a family tragedy to lose that.

      • globalise83 7 hours ago ago

        On your deathbed, you say: "My only regret is forgetting where I saved my Bitcoin keys".

      • npteljes 8 hours ago ago

        Store your archive encrypted, and then later you can decide if you share the password or not :)

    • danieldk 8 hours ago ago

      I am the exact opposite and sell or throw away pretty much everything that I don't use. I find that doing so not only clutters the house less, but also gives you less to worry about.

      My general rule is - if I didn't use it for a year, I don't need it. There are obviously some exceptions like a fire extinguisher (which I hope to never use) and digitized photos, which only go through a careful selection.

      I think the thing I kept the longest was a Libranet Linux 3.0 CD set because I worked for Libra Computer Systems for a while and this was the release that I helped building. A few years ago I threw it away, I think after I saw someone uploaded it to archive.org. When I'm 60 and want to install it again for good old time's sake I can.

      tl;de: if you don't use something for a year, you probably don't need it.

      • zie 3 hours ago ago

        > fire extinguisher (which I hope to never use)

        These expire, so make sure you check yours is still good!

        Otherwise I agree with and do basically the same thing. I also make exceptions for most tools and emotional connection items.

      • pessimizer 7 hours ago ago

        I don't get around to using plenty of things for the first time a year after I've purchased them. That policy in my life would be a nightmare of constantly rebuying stuff, or failing to rebuy stuff that is now gone forever.

        Almost everything that has become indispensable in my life took years to integrate into my life to any significant degree.

        "Need" is a weasel word. You don't need anything.

        • seb1204 3 hours ago ago

          I would like to hear more about this. So you buy a, say new air fryer, a new monitor, a new mobile phone or a new shirt and it takes you 12 months to first use it? Or was this more like I buy 3 SD cards in bulk buy might not need 3 right now? Do you live in a remote area where online shopping delivery is not available? Is it just a habit? Honestly curious.

        • redserk 6 hours ago ago

          I’d own a lot less stuff if there were more opportunities to rent infrequently used items.

          As it stands, I have a workshop and electronics bench with many tools that will go unused for years but are critical when I need them and too expensive to buy and throw away.

      • stirfish 4 hours ago ago

        I'll will be buried with my box of miscellaneous cables.

    • graemep 8 hours ago ago

      Physical media had a much higher cost in terms of both the cost of the media and the space it uses so you can horde a lot more.

      Maybe something a bit more selective than this though!

      • lukebechtel 8 hours ago ago

        Yes!

        Hoarding is bad when it's costly, due to space, time, or money.

        Digital media hoarding is thus not bad at all!

        • socalgal2 7 hours ago ago

          You have to define "cost". I have a "server" with 3 external drives connected. One is "media" and 2 are for backups. I have a drawer with 11 external HD drives which I haven't used in years that used to be my "media" and backup drives. Each of those represent money (buying the drive) and time (copying stuff from old 1TB drives to 2TB drives to 4TB drives) etc....

          So there is a cost to digital media hording.

          I wanted to save the videos I'd captured from my car's cameras but there's ~250gb every 3-4 months or so which is a more money needed. Plus, if I wanted them actually available to access I'd need a way to plug in more drives live into my server so more $$$$ and I'd need to back them up for when the drives fail so more $$$$.

          So yea, there is a cost to digital media hording.

          • mlyle 24 minutes ago ago

            The storage is cheap. The cost is keeping it online and the opportunity cost of gathering, maintaining, and preserving collections.

    • nemomarx 10 hours ago ago

      Getting them on a public shared archive is probably a good outcome though. There was that lady who taped hundreds of hours of daytime TV and archiving that has some interesting historical uses?

      But a personal copy I'm not sure has much point yeah.

      • toomuchtodo 3 hours ago ago

        > Marion Marguerite Stokes (née Butler; November 25, 1929 – December 14, 2012) was an American access television producer, businesswoman, investor, civil rights demonstrator, activist, librarian, and archivist, especially known for hoarding and archiving hundreds of thousands of hours of television news footage spanning 35 years [70,000 VHS tapes], from 1977 until her death in 2012, at which time she had been operating nine properties and three storage units. According to the Los Angeles Review of Books review of the 2019 documentary film Recorder, Stokes's massive project of recording the 24-hour news cycle "makes a compelling case for the significance of guerrilla archiving."

        https://en.wikipedia.org/wiki/Marion_Stokes

        https://archive.org/details/marionstokesvideo

        https://recorderfilm.com/

    • bombcar 9 hours ago ago

      Digital hoarding takes nearly no practical space.

      And there’s a number of YouTube videos o wish I could still access.

    • fcpguru 10 hours ago ago

      oh that's not why I want them local. I want to open them in final cut pro and edit them and use parts in other videos. I delete the data folder at the end of each day.

    • pessimizer 7 hours ago ago

      > Then they all appeared on The Archive and he finally got rid of them.

      Sometimes you're the person who is uploading them to public archives. Because everybody else threw them all away, and you saved them until the technology made archiving practical enough.

      I've been replacing all of my physical media for years, but the reason I can do that now is because other people scan/rip and archive/share the stuff. You also have unique stuff that you may not even know is unique. When you find something in your house that you can't find online, scan it and you're paying everybody back for all of the scanning they did for you.

      With the CD-ROMs, you should just glide through them one by one and check if you can find the stuff online. If you can, throw them in the trash. If you can't, copy their contents to a folder, and throw them in the trash. Go through the folder over the next hour or next 20 years (however long it takes to get around to it) and take the things you can't find online that you think somebody might want, and get those things to that somebody (uploading to archive.org is always a good place to start.)

      edit: I know for a fact that for a lot of people, uploading somewhere on the internet is their standard pre-deletion ritual.

    • attila-lendvai 6 hours ago ago

      hoarding, or maybe just anti-censorship measure.

    • ThrowawayTestr 9 hours ago ago

      Hard drives are cheap and compact. The real issue is archiving with no organization or indexing.

    • hkon 9 hours ago ago

      With enough space available hoarding is just thinking ahead.

  • erinnh 10 hours ago ago

    Ive been using Tubearchivist with the extension for this.

    https://github.com/tubearchivist/browser-extension

    I really like the WebUI of Tubearchivist itself.

    • fcpguru 10 hours ago ago

      the main feature I want is to just browse youtube like normal in firefox like I always do. And completely forget starchive is running. Then later in the day I'm pleasntly suprised that any video I want to clip is already downloaded and ready. I never know which one I'll want to download and I don't want to have to click any button.

  • mikae1 11 hours ago ago

    > Videos are saved to the ./data/ directory and converted to MOV format using ffmpeg with hardware acceleration

    Transcoded (ouch) or just remuxed to a mov container? Have to investigate.

    • pixelpoet 7 minutes ago ago

      Yeah I was onboard until the re-encoding part: yt-dlp maintains the exact bits, why on earth would someone want to waste encoding time just to trash the quality?

      On top that... seriously, of all the formats one could choose, MOV?! Might as well choose DivX or RealVideo.

    • atahanacar 10 hours ago ago
    • fcpguru 10 hours ago ago

      the video has to be re-encoded because apple quicktime doesn't like the youtube video format. But the audio can just be copied. My mac's fan never spins with the hardware acceleration so it runs in the background and I just forget about it.

  • Szpadel 9 hours ago ago

    I creates something similar in concept but with different goal. I wanted to be able to watch videos with sponsor block on iPad ideally using Plex.

    I found self hosted solution like this but I was very dissatisfied with how that worked

    on other hand I wanted to check out loco.rs framework, so I decided to implement my own solution.

    basically you are able to add channels/playlists on many many platforms that yt-dlp supports, you can select what should be cut out using sponsor block and you choice how many days you want it (videos older that that are automatically deleted)

    if you are interested, you can check it out: https://github.com/Szpadel/LocalTube

  • ProofHouse 6 hours ago ago

    I don’t really get the purpose of this broadly, because doesn’t YouTube keep videos online unless the creator took them down which is probably not the case 95% of the time? That said for a niche or a high likelihood of a video being removed, or if you really want to be 100% certain it makes sense, but would I be accurate in that statement or am I missing something?

    • fcpguru 6 hours ago ago

      I'm not trying to save them forever. I just want them local so I can take clips from them for other videos. I use them as source input to final cut pro.

    • add-sub-mul-div 6 hours ago ago

      My version of this downloads the files to my Plex filesystem so I can watch them on my TV without going through a Youtube app. Also the sponsorblock segments are cut out of my local version after download.

      I go even further and schedule TV "channels" that rotate through the local videos using ErsatzTV.

  • frou_dh 5 hours ago ago

    For YouTube videos I feel are worth archiving, I just add them to playlists on my channel, then periodically download my entire channel using a single yt-dlp command (it can keep track of what's already been downloaded).

  • jz10 6 hours ago ago

    I gave Claude access to supadata YT transcription and obsidian MCP to convert them to "permanent note" format and it's helped tone down my YT addiction a lot

  • computegabe 11 hours ago ago

    Interesting. I was looking into creating an extension that manually manipulates and intercepts the vnd.yt-ump [1] requests, then use webcodecs to process everything in the browser.

    [1]: https://github.com/gsuberland/UMP_Format/blob/main/UMP_Forma...

  • myself248 7 hours ago ago

    Oh, this is huge and important. The number of things I watch that're just gone when I go back to look again!

    Youtube is an archive like a grocery store is a food archive. [1]

    If it was worth watching in the first place, it's worth saving. Reducing the friction of doing so is going to help a lot of people.

    (1: I'm getting this quote wrong, what's the actual and attribution??)

    • john01dav 2 hours ago ago

      There's a specific removed video that I want back (the whole channel seems to have been bought by someone who wanted pre edition subscribers or something, then everything on it was nuked and replaced with content that doesn't interest me). I tried emailing the channel's new email address to ask for it, and got no response. Do you know of any practical way to try to get it back?

    • fcpguru 6 hours ago ago

      ha, I'm not sure who said Youtube is a video archive like a grocery store is a food archive but that's excellent.

  • syntaxing 8 hours ago ago

    Whoa! I asked about something like this 2 years ago but never got to making anything [1]! Super exciting something like this exists!

    [1] https://news.ycombinator.com/item?id=37885584

  • amelius 8 hours ago ago

    It would be nice if the extension wrote them to some shared repository. That way, the videos could be preserved for humanity without Google having a say in it.

    Added benefit: every video would have to be archived only once.

    • Alive-in-2025 8 hours ago ago

      But then companies could sue to wipe out the centralized repo. So to be safe, you'd copy things to the central repo and also have a local copy. ;-)

      Next, you try to centralize all the private copies so only one person has to keep theirs. Solution is end copyright for things over x years in age. Instead in the us we keep pushing back the date.

      • amelius 7 hours ago ago

        Depends where the central server is. Nobody is wiping annas archive, for example.

  • ivanjermakov 9 hours ago ago

    I'm achieving this with a single yt-dlp script reading url from a clipboard.

    • fcpguru 6 hours ago ago

      oh but there's still the thought of having to press copy. My favorite thing about this is I just forget I even have it running and browse youtube like normal. Then later anything I've watched that day is already downloaded.

      • ivanjermakov 4 hours ago ago

        Sounds like a waste of network and computer resources to me - copying url means the video is worth the effort.

  • WithinReason 11 hours ago ago

    Now add DHT so clients can download videos from each other as a torrent and you solved global video distribution.

    • rwmj 9 hours ago ago

      That's basically PeerTube?

      • WithinReason 9 hours ago ago

        PeerTube doesn't have all of youtube's videos on it

  • untech 7 hours ago ago

    See also ArchiveBox, which supports YT saving as well, but can save other content too

    https://github.com/ArchiveBox/ArchiveBox

  • globular-toast 7 hours ago ago

    I've had this idea myself so cool to see it implemented.

    What I'd really like is a kind of universal web caching backend. So everything I access goes through a cache and I have the option of viewing from cache if something goes offline or changes. I could also mark things as "favourite" so they don't ever expire from the cache. Does such a thing exist?

    • fcpguru 6 hours ago ago

      trying to just grab from the actual browser cache is very hard for video. If you look at the complexity of yt-dlp you'll see why that's so much easier than trying to grab various formats from cache.

  • busymom0 7 hours ago ago

    My archiving app called HEAP can be configured using a simple apple script and yt-dlp to do this too. And since it's a native macOS app instead of a browser extension, it works via all browsers:

    https://apps.apple.com/ca/app/heap-website-full-page-image/i...

  • fcpguru 12 hours ago ago

    ~/os/starchive (main)[56daf7] $ ls -lh data

    total 3207312

    -rw-r--r-- 1 aa staff 525M Aug 2 09:11 2PMzaym-StM.mov

    -rw-r--r-- 1 aa staff 362M Aug 2 09:10 CHbawkGc_os.mov

    -rw-r--r-- 1 aa staff 658M Aug 2 09:11 lqR7VV8ftys.mov

    ~/os/starchive (main)[56daf7] $ ./starachive

    Server starting on port 3009...

    JSON received: map[videoId:CHbawkGc_os]

    Added video CHbawkGc_os to queue. Queue length: 1

    Processing video CHbawkGc_os. Remaining in queue: 0