I have mixed feelings about this. On the one hand, I agree: text is infinitely versatile, indexable, durable, etc. But, after discovering Bret Victor's work[1], and thinking about how I learned piano, I've also started to see a lot of the limitations of text. When I learned piano, I always had a live feedback loop: play a note, and hear how it sounds, and every week I had a teacher coach me. This is a completely different way to learn a skill, and something that doesn't work well with text.
Bret Victor's point is why is this not also the approach we use for other topics, like engineering? There are many people who do not have a strong symbolic intuition, and so being able to tap into their (and our) other intuitions is a very powerful tool to increase efficiency of communication. More and more, I have found myself in this alternate philosophy of education and knowledge transmission. There are certainly limits—and text isn't going anywhere, but I think there's still a lot more to discover and try.
I think the downside, at least near-term, or maybe challenge would be the better word, is that anything richer than text requires a lot more engineering to make it useful. B♭ is text. Most of the applications on your computer, including but not limited to your browser, know how to render B♭ and C♯, and your brain does the rest.
Bret Victor's work involves a ton of really challenging heavy lifting. You walk away from a Bret Victor presentation inspired, but also intimidated by the work put in, and the work required to do anything similar. When you separate his ideas from the work he puts in to perfect the implementation and presentation, the ideas by themselves don't seem to do much.
Which doesn't mean they're bad ideas, but it might mean that anybody hoping to get the most out of them should understand the investment that is required to bring them to fruition, and people with less to invest should stick with other approaches.
> You walk away from a Bret Victor presentation inspired, but also intimidated by the work put in, and the work required to do anything similar. When you separate his ideas from the work he puts in to perfect the implementation and presentation, the ideas by themselves don't seem to do much.
Amen to that. Even dynamic land has some major issues with GC pauses and performance issues.
I do try to put my money where my mouth is, so I've been contributing a lot to folk computer[1], but yeah, there's still a ton of open questions, and it's not as easy as he sometimes makes it look.
For complex music, sure, but if I'm looking up a folk tune on, say, thesession.org, I personally think a plain-text format like ABC notation is easier to sight-read (since for some instruments, namely the fiddle and mandolin, I mainly learn songs by ear and am rather slow and unpracticed at reading standard notation).
Although, one could make the argument that staff notation is itself a form of text, albeit one with a different notation than a single stream of Unicode symbols. Certainly, without musical notation, a lot of music is lost (although, one can argue that musical notation is not able to adequately preserve some aspects of musical performance which is part of why when European composers tried to adopt jazz idioms into their compositions in the early twentieth century working from sheet music, they missed the whole concept of swing which is essential to jazz).
> one could make the argument that staff notation is itself a form of text, albeit one with a different notation than a single stream of Unicode symbols
Mostly this is straightforwardly correct. Notes on a staff are a textual representation of music.
There are some features of musical notation that aren't usually part of linguistic writing:
- Musical notation is always done in tabular form - things that happen at the same time are vertically aligned. This is not unknown in writing, though it requires an unusual context.
- Relatedly, sometimes musical notation does the equivalent of modifying the value of a global variable - a new key signature or a dynamic notation ("pianissimo") takes effect everywhere and remains in effect until something else displaces it. In writing, I guess quotation marks have similar behavior.
- Musical notation sometimes relates two things that may be arbitrarily far apart from each other. (Consider a slur.) This is difficult to do in a 1-D stream of symbols.
> although, one can argue that musical notation is not able to adequately preserve some aspects of musical performance
Nothing new there; that's equally true of writing in relation to speech.
Working in any science should also make this argument clearer. Data as text is hard to read and communicate. Even explanations of results. But graphs? Those are worth a thousand words. They communicate so much so fast. There's also a lot of skill to doing this accurately and well, just as one can say about writing. A whole subfield of computer graphics is dedicated to data visualization because it's so useful. Including things like colors. Things people often ignore because it feels so natural and obvious but actually isn't.
I think it's naïve to claim there's a singular best method to communicate. Text is great, especially since it is asynchronous. But even the OP works off of bad assumptions that are made about verbal language being natural and not being taught. But there's a simple fact, when near another person we strongly prefer to speak than write. And when we can mix modes we like to. There's an art to all this and I think wanting to have a singular mode is more a desire of simplicity than a desire to be optimal
No, you do not need to, and will not generally be able to, describe everything that a graph conveys in text. Graphs can give you an intuitive understanding of the data that text would not be able to, simply by virtue of using other parts of the brain and requiring less short term memory. If a graph can be replaced with 5 pages of text, that doesn't mean that you get the same information from both - you're likely much more able to keep one image in your short term memory than 5 pages of text.
But they are multiple different "views" into data, and I would posit that a textual view of data is no different than a graphical view, no? If you import data from a parquet file, you go straight from numbers to graphs, so I disagree that it comes from text. Both graphs and text come from information. Circles on surveys, Arduino temperature readings, counter clickers when doing surveys. Those are not just text.
Take a problem like untangling a pile of cords. Writing out how to do that in text would be a drag, and reading those directions probably wouldn't be helpful either. But a kid can learn how to untangle just by observation.
Physical intuition is an enormous part of our intelligence, and is hard to convey in text: you could read millions of words about how to ride a bike, and you would learn nothing compared to spending a few hours trying it out and falling over until it clicks.
I mean, this very discussion is a case study in the supremacy of text. I skimmed the OP's blog post in thirty seconds and absorbed his key ideas. Your link is to a 54 minute video on an interesting topic which I unfortunately don't have time to watch. While I have no doubt that there are interesting ideas in it, video's inferior to text for communicating ideas efficiently, so most people reading this thread will never learn those ideas.
Text is certainly not the best at all things and I especially get the idea that in pedagogy you might want other things in a feedback loop. The strength of text however is its versatility, especially in an age where text transformers are going through a renaissance. I think 90%+ of the time you want to default to text, use text as your source of truth, and then other mediums can be brought into play (perhaps as things you transform your text into) as the circumstances warrant.
I came back here after the video (btw he speak very deliberately, watching it at 1.5 or 2x while digesting the message is fine)
I'd compare it's message to a "warning !" sign. It's there to make you stop and think about our computing space, after that it's up to you to act or not on how you perceive it.
That's totally wishy-washy, so it might not resonate, but after that I went to check more of what dynamicland is doing and sure enough they're doing things that are completely outside of the usual paradigm.
A more recent video explaining the concept in a more practical and down to earth framing: https://youtu.be/PixPSNRDNMU
(here again, reading the transcript won't nearly convey the point. Highly recommend watching it, even sped up if needed)
Actually, you might want to check the video again, it has sections and a full transcript on the right side, precisely to make skimming easy!
> video's inferior to text for communicating ideas efficiently
Depends on the topic tbh. For example, YouTube has had an absolute explosion of car repair videos, precisely because video format works so well for visual operations. But yes, text is currently the best way to skim/revisit material. That's one reason I find Bret's website so intriguing, since he tries to introduce those navigation affordances into a video medium.
> The strength of text however is its versatility, especially in an age where text transformers are going through a renaissance. I think 90%+ of the time you want to default to text, use text as your source of truth, and then other mediums can be brought into play (perhaps as things you transform your text into) as the circumstances warrant.
Agree, though not because of text's intrinsic ability, but because its ecosystem stretches thousands of years. It's certainly the most pragmatic choice of 2025. But, I want to see just how far other mediums can go, and I think there's a lot of untapped potential!
The fidelity and encoding strength of the "idea" you got the gist of from skimming might be less than the "idea" you receive when you spend the time to watch the 54 minute video
I've also become something of a text maximalist. It is the natural meeting point in human-machine communication. The optimal balance of efficiency, flexibility and transparency.
You can store everything as a string; base64 for binary, JSON for data, HTML for layout, CSS for styling, SQL for queries... Nothing gets closer to the mythical silver-bullet that developers have been chasing since the birth of the industry.
The holy grail of programming has been staring us in the face for decades and yet we still keep inventing new data structures and complex tools to transfer data... All to save like 30% bandwidth; an advantage which is almost fully cancelled out anyway after you GZIP the base64 string which most HTTP servers do automatically anyway.
Same story with ProtoBuf. All this complexity is added to make everything binary. For what goal? Did anyone ever ask this question? To save 20% bandwidth, which, again is an advantage lost after GZIP... For the negligible added CPU cost of deserialization, you completely lose human readability.
In this industry, there are tools and abstractions which are not given the respect they deserve and the humble string is definitely one of them.
As someone who's daily job is to move protobuf messages around, I don't think protobuf is a good example to support your point :-)
AFAIKT, binary format of a protobuf message is strictly to provide a strong forward/backward compatibility guarantee. If it's not for that, the text proto format and even the jaon format are both versatile, and commonly used as configuration language (i.e. when humans need to interact with the file).
You can also provide this with JSON and API versioning. Also with JSON, you can add new fields to requests and responses, it's only deleting fields which breaks compatibility.
I've moved away from DOCish or PDF for storage to text (usually markdown) with Makefiles to build with Typst or whatever. Grep works, git likes it, and I can easily extract it to other formats.
My old 1995 MS thesis was written in Lotus Word Pro and the last I looked, there was nothing to read it. (I could try Wine, perhaps. Or I could quickly OCR it from paper.) Anyway, I wish it were plain text!
The value of protobuf is not to save a few bytes on the wire. First, it requires a schema which is immensely valuable for large teams, and second, it helps prevent issues with binary skew when your services aren't all deployed at the same millisecond.
The text based side of protobuf is not base64 or json. We'd be looking at either CSV or length delimited fields.
Many large scale systems are on the same camp as you as their text files flow around their batch processors like crazy, but there's absolutely no flexibility or transparency.
Json and or base64 are more targeted as either low volume or high latency systems. Once you hit a scale where optimizing a few bits straight saves a significant amount of money, self labeled fields are just out of question.
Base64 and JSON takes a lot of CPU to decode; this is where Protobuf shines (for example). Bandwidth is one thing, but the most expensive resources are RAM and CPU, and it makes sense to optimize for them by using "binary" protocols.
For example, when you gzip a Base64-encoded picture, you end up 1. encoding it in base64 (takes a *lot* of CPU) and then, compressing it (again! jpeg is already compressed).
I think what it boils down to is scale; if you are running a small shop and performance is not critical, sure, do everything in HTTP/1.1 if that makes you more productive.
But when numbers start mattering, designing binary protocols from scratch can save a lot of $ in my experience.
Maybe for some kind of multiplayer game which has massive bandwidth and CPU usage requirements and has to be supported by paper-thin advertising profit margins... When tiny performance improvements can mean the difference between profitable and unprofitable, then it might make sense to optimize but this... But for the vast majority of software, the cost of serializing JSON is negligible and not worth thinking about.
For example, I've seen a lot of companies obsess over minor stuff like shaving a few bucks off their JSON serialization or using a C binding of some library to squeeze every drop of efficiency out of those technologies... While at the same time letting their software maintenance costs blow out of control... Or paying astronomical cloud compute bills when they could have self-hosted for 1/20th of the price...
Also, the word scale is overused. What is discussed here is performance optimization, not scalability. Scalability doesn't care for fixed overhead costs. Scalability is about growth in costs as usage increases and there is no difference in scalability if you use ProtoBuf or JSON.
The expression that comes to mind is "Penny-wise, pound-foolish." This effect is absolutely out of control in this industry.
If you deploy on phones, CPU and memory is a major problem. Pick a median Android and lots of websites consisently fail to deliver good experience on it and it's very common to see them bottlenecked on CPU. JSON is massively innefficient, it's foolish think it won't have any effect.
shipping base64 in json instead of a multipart POST is very bad for stream-processing. In theory one could stream-process JSON and base64... but only the json keys prior would be available at the point where you need to make decisions about what to do with the data.
Still, at least it's an option to put base64 inline inside the JSON. With binary, this is not an option and must send it separately in all cases, even small binary...
You can still stream the base64 separately and reference it inside the JSON somehow like an attachment. The base64 string is much more versatile.
> Still, at least it's an option to put base64 inline inside the JSON. With binary, this is not an option and must send it separately in all cases, even small binary...
There's nothing special about "text" or binary here. You can absolutely put binary inside other binary; you use a symbol that doesn't appear inside the binary, much like you do for text.
You use a divider, like " is for json, and a prearranged way to avoid that symbol from appearing inside the inner binary (the same approach that works for text works here).
What do you think a zip file is? They're not storing compressed binary data as text, I can tell you that.
This reminds me that I just learned the other day that .a files are unix archives, which have a textual representation (and if all the bundled files are textual, there's no binary information in the bundle). I thought .a was just for static libraries for the longest time, and had no idea that it was actually an old archive format.
Even with binary, you can store a binary inline inside of another one if it is a structured format with a "raw binary data" type, such as DER. (In my opinion, DER is better in other ways too, and (with my nonstandard key/value list type added) it is a superset of the data model of JSON.)
Using base64 means that you must encode and decode it, but binary data directly means that is unnecessary. (This is true whether or not it is compressed (and/or encrypted); if it is compressed then you must decompress it, but that is independent of whether or not you must decode base64.)
And what comes to original article, there is no "text [systems]" (or there is, like there are "number [systems]", just made up). "Text" like this very thing you are reading is 2D drawing. There are no character glyphs of any kind (latin, logograms etc.) defined by universe*, they are human invented and stored/interpreted at human collective level. Computers don't know anything about text, only "numbers" of some bit width, and with those numbers a system must be created that can map some number representation to some drawing in some method (e.g. with bitmap). Also there is a lot of difference between formal/executable and natural human languages. Anyways, it's not a about some text format/encoding, it's the human/computer defined/interpreted non-linguistical meaning behind it (Wittgenstein).
* DNA/RNA can be one such "universal character glyph/string", as the "textual" information is physically constructed and interpreted.
Text is just bytes, and bytes are just text. I assume this is talking about human readable ASCII specifically.
I think the obsession with text comes down to two factors: conflating binary data with closed standards and poor tooling support. Text implies a baseline level of acceptable mediocrity for both. Consider a CSV file will millions of base64 encoded columns and no column labels. That would really not be any friendlier than a binary file with a openly documented format and suitable editing tool, e.g. sqlite.
Maybe a lack of fundamental technical skills is another culprit, but binary files really aren't that scary.
I agree, but binary is exactly the same. You use a different tool to view it, and maybe you don't have that tool, and that's the problem. But it's a matter of having a way to interpret the data; trivially base64 encoding readable text gives you text, and if you can't decode it, it's as meaningless as binary you can't decode.
It makes more sense to consider readability or comprehensibility of data in an output format; text makes sense for many kinds of data, but given a graph, I'd rather view it as a graph than as a readable text version.
And if you have a way to losslessly transform data between an efficient binary form, readable text, or some kind of image (or other format), that's the best of all.
And it's funny to think about how many different incompatible text standards there were for the first 30ish years of computers. Each vendor had their own encoding, and it took until UTF-8 to even agree on text (let alone the legacy of UTF-16). If it took that long to agree on text, I have a bad feeling it'll take even longer to agree on anything else.
I suppose open standards have slowly been winning with opus and AV1, but there's still so many forms of interactions that have proprietary or custom interfaces. It seems like anything that has a stable standard has to be at least 20 years old, lol.
Text is bytes that's accompanied with a major constraint on which sequences of bytes are permitted (a useful compression into principal axes that emerged over thousands of years of language evolution), along with a natural connection to human semantics that is due to universal adoption of the standard (allowing correlations to be modelled).
Text is like a complexity funnel (analogous to a tokenizer) that everyone shares. Its utility is derived from its compression and its standardization.
If everyone used binary data with their own custom interpretation schema, it might work better for that narrow vertical, but it would not have the same utility for LLMs.
> Text is the oldest and most stable communication technology
Minor nit: complex language (i.e. Zipf’s law) is the oldest and most stable communication technology.
Before text, we had oral story telling. It allowed us to communicate one generation’s knowledge to the next, and so on.
Arguably this is present elsewhere in the animal kingdom (orcas, elephants, etc.), but human language proves to be the most complex.
Side note: one of my favorite examples is from the Gunditjmara (a group of Aboriginal Australians) who recall a volcanic eruption from 30k+ years ago [0].
Written language (i.e. text) is unique, in that it allows information to pass across multiple generations, without a man-in-the-middle telephone-like game of storytelling.
But both are similar, text requires you to read, in your own voice, the thoughts of another. Storytelling requires you to hear a story, and then communicate it to others.
In either case, the person is required to retell the knowledge, either as an internal monologue or as an external broadcast.
This also leads to the unreasonable effectiveness of LLMs. The models are good because they have thousands of years of humans trying to capture every idea as text. Engineering, math, news, literature, and even art/craftmanship. You name it, we wrote it down.
Our image models got good when we started making shared image and text embedding spaces. A picture is worth 1000 words, but 1000 words about millions of images are what allowed us to teach computers to see.
Is doing dozens of back and forth to explain what we actually want, while the model burns down inordinate amount of processing power at each turn, a model of efficiency or effectiveness ?
It might be convenient and allow for exploration, the cost might be worth it in some cases, but I wouldn't call it "effective".
Reread Story of Your Life again just now, and all it made me want to do is learn Heptapod B and their senagram style of written communication.
Reading “Mathematica - A secret world of intuition and curiosity” as well and a part stuck out in a section called The Language Trap. Example author gives is about for a recipe for making banana bread, that if you’re familiar with bananas, it’s obvious that you need to peel them before mashing. Bit of you haven’t seen a banana, you’d have no clue what to do. Does a recipe say peel a banana or should that be ignored? Questions like these are clear coming up more with AI and context, but it’s the same for humans. He ends that section saying most people prefer a video for cooking rather than a recipe.
Other quote from him:
“The language trap is the belief that naming things is enough to make them exist, and we can dispense with the effort of really imagining them.”
gnabgib points out that this same article has been posted for comment here three other times since it was written. That said, afaict no one has commented any of these times on what I'm about to say, so hopefully this will be new.
I'm a linguist, and I've worked in endangered languages and in minority languages (many of which will some day become endangered, in the sense of not having native speakers). The advantage of plain text (Unicode) formats for documenting such languages (as opposed to binary formats like Word used to be, or databases, or even PDFs) is that text formats are the only thing that will stanmd the test of time. The article by Steven Bird and Gary Simons "Seven Dimensions of Portability for Language Documentation and Description" was the seminal paper on this topic, published in 2002. I've given later conference talks on the topic, pointing out that we can still read grammars of Greek and Latin (and Sanskrit) written thousands of years ago. And while the group I led published our grammars in paper form via PDF, we wrote and archived them as XML documents, which (along with JSON) are probably as reproducible a structured format as you can get. I'm hoping that 2000 years from now, someone will find these documents both readable and valuable.
There is of course no replacement for some binary format when it comes to audio.
(By "binary" format I mean file formats that are not sequential and readily interpretable, whereas text files are interpretable once you know the encoding.)
This is all true, but I think you're too focused on your area. Finding musical notes that we can interpret correctly from an ancient civilization, would that be "text" or "binary"? I think it's a false choice.
Similarly, cave paintings express the painting someone intended to make better than a textual description of it.
Purely anecdotal, but I hoard a lot of personal documents (shopping receipts, confirmation emails, scans etc.) and for stuff I saved only 10 years ago, the toughest to reopen are the pure text files.
You rightly mention Unicode, as before that there was a jungle of formats. I have some in UTF-16, some in SJIS, a ton in EUC, other were already utf-8, many don't have a BOM. I could try each encoding and see what works for each of the files (except on mobile...it's just a PITA to deal with that on mobile).
But in comparison there's a set of file I never had issues opening now and then: PDFs and jpegs. All the files that my scanner produced are still readable absolutely everywhere. Even with slight bitrot they're readable, and with the current OCR processes I could probably put it all back in text if ever needed.
If I had to archive more stuff now and can afford the space, I'd go for an image format without hesitation.
PS: I'm surprised you don't mention the Unicode character limitations for minority languages or academic use. There will still be characters that either can't be represented, or don't have an exact 1 to 1 match between the code point and the representation.
Much as I love text for communication, it's worth knowing that "28% of US adults scored at or below Level 1, 29% at Level 2, and 44% at Level 3 or above" - Literacy in the United States: https://en.wikipedia.org/wiki/Literacy_in_the_United_States
Anything below 3 is considered "partially illiterate".
I've been thinking about this a lot recently, as someone who cares about technical communication and making technical topics accessible to more people.
Maybe wannabe educators like myself should spend more time making content for TikTok or YouTube!
The inverse of this is the wisdom that pearls should not be cast before swine. If you want to increase literacy rates, it's unclear to me how engaging people on an illiterate medium will improve things.
Technical topics demand a technical treatment, not 30-second junk food bites of video infotainment that then imbue the ignorant audiences with the semblance or false feeling of understanding, when they actually possess none. This is why we have so many fucking idiots dilating everywhere on topics they haven't a clue on - they probably saw a fucking YouTube video and now consider themselves in possession of a graduate degree in the subject.
Rather than try to widely distribute and disseminate knowledge, it would be far more prescient to capitalize on what will soon be a massive information asymmetry and widening intellectual inequality between the reads and the read-nots, accelerated by the production of machine generated, misinformative slop at scale.
Saying that a 20x20 image of a Twitter logo is 4000 bytes is just so wrong.
The image is of a monochrome logo with anti-aliased edges. Due to being a simple filled geometric shape, it could compress well with RLE, ZIP compression, or even predictors. It could even be represented as vector drawing commands (LineTo, CurveTo, etc...).
In a 1-bit-per-pixel format, a 20x20 image ends up as 400 bits (50 bytes).
This is sort of the premise of all of us electronics-as-code startups. We think that a text-based medium for the representation of circuits is a necessity for AI to be able to create electronics. You can't skip this step and generate schematic images or something. You have to have a human-readable (which also means AI-compatible) text medium. Another confusion: KiCad files are represented in text, so shouldn't AI be able to generate them? No- AI has similar levels of spatial understanding to a human reading these text files. You can't have a ton of XY coordinates or other non-human-friendly components of the text files. Everything will be text-based and human-readable, at least at the first layer of AI-generation for serious applications
I wonder if some day there will be a video codec that is essentially a standard distribution of a very precise and extremely fast text-to-video model (like SmartTurboDiffusion-2027 or something). Because surely there are limits to text, but even the example you gave does not seem to me to be beyond the reach of a text description, given a certain level of precision and capability in the model. And we now have faster than realtime text to video.
It would be impossible to change the model. It would be like a codec, like H.264 but with 1-2GB of fixed data attached to that code name. Changing the model is like going to H.265. Different codec.
For a computer, text is a binary format like anything else. We have decades of tooling built on handling linear streams of text where we sometimes encode higher dimensional structures in it.
But I can't help feel that we try to jam everything into that format because that's what's already ubiquitous. Reminds me of how every hobby OS is a copy of some Unix/Posix system.
If we had a more general structured format would we say the opposite?
It's easy to be a text maximalist now we're in the LLM era, but I disagree that ideas are a separate, nonphysical realm that cannot otherwise be described.
https://lucent.substack.com/p/one-map-hypothesis
Nope. Text and media (visual and audio) are not comparable. text is a vehicle and the other sensory content is the payload. Vehicle is different from payload. A vehicle can not represent a payload. When you are describing a scene or sound using text, you are using it text as a vehicle to send the sensory data to someone, via text, in a crude form. Stories recreate the sensory data and feelings.
Human sensory system has an evolved processing ability for visual and audio content. A story can give different sensory data and feelings to different receivers. It is a low-fidelity transmission.
Try telling someone how an old folk song sounded or how some exotic fruit tasted, or how some wild flower smelled, or how some surreal game scene looked, using only text.
People make fun of it, but I think the fact that Unixey stuff can use tools that have existed since the 70's [1] can be attributed to the fact that they're text based. Every OS has its own philosophy on how to do GUI stuff and as such GUI programs have to do a lot of bullshit to migrate, but every OS can handle text in one form or another.
When I first started using Linux I used to make fun of people who were stuck on the command line, but now pretty much everything I do is a command line program (using NeoVim and tmux).
[1] Yes, obviously with updates but the point more or less still stands.
And when everything is a text file you have (optimally) a human readable single source of truth on things... Very important when things get complicated and layered. In GUI stuff your only option is often to start anew, make the same movements as the first time and hope you end with what you want.
Honestly text is pretty good for conveying all of those things, though you'd also need to supplement it with practice in all but the emotional impact of war bit.
I don’t see the relevance to the topic. I could preface your list with something like “The monkey wrench is not the best tool for the following situations:”. It’s kinda vacuously true in a meaningless way but without expansion adds nothing to a discussion about the relative merits of monkey wrenches versus other similar tools like pliers or vice grips.
I agree with all of these except the emotional impact of war where though slower a novel or memoir might work best. Think "All Quiet on the Western Front." At the same time we do want images of the war and time for grounding.
I was going to disagree, along the lines of the people bringing up Bret Victor or other modes of communication and learning, but I have long accepted that the written word has been one of the largest boons for learning in human history, so I guess I agree. Still, it'll be an interesting and worthwhile challenge to make a better medium with modern technology.
I just recently intentionally made the decision to keep the equation input in FuzzyGraph (https://fuzzygraph.com) plain text (instead of something like stylized latex like Desmos has) in order to make it easy to copy and paste equations.
This is one of the core reason I've been focused on building small tools for myself using Emacs and the shell (currently ksh on OpenBSD). HTML and the Web is good, but only in its basic form. A lot of stuff fancies themselves being applications and magazines and they are very much unusable.
This is one of those irritating articles where one agrees with the gist, but there are serious flaws in the support.
There are societies, even now, that don't have text. Yes, they represent a tiny fraction of 1% of the global population, but they do exist. And the beauty of text is that this level of nuance can be conveyed, a simplistic, inaccurate, broad brush approach is not needed.
Nor is it the oldest form of communication. Having recently started exploring the cave art record, the text informs me that this is at least an upper middle single digit multiple of the age of text.
Yes, a picture paints a thousand words, which can then be interpreted a thousand ways.
Text has the ability to convey precise, accurate, objective information, it does not, as this article demonstrates, necessarily do so.
With LLMs, the text format should be more popular than ever, yet we still see people pushing binary protocols like ProtoBuf for a measly 20% bandwidth advantage which is lost after GZIPing the equivalent JSON... Or a 30% CPU advantage on the serialization aspect which becomes like a 1% advantage once you consider the cost of deserialization in the context of everything else that's going on in the system which uses far more CPU.
It's almost like some people think human-readability, transparency and maintainability are negatives!
The older I get, the more I appreciate texts (any).
Videos, podcasts... I have them transcribed because even though I like listening to music, podcasts are best written for speed of comprehension... (at least for me, I don't know about others).
Audio is horrible (for me) for information transfer - reading (90% of the time) is where it's at
Not sure why that is either - because I look at people extolling the virtues of podcasts, saying that they are able to multi task (eg. driving, walking, eat dinner), and still hear the message - which leaves me aghast
To paraphrase the overused 'ol Sapir-Whorf, if all you think about is information that can be best represented as text, all your examples will be ones text wins at.
Podcasts are fine for entertainment, great for tuning out people or the traffic. I don’t expect to absorb information quickly, but try reading anything serious on the train when some guy is non-stop on his phone using his outside voice.
I had a 53 minute (each way) commute on the train, and I found it perfect for reading papers or learning skills - I was always amazed that the background noise would disappear and I could get lost in the text
Another fascinating property of text (as compared to video), it's less temporal-sensitive. It means that it's much easier to skim through and skip sections, kind of like teleporting through time it took to write such text.
Is it noteworthy that arguments against text by HN commenters are made using text
Reminds me of when HN thread comments about articles pertaining to the negative aspects of web advertising refer to the publisher's, e.g. a newspaper website's, use of web advertising, e.g., ad auctions, trackers, etc., as a point of significance
Would arguments against text be more convincing if made using something other than text
Is it appropriate to use text to make an argument against text. If yes, then why
I was surprised to see something was in text today, until I remembered knowing it at some point - the .har format. Looking at simonw's Claude-generated script [1] to investigate AI agent sent emails [2] by extracting .har archives, I saw that it uses base64 for binary and JSON strings for text.
It might be a good bet to bet on text, but it feels inefficient a lot of the time, especially in cases like this where all sorts of files are stored in JSON documents.
I have mixed feelings about this. On the one hand, I agree: text is infinitely versatile, indexable, durable, etc. But, after discovering Bret Victor's work[1], and thinking about how I learned piano, I've also started to see a lot of the limitations of text. When I learned piano, I always had a live feedback loop: play a note, and hear how it sounds, and every week I had a teacher coach me. This is a completely different way to learn a skill, and something that doesn't work well with text.
Bret Victor's point is why is this not also the approach we use for other topics, like engineering? There are many people who do not have a strong symbolic intuition, and so being able to tap into their (and our) other intuitions is a very powerful tool to increase efficiency of communication. More and more, I have found myself in this alternate philosophy of education and knowledge transmission. There are certainly limits—and text isn't going anywhere, but I think there's still a lot more to discover and try.
[1] https://dynamicland.org/2014/The_Humane_Representation_of_Th...
I think the downside, at least near-term, or maybe challenge would be the better word, is that anything richer than text requires a lot more engineering to make it useful. B♭ is text. Most of the applications on your computer, including but not limited to your browser, know how to render B♭ and C♯, and your brain does the rest.
Bret Victor's work involves a ton of really challenging heavy lifting. You walk away from a Bret Victor presentation inspired, but also intimidated by the work put in, and the work required to do anything similar. When you separate his ideas from the work he puts in to perfect the implementation and presentation, the ideas by themselves don't seem to do much.
Which doesn't mean they're bad ideas, but it might mean that anybody hoping to get the most out of them should understand the investment that is required to bring them to fruition, and people with less to invest should stick with other approaches.
> You walk away from a Bret Victor presentation inspired, but also intimidated by the work put in, and the work required to do anything similar. When you separate his ideas from the work he puts in to perfect the implementation and presentation, the ideas by themselves don't seem to do much.
Amen to that. Even dynamic land has some major issues with GC pauses and performance issues.
I do try to put my money where my mouth is, so I've been contributing a lot to folk computer[1], but yeah, there's still a ton of open questions, and it's not as easy as he sometimes makes it look.
[1] https://folk.computer/
> B♭ is text.
Yes, but musical notation is far superior to text for conveying the information needed to play a song.
For complex music, sure, but if I'm looking up a folk tune on, say, thesession.org, I personally think a plain-text format like ABC notation is easier to sight-read (since for some instruments, namely the fiddle and mandolin, I mainly learn songs by ear and am rather slow and unpracticed at reading standard notation).
I don't understand, musical notation is text though so how can it be superior to itself?
I think they mean staff notation, not a textual notation like "B♭".
Although, one could make the argument that staff notation is itself a form of text, albeit one with a different notation than a single stream of Unicode symbols. Certainly, without musical notation, a lot of music is lost (although, one can argue that musical notation is not able to adequately preserve some aspects of musical performance which is part of why when European composers tried to adopt jazz idioms into their compositions in the early twentieth century working from sheet music, they missed the whole concept of swing which is essential to jazz).
> one could make the argument that staff notation is itself a form of text, albeit one with a different notation than a single stream of Unicode symbols
Mostly this is straightforwardly correct. Notes on a staff are a textual representation of music.
There are some features of musical notation that aren't usually part of linguistic writing:
- Musical notation is always done in tabular form - things that happen at the same time are vertically aligned. This is not unknown in writing, though it requires an unusual context.
- Relatedly, sometimes musical notation does the equivalent of modifying the value of a global variable - a new key signature or a dynamic notation ("pianissimo") takes effect everywhere and remains in effect until something else displaces it. In writing, I guess quotation marks have similar behavior.
- Musical notation sometimes relates two things that may be arbitrarily far apart from each other. (Consider a slur.) This is difficult to do in a 1-D stream of symbols.
> although, one can argue that musical notation is not able to adequately preserve some aspects of musical performance
Nothing new there; that's equally true of writing in relation to speech.
Yes. And I create and manage the musical notation for over 100 songs in text, specifically Lilypond.
If we accepted the validity of this argument, then literally everything that can be represented by a computer can be referred to as text.
It renders the term "text" effectively meaningless.
To be fair, in Lilypond's case, it is an ASCII interface that renders to sheet music (kind of like openSCAD).
Working in any science should also make this argument clearer. Data as text is hard to read and communicate. Even explanations of results. But graphs? Those are worth a thousand words. They communicate so much so fast. There's also a lot of skill to doing this accurately and well, just as one can say about writing. A whole subfield of computer graphics is dedicated to data visualization because it's so useful. Including things like colors. Things people often ignore because it feels so natural and obvious but actually isn't.
I think it's naïve to claim there's a singular best method to communicate. Text is great, especially since it is asynchronous. But even the OP works off of bad assumptions that are made about verbal language being natural and not being taught. But there's a simple fact, when near another person we strongly prefer to speak than write. And when we can mix modes we like to. There's an art to all this and I think wanting to have a singular mode is more a desire of simplicity than a desire to be optimal
It is true that graphs communicate very well. But they do come from text... And in the end we need to be able to describe what we see in them in text.
No, you do not need to, and will not generally be able to, describe everything that a graph conveys in text. Graphs can give you an intuitive understanding of the data that text would not be able to, simply by virtue of using other parts of the brain and requiring less short term memory. If a graph can be replaced with 5 pages of text, that doesn't mean that you get the same information from both - you're likely much more able to keep one image in your short term memory than 5 pages of text.
But they are multiple different "views" into data, and I would posit that a textual view of data is no different than a graphical view, no? If you import data from a parquet file, you go straight from numbers to graphs, so I disagree that it comes from text. Both graphs and text come from information. Circles on surveys, Arduino temperature readings, counter clickers when doing surveys. Those are not just text.
Take a problem like untangling a pile of cords. Writing out how to do that in text would be a drag, and reading those directions probably wouldn't be helpful either. But a kid can learn how to untangle just by observation.
Physical intuition is an enormous part of our intelligence, and is hard to convey in text: you could read millions of words about how to ride a bike, and you would learn nothing compared to spending a few hours trying it out and falling over until it clicks.
I mean, this very discussion is a case study in the supremacy of text. I skimmed the OP's blog post in thirty seconds and absorbed his key ideas. Your link is to a 54 minute video on an interesting topic which I unfortunately don't have time to watch. While I have no doubt that there are interesting ideas in it, video's inferior to text for communicating ideas efficiently, so most people reading this thread will never learn those ideas.
Text is certainly not the best at all things and I especially get the idea that in pedagogy you might want other things in a feedback loop. The strength of text however is its versatility, especially in an age where text transformers are going through a renaissance. I think 90%+ of the time you want to default to text, use text as your source of truth, and then other mediums can be brought into play (perhaps as things you transform your text into) as the circumstances warrant.
I came back here after the video (btw he speak very deliberately, watching it at 1.5 or 2x while digesting the message is fine)
I'd compare it's message to a "warning !" sign. It's there to make you stop and think about our computing space, after that it's up to you to act or not on how you perceive it.
That's totally wishy-washy, so it might not resonate, but after that I went to check more of what dynamicland is doing and sure enough they're doing things that are completely outside of the usual paradigm.
A more recent video explaining the concept in a more practical and down to earth framing: https://youtu.be/PixPSNRDNMU
(here again, reading the transcript won't nearly convey the point. Highly recommend watching it, even sped up if needed)
Actually, you might want to check the video again, it has sections and a full transcript on the right side, precisely to make skimming easy!
> video's inferior to text for communicating ideas efficiently
Depends on the topic tbh. For example, YouTube has had an absolute explosion of car repair videos, precisely because video format works so well for visual operations. But yes, text is currently the best way to skim/revisit material. That's one reason I find Bret's website so intriguing, since he tries to introduce those navigation affordances into a video medium.
> The strength of text however is its versatility, especially in an age where text transformers are going through a renaissance. I think 90%+ of the time you want to default to text, use text as your source of truth, and then other mediums can be brought into play (perhaps as things you transform your text into) as the circumstances warrant.
Agree, though not because of text's intrinsic ability, but because its ecosystem stretches thousands of years. It's certainly the most pragmatic choice of 2025. But, I want to see just how far other mediums can go, and I think there's a lot of untapped potential!
The fidelity and encoding strength of the "idea" you got the gist of from skimming might be less than the "idea" you receive when you spend the time to watch the 54 minute video
Thank you so much for introducing me to this talk. Changed my way of thinking.
I've also become something of a text maximalist. It is the natural meeting point in human-machine communication. The optimal balance of efficiency, flexibility and transparency.
You can store everything as a string; base64 for binary, JSON for data, HTML for layout, CSS for styling, SQL for queries... Nothing gets closer to the mythical silver-bullet that developers have been chasing since the birth of the industry.
The holy grail of programming has been staring us in the face for decades and yet we still keep inventing new data structures and complex tools to transfer data... All to save like 30% bandwidth; an advantage which is almost fully cancelled out anyway after you GZIP the base64 string which most HTTP servers do automatically anyway.
Same story with ProtoBuf. All this complexity is added to make everything binary. For what goal? Did anyone ever ask this question? To save 20% bandwidth, which, again is an advantage lost after GZIP... For the negligible added CPU cost of deserialization, you completely lose human readability.
In this industry, there are tools and abstractions which are not given the respect they deserve and the humble string is definitely one of them.
As someone who's daily job is to move protobuf messages around, I don't think protobuf is a good example to support your point :-)
AFAIKT, binary format of a protobuf message is strictly to provide a strong forward/backward compatibility guarantee. If it's not for that, the text proto format and even the jaon format are both versatile, and commonly used as configuration language (i.e. when humans need to interact with the file).
You can also provide this with JSON and API versioning. Also with JSON, you can add new fields to requests and responses, it's only deleting fields which breaks compatibility.
I've moved away from DOCish or PDF for storage to text (usually markdown) with Makefiles to build with Typst or whatever. Grep works, git likes it, and I can easily extract it to other formats.
My old 1995 MS thesis was written in Lotus Word Pro and the last I looked, there was nothing to read it. (I could try Wine, perhaps. Or I could quickly OCR it from paper.) Anyway, I wish it were plain text!
The value of protobuf is not to save a few bytes on the wire. First, it requires a schema which is immensely valuable for large teams, and second, it helps prevent issues with binary skew when your services aren't all deployed at the same millisecond.
The text based side of protobuf is not base64 or json. We'd be looking at either CSV or length delimited fields.
Many large scale systems are on the same camp as you as their text files flow around their batch processors like crazy, but there's absolutely no flexibility or transparency.
Json and or base64 are more targeted as either low volume or high latency systems. Once you hit a scale where optimizing a few bits straight saves a significant amount of money, self labeled fields are just out of question.
Base64 and JSON takes a lot of CPU to decode; this is where Protobuf shines (for example). Bandwidth is one thing, but the most expensive resources are RAM and CPU, and it makes sense to optimize for them by using "binary" protocols.
For example, when you gzip a Base64-encoded picture, you end up 1. encoding it in base64 (takes a *lot* of CPU) and then, compressing it (again! jpeg is already compressed).
I think what it boils down to is scale; if you are running a small shop and performance is not critical, sure, do everything in HTTP/1.1 if that makes you more productive. But when numbers start mattering, designing binary protocols from scratch can save a lot of $ in my experience.
Maybe for some kind of multiplayer game which has massive bandwidth and CPU usage requirements and has to be supported by paper-thin advertising profit margins... When tiny performance improvements can mean the difference between profitable and unprofitable, then it might make sense to optimize but this... But for the vast majority of software, the cost of serializing JSON is negligible and not worth thinking about.
For example, I've seen a lot of companies obsess over minor stuff like shaving a few bucks off their JSON serialization or using a C binding of some library to squeeze every drop of efficiency out of those technologies... While at the same time letting their software maintenance costs blow out of control... Or paying astronomical cloud compute bills when they could have self-hosted for 1/20th of the price...
Also, the word scale is overused. What is discussed here is performance optimization, not scalability. Scalability doesn't care for fixed overhead costs. Scalability is about growth in costs as usage increases and there is no difference in scalability if you use ProtoBuf or JSON.
The expression that comes to mind is "Penny-wise, pound-foolish." This effect is absolutely out of control in this industry.
If you deploy on phones, CPU and memory is a major problem. Pick a median Android and lots of websites consisently fail to deliver good experience on it and it's very common to see them bottlenecked on CPU. JSON is massively innefficient, it's foolish think it won't have any effect.
I marvel at the constraint and freedom of the string.
Just go full Tcl, where instead of shunning stringly typed data structures, the only data structure available is a string :)
shipping base64 in json instead of a multipart POST is very bad for stream-processing. In theory one could stream-process JSON and base64... but only the json keys prior would be available at the point where you need to make decisions about what to do with the data.
Still, at least it's an option to put base64 inline inside the JSON. With binary, this is not an option and must send it separately in all cases, even small binary...
You can still stream the base64 separately and reference it inside the JSON somehow like an attachment. The base64 string is much more versatile.
> Still, at least it's an option to put base64 inline inside the JSON. With binary, this is not an option and must send it separately in all cases, even small binary...
There's nothing special about "text" or binary here. You can absolutely put binary inside other binary; you use a symbol that doesn't appear inside the binary, much like you do for text.
You use a divider, like " is for json, and a prearranged way to avoid that symbol from appearing inside the inner binary (the same approach that works for text works here).
What do you think a zip file is? They're not storing compressed binary data as text, I can tell you that.
This reminds me that I just learned the other day that .a files are unix archives, which have a textual representation (and if all the bundled files are textual, there's no binary information in the bundle). I thought .a was just for static libraries for the longest time, and had no idea that it was actually an old archive format.
Even with binary, you can store a binary inline inside of another one if it is a structured format with a "raw binary data" type, such as DER. (In my opinion, DER is better in other ways too, and (with my nonstandard key/value list type added) it is a superset of the data model of JSON.)
Using base64 means that you must encode and decode it, but binary data directly means that is unnecessary. (This is true whether or not it is compressed (and/or encrypted); if it is compressed then you must decompress it, but that is independent of whether or not you must decode base64.)
I don't get why using a binary protocol doesn't allow handling strings. What's the limitation ?
https://futuretextpublishing.com/ --> books vol 1-5
And what comes to original article, there is no "text [systems]" (or there is, like there are "number [systems]", just made up). "Text" like this very thing you are reading is 2D drawing. There are no character glyphs of any kind (latin, logograms etc.) defined by universe*, they are human invented and stored/interpreted at human collective level. Computers don't know anything about text, only "numbers" of some bit width, and with those numbers a system must be created that can map some number representation to some drawing in some method (e.g. with bitmap). Also there is a lot of difference between formal/executable and natural human languages. Anyways, it's not a about some text format/encoding, it's the human/computer defined/interpreted non-linguistical meaning behind it (Wittgenstein).
* DNA/RNA can be one such "universal character glyph/string", as the "textual" information is physically constructed and interpreted.
Text is just bytes, and bytes are just text. I assume this is talking about human readable ASCII specifically.
I think the obsession with text comes down to two factors: conflating binary data with closed standards and poor tooling support. Text implies a baseline level of acceptable mediocrity for both. Consider a CSV file will millions of base64 encoded columns and no column labels. That would really not be any friendlier than a binary file with a openly documented format and suitable editing tool, e.g. sqlite.
Maybe a lack of fundamental technical skills is another culprit, but binary files really aren't that scary.
> Text is just bytes, and bytes are just text. I assume this is talking about human readable ASCII specifically.
Text is human readable writing (not necessarily ASCII). It is most certainly not just any old bytes the way you are saying.
I agree, but binary is exactly the same. You use a different tool to view it, and maybe you don't have that tool, and that's the problem. But it's a matter of having a way to interpret the data; trivially base64 encoding readable text gives you text, and if you can't decode it, it's as meaningless as binary you can't decode.
It makes more sense to consider readability or comprehensibility of data in an output format; text makes sense for many kinds of data, but given a graph, I'd rather view it as a graph than as a readable text version.
And if you have a way to losslessly transform data between an efficient binary form, readable text, or some kind of image (or other format), that's the best of all.
And it's funny to think about how many different incompatible text standards there were for the first 30ish years of computers. Each vendor had their own encoding, and it took until UTF-8 to even agree on text (let alone the legacy of UTF-16). If it took that long to agree on text, I have a bad feeling it'll take even longer to agree on anything else.
I suppose open standards have slowly been winning with opus and AV1, but there's still so many forms of interactions that have proprietary or custom interfaces. It seems like anything that has a stable standard has to be at least 20 years old, lol.
And machine readable. You can parse csv file more or less easily but try the same with some forgotten software specific binary.
Text is bytes that's accompanied with a major constraint on which sequences of bytes are permitted (a useful compression into principal axes that emerged over thousands of years of language evolution), along with a natural connection to human semantics that is due to universal adoption of the standard (allowing correlations to be modelled).
Text is like a complexity funnel (analogous to a tokenizer) that everyone shares. Its utility is derived from its compression and its standardization.
If everyone used binary data with their own custom interpretation schema, it might work better for that narrow vertical, but it would not have the same utility for LLMs.
> Text is the oldest and most stable communication technology
Minor nit: complex language (i.e. Zipf’s law) is the oldest and most stable communication technology.
Before text, we had oral story telling. It allowed us to communicate one generation’s knowledge to the next, and so on.
Arguably this is present elsewhere in the animal kingdom (orcas, elephants, etc.), but human language proves to be the most complex.
Side note: one of my favorite examples is from the Gunditjmara (a group of Aboriginal Australians) who recall a volcanic eruption from 30k+ years ago [0].
Written language (i.e. text) is unique, in that it allows information to pass across multiple generations, without a man-in-the-middle telephone-like game of storytelling.
But both are similar, text requires you to read, in your own voice, the thoughts of another. Storytelling requires you to hear a story, and then communicate it to others.
In either case, the person is required to retell the knowledge, either as an internal monologue or as an external broadcast.
Always bet on language.
[0]https://en.wikipedia.org/wiki/Budj_Bim
This also leads to the unreasonable effectiveness of LLMs. The models are good because they have thousands of years of humans trying to capture every idea as text. Engineering, math, news, literature, and even art/craftmanship. You name it, we wrote it down.
Our image models got good when we started making shared image and text embedding spaces. A picture is worth 1000 words, but 1000 words about millions of images are what allowed us to teach computers to see.
> effectiveness of LLMs
Is doing dozens of back and forth to explain what we actually want, while the model burns down inordinate amount of processing power at each turn, a model of efficiency or effectiveness ?
It might be convenient and allow for exploration, the cost might be worth it in some cases, but I wouldn't call it "effective".
In many ways LLMs bring the drawbacks of spoken communication back to text.
(2014) Popular in:
2021 (570 points, 339 comments) https://news.ycombinator.com/item?id=26164001
2015 (156 points, 69 comments) https://news.ycombinator.com/item?id=10284202
2014 (355 points, 196 comments) https://news.ycombinator.com/item?id=8451271
Reread Story of Your Life again just now, and all it made me want to do is learn Heptapod B and their senagram style of written communication.
Reading “Mathematica - A secret world of intuition and curiosity” as well and a part stuck out in a section called The Language Trap. Example author gives is about for a recipe for making banana bread, that if you’re familiar with bananas, it’s obvious that you need to peel them before mashing. Bit of you haven’t seen a banana, you’d have no clue what to do. Does a recipe say peel a banana or should that be ignored? Questions like these are clear coming up more with AI and context, but it’s the same for humans. He ends that section saying most people prefer a video for cooking rather than a recipe.
Other quote from him:
“The language trap is the belief that naming things is enough to make them exist, and we can dispense with the effort of really imagining them.”
gnabgib points out that this same article has been posted for comment here three other times since it was written. That said, afaict no one has commented any of these times on what I'm about to say, so hopefully this will be new.
I'm a linguist, and I've worked in endangered languages and in minority languages (many of which will some day become endangered, in the sense of not having native speakers). The advantage of plain text (Unicode) formats for documenting such languages (as opposed to binary formats like Word used to be, or databases, or even PDFs) is that text formats are the only thing that will stanmd the test of time. The article by Steven Bird and Gary Simons "Seven Dimensions of Portability for Language Documentation and Description" was the seminal paper on this topic, published in 2002. I've given later conference talks on the topic, pointing out that we can still read grammars of Greek and Latin (and Sanskrit) written thousands of years ago. And while the group I led published our grammars in paper form via PDF, we wrote and archived them as XML documents, which (along with JSON) are probably as reproducible a structured format as you can get. I'm hoping that 2000 years from now, someone will find these documents both readable and valuable.
There is of course no replacement for some binary format when it comes to audio.
(By "binary" format I mean file formats that are not sequential and readily interpretable, whereas text files are interpretable once you know the encoding.)
This is all true, but I think you're too focused on your area. Finding musical notes that we can interpret correctly from an ancient civilization, would that be "text" or "binary"? I think it's a false choice.
Similarly, cave paintings express the painting someone intended to make better than a textual description of it.
Purely anecdotal, but I hoard a lot of personal documents (shopping receipts, confirmation emails, scans etc.) and for stuff I saved only 10 years ago, the toughest to reopen are the pure text files.
You rightly mention Unicode, as before that there was a jungle of formats. I have some in UTF-16, some in SJIS, a ton in EUC, other were already utf-8, many don't have a BOM. I could try each encoding and see what works for each of the files (except on mobile...it's just a PITA to deal with that on mobile).
But in comparison there's a set of file I never had issues opening now and then: PDFs and jpegs. All the files that my scanner produced are still readable absolutely everywhere. Even with slight bitrot they're readable, and with the current OCR processes I could probably put it all back in text if ever needed.
If I had to archive more stuff now and can afford the space, I'd go for an image format without hesitation.
PS: I'm surprised you don't mention the Unicode character limitations for minority languages or academic use. There will still be characters that either can't be represented, or don't have an exact 1 to 1 match between the code point and the representation.
Much as I love text for communication, it's worth knowing that "28% of US adults scored at or below Level 1, 29% at Level 2, and 44% at Level 3 or above" - Literacy in the United States: https://en.wikipedia.org/wiki/Literacy_in_the_United_States
Anything below 3 is considered "partially illiterate".
I've been thinking about this a lot recently, as someone who cares about technical communication and making technical topics accessible to more people.
Maybe wannabe educators like myself should spend more time making content for TikTok or YouTube!
The inverse of this is the wisdom that pearls should not be cast before swine. If you want to increase literacy rates, it's unclear to me how engaging people on an illiterate medium will improve things.
Technical topics demand a technical treatment, not 30-second junk food bites of video infotainment that then imbue the ignorant audiences with the semblance or false feeling of understanding, when they actually possess none. This is why we have so many fucking idiots dilating everywhere on topics they haven't a clue on - they probably saw a fucking YouTube video and now consider themselves in possession of a graduate degree in the subject.
Rather than try to widely distribute and disseminate knowledge, it would be far more prescient to capitalize on what will soon be a massive information asymmetry and widening intellectual inequality between the reads and the read-nots, accelerated by the production of machine generated, misinformative slop at scale.
Saying that a 20x20 image of a Twitter logo is 4000 bytes is just so wrong.
The image is of a monochrome logo with anti-aliased edges. Due to being a simple filled geometric shape, it could compress well with RLE, ZIP compression, or even predictors. It could even be represented as vector drawing commands (LineTo, CurveTo, etc...).
In a 1-bit-per-pixel format, a 20x20 image ends up as 400 bits (50 bytes).
This is sort of the premise of all of us electronics-as-code startups. We think that a text-based medium for the representation of circuits is a necessity for AI to be able to create electronics. You can't skip this step and generate schematic images or something. You have to have a human-readable (which also means AI-compatible) text medium. Another confusion: KiCad files are represented in text, so shouldn't AI be able to generate them? No- AI has similar levels of spatial understanding to a human reading these text files. You can't have a ton of XY coordinates or other non-human-friendly components of the text files. Everything will be text-based and human-readable, at least at the first layer of AI-generation for serious applications
Related: https://sive.rs/plaintext
I agree 99%.
The 1% where something else is better?
Youtube videos that show you how to access hidden fasteners on things you want to take apart.
Not that I can't get absolutely anything open, but sometimes it's nice to be able to do so with minimal damage.
I wonder if some day there will be a video codec that is essentially a standard distribution of a very precise and extremely fast text-to-video model (like SmartTurboDiffusion-2027 or something). Because surely there are limits to text, but even the example you gave does not seem to me to be beyond the reach of a text description, given a certain level of precision and capability in the model. And we now have faster than realtime text to video.
This sounds incredibly precarious and prone to breaking when you update to a new model.
It would be impossible to change the model. It would be like a codec, like H.264 but with 1-2GB of fixed data attached to that code name. Changing the model is like going to H.265. Different codec.
For a computer, text is a binary format like anything else. We have decades of tooling built on handling linear streams of text where we sometimes encode higher dimensional structures in it.
But I can't help feel that we try to jam everything into that format because that's what's already ubiquitous. Reminds me of how every hobby OS is a copy of some Unix/Posix system.
If we had a more general structured format would we say the opposite?
It's easy to be a text maximalist now we're in the LLM era, but I disagree that ideas are a separate, nonphysical realm that cannot otherwise be described. https://lucent.substack.com/p/one-map-hypothesis
Nope. Text and media (visual and audio) are not comparable. text is a vehicle and the other sensory content is the payload. Vehicle is different from payload. A vehicle can not represent a payload. When you are describing a scene or sound using text, you are using it text as a vehicle to send the sensory data to someone, via text, in a crude form. Stories recreate the sensory data and feelings.
Human sensory system has an evolved processing ability for visual and audio content. A story can give different sensory data and feelings to different receivers. It is a low-fidelity transmission.
Try telling someone how an old folk song sounded or how some exotic fruit tasted, or how some wild flower smelled, or how some surreal game scene looked, using only text.
Post from the creator of Rust, 11 years ago. Highly relevant to today.
I agree. As a simple exercise, look at all software tools that’s GUI only. They become a large walled garden unable to be penetrated by LLM.
Tools that are mostly text or have text interfaces? Greatly improved by LLM.
So all of those rich multimedia and their players/editors really need to add text representations.
People make fun of it, but I think the fact that Unixey stuff can use tools that have existed since the 70's [1] can be attributed to the fact that they're text based. Every OS has its own philosophy on how to do GUI stuff and as such GUI programs have to do a lot of bullshit to migrate, but every OS can handle text in one form or another.
When I first started using Linux I used to make fun of people who were stuck on the command line, but now pretty much everything I do is a command line program (using NeoVim and tmux).
[1] Yes, obviously with updates but the point more or less still stands.
And when everything is a text file you have (optimally) a human readable single source of truth on things... Very important when things get complicated and layered. In GUI stuff your only option is often to start anew, make the same movements as the first time and hope you end with what you want.
Text is not the best medium for the following situations:
- I want to learn how to climb rock walls
- I want to learn how to throw a baseball
- I want to learn how to do public speaking
- I want to learn how to play piano
- I want to make a fire in the woods
- I want to understand the emotional impact of war
- I want to be involved in my child's life
Honestly text is pretty good for conveying all of those things, though you'd also need to supplement it with practice in all but the emotional impact of war bit.
I don’t see the relevance to the topic. I could preface your list with something like “The monkey wrench is not the best tool for the following situations:”. It’s kinda vacuously true in a meaningless way but without expansion adds nothing to a discussion about the relative merits of monkey wrenches versus other similar tools like pliers or vice grips.
I agree with all of these except the emotional impact of war where though slower a novel or memoir might work best. Think "All Quiet on the Western Front." At the same time we do want images of the war and time for grounding.
Why did you create an account just to post that?
In text format no less
I was going to disagree, along the lines of the people bringing up Bret Victor or other modes of communication and learning, but I have long accepted that the written word has been one of the largest boons for learning in human history, so I guess I agree. Still, it'll be an interesting and worthwhile challenge to make a better medium with modern technology.
Given all the replies here that are within last 10 - 30 mins. I guess I am the only one getting "403 Forbidden" ?
I guess that’s text. Text win every time.
I disagree. If your goal involves the cooperation of others one should always bet on lazy.
Text will win, unless there is a lower effort option. The lower effort option does not need to be better, just easier.
Text can be surprisingly immersive and rich, often surpassing the most complex VR experiences.
It is amazing what we can do with a few strings of symbols, thanks to the fact that we all learn to decode them almost for free.
The oldest and most important technology indeed.
I just recently intentionally made the decision to keep the equation input in FuzzyGraph (https://fuzzygraph.com) plain text (instead of something like stylized latex like Desmos has) in order to make it easy to copy and paste equations.
This is one of the core reason I've been focused on building small tools for myself using Emacs and the shell (currently ksh on OpenBSD). HTML and the Web is good, but only in its basic form. A lot of stuff fancies themselves being applications and magazines and they are very much unusable.
This is one of those irritating articles where one agrees with the gist, but there are serious flaws in the support. There are societies, even now, that don't have text. Yes, they represent a tiny fraction of 1% of the global population, but they do exist. And the beauty of text is that this level of nuance can be conveyed, a simplistic, inaccurate, broad brush approach is not needed. Nor is it the oldest form of communication. Having recently started exploring the cave art record, the text informs me that this is at least an upper middle single digit multiple of the age of text. Yes, a picture paints a thousand words, which can then be interpreted a thousand ways. Text has the ability to convey precise, accurate, objective information, it does not, as this article demonstrates, necessarily do so.
With LLMs, the text format should be more popular than ever, yet we still see people pushing binary protocols like ProtoBuf for a measly 20% bandwidth advantage which is lost after GZIPing the equivalent JSON... Or a 30% CPU advantage on the serialization aspect which becomes like a 1% advantage once you consider the cost of deserialization in the context of everything else that's going on in the system which uses far more CPU.
It's almost like some people think human-readability, transparency and maintainability are negatives!
What are your thoughts on https://github.com/fastserial/lite3?
The older I get, the more I appreciate texts (any).
Videos, podcasts... I have them transcribed because even though I like listening to music, podcasts are best written for speed of comprehension... (at least for me, I don't know about others).
Audio is horrible (for me) for information transfer - reading (90% of the time) is where it's at
Not sure why that is either - because I look at people extolling the virtues of podcasts, saying that they are able to multi task (eg. driving, walking, eat dinner), and still hear the message - which leaves me aghast
Brittany Spears - Hit Me Baby One More Time.mp3
To paraphrase the overused 'ol Sapir-Whorf, if all you think about is information that can be best represented as text, all your examples will be ones text wins at.
Not sure, text wins hands down at sharing the ideas of one person, with many, across space and time.
I can read the thoughts of a philosopher who lived on literally the other side of the world, several thousand years ago.
I'm unsure of, but would love to know, any other medium capable of that
Podcasts are fine for entertainment, great for tuning out people or the traffic. I don’t expect to absorb information quickly, but try reading anything serious on the train when some guy is non-stop on his phone using his outside voice.
Ha! I used to
I had a 53 minute (each way) commute on the train, and I found it perfect for reading papers or learning skills - I was always amazed that the background noise would disappear and I could get lost in the text
Best study time ever.
> But text wins by a mile.
white on dark grey with phosphor green around? not really.
this, my thesis should be more to be text to text instead image to text
Another fascinating property of text (as compared to video), it's less temporal-sensitive. It means that it's much easier to skim through and skip sections, kind of like teleporting through time it took to write such text.
there is a surprising number of images used in that post.
Is it noteworthy that arguments against text by HN commenters are made using text
Reminds me of when HN thread comments about articles pertaining to the negative aspects of web advertising refer to the publisher's, e.g. a newspaper website's, use of web advertising, e.g., ad auctions, trackers, etc., as a point of significance
Would arguments against text be more convincing if made using something other than text
Is it appropriate to use text to make an argument against text. If yes, then why
The last 2 paragraphs were quite poetic.
PS: 2014
I agree about text being absolute
I TOTALLY disagree on terminal being the best way
Even the text tablet shown is using 2D surface in its full ability - we need to strive to bring that as well
I was surprised to see something was in text today, until I remembered knowing it at some point - the .har format. Looking at simonw's Claude-generated script [1] to investigate AI agent sent emails [2] by extracting .har archives, I saw that it uses base64 for binary and JSON strings for text.
It might be a good bet to bet on text, but it feels inefficient a lot of the time, especially in cases like this where all sorts of files are stored in JSON documents.
1: https://gist.github.com/simonw/007c628ceb84d0da0795b57af7b74...
2: https://simonwillison.net/2025/Dec/26/slop-acts-of-kindness/