I'm not the biggest fan of the embed block syntax, for two reasons.
1. Having `%\\\%` escape to `%\\%` is confusing. Assuming you want to embed kson inside kson it would look like this:
Note that for each level of nesting, the delimiters inside the block have to increase, instead of the ones outside. In a long embedded block you probably copied from elsewhere, you have to check every escaped delimiter to make sure it has the correct number of back slashes and update it. It would have made more sense to have a varying number of % correspond to a matching set to close it. e.g. %%%kson: foo matches %%% and closes it.
2. It's impossible to represent strings where every line has at least some leading indent, because they "always strip their minimum indent".
I think both these issues are solved by using zig's style of raw string literals. An embed block could start with %tag and each following line with % continues the block. No closing delimiter or delimiter escapes needed.
Granted, this also kind of looks ugly with percent signs. But the upside is you could paste in each sub level, and a text editor could insert the symbols for you before each line, just like they already do when highlighting text and commenting it out.
The embed block seems interesting, can someone give a concrete example on its' purpose in terms of real-world usage?
Like I understand I'll get syntax highlighting, multi-line formatting; seems great for devex, but why's stopping me from putting code in a single string and parsing it?
Does it offer some benefits while parsing it? Thanks!
Thanks! KSON is very focused on facilitating a better interface for a user managing config data, so one concrete example of the embed block's purpose is to help anywhere folks are directly editing important/complex multi-line strings in their config (think sql in dbt files, or bash snippets in GitHub Actions, etc).
By making the content type of the embed a first-class citizen of the syntax, together with clear boundaries for the embedded content, editors and IDEs can hook in all their great tooling to the embed blocks. So, you nailed it: being great for devex is exactly the idea.
I met Daniel (creator of KSON) a while ago and have helped out with a few things, working towards this first beta release. I don't want to hijack the comments section, but there's an article I just finished writing that might shed a bit of context into why KSON exists. It's called "Configuration files are user interfaces" (see https://ochagavia.nl/blog/configuration-files-are-user-inter...). Hope it helps folks understand where KSON is coming from!
(Update: my article eventually made it to the frontpage and stayed there for a good time, so there's interesting discussion there that people here might want to check out. See https://news.ycombinator.com/item?id=45291858.)
Hey zahiman, thanks for all the Qs. Trying to address them:
KSON (stands for: KSON Structured Object Notation) is designed not as an nth standard, but rather as an idea of how to play nice with all n-1 existing standards. Since our comments (we do support comments!) are preserved in formatting and transpilation (provided the underlying language supports comments), you may use KSON as how you edit whatever underlying config file your system wants (currently YAML and JSON are supported, with a clear path to adding other targets like TOML <https://github.com/kson-org/kson/issues/202>)
Why is it better than TOML? We don't insist that everyone agree it is! But here's two ways to contrast them:
- We prefer that KSON's recursive structure makes nesting consistent. You can see for instance in the grammar that a Kson list value is a list of Kson values: https://github.com/kson-org/kson/blob/857d585ef26d9f73e080b5...
- For use cases where folks are hand-editing non-trivial config, especially if they are using embedded code blocks of any kind: if they long for more help from their tooling, KSON makes that tooling possible.
So incredibly excited for this to come out! What an improvement in the quality of life for developers! My hunch is that this will also come in handy when sending data into LLMs, being able to reduce input tokens without losing fidelity is becoming an ever more important ordeal.
Nice to see this out there! Been playing with the token efficiency, looks like around 10% savings vs regular JSON/YAML. Not massive but it adds up. The embedded code thing is cool. Tired of configs that need weird templating or scripts all over the place just to do basic logic. Will check this out on some projects.
I've had similar thoughts. Any interest in sharing notes? Generally I've been thinking there should be a standard declarative format for data vis. This should be inspired by prior art like ggplot, Grammar of Graphics, Vega, VegaLite, etc.
From there it could be compiled to targets declarative or procedural (eg to generate VegaLite specs, echart specs or Observable Plot configs in JS).
How have you been thinking about this problem space?
I'm not the biggest fan of the embed block syntax, for two reasons. 1. Having `%\\\%` escape to `%\\%` is confusing. Assuming you want to embed kson inside kson it would look like this:
```
%kson: level 1
%%```
Note that for each level of nesting, the delimiters inside the block have to increase, instead of the ones outside. In a long embedded block you probably copied from elsewhere, you have to check every escaped delimiter to make sure it has the correct number of back slashes and update it. It would have made more sense to have a varying number of % correspond to a matching set to close it. e.g. %%%kson: foo matches %%% and closes it.
2. It's impossible to represent strings where every line has at least some leading indent, because they "always strip their minimum indent".
I think both these issues are solved by using zig's style of raw string literals. An embed block could start with %tag and each following line with % continues the block. No closing delimiter or delimiter escapes needed.
```
%kson: level 1
```Granted, this also kind of looks ugly with percent signs. But the upside is you could paste in each sub level, and a text editor could insert the symbols for you before each line, just like they already do when highlighting text and commenting it out.
The embed block seems interesting, can someone give a concrete example on its' purpose in terms of real-world usage?
Like I understand I'll get syntax highlighting, multi-line formatting; seems great for devex, but why's stopping me from putting code in a single string and parsing it?
Does it offer some benefits while parsing it? Thanks!
Thanks! KSON is very focused on facilitating a better interface for a user managing config data, so one concrete example of the embed block's purpose is to help anywhere folks are directly editing important/complex multi-line strings in their config (think sql in dbt files, or bash snippets in GitHub Actions, etc).
By making the content type of the embed a first-class citizen of the syntax, together with clear boundaries for the embedded content, editors and IDEs can hook in all their great tooling to the embed blocks. So, you nailed it: being great for devex is exactly the idea.
I met Daniel (creator of KSON) a while ago and have helped out with a few things, working towards this first beta release. I don't want to hijack the comments section, but there's an article I just finished writing that might shed a bit of context into why KSON exists. It's called "Configuration files are user interfaces" (see https://ochagavia.nl/blog/configuration-files-are-user-inter...). Hope it helps folks understand where KSON is coming from!
(Update: my article eventually made it to the frontpage and stayed there for a good time, so there's interesting discussion there that people here might want to check out. See https://news.ycombinator.com/item?id=45291858.)
So the point is that you can use the nth competing standard, and transpile it to whatever the other systems need?
What about manipulating the data?
Does it support embedded comments, like TOML does? What happens if you round-trip that through JSON? What about the other superset features?
What does KSON stand for?
There's a lot said about JSON and YAML on the main page, but why is this better than TOML?
Hey zahiman, thanks for all the Qs. Trying to address them:
KSON (stands for: KSON Structured Object Notation) is designed not as an nth standard, but rather as an idea of how to play nice with all n-1 existing standards. Since our comments (we do support comments!) are preserved in formatting and transpilation (provided the underlying language supports comments), you may use KSON as how you edit whatever underlying config file your system wants (currently YAML and JSON are supported, with a clear path to adding other targets like TOML <https://github.com/kson-org/kson/issues/202>)
Why is it better than TOML? We don't insist that everyone agree it is! But here's two ways to contrast them: - We prefer that KSON's recursive structure makes nesting consistent. You can see for instance in the grammar that a Kson list value is a list of Kson values: https://github.com/kson-org/kson/blob/857d585ef26d9f73e080b5... - For use cases where folks are hand-editing non-trivial config, especially if they are using embedded code blocks of any kind: if they long for more help from their tooling, KSON makes that tooling possible.
So incredibly excited for this to come out! What an improvement in the quality of life for developers! My hunch is that this will also come in handy when sending data into LLMs, being able to reduce input tokens without losing fidelity is becoming an ever more important ordeal.
Nice to see this out there! Been playing with the token efficiency, looks like around 10% savings vs regular JSON/YAML. Not massive but it adds up. The embedded code thing is cool. Tired of configs that need weird templating or scripts all over the place just to do basic logic. Will check this out on some projects.
Excited for the GA of KSON! We've been exploring the alpha to help us build a DSL for a data viz tool and it's been really cool.
I've had similar thoughts. Any interest in sharing notes? Generally I've been thinking there should be a standard declarative format for data vis. This should be inspired by prior art like ggplot, Grammar of Graphics, Vega, VegaLite, etc.
From there it could be compiled to targets declarative or procedural (eg to generate VegaLite specs, echart specs or Observable Plot configs in JS).
How have you been thinking about this problem space?
For a moment I got confused that this was the new Kubernetes KYAML [0]... Yet another *SON, ewww!
[0]: https://medium.com/@simardeep.oberoi/kyaml-kubernetes-answer...