Aside from the Knowledge Graph buzzword, isn't this exactly the same idea as Tim Berners-Lee and the Semantic Web back in 2001?
- web of resources, not pages
- ontology-based schema
- RDF based encoding
- URI (IRI) resource identifiers
- Automated agents and reasoning (DL) support
Considering the ensuing reception and general avoidance of the semantic web outside academic papers, I guess no one wants to talk about it.
And there are related standards like HATEOAS for state full behaviour right?
But this isn't about a new way of presenting information to rival sparql or something.
This is a technical report about a guy who wrote a slightly advanced crawler bot that discovers the behaviour of a modern Web application and can maybe be used to generate automated tests.
It has a lot in common with that post about a new yc company using llms to generate integrations for existing undocumented Web applications.
> Knowledge graph-based systems, such as Squirrel [8], have been proposed to crawl the semantic web and represent data in structured formats. However, these approaches are typically limited to RDF-based web data and do not capture the dynamic, user-driven interactions seen in modern web applications.
How is this dismissive? It's a fairly straightforward statement of fact. The only way I could possibly read it as dismissive is if you interpreted it like, "These approaches are typically limited to [old, boring ass,] RDF-based web data and do not capture the [rich, luscious] dynamic, user driven interactions seen in [beautiful] modern web applications"
Isn't this just going back to the 90s web? Before all the javascript and interactivity craziness was used. We can argue all day about how bad js is from a development perspective but I definitely like the interactivity it's brought to the web.
> but I definitely like the interactivity it's brought to the web.
I don't!
EDIT: to be earnest, I sincerely wish the web as it stands had chosen some serious distinction between "documents", "documents-with-scripting-but-no-web-requests", and "interactive applications with remote state". As it stands the last option trivially passes for the first two options.
JS can do some cool things. Sometimes it's necessary. Sometimes it's the most expedient way.
On the other hand, we've exported high enough complexity and code payload sizes to the front-end and seen bandwidth rise enough that it's possible for some full page reloads to beat responsiveness/interactivity of many front ends at this point. But if one doesn't want to do that, there's techniques to let JS act primarily in the controller role while letting the back end take care of model/view concerns that, exchanging HTML as an adequate data serialization and display/UI description format.
And, just as importantly, as a hypertext and serialization format, it provides opportunity for machine discovery and description of available resources. The ones we're most used to are search engines, but we're obviously in an era where we're discovering what else is possible.
Possible ML models can also work with JS messes, but since they're less legible for people, there's less likely a good corpus correlating accurate descriptions with code in the wild. Legibility of what you feed the client has benefits beyond executablity, something that people who grew up with the early web were apparently prepared to understand.
Aside from the Knowledge Graph buzzword, isn't this exactly the same idea as Tim Berners-Lee and the Semantic Web back in 2001? - web of resources, not pages - ontology-based schema - RDF based encoding - URI (IRI) resource identifiers - Automated agents and reasoning (DL) support
Considering the ensuing reception and general avoidance of the semantic web outside academic papers, I guess no one wants to talk about it.
And there are related standards like HATEOAS for state full behaviour right?
But this isn't about a new way of presenting information to rival sparql or something.
This is a technical report about a guy who wrote a slightly advanced crawler bot that discovers the behaviour of a modern Web application and can maybe be used to generate automated tests.
It has a lot in common with that post about a new yc company using llms to generate integrations for existing undocumented Web applications.
The paper seems kind of dismissive:
> Knowledge graph-based systems, such as Squirrel [8], have been proposed to crawl the semantic web and represent data in structured formats. However, these approaches are typically limited to RDF-based web data and do not capture the dynamic, user-driven interactions seen in modern web applications.
How is this dismissive? It's a fairly straightforward statement of fact. The only way I could possibly read it as dismissive is if you interpreted it like, "These approaches are typically limited to [old, boring ass,] RDF-based web data and do not capture the [rich, luscious] dynamic, user driven interactions seen in [beautiful] modern web applications"
"Dismissive" in the sense of "let's dismiss it for the remainder of this paper, because it does not apply here".
Isn't this just going back to the 90s web? Before all the javascript and interactivity craziness was used. We can argue all day about how bad js is from a development perspective but I definitely like the interactivity it's brought to the web.
> but I definitely like the interactivity it's brought to the web.
I don't!
EDIT: to be earnest, I sincerely wish the web as it stands had chosen some serious distinction between "documents", "documents-with-scripting-but-no-web-requests", and "interactive applications with remote state". As it stands the last option trivially passes for the first two options.
JS can do some cool things. Sometimes it's necessary. Sometimes it's the most expedient way.
On the other hand, we've exported high enough complexity and code payload sizes to the front-end and seen bandwidth rise enough that it's possible for some full page reloads to beat responsiveness/interactivity of many front ends at this point. But if one doesn't want to do that, there's techniques to let JS act primarily in the controller role while letting the back end take care of model/view concerns that, exchanging HTML as an adequate data serialization and display/UI description format.
And, just as importantly, as a hypertext and serialization format, it provides opportunity for machine discovery and description of available resources. The ones we're most used to are search engines, but we're obviously in an era where we're discovering what else is possible.
Possible ML models can also work with JS messes, but since they're less legible for people, there's less likely a good corpus correlating accurate descriptions with code in the wild. Legibility of what you feed the client has benefits beyond executablity, something that people who grew up with the early web were apparently prepared to understand.
This begs for the reverse-turning state graphs back into UI.