But: 99.9% of the people forget about setting the routing constraints/rules so the results turn out to be very bad. Even professionals forget about this.
Lots of things also cannot be auto-routed as you have to work around of details. Usually high-frequency DCDC converters belong into that region for example.
As a beginner these are no tools for you as you cannot judge if they are creating extremely bad designs or what will work. Stick super close to reference designs from datasheets and there will be a good chance that they will work just fine.
One could build a dataset out of Github data, by analyzing KiCAD files. They are likely to be "completed" projects, so one would have to rip out all the traces and start from that. And the parts are also placed (in a way which makes routing feasible for a human), which is a large part of routing. So one could have another task setting where the parts also have to be placed, and then routed.
Such a dataset would likely represent simple to medium cases, as open designs are usually on the lower side of the complexity compared to industry. And it would be hard to automatically infer important constraints such as differential pair matching. That would require manual annotation, most likely. But if there indeed are no open datasets, then I think this would be a worthwhile contribution to the field.
If OP wants to approach this using machine learning instead of a deterministic algorithm, wouldn't this be exactly what they need?
Use the completed traces and part locations (complete with human post adjustments and all) as labels and the bare connectivity graph + "constraints" in some form as inputs.
Of course, as with all machine learning projects, the interface is deceptively simple but gives you no information how well the system can work or whether it can work at all...
What kinds of PCBs are you thinking to auto-route?
If you are talking about a typical hobbyist board with very low frequency switching, the problem is relatively simple, particularly if you allow more than two layer boards. Even there you can create massive problems if you do power distribution wrong.
On the other hand, if you are trying to do something with high frequencies, the problem will be much more difficult. For example, many high speed analog-digital converters use low voltage differential signalling (LVDS) to move data around. Such signals are very sensitive to bad routing since they must be kept very near each other (to avoid creating an inductive loop sensitive to external signals) and must have the same length (to retain common mode rejection). Similarly, RF amplifiers will often oscillate badly or have much lower bandwidth if laid out incorrectly.
Routing by itself is solved problem, there is no need for AI here. But! AI could be used to look at datasheet of the part and identify the signal. Typical status LED or power good signals do not care about routing while power and high frequency signals are very sensitive to the routing.
That doesn't seem hard for the PCB designer to simply specify when designing. Although, I don't know how to design PCBs so I may well be missing something.
As others have said autorouting is a trivial problem under the assumption that you have infinite PCB layers. Trying to squeeze a design into a finite number of layers might be NP-hard though.
What's not solved is setting up constraints on signals.
Now if you still want to do this anyway, I'd suggest building an autorouter and automatically generating custom parts with both standard and generated footprints, place them randomly on a PCB with a random number of layers, then autoroute it with a conventional algorithm and there is your dataset.
Partial Auto-Routing always worked quite good.
But: 99.9% of the people forget about setting the routing constraints/rules so the results turn out to be very bad. Even professionals forget about this.
Lots of things also cannot be auto-routed as you have to work around of details. Usually high-frequency DCDC converters belong into that region for example.
As a beginner these are no tools for you as you cannot judge if they are creating extremely bad designs or what will work. Stick super close to reference designs from datasheets and there will be a good chance that they will work just fine.
The ROTs I always heard was "don't use it it's shit!" I never validated it myself, and now it's gone from KiCad!
Would love auto routing if it was good; routing is tedius, and on cramped PCBs, can be frustrating and make your designs high-inertia to change.
Incidentally, team 6+ layers! Makes routing easier.
Even "naked" auto-routers without configuration are great to check if there's any solution for routing by hand.
One could build a dataset out of Github data, by analyzing KiCAD files. They are likely to be "completed" projects, so one would have to rip out all the traces and start from that. And the parts are also placed (in a way which makes routing feasible for a human), which is a large part of routing. So one could have another task setting where the parts also have to be placed, and then routed. Such a dataset would likely represent simple to medium cases, as open designs are usually on the lower side of the complexity compared to industry. And it would be hard to automatically infer important constraints such as differential pair matching. That would require manual annotation, most likely. But if there indeed are no open datasets, then I think this would be a worthwhile contribution to the field.
If OP wants to approach this using machine learning instead of a deterministic algorithm, wouldn't this be exactly what they need?
Use the completed traces and part locations (complete with human post adjustments and all) as labels and the bare connectivity graph + "constraints" in some form as inputs.
Of course, as with all machine learning projects, the interface is deceptively simple but gives you no information how well the system can work or whether it can work at all...
What kinds of PCBs are you thinking to auto-route?
If you are talking about a typical hobbyist board with very low frequency switching, the problem is relatively simple, particularly if you allow more than two layer boards. Even there you can create massive problems if you do power distribution wrong.
On the other hand, if you are trying to do something with high frequencies, the problem will be much more difficult. For example, many high speed analog-digital converters use low voltage differential signalling (LVDS) to move data around. Such signals are very sensitive to bad routing since they must be kept very near each other (to avoid creating an inductive loop sensitive to external signals) and must have the same length (to retain common mode rejection). Similarly, RF amplifiers will often oscillate badly or have much lower bandwidth if laid out incorrectly.
Routing by itself is solved problem, there is no need for AI here. But! AI could be used to look at datasheet of the part and identify the signal. Typical status LED or power good signals do not care about routing while power and high frequency signals are very sensitive to the routing.
For one, routing is not a solved problem. It's an unsolvable problem with a lot okay solutions.
Second, you're describing netclasses. Every EDA package has this feature. You have to click one extra box when setting up your symbols.
That doesn't seem hard for the PCB designer to simply specify when designing. Although, I don't know how to design PCBs so I may well be missing something.
It’s not hard, but nobody has time for it. Curating footprints and symbols is very time consuming task.
Couldn't you simply do it with a few clicks when creating the signal?
Too difficult and annoying. We should burn several MW to save designers a couple of clicks.
routing is kind of connected with board design, when you route stuff you also move components around to make it easier.
LLM would need to layout and route while board own. We would probably need diffusion or agentic solution where board can be simulated in RL loop.
As others have said autorouting is a trivial problem under the assumption that you have infinite PCB layers. Trying to squeeze a design into a finite number of layers might be NP-hard though.
What's not solved is setting up constraints on signals.
Now if you still want to do this anyway, I'd suggest building an autorouter and automatically generating custom parts with both standard and generated footprints, place them randomly on a PCB with a random number of layers, then autoroute it with a conventional algorithm and there is your dataset.
Have you tried pre-LLM solutions, like TopoR?