I am returning to this model in my classes: pen in paper quizzes, no digital devices. I also do seven equally weighted quizzes to deescalate them individually. I have reduced project/programming weight from 60-80% of my grade to 50% because it is not possible to tell if the students actually did the work.
Mostly. 50% for midterm and final, plus 10% of the 50% project work is individual contributions to the project to account for varying interest/contributions to the project work.
But the problem is, students need to learn to do the easy things themselves before they can do the hard things with LLMs.
If you ask them to build a web browser when they can't do a hello world on their own, it's going to be a disaster. LLMs are like dumb juniors that you command, but students are less skilled than dumb juniors when they start programming classes..
Why should my 5 year old learn anything if he can just ask chatGPT?
Using chatGPT as a professional is different than using it for homework. Homework and school teaches you many things, not only the subject. You discover how you learn, what your interests are, etc.
ChatGPT can assist with learning also but SHOULD NOT be doing any of the work for the student. It is okay to ask "can you explain big O", then answer follow up questions. However, "give me method to reverse a string" will only hurt.
Depends. Does the college want to graduate computer scientists or LLM operators?
More importantly, does the student want to be a computer scientist or a LLM operator? If they think the future belongs to LLM operators (not a bet I'd recommend), college might not even be the right path for them versus a trade school / bootcamp.
Do you think children should still be expected to be able to do arithmetic by hand?
I think the answer maybe comes down to figuring out exactly what the goal of school is. Are you trying to educate people or train them? There is for sure a lot of overlap, but I think there's a pretty clear distinction and I definitely favor the education side. On the job, a person with a solid education should be able to use whatever language or framework they need with very little training required.
At the university, you are supposed to learn the foundational knowledge. If you let the LLM do the work, you are simply not learning. There are no shortcuts.
And learning how to use LLMs is pathetically easy. Really.
Open book exams are not a new thing and I've often had them for STEM disciplines (maths and biology). Depending on the subject, you will often fail those unless you had a good prior understanding of the material.
If you can pass an exam just by googling something, it means you're just testing rote-memorization rather, and maybe a better design is needed where synthesis and critical thinking skills are evaluated more actively.
I make a point of only using references that are either available for free online or through our university’s library subscriptions. These are all electronic. My open book exam became an open computer exam when I realized students were printing hundreds of pages just for a 3-hour exam. This semester I’m switching to no-computer, bring your own printed cheat-sheet for the exam.
I had a Continuous and Discrete Systems class that allowed open everything during exams. You could google whatever you wanted but the exam was so lengthy that if you had to google something, you really did not have much time to do it and would definitely not have enough time to do it a second time. I would load up a PDF of the chapters and lectures I needed and my homeworks for that unit with everything properly labeled. It was much faster looking for a similar problem you already did in the homework than trying to find the answer online.
Offer to make everyone espresso and macchiato with you GPU cooling module. They won't be able to hear the fan over the grinder and pump and milk foamer!
Except that the physical book isn't the way people lookup facts these days.
The open book test is purposes is to not have to know all facts (formulas) but proving how to find them and how to apply them. (Finding is part of it as the more you look, the less time you got to use it, thus there is an optimisation problem which things to remember and which to look up)
In modern times you wouldn't look those up in a book, thus other research techniques are required to deal with real life (which advanced certifications should prove)
Going to university isn't how people learn these days, so there is already a real-world disconnect, fundamentally. But that's okay as it isn't intended to be a reflection of the real world.
Observation? Children show clear signs of learning before they even make it through their first year out of the womb. Man, most people don't even consider university as an option until they are around 17-18 years of age, after they have already learned the vast majority of the things they will learn in life.
Data? Only 7-8% of the population have a university degree. Obviously you could learn in university without graduating, and unfortunately participation data is much harder to come by, but there is no evidence to suggest that the non-completion rate is anywhere high enough to think that even a majority of the population have step foot in a university even if for just one for day. If we go as far as to assume a 50% dropout rate, that is still no more than 16% of the population. Little more than rounding error.
Nothing? It's a random comment on the internet. It is not necessarily based on anything. Fundamentally, comments are only ever written for the enjoyment of writing. One trying to derive anything more from it has a misunderstanding of the world around them. I suppose you have a point that, for those who struggle to see the obvious, a university education would teach the critical thinking necessary to recognize the same. But, the fact that we are here echoes that university isn't how people learn these days.
> citing
Citing...? Like, as in quoting a passage? I can find no reason why I would want to repeat what someone else has written about. Whatever gives you enjoyment, but that seems like a pointless waste of time. It is already right there. You must be trying to say something else by this? I, unfortunately, am not in tune with your pet definition.
> This approach still works, why do something else?
One issue is that the time provided to mark each piece of work continues to decrease. Sometimes you are only getting 15 minutes for 20 pages, and management believe that you can mark back-to-back from 9-5 with a half hour lunch. The only thing keeping people sane is the students that fail to submit, or submit something obviously sub-par. So where possible, even for designing exams, you try to limit text altogether. Multiple choice, drawing lines, a basic diagram, a calculation, etc.
Some students have terrible handwriting. I wouldn't be against the use of a dumb terminal in an exam room/hall. Maybe in the background it could be syncing the text and backing it up.
> Unless you're specifically testing a student's ability to Google, they don't need access to it.
I've been the person testing students, and I don't always remember everything. Sometimes it is good enough for the students to demonstrate that they understand the topic enough to know where to find the correct information based on a good intuition.
Your blue book is being graded by a stressed out and very underpaid grad student with many better things to do. They're looking for keywords to count up, that's it. The PI gave them the list of keywords, the rubric. Any flourishes, turns of phrase, novel takes, those don't matter to your grader at 11 pm after the 20th blue book that night.
Yeah sure, that's not your school, but that is the reality of ~50% of US undergrads.
Very effective multiple choice tests can be given, that require work to be done before selecting an answer, so it can be machine graded. Not ideal in every case but a very quality test can be made multiple choice for hard science subjects
But again, the test creator matters a lot here too. To make such an exam is quite the labor. Especially as many/most PIs have other better things to do. Their incentives are grant money, then papers, then in a distant 3rd their grad students, and finally undergrad teaching.any departments are explicit on this. To spend the limited time on a good undergrad multiple choice exam is not in the PIs best interest.
Which is why, in this case of a good Scantron exam, they're likely to just farm it out to Claude. Cheap, easy, fast, good enough. A winner in all dimensions.
Also, as an aside to the above, an AI with OCR for your blue book would likely be the best realistic grader too. Needs less coffee after all
> Very effective multiple choice tests can be given, that require work to be done before selecting an answer, so it can be machine graded.
As someone who has been part of the production of quite a few high stakes MC tests, I agree with this.
That said, a professor would need to work with a professional test developer to make a MC that is consistently good, valid, and reliable.
Some universities have test dev folks as support, but many/most/all of them are not particularly good at developing high quality MC tests imho.
So, for anyone in a spot to do this, start test dev very early, ideally create an item bank that is constantly growing and being refined, and ideally have some problem types that can be varied from year-to-year with heuristics for keys and distractors that will allow for items to be iterated on over the years while still maintaining their validity. Also, consider removing outliers from the scoring pool, but also make sure to tell students to focus on answering all questions rather than spinning their wheels on one so that naturally persistent examinees are less likely to be punished by poor item writing.
Pros and cons. Multiple choice can be frustrating for students because it's all or nothing. Spend 10 minutes+ on question, make a small calculation error and end up with a zero. It's not a great format for a lot of questions.
They're also susceptible to old-school cheating - sharing answers. When I was in college, multiple choice exams were almost extinct because students would form groups and collect/share answers over the years.
You can solve that but it's a combinatorial explosion.
A long time ago, when I handed out exams, for each question, I used to program my exam questions into a generator that produced both not-entirely-identical questions for each student (typically, only the numeric values changed) and the matching answers for whoever was in charge of assessing.
This is what my differential equations exams were like almost 20 years ago. Honestly, as a student I considered them brutal (10 questions, no partial credit available at all) even though I'd always been good at math. I scraped by but I think something like 30% of students had to retake the class.
Now that I haven't been a student in a long time and (maybe crucially?) that I am friends with professors and in a relationship with one, I get it. I don't think it would be appropriate for a higher level course, but for a weed-out class where there's one Prof and maybe 2 TAs for every 80-100 students it makes sense.
For large classes or test questions used over multiple years, you need to take care that the answers are not shared. It means having large question banks which will be slowly collected. A good question can take a while to design, and it can be leaked very easily.
Stanford started doing 15 minute exams with ~12 questions to combat LLM use. OTOH I got a final project feedback from them that was clearly done by an LLM :shrug:
> I got a final project feedback from them that was clearly done by an LLM
I've heard of this and have been offered "pre-prepared written feedback banks" for questions, but I write all of my feedback from scratch every time. I don't think students should have their work marked by an LLM or feedback given via an LLM.
An LLM could have a place in modern marking, though. A student submits a piece of work and you may have some high level questions:
1. Is this the work of an LLM?
2. Is this work replicated elsewhere?
3. Is there evidence of poor writing in this work?
4. Are there examples where the project is inconsistent or nonsensical?
And then the LLM could point to areas of interest for the marker to check. This wouldn't be to replace a full read, but would be the equivalent of passing a report to a colleague and saying "is there anything you think I missed here?".
> Then they should have points deducted for that. Effective communication of answers is part of any exam.
Agreed. Then let me type my answers out like any reasonable person would do.
For reference…
For my last written blue book exam (in grad school) in the 90s, the professor insisted on blue books and handwriting.
I asked if I could type my answers or hand write my answers in the blue books and later type them out for her (with the blue book being the original source).
I told her point blank that my “clean” handwriting was produced at about a third of the speed that I can type, and that my legible chicken scratch was at about 80% of my typing rate. I hadn’t handwritten anything longer than a short note in over 5 years. She insisted that she could read any handwriting, and she wasn’t tech savvy enough to monitor any potential cheating in real time (which I think was accurate and fair).
I ended up writing my last sentence as the time ran out. I got an A+ on the exam and a comment about one of my answers being one of the best and most original that she had read. She also said that I would be allowed to type out my handwritten blue book tests if I took her other class.
All of this is to say that I would have been egregiously misgraded if “clean handwriting” had been a requirement. There is absolutely no reason to put this burden on people, especially as handwriting has become even less relevant since that exam I took in the 90s.
I was in university around the same time. While there I saw a concerted effort to push online courses. Professors would survey students fishing for interest. It was unpopular. To me the motivation seemed clear: charge the same or more for tuition, but reduce opex. Maybe even admit more students to just have then be remote. It watered down the value of the degree while working towards a worse product. Why would a nonprofit public university be working on maximizing profit?
Universities aren’t profit maximizing. They are admin maximizing. Admin are always looking to expand admins budget. Professors, classrooms, facilities all divert money away from admin and they don’t want to pay it unless they have to.
> Why would a nonprofit public university be working on maximizing profit?
Because 'nonprofit' is only in reference to the legal entity, not the profit-seeking people working there? There is still great incentive to increase profitability.
You're thinking of not-for-profit. Non-profits do not seek increased profitability in the same way since it's expected (mandated?) they don't have any.
So they can educate more students? Many university classes are lecture only with 200+ students in the class and no direct interactions with profs. Those courses might was well be online.
One potential answer is that this tests more heavily for the ability to memorise, as opposed to understanding. My last exams were over ten years ago and I was always good at them because I have a good medium-term memory for names and numbers. But it's not clearly useful to test for this, as most names and numbers can just be looked up.
When I was studying at university there was a rumour that one of the dons had scraped through their fourth-year exams despite barely attending lectures, because he had a photographic memory and just so happened to leaf through a book containing a required proof, the night before the exam. That gave him enough points despite not necessarily understanding what he was writing.
Obviously very few students have that sort of memory, but it's not necessarily fair to give advantage to those like me who can simply remember things more easily.
Have you ever seen a programmer who really understands C going to stackoverflow every time they have to use an fopen()? Memorization is part of understanding. You cannot understand something without it being readily available in your head
Right, and a lot of them probably got that understanding by going to stackoverflow every time they needed to use fopen() until they eventually didn’t need to anymore.
In the book days, I sometimes got to where I knew exactly where on a page I would find my answer without remembering what that answer was. Nowadays I remember the search query I used to find an answer without remembering what that answer was.
I wrote a long answer, but I realised that even advanced C users are unlikely to have memorised every possible value of errno and what they all mean when fopen errors. There's just no point as you can easily look it up. You can understand that there is a maximum allowable number of opened files without remembering what exact value errno will have in this case.
Yes, I have. I do it too, even some basic functions, I would look up on SO.
You really just need to know that there's a way to open files in C.
I don't think you can reach any sort of scale of breadth or depth if you try to memorize things. Programmers have to glue together a million things, it's just not realistic for them to know all the details of all of them.
It's unfortunate for the guy who has memorized all of K&R, but we have tools now to bring us these details based on some keywords, and we should use them.
I dunno, I went to a high school reunion last year, and a dude seemed to know people's phone numbers from 30 years ago.
If he could remember that sort of thing, I can believe there are people who can remember steps of a proof, which is a much less random thing that you can feel your way around, given a few queues from memory.
Plus, realistically, how closely does an examiner read a proof? They have a stack of dozens of almost the same thing, I bet they get pretty tired of it and use a heuristic.
I think many people who grew up before cell phones remember phone numbers from the past. I just thought about it and can list the phone numbers of 3 houses that were on my childhood street in the early 2000s + another 5 that were friends in the area. I remember at least a handful of cell phone numbers from the mid to late 2000s as friends started to get those; some of them are still current. On the other hand, I don't know the number of anyone I've met in the last 15 years besides my wife, and haven't tried to.
When I was in university, in my program, the most common format was that you were allowed to bring in a single page of notes (which you prepared ahead of time based on your understanding of what topics were likely to come up). That seemed to work fine for everyone.
My students then often ask me to do the same, to permit them to bring one page of notes as he does.
Then I would say: just assume you're writing the exam with him and work on your one-pager of notes, optimize your notes by copying and re-writing them a few times. Now, the only difference between my exam and his exam is that the night before, you memorize your one-pager (if you re-wrote it a few times you should be able to recreate it purely from memory from that practice alone).
I believe having had all material in your memory at the same time, at least once for a short while, gives students higher self-confidence; they may forget stuff again, but they hopefully remember the feeling of mastering it.
I teach at MSc level. My students are scattered around the country and world. This makes hand-written exams tricky. Luckily, the nature of the questions they are asked to solve in the essay I give them following their coursework are that chatbots produce appalling bad submissions.
In my case, I set some course-work, where they have to log in to a Linux server in the university and process a load of data, get the results, and then write the essay about the process. Because the LLM hasn't been able to log in and see the data or indeed the results, it doesn't have a clue what it's meant to talk about.
for most of the low hanging fruit it's as easy as copy-pasting the question into multiple LLMs and logging the output
do it again from a different IP or two.
there will be some pretty obvious patterns in responses. the smart kids will do minor prompt engineering "explain like you're Peter Griffin from Family Guy" or whatever, but even then there will be some core similarities.
or follow the example of someone here and post a question with hidden characters that will show up differently when copy-pasted.
I dont know what you majored. But when I was a CS major maybe 50% of my grade came from projects. We wrote a compiler from scratch, wrote something that resembled a SQL engine from scratch, and wrote sizeable portions of an operating system. In my sophomore year we spent at least 20 hours a week on various projects a week.
We could use any resource we coulc find as long as we didn't submit anything we didn't write ourselves. This meant stackoverflow and online documentations.
There is no way you can test a student's ability to implement a large, complex system with thousands of lines of code in a three hour exam. There is just no way. I am not against closed book paper exams, I just wish the people touting them as the solution can be more realistic about what they can and cannot do.
I had some take home exams in Physics that you could use internet, books, anything except other people (but that was honor code based). Those were some of the hardest exams I ever took in my life. Pages and pages of mathematical derivations. An LLM with how they can do a pretty good job at constructing mathematics, would actually have solved that issue pretty well.
People really struggle to go back once a technology has been adopted. I think for the most part, people cannot really evaluate whether or not the technology is a net positive; the adoption is more social than it is rational, and so it'd be like asking people to change their social values or behaviors.
It was the same when I graduated 6 years ago. We had projects to test our ability to use tools and such, and I guess in that context LLMs might be a concern. But exams were pencil and paper only.
Something that I think many students, indeed many people, struggle with is the question "why should I know anything?"
For most of us--myself included--once you graduate from college, the answer is: "enough to not get fired". This is far less than most curriculums ask you to know, and every year, "enough to not get fired" is a lower and lower bar. With LLMs, it's practically on the floor for 90% of full-time jobs.
That is why I propose exactly the opposite regimen from this course, although I admire the writer's free thinking. Return to tradition, with a twist. Closed-book exams, no note sheets, all handwritten. Add a verbal examination, even though it massively increases examination time. No homework assignments, which encourage "completionist mindset", where the turning-in of the assignment feels more real than understanding the assignment. Publish problem sets thousands of problems large with worked-out-solutions to remove the incentive to cheat.
"Memorization is a prerequisite for creativity" -- paraphrase of an HN comment about a fondly remembered physics professor who made the students memorize every equation in the class. In the age of the LLM, I suspect this is triply true.
> once you graduate from college, the answer is: "enough to not get fired"
I thought the point was to continue in the same vein and contribute to the sum total of all human knowledge. I suppose this is why people criticize colleges as having lost their core principles and simply responded to market forces to produce the types of graduates that corporate America currently wants.
> "enough to not get fired" is a lower and lower bar.
Usually people get fired for their actions and not their knowledge or lack thereof. It may be that David Graebers core thesis was correct. Most jobs are actually "bullshit jobs," and in the era of the Internet, they don't actually require any formal education to perform.
I agree with both of your assertions. Most jobs are indeed bullshit jobs in the age of abundance, and while the "point" of knowledge and wisdom is, in a grander sense, to continue in the same vein and contribute to the sum total of all human knowledge (I prefer the slightly less abstract phrase "build and inhabit a greater civilization"), there's very little about the current education system or the economic modality of the modern West that incentivizes that goal.
> Closed-book exams, no note sheets, all handwritten. Add a verbal examination
You are describing how school worked for me (in Italy, but much of Europe is the same I think?) from middle school through university. The idea of graded homework has always struck me as incredibly weird.
> In the age of the LLM, I suspect this is triply true.
They do change what is worth learning though? I completely agree that "oh no the grades" is a ridiculous reaction, but adapting curricula is not an insane idea.
Something often left out is the dependence on LLM’s. Students today assume LLM’s will always be available, at a price they (or their companies) can afford.
What happens if LLM’s suddenly change their cost to be 1000 USD per user per month? What if it is 1000 USD per request? Will new students and new professionals still be able to complete their jobs?
I swear teachers said something extremely similar about calculators when I was in grade school. "What are you going to do when you don't have access to a calculator? You won't ways have one with you!"
Calculators have never been more accessible/available. (And yet I personally still do most basic calculations in my head)
So I agree students should learn to do this stuff without LLMs, but not because the LLMs are going to get less accessible. There's another better reason I'm just not sure how to articulate it yet. Something to do with integrated information and how thinking works.
Calculators are widely available for a low cost. The logic behind most calculators is able to be consistently duplicated across a variety of manufacturers, thereby lowering the cost to produce these to the masses.
LLM’s are not consistent. For example, having a new company make a functional duplicate of ChatGPT is nearly impossible.
Furthermore, the cost of LLM’s can change at any time for any reason. Access can be changed by new government regulations, and private organizations can chose to suspend or revoke access to their LLM due to changes in local laws.
All of this makes dependence on an LLM a risk for any professionals. The only way these would be mitigated is by an open source, freely available LLM that creates consistent results that students can learn how to use.
The comparison with calculators overlooks several key developments.
LLMs are becoming increasingly efficient. Through techniques such as distillation, quantization, and optimized architectures, it is already possible to run capable models offline, including on personal computers and even smartphones. This trend reduces reliance on constant access to centralized providers and enables local, self-contained usage.
Rather than avoiding LLMs, the rational response is to build local, portable, and open alternatives in parallel. The natural trajectory of LLMs points toward smaller, more efficient, and locally executable models, mirroring the path that calculators themselves once followed.
My intuition is that the costs involved to train and run LLMs will keep dropping. They will become more and more accessible, so long as our economies keep chugging along.
I could be wrong, time will tell. I just wouldn't base my argument for why students should learn to think for themselves on accessibility of LLMs. I think there's something far more fundamental and important, I just don't know how to say it yet.
The question is no longer "How do we educate people?" but "What are work and competence even for?"
The culture has moved from competence to performance. Where universities used to be a gateway to a middle class life, now they're a source of debt. And social performances of all kind are far more valuable than the ability to work competently.
Competence used to be central, now it's more and more peripheral. AI mirrors and amplifies that.
I completely agree with you. Do you have any ideas about what might stem this tide on a grander scale? I live in the country and will homeschool my kids--I think the risk of under-socialization is worth the reward of competency-based education and the higher likelihood of my own principles taking hold--but I would vastly prefer to send them to a normal school with other kids, albeit one in a superior society to that which we currently inhabit.
> Do you have any ideas about what might stem this tide on a grander scale?
The best way to move from the working class to the middle class these days is the military with a federal government job after retirement (even with what the current admin is doing). That said, a person doing this needs to realize that they will need to unlearn and learn a lot of social habits and learn some new ones.
The bonus is that higher ed will be free, and ambitious folks can ladder up into officer roles, which can be even more of a social climb.
> I think the risk of under-socialization is worth the reward of competency-based education and the higher likelihood of my own principles taking hold
I think you are very wrong on this point.
A highly-socialized person with the minimum viable amount of competency will go much farther in life than a highly-competent person with limited social skills.
If your kids are in a good school system, there will be a culture of competence in the students and their families.
> but I would vastly prefer to send them to a normal school with other kids, albeit one in a superior society to that which we currently inhabit
You just need to find the right pocket of people.
I personally recommend good Montessori schools over home schooling for K-8. It doesn’t work for everyone, but it works well when it’s a good fit. The community around the school is usually fairly healthy as well.
For 9-12, a high-quality private school, a magnet school, a combo high school / JC, or an independent study high school (often with home school “classes”) are all good options for curious and ambitious students, imho.
I had an electrodynamics professor say that there was no reason to memorize the equations, you would never remember them anyways, the goal was to understand how the relationships were formed in the first place. Then you would understand what the relationships are that each equation represents. That I think is the basis for this statement. Memorization of the equations gives you a basis to understand the relationships. So I guess the hope is that is enough. I would argue it isn't enough since physics isn't really about math or equations its about the structure and dynamics of how systems evolve over time. And equations give one representation of the evolution of those systems. But it's not the only representation.
This is all very well if the goal was to sift the wheat from the chaff - but modern western education is about passing as many fee paying students as possible, preferably with a passably enjoyable experience for the institutional kudos.
I think that really depends on countries. I went to an engineering school only 15% of applicants out of high school were admitted and of those who were admitted only around 75% graduated.
Western education passing as many fee paying students as possible seems to be very much a UK/US phenomenon but doesn't seem to be the case of European countries where the best schools are public and fees are very low (In France, private engineering schools rank lower)
I wonder if education will bifurcate back out as a result of AI. Small, bespoke institutions which insist on knowledge and difficult tests. And degree factories. It seems like students want the degree factory experience with the prestige of an elite institution. But - obviously - that can’t last long. Colleges and universities should decide what they are and commit accordingly.
I think the UK has been heading this way for a while -- before AI. Its not been the size of the institutions that has changed, but the "elite" universities tend to give students more individual attention. A number of them (not just Oxford and Cambridge) have tutorial systems where a lot of learning is done in a small group (usually two or three students). They have always done this.
At the other extreme are universities offering low quality courses that are definitely degree factories. They tend to have a strong vocational focus but nonetheless they are not effective in improving employability. In the last few decades we have expanded the university system and there are far more of these.
There is no clear cutoff and a lot of variation in between so its not a bifurcation but the quality vs factory difference is there.
On other side in western systems funded by taxes the incentive is still to give out as many degrees as possible as schools get funding based on produced degrees.
Mostly done to get more degree holders which are seen as "more productive". Or at least higher paid...
> Add a verbal examination, even though it massively increases examination time. No homework assignments, which encourage "completionist mindset"
To the horror of anyone struggling with anxiety, ADHD, or any other source of memory-recall issues under examination pressure. This further optimizes everything for students who can memorize and recall information on the spot under artificial pressure, and who don't suffer any from any of the problems I mentioned.
In grade school you could put me on the spot and I would blank on questions about subjects that I understood rather well and that I could answer 5 minutes before the exam and 5 minutes after the exam, but not during the exam. The best way for me to display my understanding and knowledge is through project assignments where that knowledge is put to practical use, or worked "homework" examples that you want to remove.
Do you have any ideas for accommodating people who process information differently and find it easier to demonstrate their knowledge and understanding in different ways?
Maybe those people just wont get as good of grades, and that's acceptable. It is strange that the educational system determined it wasn't acceptable. If I go to a university and try to walk onto the NCAA Division 1 Basketball team, its fine for them to tell me that I am too short, too slow, too weak, can't shoot, or my performance anxiety means I mess up every game and I am off the team. If I try and go for Art but my art is bad I am rejected. If I try and go for music but my performance anxiety messes up my performances, then I am rejected.
Why aught there be an exception for academics? Do you want your lawyer or surgeon to have performance anxiety? This seems like a perfectly acceptable thing to filter out on.
Everything involves performing and actually proving what you know. If this is such an issue, then its something you need to fix. I have never actually met anyone who has this “perfomance anxiety” where they are so brilliant but do poorly on tests because of it. I think its a myth to attack rigor of academics. For knowledge workers everntually you have to go into court, or perform surgery, or do the taxes or give the presentation, or have the high pressure meeting. If anxiety is truly debilitating to the person all of these situations theyll be doomed so filter them out.
But suppose you think strictly in utilitarian terms: what effort should I invest for what $$$ return. I have two things to say to you:
First: what a meaningless life you're living.
Second: you realize that if you don't learn anything because you have LLMs, and I learn everything because it's interesting, when you and I are competing, I'll have LLMs as well...? We'll be using the same tools, but I'll be able to reason and you won't.
I think the people who struggle with the question "Why should I know anything?" aren't going to learn anything anyway. You need curiosity to learn, or at least to learn a lot and well, and if you have curiosity you're not asking why you should learn anything.
To play devil's advocate: In the future, "knowing things" might not really be a prerequisite for living a decent life. If you could just instantly look anything up that you need to know, then why would you need to know anything? I don't think it's a ridiculous question. As long as I can maintain basic literacy and an ability to form questions for an LLM, why really do I kneed knowledge? Maybe I don't find any intrinsic "life meaning" from knowledge. Maybe I don't care if it's interesting. Pragmatically why should I be educated?
Why would I need to be able to lift 100kg? I'm never going to need to lift 100kg, and if I do need to, I'd just find a tool that will do it. My life isn't any less rich because I can't lift 100kg, and I can maintain basic body health without being able to lift weight from the ground.
Exactly. In the long term, I would argue that "interest" is always a bigger determining factor of professional success than innate "capability" in a field. An interested person can grow their competence over time, whereas a disinterested, yet capable person will mostly remain at a fixed level of competence.
Honestly, I feel like I have to know more and more these days, as the ais have unlocked significantly more domains that I can impact. Everyone is contributing to every part of the stack in the tech world all of a sudden, and "I am not an expert on that piece of the system" no longer is a reasonable position.
This is in tech now, were the first adopters, but soon it will come to other fields.
To your broader question
> Something that I think many students, indeed many people, struggle with is the question "why should I know anything?"
You should know things because these AIs are wrong all the time, because if you want any control in your life you need to be able to make an educated guess at what is true and what isn't.
As to how to teach students. I think we're in an age of experimentation here. I like the idea of letting students use all tools available for the job. But I also agree that if you do give exams and hw, you better make them hand written/oral only.
Overall, I think education needs to focus more on building portfolios for students, and focus less giving them grades.
> and "I am not an expert on that piece of the system" no longer is a reasonable position
Gosh that sounds horrifying. I am not an expert on that piece of system, no I do not want to take responsibility for whatever the LLMs have produced for that piece of system, I am not an expert and cannot verify it.
You didn't answer why the student should memorize anything, except the hand-waving "Memorization is a prerequisite for creativity".
Students had very good reason to question the education system when they were asked to memorize things that were safe to forget once they graduated from school. And when most functional adults admitted they forgot what they had learned in school. It was an issue before LLM, and triply so now.
By the way, I now am 100% agree with "Memorization is a prerequisite for creativity." However, if you asked me to try to convince the 16-year-old me I would throw my hands up.
I completely agree with you, and now that I am far away from being a student (and at the time, I vehemently hated any system that demanded memorization), I regretfully say that sometimes you just have to force young people to do things they don't want to do, for their own good.
But "enough to not get fired" is not an answer to a question "why should I know anything?". To be honest, it's not clear if the rest of your post tries to answer the initial question of why you should know anything or the implied question of how much should I really know.
The answer to "why should I know anything" is a value judgement that, if advertised in my top-level post, provides a great deal more rhetorical surface to disagree with or criticize. My main point is that regardless of why anyone wants to know anything, in the age of AI, if you want to produce students who actually know things, I recommend dropping the tech and returning to a more rigorous, in-person curriculum with a foundation of memorization.
Here, though, is my answer: an excellent long-term goal for any band of humans is to create, inhabit, and enjoy the greatest civilization possible, and the more each individual human knows about their reality, the easier it is to do that.
What I like about the approach in the article is that it confronts the "why should I know this?" question directly. By making students accountable for reasoning (even when tools are available) it exposes the difference between having access to information and having a mental model
This is like the Indian education system and presumably other Asian ones. Homework counts for very little towards your grade. 90% of your grade comes from the midterms and the finals. All hand written, no notes, no calculators.
That’s a terrible indictment of society if true. People are so far from self-realization, so estranged from their natural curiosity, that there is no motivation to learn anything beyond what will get you fed and housed. How can anyone be okay with that? Because even most chronically alienated people have had glimpses of self-actualization, of curiosity, of intrinsic motivation; most have had times when they were inspired to use the intellectual and bodily gifts that nature has endowed them with.
But the response to that will be further beatings until morale improves.
What about technology professionals? From my biased reading of this site alone: both further beatings and pain relievers in the form of even more dulling and pacifying technology. Follow by misanhtropic, thought-terminating cliches: well people are inherently dumb/unmotivated/unworthy so topic is not really worth our genuine attention; furthermore, now with LMMs, we are seeing just how easy it is to mimic these lumps of meat—in fact they can act both better and more pathetic than human meat bags, just have to adjust the prompts...
People who aren't fed and employed generally struggle to be self actualized, right? First you need to work for your supper, then you can focus on learning for its own sake.
As more jobs started requiring degrees, the motivation has to change. If people can get food and housing without a degree again to a comfortable extent than the type of person getting a degree will change again too.
If you let them, they'll alienate you until you have no free time and no space for rest or hobbies or learning. Labour movements had to work hard to prevent the 60 hour workweek, but we're creeping back away from 40, right?
I'm a university professor, and the amount of students who seem to be in need of LLM as a crutch is growing really exponentially.
We are still in a place where the oldest students did their first year completely without LLMs. But younger students have used LLMs throughout their studies, and I fear that in the future, we will see full generations of students completely incapable of working without LLM assistance.
Reading the article, it seemed to me that both the professor and the students were interested in the material being taught and therefore actively wanted to learn it, so using an LLM isn't the best tactic.
My feeling is that for many/most students, getting a great understanding of the course material isn't the primary goal, passing the course so they can get a good job is the primary goal. For this group using LLMs makes a lot of sense.
I know when I was a student doing a course I was not particularly interested in because my parents/school told me that was the right thing to do, if LLMs had been around, I absolutely would have used them :).
I think the professor here presented them with a "special" case which can not be generalized outside of the exam context.
If you're presented with the choice of "Don't use AI" and "Use AI, but live with the consequences" (consequences like mistakes being judged harsher when using AI than when not using AI), I do not think chatbots will be a desirable choice if you've properly prepared for the exam.
One helper here is fear. You can be failed for formal errors at university, and it means we were scared shitless of making them and payed close attention.
If people know "at university you can't use LLM, you are forced to think by yourself" they will adjust, albeit by trial of fire.
I think there's an argument that growing up in an educational system unable to teach you how to not rely on LLM would for all intents and purposes permanently nerf you compared to more fortunate peers. Critical thought is a skill we continue to practice until the very end
It will be very interesting to see what will happen when LLMs start charging users for their true cost. With many people priced out how would they cope?
It's not that expensive unless you run millions of tokens through an agent. For use cases where you actually read all the input and output by yourself (i.e. an actual conversation), it is insanely cheap.
Yeah in my last job, unsupervised dataset-scale transformations amounted to 97% of all spending. We were using gemini 2.5 flash in batch/prefill-caching mode in Vertex, and always the latest/brightest for ChatGPT-like conversations.
May happen, but I suspect not in the way implied by that question.
Hardware is still improving, though not as fast as it used to; it's very plausible that even the current largest open weights models will run on affordable PCs and laptops in 5 years, and high-end smartphones in 7.
I don't know how big the SOTA close-weights models are, that may come later.
But: to the extent that a model that runs on your phone can do your job, your employer will ask "why are we paying you so much?" and now you can't afford the phone.
Even if the SOTA is always running ahead of local models, Claude Code could cost 1500 times as much and still have the average American business asking "So why did we hire a junior? You say the juniors learn when we train them, I don't care, let some other company do that and we only hire mid-tier and up now."
(Threshold is less than 1500 elsewhere, I just happened to have recently seen the average US pay for junior-grade software developers, $85k*, which is 350x cheaper, and my own observation that they're not only junior quality but also much faster to output than a junior).
* but also note while looking for a citation the search results made claims varying from $55k to $97.7k
Very few people fall behind at the moment due to lack of access to information. People in poor countries largely have access to the internet now. It doesn’t magically make people educated and economically prosperous.
You are arguing the converse. Access to information doesn't make people educated, but lack of access definitely puts people at a big advantage. Chatbots are not just information, they are tools and using it needs training because they hallucinate.
Please, you don’t need to counter-narrative everything. Maybe talk about what the professor did here and why students didn’t trust the output in an exam context in this particular subject.
> Second, I learned that cheating, however lightly, is now considered a major crime. It might result in the student being banned from any university in the country for three years. Discussing exam with someone who has yet to pass it might be considered cheating. Students have very strict rules on their Discord.
This has also something to do with it. Hard to make very accurate conclusions.
> When one would finish her exam, she would come back to the room and tell all the remaining students what questions she had and how she solved them. We never considered that "cheating" and, as a professor, I always design my exams hoping that the good one (who usually choose to pass the exam early) will help the remaining crowd.
You are an outlier. When I was in school any outside assistance was tantamount to cheating and, unlike an actual crime, it was on the student to prove they were not cheating. Just the suspicion was enough to get you put in front of an honor board.
It was also pervasive. I would say 40% of international students were cheaters. When some were caught they fell back on cultural norms as their defense. The university never balked because those students, or their institutions, paid tuition in cash.
International students in graduate programs at US institions are basically buying a degree from what I've seen. The professors know they cheat and they don't really care. The students are paying a lot of money and they will get what they paid for.
> The professors know they cheat and they don't really care.
To throw another anecdote in the bucket, I know at least one professor who does not tolerate cheating from any of his students, regardless of cultural or national background, or how they're paying for their education
I've seen, on multiple occasions, the professor's recommendations get overruled by the dean or university administration. If the school wants them there, they stay.
Andés Hess (RIP) gave an examination, 2006, in his Organic Chemistry course... which ended up with 35% of the class being reported to Vanderbilt's Honor Council.
He brilliantly tested students using open-ended, single-sentence questions (with half of the page blank to show your work)... which tested foundational topics and oozed with partial-credit opportunities. You then had an option to submit "test corrections" to explain why you should gain more points for your efforts (typically considered, when reasonable).
----
His first exam of the semester, there was a multi-step question which resulted in a single 1cm x 1cm box — worth 20% of the entire exam's scoring — for you to indicate whether that particular Grignard reaction resulted in a single-, double-, or triple- bond.
The majority of the class answered (incorrectly) that it would be a double-bond, by writing a `=` into the blank box. In fact, that reaction resulted in a triple-bond `≡`
35% of the class ended up just adding the third parallel line (i.e. changing what they had originally answered) when handing in their test corrections. Dr. Hess had made photocopies of all the penciled exams... and reported all the cheaters.
----
I answered it correctly, originally, so was never tempted to fib a similar mistake — but this definitely opened my eyes in reinforcement of not cheating. I eventually got into medical school, and most of that 35% of branded "cheaters" did not. Ultimately I never became a physician, but remember the temptations to cheat like everybody else did. I am happier/poorer because...
>40% of international students were cheaters. When some were caught they fell back on cultural norms as their defense. The university never balked because those students, or their institutions, paid tuition in cash.
Twenty years ago, at Vanderbilt, this would have been an understatement — particularly among non-citizen asians.
I remember in organic chemistry an instructor attempted to re-give the same examination ("because ya'll did so terrible") and it was struck down by a dean as not allowable simply because the Honor Code was to be invoked that nobody/groups would share answers (yeah sure okay).
The minority following the Honor Code ended up getting into lesser graduate schools (e.g. myself) — because most courses didn't curve and VU didn't give out A+ as a grade. I have specifically not mentioned the specific country which cheated most-blatantly... but everybody from back then knew/knows.
> 3. I allow students to discuss among themselves [during an exam] if it is on topic.
Makes me wonder if they should also get a diploma together then, saying "may not have the tested knowledge if not accompanied by $other_student"
I know of some companies that support hiring people as a team (either all or none get hired and they're meant to then work together well), so it wouldn't necessarily be a problem if they wish to be a team like that
The main strategy is collaboration. If you are smart enough to:
1. Identify your problem
2. Ask someone about it
3. Get an answer which improve your understanding
Then you are doing pretty good by all standards
Another trick I sometimes use. I take one student which has hard time articulate it a concept. I take student two who don’t understand that concept. I say to student 1: "You have 20 minutes to teach student 2 the concept. When I come back, you will be graded according to his answers"
(I, of course, not only grade that. But it forces both of them to make an extra effort, student 2 not willing to be the cause for student 1 demise)
Quite a thoughtful way to adapt exams to wave of new tools for students and learn on the way.
I wished other universities adapt so quickly too (and have such a mindful attitude to students e.g. try to understand them, be upfront with expectations, learning from students etc).
Majority of professors are stressed and treat students as idiots... at least that was the case decade a go!
OP here: Majority of professors became professors because there were very good at passing standard exam (and, TBH, some are not good at anything else).
I’m different because I was a bad student. Only managed to get my diploma with minimal grade, always rebel against everything. But some good people at my university thought that Open Source was really important and they needed someone with a good career in that field. I was that person (and I’m really thankful for offering that position)
> Majority of professors became professors because there were very good at passing standard exam (and, TBH, some are not good at anything else).
Is this a French thing? In the US we don't have standardized exams to become a college professor. Instead, we need to do original research and publish.
> I realized that my students are so afraid of cheating that they mostly don’t collaborate before their exams! At least not as much as what we were doing.
This is radically different from the world that's been described to me. Even 20 years ago cheating was endemic and I've only heard of it getting worse.
I teach at MSc level, and over the last couple of years about 15% of my students have cheated. Really obviously cheated, like two students submitting 100% byte-identical answers.
Wow. I would love to have had this teacher :-) Everything about this setup seems so thoughtful. Giving students both agency / freedom of choice and responsibility. And if they choose more power (llms), they automatically have more responsibility (having to explain the mistakes of llms). And this:
> It took me 20 years after university to learn what I know today about computers. And I’ve only one reason to be there in front of you: be sure you are faster than me. Be sure that you do it better and deeper than I did. If you don’t manage to outsmart me, I will have failed.
What a wonderful teacher! I wish all teachers were like him.
Regarding the collaboration before the exam, it's really strange. In our generation, asking or exchanging questions was perfectly normal. I got an almost perfect score in physics thanks to that. I guess the elegant solution was still in me, but I might not have been able to come up with it in such a stressful situation. 'Almost' because the professor deducted one point from my score for being absent too often :)
However, oral exams in Europe are quite different from those at US universities. In an oral exam, the professor can interact with the student to see if they truly understand the subject, regardless of the written text. Allowing a chatbot during a written exam today would be defying the very purpose of the exam.
Very interesting write up, would be curious to know more about what an Open Source Strategies course entails, as far as I can remember I never had anything like that on offer at my university.
A similar outlook scandal happened in Norway. I believe both UiO and NTNU, biggest in Humanities and engineering respectively, used to have great internal email services. Huge protests from competent personnel ignored. Now its all outlook. Microsoft are excessively good at convincing non technical stakeholders that their own staff are idiots, and that Microsoft is the solution.
I have corrected exams and graded assignments as an external party before (legal requirement). The biggest problem with LLMs I see is that the weak students copy-paste commands with unnecessary command line switches. But they would have done the same with stack overflow.
Some also say they use LLM to help improve their writing but that's where the learning is so why????? I think it's the anxiety for failing, they don't seem to understand I'll not fail them as long as their incoherent text proves they understood what they were doing.
Having graduated and knowing how things are ought to look, taking exams are so much less scary now because I'm confident I will be failed for being incompetent, not because I didn't write properly. Not all students have the same privilege, they gain it over time.
It does help that computer science assignments and papers are pretty damn standard in form.
Only 2 students actually used an LLM in his exam, one well and one poorly so I'm not sure there is much you can draw from this experience.
In my experience LLMs can significantly speed up the process of solving exam questions. They can surface relevant material I don't know about, they can remember how other similar problems are solved a lot better than I can and they can check for any mistakes in my answer. Yes when you get into very niche areas they start to fail (and often in a misleading way) but if you run through practise papers at all you can tell this and either avoid using the LLM or do some fine tuning on past papers.
> Second, I learned that cheating, however lightly, is now considered a major crime. It might result in the student being banned from any university in the country for three years. Discussing exam with someone who has yet to pass it might be considered cheating. Students have very strict rules on their Discord.
> I was completely flabbergasted because, to me, discussing "What questions did you have?" was always part of the collaboration between students.
I suspect that lots of intelligent and diligent students hate our new world of AI because they probably find it more likely now that they could be accused of and disciplined for something they didn't do.
He describes mostly a process where the exam itself,
or rather testing the knowledge of a student, is not
so important.
I think not all exams can occur like that. In some cases
you just have to test one's knowledge about a specific
topic, and knowing facts is a very, very easy way to
test this. I would agree that just focusing on facts
these days is overrated, but I would still reason that
it is not a useless metric still. So, when the author
describes "bring your own exam questions", it more means
that the exam itself is not so relevant, which is fine -
but saying that university exams are now useless in the
age of autosolving chatbots, is simply wrong. It just
means that the exam itself is not important; that in
itself does not automatically mean that ALL exams or
exam styles are useless. Also, it depends on what you
test. For instance, testing solving math questions -
yes, chatbots can solve this, but can a student solve
the same without needing a chatbot? How about practical
skills? Ok, 3D printing will dominate, but the ability
to craft something with your own hands, that is still a
skill that may be useful, at the least to some extent.
I feel that the whole discussion about chatbots dumbs
down a lot. Skills have not become irrelevant just
because chatbots exist.
Interesting write up! I’ve thought about how university exams are done effectively nowadays. I took my degree in CS almost 20 years ago, and being a user of LLMS - I can’t really see how any of my old exams would work today if students would be allowed LLMs.
I graduated 15 years ago, and I think the exams in my degree were actually the most LLM-proof part of the student assessment. They were no-aid written exams with pencils and paper, whereas the assignments were online-submitted code only that an LLM could easily write.
This comes to not "smartness" of LLMs. But reality that we do not even want anything novel in these exercises or exams. And same areas are repeated multiple times so naturally there is lot of these in training data.
This is one area where LLMs really should excel at. And that doesn't really mean that students should not also learn it and be able to solve same issues. Which is real dilemma for the school system...
In my experience, reading a solution and even understanding it doesn’t go very far in teaching you how to do something. I can look at calculus solutions all day but only when I actually try to solve them myself do I run into all kinds of roadblocks which is where the real learning happens.
You're right, but learning can take place when you need it. There is no real advantage to learning something ahead of time. The bottleneck is having awareness of what is out there to learn. You can't learn about what you don't know exists. Looking at calculus solutions all day should give a sense of what calculus can be used for, so that it is in your back pocket when the time you need it comes.
Well, at least it used to be the bottleneck. Nowadays you can just ask an LLM. For all their faults, they are really good at letting you know about what tools exist in out there in the world, surfacing more than you could ever come to know about even if all you did was read about what exists all day, every day.
I believe to count as an expert on something you need to have a ready compendium of knowledge ready to go. It becomes very hard to tackle problems or gain deep insights if you don’t already have knowledgeable people that have thought deeply about a particular space. Maybe when we have supremely reliable LLMs that can replace humans we might not but we’re not there yet.
> I believe to count as an expert on something you need to have a ready compendium of knowledge ready to go.
You are certainly headed in the right direction, but not quite. To be seen as an expert in the eyes of others you need to have had a vision for something and to have successfully executed on it. If the vision was dependent on calculus, then you will have reached a point where you had to learn something about calculus, of course...
But that's different to having a taskmaster tell you to learn calculus for no apparent reason. Even if you follow through and built up a huge wealth of knowledge from it, you would still not be deemed an expert by others. You're no different than an encyclopedia, which isn't an expert either. It is being able to see things others can't and the ability to act upon it that makes an expert.
Learning taking place when you need it isn't the same as never.
> Maybe when we have supremely reliable LLMs that can replace humans we might not but we’re not there yet.
Frankly, even Page Rank already replaced humans for this. But LLMs are even better at it. Humans are just that poorly performing. Like I said before, even someone doing nothing in life but looking for what exists in the world could not take in as much as databases that have indexed every written thing.
Calculus might not be the best analogy for my point since it’s pretty fundamental. When I think of an expert I think of an accomplished mathematician or a chemist, someone that can build on existing knowledge to provide new breakthroughs. You can ask an LLM for a particular formulation but you cannot make wide spanning connections to come up with something novel until you have a good understanding of a given space. Not all progress is a series of iterative problems and tasks that need to be solved. In fact for a lot of breakthroughs it’s making disparate connections.
> When I think of an expert I think of an accomplished mathematician or a chemist, someone that can build on existing knowledge to provide new breakthroughs.
I think we're on the same page here. Experts have both vision and execution. Someone who has simply learned a bunch of things, or a lot about one thing, is not what we consider an expert.
> You can ask an LLM for a particular formulation but you cannot make wide spanning connections to come up with something novel until you have a good understanding of a given space.
I don't get where you are trying to go with this. Using an LLM (or Page Rank for that matter) to search for tools that have been created/discovered necessary to fulfill execution of your vision seems to have nothing to do with what you are trying to say. Nobody would ask an LLM to do what you are suggesting, if I am understanding you correctly. LLMs are most definitely not good at that. That is AGI territory.
I'm back in school part time for a bachelor's, and have recently had a class where I had a professor who really understood how to implement LLM's into the class.
Our written assignments were a lot of "have an LLM generate a business proposal, then annotate it yourself"
The final exam was a 30 minute meeting where we just talked as peers, kinda like a cultural job interview. Sure there's lots of potential for bias there, but I think it's better than just blindly passing students using LLM's for the final exam.
This professor's capacity for empathy and compassion for his students is off the charts. You can tell he puts a lot of thought and effort into helping his students learn.
Bravo.
Also, the take on AI is a stark contrast to much of what we've seen by other educators.
I find it odd that the Professor requires the students to demonstrate why the output of the LLM is correct, while not having the same requirement for a Google search. Even pre-AI using Google required critical thinking skills because the content was driven by SEO. If someone could make money by giving you mis-information it was pushed up in the rankings.
I went to a university class in my last year that had the same “choice”. Any time the professor starts asking about “ai” tends to be worrisome in my experience.
If anything, this reinforces the idea that chatbots don't fundamentally change education... they just amplify whatever incentives and structures already exist
I am not so sure about that. I think the difference this time is that AI usage is fairly mainstream. People are using it for all kinds of things, including studies. Using it to quickly get done with school/study-work, is a no-brainer (pun intended).
If what you are trying to test is whether the student will be able to give good answers to those and similar questions in the future, then it's fine if they use AI to answer exam questions. Because in the future, when they are confronted with these and similar questions, they will have access to at least as good AI as they do have today.
And if you want to test something else than whether the student can provide answer to your question, then why do you ask this question? Give a task that shows what you care about.
90% weight to in-person exams without technology. 10% to quizzes or homework. You can't trust anything done outside of the classroom to accurately show competency. Problem solved.
As a professor who had to run Subversion for students (a bit before Git, et al), it's a nightmare to put the infrastructure together, keep it reliable under spiky loads (there is always a crush at the deadline), be customer support for students who manage to do something weird or lose their password, etc. You wind up spending a non-trivial amount of time being sysadmin for the class on top of your teaching duties. Being able to say "Put it on GitHub" short circuits all of that. It sucks, but it makes life a huge amount easier for the professor.
From the students point of view, sure, it sucks that nobody mentioned that Git could be used independently (or jj or Mercurial or ...) However, Github is going to be better than what 99.9% of all professors will put together or be able to use. Sure, you can use Git by itself, but then it needs to go somewhere that the professor can look at it, get submitted to automated testing, etc. That's not a trivial step. My students were happy that I had the "Best Homework Submission System" (said about Subversion of all things ...) because everybody else used the dumbass university enterprise thing that was completely useless (not going to mention its name because it deserves to die in the blazing fires of the lowest circle of Hell). However, it wasn't straightforward for me to put that together. And the probability of getting a professor with my motivation and skill is pretty low.
Agree about the possibility of infra nightmare, especially in the "SVN era" -- but in 2026, it's pretty straightforward to run a gitlab instance (takes about an hour to set up, most of which is DNS and TLS stuff, ime) for a course and set up actions, or use other submission infra like CMU autolab. I do this.
Agree with your comment about probability, motivation, and skill.
> I was completely flabbergasted because, to me, discussing "What questions did you have?" was always part of the collaboration between students
When I was a student, professors maintained a public archive of past exams. The reason was obvious: next time the questions would be different, and memorizing past answers wouldn't help you if you don't understand the core ideas being taught. Then I took part in an exchange program and went to some shit-tier uni and I realized that collaboration was explicitly forbidden because professors would usually ask questions along "what was on slide 54". My favorite part was when professor said "I can't publish the slides online because they're stolen from another professor but you can buy them in the faculcy's shop".
My uni maintained a giant presence on Facebook - we'd share a lot of information, and the most popular group was "easy courses" for students who wanted to graduate but couldn't afford a difficult elective course.
The exchange uni had none of that. Literally no community, no collaboration, nothing. It's astonishing.
BTW regarding the stream of consciousness - I distinctly remember taking an exam and doing my best to force my brain to think about the exam questions, rather than porn I had been watching the previous day.
> Mistakes made by chatbots will be considered more important than honest human mistakes, resulting in the loss of more points.
>I thought this was fair. You can use chatbots, but you will be held accountable for it.
So you're held more accountable for the output actually? I'd be interested in how many students would choose to use LLMs if faults weren't penalized more.
I thought this part especially was quite ingenious.
If you have this great resource available to you (an LLM) you better show that you read and checked its output. If there's something in the LLM output you do not understand or check to be true, you better remove it.
If you do not use LLMs and just misunderstood something, you will have an (flawed) justification for why you wrote this. If there's something flawed in an LLM, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.
One shows a misunderstanding, the other doesn't necessarily show any understanding at all.
>If you have this great resource available to you (an LLM) you better show that you read and checked its output. If there's something in the LLM output you do not understand or check to be true, you better remove it.
You could say the same about what people find on the web, yet LLMs are penalized more than web search.
>If you do not use LLMs and just misunderstood something, you will have an (flawed) justification for why you wrote this. If there's something flawed in an LLM, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.
Swap "LLMs" for "websites" and you could say the exact same thing.
The author has this in their conclusions:
>One clear conclusion is that the vast majority of students do not trust chatbots. If they are explicitly made accountable for what a chatbot says, they immediately choose not to use it at all.
This is not true. What is true is that if the students are more accountable for their use of LLMs than their use of websites, they prefer using websites. What is "more" here? We have no idea, the author didn't say so. It could be that an error from a website or your own mind is -1 point and from a LLM is -2, so LLMs have to make two times less mistakes than websites and your mind. It could be -1 and -1.25. It could be -1 and -10.
The author even says themselves:
>In retrospect, my instructions were probably too harsh and discouraged some students from using chatbots.
But they don't note the bias they introduced against LLMs with their notation.
rare here: well written and insightful, I would take this course. I'm curious about why he penalized chatbot mistakes more, at first glance sounds like just discouraging their use but the hole setup indicates genuine desire to let it be a possibility. In my mind the rule should be "same penalty and extra super cookies for catching chatbot mistakes"
I wrote this before to another comment like yours:
I thought this part of penalizing mistakes made with the help of LLMs more was quite ingenious.
If you have this great resource available to you (an LLM) you better show that you read and checked its output. If there's something in the LLM output you do not understand or check to be true, you better remove it.
If you do not use LLMs and just misunderstood something, you will have a (flawed) justification for why you wrote this. If there's something flawed in an LLM answer, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.
One shows a misunderstanding, the other doesn't necessarily show any understanding at all.
Here is my guess:
Usually marks are given for partially correct answers, partially to be less punishing for human error whether caused by stress or other factors, there’s a good chance the student understood the topic. If instead they are using a chat bot, but didn’t catch the mistake themselves, it’s an indication of less understanding and marked accordingly.
> The third chatbot-using student had a very complex setup where he would use one LLM, then ask another unrelated LLM for confirmation. He had walls of text that were barely readable. When glancing at his screen, I immediately spotted a mistake (a chatbot explaining that "Sepia Search is a compass for the whole Fediverse"). I asked if he understood the problem with that specific sentence. He did not. Then I asked him questions for which I had seen the solution printed in his LLM output. He could not answer even though he had the answer on his screen.
Is it possible, and this is an interesting one to me, that this is the smartest kid in the class? I think maybe.
That guy who is playing with the latest tech, and forcing it to do the job (badly), and could care less about university or the course he's on. There's a time and a place where that guy is the one you want working for you. Maybe he's not the number 1 student, but I think there should be some room for this to be the Chaotic Neutral pick.
I don't understand.
10 years ago, we wrote exams by hand with whatever we understood (in our heads.)
No colleagues, no laptops, no internet, no LLMs.
This approach still works, why do something else? Unless you're specifically testing a student's ability to Google, they don't need access to it.
I am returning to this model in my classes: pen in paper quizzes, no digital devices. I also do seven equally weighted quizzes to deescalate them individually. I have reduced project/programming weight from 60-80% of my grade to 50% because it is not possible to tell if the students actually did the work.
I am also doing the same. 50% for project work and 50% for individual work, including paper and pen exams with no digital devices allowed.
The days of take home exams and coding lab assignments are gone...
is "individual work" the pen and paper?
Mostly. 50% for midterm and final, plus 10% of the 50% project work is individual contributions to the project to account for varying interest/contributions to the project work.
For a project I'm but so sure banning LLMs is actually the right approach.
Industry is full of people trying to use them to become more productive.
Why wouldn't you let students use the same tools?
Seems like you need to make the projects much harder.
But the problem is, students need to learn to do the easy things themselves before they can do the hard things with LLMs.
If you ask them to build a web browser when they can't do a hello world on their own, it's going to be a disaster. LLMs are like dumb juniors that you command, but students are less skilled than dumb juniors when they start programming classes..
Why should my 5 year old learn anything if he can just ask chatGPT?
Using chatGPT as a professional is different than using it for homework. Homework and school teaches you many things, not only the subject. You discover how you learn, what your interests are, etc.
ChatGPT can assist with learning also but SHOULD NOT be doing any of the work for the student. It is okay to ask "can you explain big O", then answer follow up questions. However, "give me method to reverse a string" will only hurt.
Depends. Does the college want to graduate computer scientists or LLM operators?
More importantly, does the student want to be a computer scientist or a LLM operator? If they think the future belongs to LLM operators (not a bet I'd recommend), college might not even be the right path for them versus a trade school / bootcamp.
It's like learning to factor polynomials even thought a computer algebra system on a graphic calculator can do that.
Do you think children should still be expected to be able to do arithmetic by hand?
I think the answer maybe comes down to figuring out exactly what the goal of school is. Are you trying to educate people or train them? There is for sure a lot of overlap, but I think there's a pretty clear distinction and I definitely favor the education side. On the job, a person with a solid education should be able to use whatever language or framework they need with very little training required.
Don’t compare LLMs to calculators. Only one of those is deterministic.
We're trying to evaluate the student not the LLM. You need to tease apart their contributions. Isn't this obvious?
At the university, you are supposed to learn the foundational knowledge. If you let the LLM do the work, you are simply not learning. There are no shortcuts.
And learning how to use LLMs is pathetically easy. Really.
> Industry is full of people trying to use them to become more productive
The goal of learning is not to be as productive as possible. You need to learn the material so you can check and fix the output of an LLM.
> Why wouldn't you let students use the same tools?
Students should use those tools, after they've learned how to do it without
> Seems like you need to make the projects much harder.
The goal of a project should be to learn the material with hands on experience.
I will always be grateful to my school teachers that forced me to learn arithmetic and not rely on the calculators we all carry in our pockets.
Open book exams are not a new thing and I've often had them for STEM disciplines (maths and biology). Depending on the subject, you will often fail those unless you had a good prior understanding of the material.
If you can pass an exam just by googling something, it means you're just testing rote-memorization rather, and maybe a better design is needed where synthesis and critical thinking skills are evaluated more actively.
Open book, sure. But you don't even need a computer for that.
I make a point of only using references that are either available for free online or through our university’s library subscriptions. These are all electronic. My open book exam became an open computer exam when I realized students were printing hundreds of pages just for a 3-hour exam. This semester I’m switching to no-computer, bring your own printed cheat-sheet for the exam.
And even if you are allowed to use a computer, you cannot use internet (and should not be hard to prevent that).
I had a Continuous and Discrete Systems class that allowed open everything during exams. You could google whatever you wanted but the exam was so lengthy that if you had to google something, you really did not have much time to do it and would definitely not have enough time to do it a second time. I would load up a PDF of the chapters and lectures I needed and my homeworks for that unit with everything properly labeled. It was much faster looking for a similar problem you already did in the homework than trying to find the answer online.
Local LLMs
Be sure to bring an extra power strip for all your plugs and adaptors.
https://www.tomshardware.com/pc-components/gpus/tiny-corp-su...
My laptop runs gpt-oss 120B with none of that. Don't know how long though. I suspect a couple of hours continuous.
Which laptop?
ROG Flow Z13 with maxxed out RAM.
Nice laptop. I love my current laptop in general, but it is lagging in performance.
how is anyone going to be able to take a test with all of the noise from that fan as it cranks through tokens?
You can make it as slow as you want. At half TDP it is silent.
Offer to make everyone espresso and macchiato with you GPU cooling module. They won't be able to hear the fan over the grinder and pump and milk foamer!
Except that the physical book isn't the way people lookup facts these days.
The open book test is purposes is to not have to know all facts (formulas) but proving how to find them and how to apply them. (Finding is part of it as the more you look, the less time you got to use it, thus there is an optimisation problem which things to remember and which to look up)
In modern times you wouldn't look those up in a book, thus other research techniques are required to deal with real life (which advanced certifications should prove)
Going to university isn't how people learn these days, so there is already a real-world disconnect, fundamentally. But that's okay as it isn't intended to be a reflection of the real world.
> Going to university isn't how people learn these days
That’s a surpsing statement that doesn’t ring true to me, what are you basing that off of / citing?
> what are you basing that off of
Observation? Children show clear signs of learning before they even make it through their first year out of the womb. Man, most people don't even consider university as an option until they are around 17-18 years of age, after they have already learned the vast majority of the things they will learn in life.
Data? Only 7-8% of the population have a university degree. Obviously you could learn in university without graduating, and unfortunately participation data is much harder to come by, but there is no evidence to suggest that the non-completion rate is anywhere high enough to think that even a majority of the population have step foot in a university even if for just one for day. If we go as far as to assume a 50% dropout rate, that is still no more than 16% of the population. Little more than rounding error.
Nothing? It's a random comment on the internet. It is not necessarily based on anything. Fundamentally, comments are only ever written for the enjoyment of writing. One trying to derive anything more from it has a misunderstanding of the world around them. I suppose you have a point that, for those who struggle to see the obvious, a university education would teach the critical thinking necessary to recognize the same. But, the fact that we are here echoes that university isn't how people learn these days.
> citing
Citing...? Like, as in quoting a passage? I can find no reason why I would want to repeat what someone else has written about. Whatever gives you enjoyment, but that seems like a pointless waste of time. It is already right there. You must be trying to say something else by this? I, unfortunately, am not in tune with your pet definition.
> This approach still works, why do something else?
One issue is that the time provided to mark each piece of work continues to decrease. Sometimes you are only getting 15 minutes for 20 pages, and management believe that you can mark back-to-back from 9-5 with a half hour lunch. The only thing keeping people sane is the students that fail to submit, or submit something obviously sub-par. So where possible, even for designing exams, you try to limit text altogether. Multiple choice, drawing lines, a basic diagram, a calculation, etc.
Some students have terrible handwriting. I wouldn't be against the use of a dumb terminal in an exam room/hall. Maybe in the background it could be syncing the text and backing it up.
> Unless you're specifically testing a student's ability to Google, they don't need access to it.
I've been the person testing students, and I don't always remember everything. Sometimes it is good enough for the students to demonstrate that they understand the topic enough to know where to find the correct information based on a good intuition.
I want to echo this.
Your blue book is being graded by a stressed out and very underpaid grad student with many better things to do. They're looking for keywords to count up, that's it. The PI gave them the list of keywords, the rubric. Any flourishes, turns of phrase, novel takes, those don't matter to your grader at 11 pm after the 20th blue book that night.
Yeah sure, that's not your school, but that is the reality of ~50% of US undergrads.
Very effective multiple choice tests can be given, that require work to be done before selecting an answer, so it can be machine graded. Not ideal in every case but a very quality test can be made multiple choice for hard science subjects
True! Good point!
But again, the test creator matters a lot here too. To make such an exam is quite the labor. Especially as many/most PIs have other better things to do. Their incentives are grant money, then papers, then in a distant 3rd their grad students, and finally undergrad teaching.any departments are explicit on this. To spend the limited time on a good undergrad multiple choice exam is not in the PIs best interest.
Which is why, in this case of a good Scantron exam, they're likely to just farm it out to Claude. Cheap, easy, fast, good enough. A winner in all dimensions.
Also, as an aside to the above, an AI with OCR for your blue book would likely be the best realistic grader too. Needs less coffee after all
> Very effective multiple choice tests can be given, that require work to be done before selecting an answer, so it can be machine graded.
As someone who has been part of the production of quite a few high stakes MC tests, I agree with this.
That said, a professor would need to work with a professional test developer to make a MC that is consistently good, valid, and reliable.
Some universities have test dev folks as support, but many/most/all of them are not particularly good at developing high quality MC tests imho.
So, for anyone in a spot to do this, start test dev very early, ideally create an item bank that is constantly growing and being refined, and ideally have some problem types that can be varied from year-to-year with heuristics for keys and distractors that will allow for items to be iterated on over the years while still maintaining their validity. Also, consider removing outliers from the scoring pool, but also make sure to tell students to focus on answering all questions rather than spinning their wheels on one so that naturally persistent examinees are less likely to be punished by poor item writing.
Pros and cons. Multiple choice can be frustrating for students because it's all or nothing. Spend 10 minutes+ on question, make a small calculation error and end up with a zero. It's not a great format for a lot of questions.
They're also susceptible to old-school cheating - sharing answers. When I was in college, multiple choice exams were almost extinct because students would form groups and collect/share answers over the years.
You can solve that but it's a combinatorial explosion.
A long time ago, when I handed out exams, for each question, I used to program my exam questions into a generator that produced both not-entirely-identical questions for each student (typically, only the numeric values changed) and the matching answers for whoever was in charge of assessing.
That was a bit time-consuming, of course.
This is what my differential equations exams were like almost 20 years ago. Honestly, as a student I considered them brutal (10 questions, no partial credit available at all) even though I'd always been good at math. I scraped by but I think something like 30% of students had to retake the class.
Now that I haven't been a student in a long time and (maybe crucially?) that I am friends with professors and in a relationship with one, I get it. I don't think it would be appropriate for a higher level course, but for a weed-out class where there's one Prof and maybe 2 TAs for every 80-100 students it makes sense.
For large classes or test questions used over multiple years, you need to take care that the answers are not shared. It means having large question banks which will be slowly collected. A good question can take a while to design, and it can be leaked very easily.
Scantron and a #2 pencil.
Stanford started doing 15 minute exams with ~12 questions to combat LLM use. OTOH I got a final project feedback from them that was clearly done by an LLM :shrug:
> I got a final project feedback from them that was clearly done by an LLM
I've heard of this and have been offered "pre-prepared written feedback banks" for questions, but I write all of my feedback from scratch every time. I don't think students should have their work marked by an LLM or feedback given via an LLM.
An LLM could have a place in modern marking, though. A student submits a piece of work and you may have some high level questions:
1. Is this the work of an LLM?
2. Is this work replicated elsewhere?
3. Is there evidence of poor writing in this work?
4. Are there examples where the project is inconsistent or nonsensical?
And then the LLM could point to areas of interest for the marker to check. This wouldn't be to replace a full read, but would be the equivalent of passing a report to a colleague and saying "is there anything you think I missed here?".
> Some students have terrible handwriting.
Then they should have points deducted for that. Effective communication of answers is part of any exam.
> Then they should have points deducted for that. Effective communication of answers is part of any exam.
Agreed. Then let me type my answers out like any reasonable person would do.
For reference…
For my last written blue book exam (in grad school) in the 90s, the professor insisted on blue books and handwriting.
I asked if I could type my answers or hand write my answers in the blue books and later type them out for her (with the blue book being the original source).
I told her point blank that my “clean” handwriting was produced at about a third of the speed that I can type, and that my legible chicken scratch was at about 80% of my typing rate. I hadn’t handwritten anything longer than a short note in over 5 years. She insisted that she could read any handwriting, and she wasn’t tech savvy enough to monitor any potential cheating in real time (which I think was accurate and fair).
I ended up writing my last sentence as the time ran out. I got an A+ on the exam and a comment about one of my answers being one of the best and most original that she had read. She also said that I would be allowed to type out my handwritten blue book tests if I took her other class.
All of this is to say that I would have been egregiously misgraded if “clean handwriting” had been a requirement. There is absolutely no reason to put this burden on people, especially as handwriting has become even less relevant since that exam I took in the 90s.
I personally don't believe that terrible handwriting should have any hold over a computer science student.
Doctors (medicine) get away with it.
> Then they should have points deducted for that. Effective communication of answers is part of any exam.
...even when it's for a medical reason?
I was in university around the same time. While there I saw a concerted effort to push online courses. Professors would survey students fishing for interest. It was unpopular. To me the motivation seemed clear: charge the same or more for tuition, but reduce opex. Maybe even admit more students to just have then be remote. It watered down the value of the degree while working towards a worse product. Why would a nonprofit public university be working on maximizing profit?
Online courses are also increases admin overhead.
Universities aren’t profit maximizing. They are admin maximizing. Admin are always looking to expand admins budget. Professors, classrooms, facilities all divert money away from admin and they don’t want to pay it unless they have to.
Also applies to hospitals in USA.
> Why would a nonprofit public university be working on maximizing profit?
Because 'nonprofit' is only in reference to the legal entity, not the profit-seeking people working there? There is still great incentive to increase profitability.
You're thinking of not-for-profit. Non-profits do not seek increased profitability in the same way since it's expected (mandated?) they don't have any.
I'm not thinking of either. The profit-seeking people looking to increase their profitability spoken of are neither non-profits nor not-for-profits.
So they can educate more students? Many university classes are lecture only with 200+ students in the class and no direct interactions with profs. Those courses might was well be online.
One potential answer is that this tests more heavily for the ability to memorise, as opposed to understanding. My last exams were over ten years ago and I was always good at them because I have a good medium-term memory for names and numbers. But it's not clearly useful to test for this, as most names and numbers can just be looked up.
When I was studying at university there was a rumour that one of the dons had scraped through their fourth-year exams despite barely attending lectures, because he had a photographic memory and just so happened to leaf through a book containing a required proof, the night before the exam. That gave him enough points despite not necessarily understanding what he was writing.
Obviously very few students have that sort of memory, but it's not necessarily fair to give advantage to those like me who can simply remember things more easily.
Have you ever seen a programmer who really understands C going to stackoverflow every time they have to use an fopen()? Memorization is part of understanding. You cannot understand something without it being readily available in your head
Right, and a lot of them probably got that understanding by going to stackoverflow every time they needed to use fopen() until they eventually didn’t need to anymore.
In the book days, I sometimes got to where I knew exactly where on a page I would find my answer without remembering what that answer was. Nowadays I remember the search query I used to find an answer without remembering what that answer was.
I wrote a long answer, but I realised that even advanced C users are unlikely to have memorised every possible value of errno and what they all mean when fopen errors. There's just no point as you can easily look it up. You can understand that there is a maximum allowable number of opened files without remembering what exact value errno will have in this case.
Yes, I have. I do it too, even some basic functions, I would look up on SO.
You really just need to know that there's a way to open files in C.
I don't think you can reach any sort of scale of breadth or depth if you try to memorize things. Programmers have to glue together a million things, it's just not realistic for them to know all the details of all of them.
It's unfortunate for the guy who has memorized all of K&R, but we have tools now to bring us these details based on some keywords, and we should use them.
I still look up PHP builtins often because they're so inconsistent. What comes first, the needle or the haystack?
> because he had a photographic memory and just so happened to leaf through a book containing a required proof
It makes for good rumours and TV show plots, but this sort of "photographic memory" has never been shown to actually exist.
Huh, TIL [0]. Thanks. There are people who can perform extraordinary memory feats, but they're very rare and/or self-trained.
[0] https://skeptoid.com/episodes/542
I dunno, I went to a high school reunion last year, and a dude seemed to know people's phone numbers from 30 years ago.
If he could remember that sort of thing, I can believe there are people who can remember steps of a proof, which is a much less random thing that you can feel your way around, given a few queues from memory.
Plus, realistically, how closely does an examiner read a proof? They have a stack of dozens of almost the same thing, I bet they get pretty tired of it and use a heuristic.
I think many people who grew up before cell phones remember phone numbers from the past. I just thought about it and can list the phone numbers of 3 houses that were on my childhood street in the early 2000s + another 5 that were friends in the area. I remember at least a handful of cell phone numbers from the mid to late 2000s as friends started to get those; some of them are still current. On the other hand, I don't know the number of anyone I've met in the last 15 years besides my wife, and haven't tried to.
>His photographic memory manifested itself early — he would amuse his parents’ friends by instantly memorizing pages of phone books on command.
https://medium.com/young-spurs/the-unsung-genius-of-john-von...
When I was in university, in my program, the most common format was that you were allowed to bring in a single page of notes (which you prepared ahead of time based on your understanding of what topics were likely to come up). That seemed to work fine for everyone.
I have a colleague who does that.
My students then often ask me to do the same, to permit them to bring one page of notes as he does.
Then I would say: just assume you're writing the exam with him and work on your one-pager of notes, optimize your notes by copying and re-writing them a few times. Now, the only difference between my exam and his exam is that the night before, you memorize your one-pager (if you re-wrote it a few times you should be able to recreate it purely from memory from that practice alone).
I believe having had all material in your memory at the same time, at least once for a short while, gives students higher self-confidence; they may forget stuff again, but they hopefully remember the feeling of mastering it.
I teach at MSc level. My students are scattered around the country and world. This makes hand-written exams tricky. Luckily, the nature of the questions they are asked to solve in the essay I give them following their coursework are that chatbots produce appalling bad submissions.
This is great. Do you have advice on making questions that LLMs are bad at answering?
In my case, I set some course-work, where they have to log in to a Linux server in the university and process a load of data, get the results, and then write the essay about the process. Because the LLM hasn't been able to log in and see the data or indeed the results, it doesn't have a clue what it's meant to talk about.
They don't work very well with large numbers. Try asking Claude to find the prime factors of 83521.
for most of the low hanging fruit it's as easy as copy-pasting the question into multiple LLMs and logging the output
do it again from a different IP or two.
there will be some pretty obvious patterns in responses. the smart kids will do minor prompt engineering "explain like you're Peter Griffin from Family Guy" or whatever, but even then there will be some core similarities.
or follow the example of someone here and post a question with hidden characters that will show up differently when copy-pasted.
I dont know what you majored. But when I was a CS major maybe 50% of my grade came from projects. We wrote a compiler from scratch, wrote something that resembled a SQL engine from scratch, and wrote sizeable portions of an operating system. In my sophomore year we spent at least 20 hours a week on various projects a week.
We could use any resource we coulc find as long as we didn't submit anything we didn't write ourselves. This meant stackoverflow and online documentations.
There is no way you can test a student's ability to implement a large, complex system with thousands of lines of code in a three hour exam. There is just no way. I am not against closed book paper exams, I just wish the people touting them as the solution can be more realistic about what they can and cannot do.
I had some take home exams in Physics that you could use internet, books, anything except other people (but that was honor code based). Those were some of the hardest exams I ever took in my life. Pages and pages of mathematical derivations. An LLM with how they can do a pretty good job at constructing mathematics, would actually have solved that issue pretty well.
> 10 years ago, we wrote exams by hand with whatever we understood (in our heads.)
You did, but the best exam I had was open book bring anything. 25 and some change years ago even.
I've also had another professor do the "you can bring one A4 sheet with whatever notes you want to make on it."
People really struggle to go back once a technology has been adopted. I think for the most part, people cannot really evaluate whether or not the technology is a net positive; the adoption is more social than it is rational, and so it'd be like asking people to change their social values or behaviors.
Grading the students. Usually for bigger classes, universities (at least mine) don't provide adequate support for grading the tests.
We had to write C code in paper. It was horrible.
It's not like writing prose And there is no syntax highlighting, no compile errors
I don't know if it's the reason, but some students do need a computer for medical reasons.
It was the same when I graduated 6 years ago. We had projects to test our ability to use tools and such, and I guess in that context LLMs might be a concern. But exams were pencil and paper only.
I think the key difference is what you're trying to measure
I go to school right now, and most classes actually enforce paper and pencil tests despite how annoying it is to grade and code on.
We had open notebook group exams back then too.
Optics.
Something that I think many students, indeed many people, struggle with is the question "why should I know anything?"
For most of us--myself included--once you graduate from college, the answer is: "enough to not get fired". This is far less than most curriculums ask you to know, and every year, "enough to not get fired" is a lower and lower bar. With LLMs, it's practically on the floor for 90% of full-time jobs.
That is why I propose exactly the opposite regimen from this course, although I admire the writer's free thinking. Return to tradition, with a twist. Closed-book exams, no note sheets, all handwritten. Add a verbal examination, even though it massively increases examination time. No homework assignments, which encourage "completionist mindset", where the turning-in of the assignment feels more real than understanding the assignment. Publish problem sets thousands of problems large with worked-out-solutions to remove the incentive to cheat.
"Memorization is a prerequisite for creativity" -- paraphrase of an HN comment about a fondly remembered physics professor who made the students memorize every equation in the class. In the age of the LLM, I suspect this is triply true.
> once you graduate from college, the answer is: "enough to not get fired"
I thought the point was to continue in the same vein and contribute to the sum total of all human knowledge. I suppose this is why people criticize colleges as having lost their core principles and simply responded to market forces to produce the types of graduates that corporate America currently wants.
> "enough to not get fired" is a lower and lower bar.
Usually people get fired for their actions and not their knowledge or lack thereof. It may be that David Graebers core thesis was correct. Most jobs are actually "bullshit jobs," and in the era of the Internet, they don't actually require any formal education to perform.
I agree with both of your assertions. Most jobs are indeed bullshit jobs in the age of abundance, and while the "point" of knowledge and wisdom is, in a grander sense, to continue in the same vein and contribute to the sum total of all human knowledge (I prefer the slightly less abstract phrase "build and inhabit a greater civilization"), there's very little about the current education system or the economic modality of the modern West that incentivizes that goal.
> Closed-book exams, no note sheets, all handwritten. Add a verbal examination
You are describing how school worked for me (in Italy, but much of Europe is the same I think?) from middle school through university. The idea of graded homework has always struck me as incredibly weird.
> In the age of the LLM, I suspect this is triply true.
They do change what is worth learning though? I completely agree that "oh no the grades" is a ridiculous reaction, but adapting curricula is not an insane idea.
Something often left out is the dependence on LLM’s. Students today assume LLM’s will always be available, at a price they (or their companies) can afford.
What happens if LLM’s suddenly change their cost to be 1000 USD per user per month? What if it is 1000 USD per request? Will new students and new professionals still be able to complete their jobs?
I swear teachers said something extremely similar about calculators when I was in grade school. "What are you going to do when you don't have access to a calculator? You won't ways have one with you!"
Calculators have never been more accessible/available. (And yet I personally still do most basic calculations in my head)
So I agree students should learn to do this stuff without LLMs, but not because the LLMs are going to get less accessible. There's another better reason I'm just not sure how to articulate it yet. Something to do with integrated information and how thinking works.
Calculators are widely available for a low cost. The logic behind most calculators is able to be consistently duplicated across a variety of manufacturers, thereby lowering the cost to produce these to the masses.
LLM’s are not consistent. For example, having a new company make a functional duplicate of ChatGPT is nearly impossible.
Furthermore, the cost of LLM’s can change at any time for any reason. Access can be changed by new government regulations, and private organizations can chose to suspend or revoke access to their LLM due to changes in local laws.
All of this makes dependence on an LLM a risk for any professionals. The only way these would be mitigated is by an open source, freely available LLM that creates consistent results that students can learn how to use.
The comparison with calculators overlooks several key developments.
LLMs are becoming increasingly efficient. Through techniques such as distillation, quantization, and optimized architectures, it is already possible to run capable models offline, including on personal computers and even smartphones. This trend reduces reliance on constant access to centralized providers and enables local, self-contained usage.
Rather than avoiding LLMs, the rational response is to build local, portable, and open alternatives in parallel. The natural trajectory of LLMs points toward smaller, more efficient, and locally executable models, mirroring the path that calculators themselves once followed.
My intuition is that the costs involved to train and run LLMs will keep dropping. They will become more and more accessible, so long as our economies keep chugging along.
I could be wrong, time will tell. I just wouldn't base my argument for why students should learn to think for themselves on accessibility of LLMs. I think there's something far more fundamental and important, I just don't know how to say it yet.
Bang on! This is definitely coming and very few talk about it.
The question is no longer "How do we educate people?" but "What are work and competence even for?"
The culture has moved from competence to performance. Where universities used to be a gateway to a middle class life, now they're a source of debt. And social performances of all kind are far more valuable than the ability to work competently.
Competence used to be central, now it's more and more peripheral. AI mirrors and amplifies that.
I completely agree with you. Do you have any ideas about what might stem this tide on a grander scale? I live in the country and will homeschool my kids--I think the risk of under-socialization is worth the reward of competency-based education and the higher likelihood of my own principles taking hold--but I would vastly prefer to send them to a normal school with other kids, albeit one in a superior society to that which we currently inhabit.
> Do you have any ideas about what might stem this tide on a grander scale?
The best way to move from the working class to the middle class these days is the military with a federal government job after retirement (even with what the current admin is doing). That said, a person doing this needs to realize that they will need to unlearn and learn a lot of social habits and learn some new ones.
The bonus is that higher ed will be free, and ambitious folks can ladder up into officer roles, which can be even more of a social climb.
> I think the risk of under-socialization is worth the reward of competency-based education and the higher likelihood of my own principles taking hold
I think you are very wrong on this point.
A highly-socialized person with the minimum viable amount of competency will go much farther in life than a highly-competent person with limited social skills.
If your kids are in a good school system, there will be a culture of competence in the students and their families.
> but I would vastly prefer to send them to a normal school with other kids, albeit one in a superior society to that which we currently inhabit
You just need to find the right pocket of people.
I personally recommend good Montessori schools over home schooling for K-8. It doesn’t work for everyone, but it works well when it’s a good fit. The community around the school is usually fairly healthy as well.
For 9-12, a high-quality private school, a magnet school, a combo high school / JC, or an independent study high school (often with home school “classes”) are all good options for curious and ambitious students, imho.
I had an electrodynamics professor say that there was no reason to memorize the equations, you would never remember them anyways, the goal was to understand how the relationships were formed in the first place. Then you would understand what the relationships are that each equation represents. That I think is the basis for this statement. Memorization of the equations gives you a basis to understand the relationships. So I guess the hope is that is enough. I would argue it isn't enough since physics isn't really about math or equations its about the structure and dynamics of how systems evolve over time. And equations give one representation of the evolution of those systems. But it's not the only representation.
This is all very well if the goal was to sift the wheat from the chaff - but modern western education is about passing as many fee paying students as possible, preferably with a passably enjoyable experience for the institutional kudos.
I think that really depends on countries. I went to an engineering school only 15% of applicants out of high school were admitted and of those who were admitted only around 75% graduated.
Western education passing as many fee paying students as possible seems to be very much a UK/US phenomenon but doesn't seem to be the case of European countries where the best schools are public and fees are very low (In France, private engineering schools rank lower)
I wonder if education will bifurcate back out as a result of AI. Small, bespoke institutions which insist on knowledge and difficult tests. And degree factories. It seems like students want the degree factory experience with the prestige of an elite institution. But - obviously - that can’t last long. Colleges and universities should decide what they are and commit accordingly.
I think the UK has been heading this way for a while -- before AI. Its not been the size of the institutions that has changed, but the "elite" universities tend to give students more individual attention. A number of them (not just Oxford and Cambridge) have tutorial systems where a lot of learning is done in a small group (usually two or three students). They have always done this.
At the other extreme are universities offering low quality courses that are definitely degree factories. They tend to have a strong vocational focus but nonetheless they are not effective in improving employability. In the last few decades we have expanded the university system and there are far more of these.
There is no clear cutoff and a lot of variation in between so its not a bifurcation but the quality vs factory difference is there.
On other side in western systems funded by taxes the incentive is still to give out as many degrees as possible as schools get funding based on produced degrees.
Mostly done to get more degree holders which are seen as "more productive". Or at least higher paid...
> Add a verbal examination, even though it massively increases examination time. No homework assignments, which encourage "completionist mindset"
To the horror of anyone struggling with anxiety, ADHD, or any other source of memory-recall issues under examination pressure. This further optimizes everything for students who can memorize and recall information on the spot under artificial pressure, and who don't suffer any from any of the problems I mentioned.
In grade school you could put me on the spot and I would blank on questions about subjects that I understood rather well and that I could answer 5 minutes before the exam and 5 minutes after the exam, but not during the exam. The best way for me to display my understanding and knowledge is through project assignments where that knowledge is put to practical use, or worked "homework" examples that you want to remove.
Do you have any ideas for accommodating people who process information differently and find it easier to demonstrate their knowledge and understanding in different ways?
Maybe those people just wont get as good of grades, and that's acceptable. It is strange that the educational system determined it wasn't acceptable. If I go to a university and try to walk onto the NCAA Division 1 Basketball team, its fine for them to tell me that I am too short, too slow, too weak, can't shoot, or my performance anxiety means I mess up every game and I am off the team. If I try and go for Art but my art is bad I am rejected. If I try and go for music but my performance anxiety messes up my performances, then I am rejected.
Why aught there be an exception for academics? Do you want your lawyer or surgeon to have performance anxiety? This seems like a perfectly acceptable thing to filter out on.
Why would performance anxiety be disqualifying for knowledge workers?
Everything involves performing and actually proving what you know. If this is such an issue, then its something you need to fix. I have never actually met anyone who has this “perfomance anxiety” where they are so brilliant but do poorly on tests because of it. I think its a myth to attack rigor of academics. For knowledge workers everntually you have to go into court, or perform surgery, or do the taxes or give the presentation, or have the high pressure meeting. If anxiety is truly debilitating to the person all of these situations theyll be doomed so filter them out.
> why should I know anything
The obvious answer is "Because it's interesting."
But suppose you think strictly in utilitarian terms: what effort should I invest for what $$$ return. I have two things to say to you:
First: what a meaningless life you're living.
Second: you realize that if you don't learn anything because you have LLMs, and I learn everything because it's interesting, when you and I are competing, I'll have LLMs as well...? We'll be using the same tools, but I'll be able to reason and you won't.
I think the people who struggle with the question "Why should I know anything?" aren't going to learn anything anyway. You need curiosity to learn, or at least to learn a lot and well, and if you have curiosity you're not asking why you should learn anything.
To play devil's advocate: In the future, "knowing things" might not really be a prerequisite for living a decent life. If you could just instantly look anything up that you need to know, then why would you need to know anything? I don't think it's a ridiculous question. As long as I can maintain basic literacy and an ability to form questions for an LLM, why really do I kneed knowledge? Maybe I don't find any intrinsic "life meaning" from knowledge. Maybe I don't care if it's interesting. Pragmatically why should I be educated?
Why would I need to be able to lift 100kg? I'm never going to need to lift 100kg, and if I do need to, I'd just find a tool that will do it. My life isn't any less rich because I can't lift 100kg, and I can maintain basic body health without being able to lift weight from the ground.
Exactly. In the long term, I would argue that "interest" is always a bigger determining factor of professional success than innate "capability" in a field. An interested person can grow their competence over time, whereas a disinterested, yet capable person will mostly remain at a fixed level of competence.
Honestly, I feel like I have to know more and more these days, as the ais have unlocked significantly more domains that I can impact. Everyone is contributing to every part of the stack in the tech world all of a sudden, and "I am not an expert on that piece of the system" no longer is a reasonable position.
This is in tech now, were the first adopters, but soon it will come to other fields.
To your broader question
> Something that I think many students, indeed many people, struggle with is the question "why should I know anything?"
You should know things because these AIs are wrong all the time, because if you want any control in your life you need to be able to make an educated guess at what is true and what isn't.
As to how to teach students. I think we're in an age of experimentation here. I like the idea of letting students use all tools available for the job. But I also agree that if you do give exams and hw, you better make them hand written/oral only.
Overall, I think education needs to focus more on building portfolios for students, and focus less giving them grades.
> and "I am not an expert on that piece of the system" no longer is a reasonable position
Gosh that sounds horrifying. I am not an expert on that piece of system, no I do not want to take responsibility for whatever the LLMs have produced for that piece of system, I am not an expert and cannot verify it.
You didn't answer why the student should memorize anything, except the hand-waving "Memorization is a prerequisite for creativity".
Students had very good reason to question the education system when they were asked to memorize things that were safe to forget once they graduated from school. And when most functional adults admitted they forgot what they had learned in school. It was an issue before LLM, and triply so now.
By the way, I now am 100% agree with "Memorization is a prerequisite for creativity." However, if you asked me to try to convince the 16-year-old me I would throw my hands up.
I completely agree with you, and now that I am far away from being a student (and at the time, I vehemently hated any system that demanded memorization), I regretfully say that sometimes you just have to force young people to do things they don't want to do, for their own good.
But "enough to not get fired" is not an answer to a question "why should I know anything?". To be honest, it's not clear if the rest of your post tries to answer the initial question of why you should know anything or the implied question of how much should I really know.
The answer to "why should I know anything" is a value judgement that, if advertised in my top-level post, provides a great deal more rhetorical surface to disagree with or criticize. My main point is that regardless of why anyone wants to know anything, in the age of AI, if you want to produce students who actually know things, I recommend dropping the tech and returning to a more rigorous, in-person curriculum with a foundation of memorization.
Here, though, is my answer: an excellent long-term goal for any band of humans is to create, inhabit, and enjoy the greatest civilization possible, and the more each individual human knows about their reality, the easier it is to do that.
You had me until, "no homework assignments". I am a lazy dev man. I like programming so I don't have to repeat tasks.
I would not survive without homework. I needed that extra push in school. Otherwise, I would have been doing something else.
I was lazy. I had perfect scores on all of the exams in one class, but never did the homework, so even though I knew the material, I got a "C".
F homework.
What I like about the approach in the article is that it confronts the "why should I know this?" question directly. By making students accountable for reasoning (even when tools are available) it exposes the difference between having access to information and having a mental model
This is like the Indian education system and presumably other Asian ones. Homework counts for very little towards your grade. 90% of your grade comes from the midterms and the finals. All hand written, no notes, no calculators.
That’s a terrible indictment of society if true. People are so far from self-realization, so estranged from their natural curiosity, that there is no motivation to learn anything beyond what will get you fed and housed. How can anyone be okay with that? Because even most chronically alienated people have had glimpses of self-actualization, of curiosity, of intrinsic motivation; most have had times when they were inspired to use the intellectual and bodily gifts that nature has endowed them with.
But the response to that will be further beatings until morale improves.
What about technology professionals? From my biased reading of this site alone: both further beatings and pain relievers in the form of even more dulling and pacifying technology. Follow by misanhtropic, thought-terminating cliches: well people are inherently dumb/unmotivated/unworthy so topic is not really worth our genuine attention; furthermore, now with LMMs, we are seeing just how easy it is to mimic these lumps of meat—in fact they can act both better and more pathetic than human meat bags, just have to adjust the prompts...
People who aren't fed and employed generally struggle to be self actualized, right? First you need to work for your supper, then you can focus on learning for its own sake.
As more jobs started requiring degrees, the motivation has to change. If people can get food and housing without a degree again to a comfortable extent than the type of person getting a degree will change again too.
If you let them, they'll alienate you until you have no free time and no space for rest or hobbies or learning. Labour movements had to work hard to prevent the 60 hour workweek, but we're creeping back away from 40, right?
I know about Materialism.
> Most Students Don’t Want to Use Chatbots
I think this is changing rapidly.
I'm a university professor, and the amount of students who seem to be in need of LLM as a crutch is growing really exponentially.
We are still in a place where the oldest students did their first year completely without LLMs. But younger students have used LLMs throughout their studies, and I fear that in the future, we will see full generations of students completely incapable of working without LLM assistance.
Reading the article, it seemed to me that both the professor and the students were interested in the material being taught and therefore actively wanted to learn it, so using an LLM isn't the best tactic.
My feeling is that for many/most students, getting a great understanding of the course material isn't the primary goal, passing the course so they can get a good job is the primary goal. For this group using LLMs makes a lot of sense.
I know when I was a student doing a course I was not particularly interested in because my parents/school told me that was the right thing to do, if LLMs had been around, I absolutely would have used them :).
I think the professor here presented them with a "special" case which can not be generalized outside of the exam context.
If you're presented with the choice of "Don't use AI" and "Use AI, but live with the consequences" (consequences like mistakes being judged harsher when using AI than when not using AI), I do not think chatbots will be a desirable choice if you've properly prepared for the exam.
One helper here is fear. You can be failed for formal errors at university, and it means we were scared shitless of making them and payed close attention.
If people know "at university you can't use LLM, you are forced to think by yourself" they will adjust, albeit by trial of fire.
I think there's an argument that growing up in an educational system unable to teach you how to not rely on LLM would for all intents and purposes permanently nerf you compared to more fortunate peers. Critical thought is a skill we continue to practice until the very end
It will be very interesting to see what will happen when LLMs start charging users for their true cost. With many people priced out how would they cope?
It's not that expensive unless you run millions of tokens through an agent. For use cases where you actually read all the input and output by yourself (i.e. an actual conversation), it is insanely cheap.
Yeah in my last job, unsupervised dataset-scale transformations amounted to 97% of all spending. We were using gemini 2.5 flash in batch/prefill-caching mode in Vertex, and always the latest/brightest for ChatGPT-like conversations.
May happen, but I suspect not in the way implied by that question.
Hardware is still improving, though not as fast as it used to; it's very plausible that even the current largest open weights models will run on affordable PCs and laptops in 5 years, and high-end smartphones in 7.
I don't know how big the SOTA close-weights models are, that may come later.
But: to the extent that a model that runs on your phone can do your job, your employer will ask "why are we paying you so much?" and now you can't afford the phone.
Even if the SOTA is always running ahead of local models, Claude Code could cost 1500 times as much and still have the average American business asking "So why did we hire a junior? You say the juniors learn when we train them, I don't care, let some other company do that and we only hire mid-tier and up now."
(Threshold is less than 1500 elsewhere, I just happened to have recently seen the average US pay for junior-grade software developers, $85k*, which is 350x cheaper, and my own observation that they're not only junior quality but also much faster to output than a junior).
* but also note while looking for a citation the search results made claims varying from $55k to $97.7k
They would fall behind in the world just like people from developing and poor countries do today.
Very few people fall behind at the moment due to lack of access to information. People in poor countries largely have access to the internet now. It doesn’t magically make people educated and economically prosperous.
You are arguing the converse. Access to information doesn't make people educated, but lack of access definitely puts people at a big advantage. Chatbots are not just information, they are tools and using it needs training because they hallucinate.
What do you think the "true cost" is?
Google destroyed search and replaced it with that dippy LLM box.
Are you sure student desire is the driving force here?
> ... is growing really exponentially.
Or geometrically?
Please, you don’t need to counter-narrative everything. Maybe talk about what the professor did here and why students didn’t trust the output in an exam context in this particular subject.
> Second, I learned that cheating, however lightly, is now considered a major crime. It might result in the student being banned from any university in the country for three years. Discussing exam with someone who has yet to pass it might be considered cheating. Students have very strict rules on their Discord.
This has also something to do with it. Hard to make very accurate conclusions.
> When one would finish her exam, she would come back to the room and tell all the remaining students what questions she had and how she solved them. We never considered that "cheating" and, as a professor, I always design my exams hoping that the good one (who usually choose to pass the exam early) will help the remaining crowd.
You are an outlier. When I was in school any outside assistance was tantamount to cheating and, unlike an actual crime, it was on the student to prove they were not cheating. Just the suspicion was enough to get you put in front of an honor board.
It was also pervasive. I would say 40% of international students were cheaters. When some were caught they fell back on cultural norms as their defense. The university never balked because those students, or their institutions, paid tuition in cash.
International students in graduate programs at US institions are basically buying a degree from what I've seen. The professors know they cheat and they don't really care. The students are paying a lot of money and they will get what they paid for.
> The professors know they cheat and they don't really care.
To throw another anecdote in the bucket, I know at least one professor who does not tolerate cheating from any of his students, regardless of cultural or national background, or how they're paying for their education
I've seen, on multiple occasions, the professor's recommendations get overruled by the dean or university administration. If the school wants them there, they stay.
Andés Hess (RIP) gave an examination, 2006, in his Organic Chemistry course... which ended up with 35% of the class being reported to Vanderbilt's Honor Council.
He brilliantly tested students using open-ended, single-sentence questions (with half of the page blank to show your work)... which tested foundational topics and oozed with partial-credit opportunities. You then had an option to submit "test corrections" to explain why you should gain more points for your efforts (typically considered, when reasonable).
----
His first exam of the semester, there was a multi-step question which resulted in a single 1cm x 1cm box — worth 20% of the entire exam's scoring — for you to indicate whether that particular Grignard reaction resulted in a single-, double-, or triple- bond.
The majority of the class answered (incorrectly) that it would be a double-bond, by writing a `=` into the blank box. In fact, that reaction resulted in a triple-bond `≡`
35% of the class ended up just adding the third parallel line (i.e. changing what they had originally answered) when handing in their test corrections. Dr. Hess had made photocopies of all the penciled exams... and reported all the cheaters.
----
I answered it correctly, originally, so was never tempted to fib a similar mistake — but this definitely opened my eyes in reinforcement of not cheating. I eventually got into medical school, and most of that 35% of branded "cheaters" did not. Ultimately I never became a physician, but remember the temptations to cheat like everybody else did. I am happier/poorer because...
>40% of international students were cheaters. When some were caught they fell back on cultural norms as their defense. The university never balked because those students, or their institutions, paid tuition in cash.
Twenty years ago, at Vanderbilt, this would have been an understatement — particularly among non-citizen asians.
I remember in organic chemistry an instructor attempted to re-give the same examination ("because ya'll did so terrible") and it was struck down by a dean as not allowable simply because the Honor Code was to be invoked that nobody/groups would share answers (yeah sure okay).
The minority following the Honor Code ended up getting into lesser graduate schools (e.g. myself) — because most courses didn't curve and VU didn't give out A+ as a grade. I have specifically not mentioned the specific country which cheated most-blatantly... but everybody from back then knew/knows.
> 3. I allow students to discuss among themselves [during an exam] if it is on topic.
Makes me wonder if they should also get a diploma together then, saying "may not have the tested knowledge if not accompanied by $other_student"
I know of some companies that support hiring people as a team (either all or none get hired and they're meant to then work together well), so it wouldn't necessarily be a problem if they wish to be a team like that
OP here: I teach Open Source Strategies.
The main strategy is collaboration. If you are smart enough to:
1. Identify your problem 2. Ask someone about it 3. Get an answer which improve your understanding
Then you are doing pretty good by all standards
Another trick I sometimes use. I take one student which has hard time articulate it a concept. I take student two who don’t understand that concept. I say to student 1: "You have 20 minutes to teach student 2 the concept. When I come back, you will be graded according to his answers"
(I, of course, not only grade that. But it forces both of them to make an extra effort, student 2 not willing to be the cause for student 1 demise)
> student 2 not willing to be the cause for student 1 demise
I would very much not count on that.
Yea, curious too about some more rules e.g. both parties has to contribute to the discussion (:
ha ha fair enough - but he does mention there's a culture of isolation and cu-throat competition at the school so, maybe it's just a reaction to that
I think we should send all diplomas to OpenAI and end higher education.
Less educated people are easier to steer via TikTok feeds anyway.
Quite a thoughtful way to adapt exams to wave of new tools for students and learn on the way.
I wished other universities adapt so quickly too (and have such a mindful attitude to students e.g. try to understand them, be upfront with expectations, learning from students etc).
Majority of professors are stressed and treat students as idiots... at least that was the case decade a go!
OP here: Majority of professors became professors because there were very good at passing standard exam (and, TBH, some are not good at anything else).
I’m different because I was a bad student. Only managed to get my diploma with minimal grade, always rebel against everything. But some good people at my university thought that Open Source was really important and they needed someone with a good career in that field. I was that person (and I’m really thankful for offering that position)
> Majority of professors became professors because there were very good at passing standard exam (and, TBH, some are not good at anything else).
Is this a French thing? In the US we don't have standardized exams to become a college professor. Instead, we need to do original research and publish.
> I realized that my students are so afraid of cheating that they mostly don’t collaborate before their exams! At least not as much as what we were doing.
This is radically different from the world that's been described to me. Even 20 years ago cheating was endemic and I've only heard of it getting worse.
I teach at MSc level, and over the last couple of years about 15% of my students have cheated. Really obviously cheated, like two students submitting 100% byte-identical answers.
But this class is not very representative of what the same students are presumably doing for other classes.
Wow. I would love to have had this teacher :-) Everything about this setup seems so thoughtful. Giving students both agency / freedom of choice and responsibility. And if they choose more power (llms), they automatically have more responsibility (having to explain the mistakes of llms). And this:
> It took me 20 years after university to learn what I know today about computers. And I’ve only one reason to be there in front of you: be sure you are faster than me. Be sure that you do it better and deeper than I did. If you don’t manage to outsmart me, I will have failed.
What a wonderful article, and what a wonderful way of enganging with students and adapting to the new tech. I wish all professors were like you
What a wonderful teacher! I wish all teachers were like him.
Regarding the collaboration before the exam, it's really strange. In our generation, asking or exchanging questions was perfectly normal. I got an almost perfect score in physics thanks to that. I guess the elegant solution was still in me, but I might not have been able to come up with it in such a stressful situation. 'Almost' because the professor deducted one point from my score for being absent too often :)
However, oral exams in Europe are quite different from those at US universities. In an oral exam, the professor can interact with the student to see if they truly understand the subject, regardless of the written text. Allowing a chatbot during a written exam today would be defying the very purpose of the exam.
Very interesting write up, would be curious to know more about what an Open Source Strategies course entails, as far as I can remember I never had anything like that on offer at my university.
https://uclouvain.be/cours-2025-linfo2401
https://uclouvain.be/cours-2025-linfo2402
A similar outlook scandal happened in Norway. I believe both UiO and NTNU, biggest in Humanities and engineering respectively, used to have great internal email services. Huge protests from competent personnel ignored. Now its all outlook. Microsoft are excessively good at convincing non technical stakeholders that their own staff are idiots, and that Microsoft is the solution.
I have corrected exams and graded assignments as an external party before (legal requirement). The biggest problem with LLMs I see is that the weak students copy-paste commands with unnecessary command line switches. But they would have done the same with stack overflow.
Some also say they use LLM to help improve their writing but that's where the learning is so why????? I think it's the anxiety for failing, they don't seem to understand I'll not fail them as long as their incoherent text proves they understood what they were doing.
Having graduated and knowing how things are ought to look, taking exams are so much less scary now because I'm confident I will be failed for being incompetent, not because I didn't write properly. Not all students have the same privilege, they gain it over time.
It does help that computer science assignments and papers are pretty damn standard in form.
Only 2 students actually used an LLM in his exam, one well and one poorly so I'm not sure there is much you can draw from this experience.
In my experience LLMs can significantly speed up the process of solving exam questions. They can surface relevant material I don't know about, they can remember how other similar problems are solved a lot better than I can and they can check for any mistakes in my answer. Yes when you get into very niche areas they start to fail (and often in a misleading way) but if you run through practise papers at all you can tell this and either avoid using the LLM or do some fine tuning on past papers.
> Second, I learned that cheating, however lightly, is now considered a major crime. It might result in the student being banned from any university in the country for three years. Discussing exam with someone who has yet to pass it might be considered cheating. Students have very strict rules on their Discord.
> I was completely flabbergasted because, to me, discussing "What questions did you have?" was always part of the collaboration between students.
I suspect that lots of intelligent and diligent students hate our new world of AI because they probably find it more likely now that they could be accused of and disciplined for something they didn't do.
He describes mostly a process where the exam itself, or rather testing the knowledge of a student, is not so important.
I think not all exams can occur like that. In some cases you just have to test one's knowledge about a specific topic, and knowing facts is a very, very easy way to test this. I would agree that just focusing on facts these days is overrated, but I would still reason that it is not a useless metric still. So, when the author describes "bring your own exam questions", it more means that the exam itself is not so relevant, which is fine - but saying that university exams are now useless in the age of autosolving chatbots, is simply wrong. It just means that the exam itself is not important; that in itself does not automatically mean that ALL exams or exam styles are useless. Also, it depends on what you test. For instance, testing solving math questions - yes, chatbots can solve this, but can a student solve the same without needing a chatbot? How about practical skills? Ok, 3D printing will dominate, but the ability to craft something with your own hands, that is still a skill that may be useful, at the least to some extent.
I feel that the whole discussion about chatbots dumbs down a lot. Skills have not become irrelevant just because chatbots exist.
Interesting write up! I’ve thought about how university exams are done effectively nowadays. I took my degree in CS almost 20 years ago, and being a user of LLMS - I can’t really see how any of my old exams would work today if students would be allowed LLMs.
I graduated 15 years ago, and I think the exams in my degree were actually the most LLM-proof part of the student assessment. They were no-aid written exams with pencils and paper, whereas the assignments were online-submitted code only that an LLM could easily write.
Spoiler: they don't.
CS exercises that we can expect an average student to solve is trivially solved by LLMs. Even smaller local models.
This comes to not "smartness" of LLMs. But reality that we do not even want anything novel in these exercises or exams. And same areas are repeated multiple times so naturally there is lot of these in training data.
This is one area where LLMs really should excel at. And that doesn't really mean that students should not also learn it and be able to solve same issues. Which is real dilemma for the school system...
Except some times they aren't solved by LLMs, but appear to be. A CS student should be able to look at it and tell the difference.
The problem is when students just blindly copy and paste from the chatbot and submit it as their own answer without even reading it.
They should be encouraged to read and review the LLM output so they can critically understand it and take ownership of it.
They should be encouraged to not turn in casual plagiarism as their own work.
I believe there is a mechanism for this already.
In my experience, reading a solution and even understanding it doesn’t go very far in teaching you how to do something. I can look at calculus solutions all day but only when I actually try to solve them myself do I run into all kinds of roadblocks which is where the real learning happens.
You're right, but learning can take place when you need it. There is no real advantage to learning something ahead of time. The bottleneck is having awareness of what is out there to learn. You can't learn about what you don't know exists. Looking at calculus solutions all day should give a sense of what calculus can be used for, so that it is in your back pocket when the time you need it comes.
Well, at least it used to be the bottleneck. Nowadays you can just ask an LLM. For all their faults, they are really good at letting you know about what tools exist in out there in the world, surfacing more than you could ever come to know about even if all you did was read about what exists all day, every day.
I believe to count as an expert on something you need to have a ready compendium of knowledge ready to go. It becomes very hard to tackle problems or gain deep insights if you don’t already have knowledgeable people that have thought deeply about a particular space. Maybe when we have supremely reliable LLMs that can replace humans we might not but we’re not there yet.
> I believe to count as an expert on something you need to have a ready compendium of knowledge ready to go.
You are certainly headed in the right direction, but not quite. To be seen as an expert in the eyes of others you need to have had a vision for something and to have successfully executed on it. If the vision was dependent on calculus, then you will have reached a point where you had to learn something about calculus, of course...
But that's different to having a taskmaster tell you to learn calculus for no apparent reason. Even if you follow through and built up a huge wealth of knowledge from it, you would still not be deemed an expert by others. You're no different than an encyclopedia, which isn't an expert either. It is being able to see things others can't and the ability to act upon it that makes an expert.
Learning taking place when you need it isn't the same as never.
> Maybe when we have supremely reliable LLMs that can replace humans we might not but we’re not there yet.
Frankly, even Page Rank already replaced humans for this. But LLMs are even better at it. Humans are just that poorly performing. Like I said before, even someone doing nothing in life but looking for what exists in the world could not take in as much as databases that have indexed every written thing.
Calculus might not be the best analogy for my point since it’s pretty fundamental. When I think of an expert I think of an accomplished mathematician or a chemist, someone that can build on existing knowledge to provide new breakthroughs. You can ask an LLM for a particular formulation but you cannot make wide spanning connections to come up with something novel until you have a good understanding of a given space. Not all progress is a series of iterative problems and tasks that need to be solved. In fact for a lot of breakthroughs it’s making disparate connections.
> When I think of an expert I think of an accomplished mathematician or a chemist, someone that can build on existing knowledge to provide new breakthroughs.
I think we're on the same page here. Experts have both vision and execution. Someone who has simply learned a bunch of things, or a lot about one thing, is not what we consider an expert.
> You can ask an LLM for a particular formulation but you cannot make wide spanning connections to come up with something novel until you have a good understanding of a given space.
I don't get where you are trying to go with this. Using an LLM (or Page Rank for that matter) to search for tools that have been created/discovered necessary to fulfill execution of your vision seems to have nothing to do with what you are trying to say. Nobody would ask an LLM to do what you are suggesting, if I am understanding you correctly. LLMs are most definitely not good at that. That is AGI territory.
I'm back in school part time for a bachelor's, and have recently had a class where I had a professor who really understood how to implement LLM's into the class.
Our written assignments were a lot of "have an LLM generate a business proposal, then annotate it yourself"
The final exam was a 30 minute meeting where we just talked as peers, kinda like a cultural job interview. Sure there's lots of potential for bias there, but I think it's better than just blindly passing students using LLM's for the final exam.
This professor's capacity for empathy and compassion for his students is off the charts. You can tell he puts a lot of thought and effort into helping his students learn.
Bravo.
Also, the take on AI is a stark contrast to much of what we've seen by other educators.
I find it odd that the Professor requires the students to demonstrate why the output of the LLM is correct, while not having the same requirement for a Google search. Even pre-AI using Google required critical thinking skills because the content was driven by SEO. If someone could make money by giving you mis-information it was pushed up in the rankings.
I remember my first programming class, where the exam required students to solve problems in Pascal on paper. I see no problems in this approach.
I still think pen and paper is king among students.
I wish we could take our exams this way. It seems like a very interesting approach :)
I went to a university class in my last year that had the same “choice”. Any time the professor starts asking about “ai” tends to be worrisome in my experience.
Why was it worrisome?
If anything, this reinforces the idea that chatbots don't fundamentally change education... they just amplify whatever incentives and structures already exist
I am not so sure about that. I think the difference this time is that AI usage is fairly mainstream. People are using it for all kinds of things, including studies. Using it to quickly get done with school/study-work, is a no-brainer (pun intended).
Louvain-Li-Nux forever!
Paper and pencil.
If what you are trying to test is whether the student will be able to give good answers to those and similar questions in the future, then it's fine if they use AI to answer exam questions. Because in the future, when they are confronted with these and similar questions, they will have access to at least as good AI as they do have today.
And if you want to test something else than whether the student can provide answer to your question, then why do you ask this question? Give a task that shows what you care about.
90% weight to in-person exams without technology. 10% to quizzes or homework. You can't trust anything done outside of the classroom to accurately show competency. Problem solved.
> We were imposed GitHub for so many exercises!
I'm sympathetic to both sides here.
As a professor who had to run Subversion for students (a bit before Git, et al), it's a nightmare to put the infrastructure together, keep it reliable under spiky loads (there is always a crush at the deadline), be customer support for students who manage to do something weird or lose their password, etc. You wind up spending a non-trivial amount of time being sysadmin for the class on top of your teaching duties. Being able to say "Put it on GitHub" short circuits all of that. It sucks, but it makes life a huge amount easier for the professor.
From the students point of view, sure, it sucks that nobody mentioned that Git could be used independently (or jj or Mercurial or ...) However, Github is going to be better than what 99.9% of all professors will put together or be able to use. Sure, you can use Git by itself, but then it needs to go somewhere that the professor can look at it, get submitted to automated testing, etc. That's not a trivial step. My students were happy that I had the "Best Homework Submission System" (said about Subversion of all things ...) because everybody else used the dumbass university enterprise thing that was completely useless (not going to mention its name because it deserves to die in the blazing fires of the lowest circle of Hell). However, it wasn't straightforward for me to put that together. And the probability of getting a professor with my motivation and skill is pretty low.
Agree about the possibility of infra nightmare, especially in the "SVN era" -- but in 2026, it's pretty straightforward to run a gitlab instance (takes about an hour to set up, most of which is DNS and TLS stuff, ime) for a course and set up actions, or use other submission infra like CMU autolab. I do this.
Agree with your comment about probability, motivation, and skill.
"Marking Exam Done by A.I." (Sixty Symbols)
https://www.youtube.com/watch?v=JcQPAZP7-sE
LLM reasoning models are very good at searching well documented problems. =3
> I was completely flabbergasted because, to me, discussing "What questions did you have?" was always part of the collaboration between students
When I was a student, professors maintained a public archive of past exams. The reason was obvious: next time the questions would be different, and memorizing past answers wouldn't help you if you don't understand the core ideas being taught. Then I took part in an exchange program and went to some shit-tier uni and I realized that collaboration was explicitly forbidden because professors would usually ask questions along "what was on slide 54". My favorite part was when professor said "I can't publish the slides online because they're stolen from another professor but you can buy them in the faculcy's shop".
My uni maintained a giant presence on Facebook - we'd share a lot of information, and the most popular group was "easy courses" for students who wanted to graduate but couldn't afford a difficult elective course.
The exchange uni had none of that. Literally no community, no collaboration, nothing. It's astonishing.
BTW regarding the stream of consciousness - I distinctly remember taking an exam and doing my best to force my brain to think about the exam questions, rather than porn I had been watching the previous day.
> Mistakes made by chatbots will be considered more important than honest human mistakes, resulting in the loss of more points.
>I thought this was fair. You can use chatbots, but you will be held accountable for it.
So you're held more accountable for the output actually? I'd be interested in how many students would choose to use LLMs if faults weren't penalized more.
I thought this part especially was quite ingenious.
If you have this great resource available to you (an LLM) you better show that you read and checked its output. If there's something in the LLM output you do not understand or check to be true, you better remove it.
If you do not use LLMs and just misunderstood something, you will have an (flawed) justification for why you wrote this. If there's something flawed in an LLM, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.
One shows a misunderstanding, the other doesn't necessarily show any understanding at all.
>If you have this great resource available to you (an LLM) you better show that you read and checked its output. If there's something in the LLM output you do not understand or check to be true, you better remove it.
You could say the same about what people find on the web, yet LLMs are penalized more than web search.
>If you do not use LLMs and just misunderstood something, you will have an (flawed) justification for why you wrote this. If there's something flawed in an LLM, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.
Swap "LLMs" for "websites" and you could say the exact same thing.
The author has this in their conclusions:
>One clear conclusion is that the vast majority of students do not trust chatbots. If they are explicitly made accountable for what a chatbot says, they immediately choose not to use it at all.
This is not true. What is true is that if the students are more accountable for their use of LLMs than their use of websites, they prefer using websites. What is "more" here? We have no idea, the author didn't say so. It could be that an error from a website or your own mind is -1 point and from a LLM is -2, so LLMs have to make two times less mistakes than websites and your mind. It could be -1 and -1.25. It could be -1 and -10.
The author even says themselves:
>In retrospect, my instructions were probably too harsh and discouraged some students from using chatbots.
But they don't note the bias they introduced against LLMs with their notation.
rare here: well written and insightful, I would take this course. I'm curious about why he penalized chatbot mistakes more, at first glance sounds like just discouraging their use but the hole setup indicates genuine desire to let it be a possibility. In my mind the rule should be "same penalty and extra super cookies for catching chatbot mistakes"
I wrote this before to another comment like yours:
I thought this part of penalizing mistakes made with the help of LLMs more was quite ingenious.
If you have this great resource available to you (an LLM) you better show that you read and checked its output. If there's something in the LLM output you do not understand or check to be true, you better remove it.
If you do not use LLMs and just misunderstood something, you will have a (flawed) justification for why you wrote this. If there's something flawed in an LLM answer, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.
One shows a misunderstanding, the other doesn't necessarily show any understanding at all.
Here is my guess: Usually marks are given for partially correct answers, partially to be less punishing for human error whether caused by stress or other factors, there’s a good chance the student understood the topic. If instead they are using a chat bot, but didn’t catch the mistake themselves, it’s an indication of less understanding and marked accordingly.
> The third chatbot-using student had a very complex setup where he would use one LLM, then ask another unrelated LLM for confirmation. He had walls of text that were barely readable. When glancing at his screen, I immediately spotted a mistake (a chatbot explaining that "Sepia Search is a compass for the whole Fediverse"). I asked if he understood the problem with that specific sentence. He did not. Then I asked him questions for which I had seen the solution printed in his LLM output. He could not answer even though he had the answer on his screen.
Is it possible, and this is an interesting one to me, that this is the smartest kid in the class? I think maybe.
That guy who is playing with the latest tech, and forcing it to do the job (badly), and could care less about university or the course he's on. There's a time and a place where that guy is the one you want working for you. Maybe he's not the number 1 student, but I think there should be some room for this to be the Chaotic Neutral pick.
> Is it possible, and this is an interesting one to me, that this is the smartest kid in the class? I think maybe.
He might as well be the dumbest guy in the class. Playing with tech is not a proof of being smart on itself.