Costa Rica MedTech Header
Gowned technicians assembling and inspecting medical devices in a clean, modern Costa Rica manufacturing lab.

Software Engineering in the Age of AI

Software Engineering in the Age of A.I.

AI-assisted coding is here, and it’s improving fast, however software engineering still lives or dies by verification and validation discipline. Modern generative AI is fundamentally probabilistic, while many real systems demand behavior we can justify as provable (or at least defensible with strong objective evidence). The path forward is not “replace engineers,” but “upgrade the process”: iterate, verify continuously, and use AI agents across the lifecycle (requirements, architecture, contracts, code, tests, proof, and DevOps) with a human engineer accountable for quality.

I’ve spent most of my career in environments where software failures are expensive at best and dangerous at worst: life-critical medical devices, safety-critical aerospace, and large-scale banking platforms. From that perspective, listening to Bertrand Meyer’s ACM talk is a useful reset.


“A.I. support for software engineering is here to stay… C-class development can essentially be automated… For the focus on A and B, AI is not really ready for a full software engineering process.”

Bertrand Meyer (ACM, 2026)

The Thesis

“…a process with a programmer at the center… Hallucinations are going to persist. So unless you have a human who is actually more competent than the tools, it’s not going to work.”

– Bertrand Meyer (ACM, 2026)

Meyer opens by intentionally “spoiling” his destination: the future is not just AI writing code, but AI supporting the whole software lifecycle, with the software engineer acting as an orchestrator. From my vantage point, that’s the right level of conversation. In every high-accountability program I’ve led, quality has come less from heroics in implementation and more from a controlled lifecycle where requirements, design, verification, and evidence stay connected.

He explicitly keeps the human engineer at the center. For practitioners and leaders, this matters: if we treat AI as an autonomous producer, we inherit its failure modes. If we treat it as a set of accelerators inside a disciplined process, we can get speed and confidence.

TL:DR – Jump to conclusions.

“I’m going to argue for the need to use AI for more than just coding… We need agents… an orchestration of agents that are going to help us do requirements… architecture… contracting… coding… test generation… testing… proof… deployment.”

– Bertrand Meyer (ACM, 2026)

The spark: “vibe coding” and why “mostly works” is a problem

To ground the discussion in the current zeitgeist, Meyer references Andrej Karpathy’s “vibe coding” tweet, which advocates an ultra-fast, intuition-driven workflow: “accept all,” don’t read diffs, copy/paste error messages, iterate until it works. Meyer doesn’t spend the whole talk critiquing “vibe coding,” but he uses it as a foil: it highlights the mismatch between speed and assurance.

“vibe coding” is a useful label precisely because it makes the trade-off explicit: you can buy speed by spending less on review, verification, and evidence. In low-consequence contexts that may be fine; in high-consequence contexts it’s a debt instrument with a high interest rate.

The phrase “mostly works” becomes a micro-summary of the verification problem. “Mostly” may be acceptable for prototypes and demos, but it is not a tolerable posture for many real systems, especially when security vulnerabilities, reliability failures, and safety hazards are possible outcomes. The more AI accelerates generation, the more the system needs disciplined ways to detect, prevent, and contain defects.

“I really like this word mostly which from a software engineering perspective is a little bit dubious.”

– Bertrand Meyer (ACM, 2026)

Career anxiety framed as a technical question: skill‑leveling (L) vs skill‑enhancing (E)

Meyer acknowledges the audience’s anxiety about what AI means for the profession, but he quickly reframes the discussion into a technical lens: some breakthroughs are skill-leveling (they reduce differences between experts and novices), while others are skill-enhancing (they amplify the advantage of top performers). This is more useful than generic “AI will replace jobs” discourse, because it forces the reader to ask: Which tasks are being commoditized? Which tasks become more valuable?

Slide graphic contrasting “skill-leveling” vs “skill-enhancing” technologies
Skill‑leveling (L) vs skill‑enhancing (E)

He illustrates leveling with tools that provide a shared capability baseline (e.g., GPS navigation); and enhancing with tools that scale distribution and impact (e.g., the printing press, which made great writing scale globally). The relevance to AI-assisted engineering is that code generation may reduce the “entry barrier” for some categories of development, while simultaneously raising the premium on those who can define problems well, reason about tradeoffs, and enforce quality.

For leaders, the takeaway is straightforward: AI may compress effort in implementation, but it increases the strategic value of strong engineering management. Clear requirements, explicit risk decisions, verification strategies, and accountability.

Slide suggesting a forecast where plain English could replace programming
“English replaces programming” claims

On one side we have the forecast of English simply replacing programming and computer science as a whole.

Slide showing a nuanced view of AI’s impact, indicating continued demand for top talent
Reality is nuanced: talent still matters

But then the reality if full of nuances, even Nvidia is still hunting for top talent and the risks and actual failures of lack of process with new AI Agents has surfaced big time.

Slide illustrating risks and failures when AI agents are used without sufficient process
Agentic speed without process increases risk

Intellectual discipline: avoid anecdotes, use state‑of‑the‑art, separate hiccups from essentials

Before making claims about AI’s limits or risks, Meyer insists on epistemic hygiene. He argues that negative conclusions drawn from old models or free tiers are weak evidence, because the technology is moving quickly and recently improved. He also distinguishes between early hiccups of a new technology (which often disappear) and essential limitations (which persist because they are structural).

Slide urging evaluation based on state-of-the-art evidence rather than anecdotes or outdated tools
Avoid anecdotes; use state‑of‑the‑art evidence

This matters to software engineering: engineering decisions should not be based on isolated demos or one-off failures/successes. Instead, the evaluation should be: are we using the best tools available, what do measured studies show, and what class of failure modes are inherent to the approach?

In regulated environments, we’d simply call this “objective evidence.” The same mindset applies even when the product is not regulated: treat claims (positive or negative) as hypotheses that need disciplined validation.


“Just state what you need”: why requirements are the hard part

Slide showing that requirements need structure and logic, not only natural language
Specs need structure, not only English

Meyer targets a recurring claim: “We won’t need programming; we’ll just tell the computer what we want in English.” He argues that the critical word is “just.” The act of specifying what you want (unambiguously, completely enough, and in a way that survives change) is a large fraction of software engineering difficulty. Requirements engineering requires abstraction, structure, and logic; it is not eliminated by swapping Java for English.

Slide emphasizing that “just state what you need” hides the difficulty of specifying requirements
Requirements are the hard part

This is where verification and AI intersect sharply. If the requirement is ambiguous, tests and proofs can only validate the wrong thing faster. In other words: AI can accelerate output, but it can also accelerate misinterpretation unless requirements are treated as first-class engineering artifacts.

In practice, this is where I see leaders either win or lose with GenAI adoption: teams optimize for “producing more code,” while under-investing in the hard work of defining what “correct” means and how we’ll know.


Déjà vu: why “no more programmers” has been promised before (and what actually happened)

“So for those of us who have been around for a while there’s a certain impression of déjà vu… we have heard some of this before…”

– Bertrand Meyer (ACM, 2026)

Meyer notes that “end of programming” narratives are not new. He lists prior waves (COBOL, 4GLs, 5th generation computing, component-based development, model-driven development, low-code/no-code) that each promised to remove the need for programmers. The pattern, he argues, is that abstraction levels rise and some skills become less central, but the underlying need for engineering remains.

Slide listing prior waves claiming “no more programmers”
Déjà vu: “no more programmers needed” cycles

He then adds an important nuance for the AI era: unlike compilers (which translate high-level code to machine code largely invisibly), AI-generated code frequently still demands inspection. Even if the AI is “doing the work,” we often cannot safely “push a button and forget about the result.”

The key taxonomy: Casual vs Business vs Acute software (ABC) and where AI truly fits today

Meyer’s ABC framing is one of the most practical parts of the talk. He distinguishes:

  • Casual software: scripts, websites, experiments, MVPs—where failure is inconvenient but not catastrophic.
  • Business software: enterprise systems that are important and costly, though not necessarily life-critical.
  • Acute software: mission/safety-critical systems where failure can have severe consequences (transport, aerospace, medical, defense).
Chart categorizing software as Casual, Business, or Acute with increasing consequences and rigor
ABC taxonomy:
Acute: Systems requiring immediate, precise intervention and high reliability. The “Science” of the craft. Acute software involves life-critical or mission-critical applications where failure translates directly to human injury, death, or catastrophic financial loss.
Business: Value-driven engineering focusing on scalability, professional tools, and long-term ROI.
Casual: The rise of low-code/no-code environments. Balancing approachability with engineering rigor. Source: https://bertrandmeyer.com/2013/03/25/the-abc-of-software-engineering

This taxonomy clarifies the otherwise polarized debate. AI “push button, forget it” workflows can work well for casual development and can be genuinely transformative. Enabling non-programmers to produce demos and prototypes. But as you move into business and especially acute domains, “mostly works” becomes unacceptable; verification and validation rigor must rise.

This is the lens I recommend for practitioners and leaders: decide explicitly which parts of your stack are “casual,” which are “business,” and which are “acute,” then align your AI usage and verification investment accordingly.


The “probable vs provable” clash: why statistics-based generation collides with engineering rigor

Slide titled “Stochastic Foundations” showing “P(x) — Probability over Provenance” and the language-model objective “L = Σ log P(wᵢ | w<ᵢ)”, noting that LMs predict next tokens from context rather than semantic truth.
Probability over provenance: language models optimize next-token prediction.

“Hallucinations… are not accidental… they’re not a bug. They’re a feature… [modern AI is] probability and statistics based.”

“Assume… modules… correct with probability 99.9%… [with] 5,000 modules… probability of less than 1%.”

– Bertrand Meyer (ACM, 2026)

This is the conceptual heart of the talk. Meyer argues that modern AI systems (LLMs especially) are built on probability and statistics: they generate the most likely answer, not a guaranteed-correct one. “Hallucination,” in this view, is not a temporary bug; it is a structural consequence of probabilistic generation (even if rates improve).

Slide contrasting “probable” outputs from probabilistic AI with the need for “provable” assurance; notes compounding risk across many modules
Probable vs provable: scaling risk

He reinforces the point with a compositional argument (attributed to Dijkstra): even if individual components are “almost correct,” large systems multiply small error probabilities into unacceptable overall risk. In software, we don’t ship single functions, we ship systems composed of thousands of pieces. This is why “probable” is not enough: engineering needs methods that can establish “provable” properties, at least for critical behaviors.

My pragmatic translation: treat LLM output as a powerful draft generator, not as an authority. The authority is the verification strategy and the evidence it produces.


Verification as the “right side of the V”: use AI on both creation and checking

“This duality is fundamental and if we are going to use AI we should use AI on both sides of this… V.”

– Bertrand Meyer (ACM, 2026)
Bertrand Meyer
Bertrand Meyer

Meyer highlights a fundamental duality: creating software vs verifying it. He references the V-model, not as an endorsement of rigid waterfall, but as a reminder that verification activities must match production activities. If AI increases the speed and volume of generated artifacts, verification must become even more systematic.

He also notes that reality is fractal/iterative: a “V within a V,” repeated at different granularities. This aligns with modern engineering: short cycles, continuous integration, regression suites, and iterative specification refinement.

For leaders, the operational point is that AI doesn’t remove the need for the right side of the V; it increases the throughput pressure on it. If verification capacity doesn’t scale, risk accumulates invisibly until it becomes incident-driven work.

“Specification, write code, verify? No, this is far too rigid. It’s not the way things work in practice.”

– Bertrand Meyer (ACM, 2026)

Why formal methods struggled and how iterative contracts can make them workable now

Meyer argues that one reason formal methods have struggled in broader adoption is process mismatch: many formal approaches implicitly assume stable requirements and a “specify first, then implement, then prove” pipeline. Real projects don’t behave that way; requirements change, designs evolve, and proofs (like debugging) are iterative.

“…design by contract… specifications… interspersed with the code so that you can work iteratively…”

– Bertrand Meyer (ACM, 2026)

His proposed remedy is an iterative approach where specification and implementation co-evolve, with contracts placed near code (design by contract). Verification becomes a daily practice: write code and contracts, try to verify, fail, diagnose, then revise either the implementation or the specification.

“Getting the proof to work is an iterative process… a form of debugging… debugging without execution.”

– Bertrand Meyer (ACM, 2026)

This is one of the places where I see a real opportunity for AI: not to “replace the prover,” but to reduce the friction in writing and maintaining specs, contracts, and tests, so more teams can adopt higher assurance practices without an all-or-nothing process overhaul.


The multi‑agent future: requirements → architecture → contracts → code → tests → proof → DevOps (human‑orchestrated)

“People right now are thinking very much of coding agents… but… coding is just one small part of the puzzle.”

– Bertrand Meyer (ACM, 2026)

Meyer returns to his opening teaser: if we are serious about correctness in an AI-accelerated world, we must use AI across the lifecycle, not only for code. That implies specialized agents for distinct responsibilities:

  • Requirements support (including glossary generation and constraint discovery)
  • Architecture support (component boundaries, interfaces, tradeoffs)
  • Contract/spec agent (from code to contracts and from contracts to code)
  • Coding agent
  • Test-generation agent (a “big unsolved issue,” labor-intensive today)
  • Testing agent (running, triaging, analyzing failures)
  • Proof/verification agent
  • DevOps/deployment agent
Multi Agent system with a Software Engineer at the center. Requirements support (including glossary generation and constraint discovery) Architecture support (component boundaries, interfaces, tradeoffs) Contract/spec agent (from code to contracts and from contracts to code) Coding agent Test-generation agent (a “big unsolved issue,” labor-intensive today) Testing agent (running, triaging, analyzing failures) Proof/verification agent DevOps/deployment agent
Multi Agent system with a Software Engineer at the center.

The essential governance concept is the human-in-the-center: hallucinations persist, tools are imperfect, and responsibility remains with the engineer. Therefore, orchestration requires competence and oversight.

In other words: if we build agentic workflows, we should treat each agent like a junior contributor—fast, helpful, sometimes wrong—whose output must be reviewed and verified in proportion to risk.


Conclusion

What changes for careers, outsourcing, and quality

“Quality is going to be key because… the faster we generate the software the faster we also generate bugs…”

Bertrand Meyer (ACM, 2026)

Meyer closes by answering the L vs E question. For casual development, AI is leveling: more people can build working artifacts quickly, reducing the value of low-skill outsourcing and routine coding tasks. For business and acute development, however, AI is enhancing: it amplifies the advantage of engineers who combine software engineering fundamentals with AI literacy and verification discipline.

He emphasizes personal responsibility: even if you “accept all,” you own the outcomes. And he warns that acceleration increases bug creation rate unless quality control keeps pace, turning verification into the strategic differentiator.

For my network, I’ll summarize it this way: GenAI adoption without a coherent verification strategy is just accelerating uncertainty. GenAI adoption with verification strategy is an opportunity to deliver faster and raise quality, especially in the systems we can’t afford to get wrong.

Summary – by Bertrand Meyer


  • AI support for Software Engineering is here to stay
  • “C” class development can be essentially automated

For A and B software:

  • AI is not yet ready for a full SE process
  • There is a culture clash between the creativity of AI and the rigor of Software Engineering
  • Hallucinations remain a major problem
  • The only sure way is formal verification and alternatively well planned V&V
  • We need a clear process
  • Real software development is iterative, stay Agile
  • We are only at the beginning

Strategies for Individuals


  • The irruption of AI into SE is both:
    • Type-L for C-class developments (Casual)
    • Type-E for the rest (Business and Acute)
  • Unclear prospects for low-skill outsourcing
    • AI is going to be cheaper than the cheapest programmers
  • Excellent prospects for companies to reclaim ownership (less outsourcing more ownership of culture)
  • Every software engineer must master Modern-AI
  • You are still responsible for the code!
  • Quality is key, particularly correctness, robustness, extendibility, reusability, “maintainability”
  • Lessons of software engineering still apply (More than ever)

“Quicksort is still quick sort; design patterns are still design patterns; correctness rules are object-oriented programming abstraction; everything still remains there’s no magic which makes these matters and these issues disappear.”

Bertrand Meyer (ACM, 2026)

Bertrand Meyer

Professor at ETH Zurich and CTO of Eiffel Software and Recognyze AI

Bertrand Meyer is Professor of Software Engineering and Provost at the Schaffhausen Institute of Technology in Switzerland and CTO of Eiffel Software (based in Santa Barbara). His is the author of several well-known books on software topics, particularly object technology, programming languages, software verification, and agile methods. He is a recipient of the ACM Software System Award and the IEEE Harlan Mills prize and an ACM Fellow. His previous ACM TechTalks were devoted to Design by Contract, agile methods, and concurrent programming.

Leave a Reply

Managing Change in Regulated MedTech Software

🧭 TL;DR Who this article is for This article is for MedTech project managers, product managers, software leads, QA/RA, systems engineers, cybersecurity leads, and supplier-quality partners working with regulated software, SaMD and SiMD, connected devices, or hybrid medical-device programs. It is especially useful for teams trying to reconcile two pressures that often feel opposed: How…

Stakeholder Management in MedTech

Stakeholder management in MedTech is not a “soft skill.” It is part of the design-control and risk-control system that determines whether a product can be safely released, adopted, supported, and defended with objective evidence. This article is MedTech-first and experience-based. In regulated medical software, stakeholder management is not just communication. It is how expectations become…

Unpacking the Project Performance Domains

Project performance domains are one of those concepts that sit quietly underneath everything in modern project management: they’re not a “method,” but they often determine whether the work is coherent, repeatable, and value-realizing. I’m writing this from the perspective of a program manager in high‑stakes engineering (MedTech and other regulated environments). I started learning with…

From Sand to Silicon: Semiconductor Materials, Purification, Wafers, Films, and Dopants

Semiconductors are one of those fields where materials science and electronics meet in the most practical way: what a material is (and how clean, ordered, and intentionally “imperfect” it can be made) ultimately determines what circuits can do. I’m writing this series as an electronics engineer, and because earlier in my career I had the…

Agile in regulated medical device software: what TIR45:2023 really added

I believe in starting an Agile roadmap from an agnostic Agile perspective. I focus on the outcomes. I focus on flow. I focus on learning. I do this in any industry. I do it even more in safety-critical industries. That is why I paid close attention to the updates in AAMI TIR45:2023. This update matters.…

Laura López: The Economist Behind Costa Rica’s MedTech Boom

Laura López: The Economist Behind Costa Rica’s MedTech Boom You may know that Costa Rica has become a MedTech powerhouse but do you know that behind all these engineers and managers there is an economist thought leader helping shape the conditions for this economic miracle in the Americas? Laura López, CEO of PROCOMER (Costa Rica’s…

Something went wrong. Please refresh the page and/or try again.

Discover more from Costa Rica MedTech

Subscribe now to keep reading and get access to the full archive.

Continue reading