Comparisons

AI Slop in AWS Practice Exams: Why It Happens and How to Spot a Provider Who Validates

Two separate Reddit threads this week called out factual errors in paid AWS practice exams. One flagged Stephane Maarek. One pointed at Frank Kane. In both cases the explanation contradicted itself within the same paragraph - the kind of confident-but-wrong output that has become the fingerprint of AI-generated content shipped without a validation step. Honest answer up front: the problem is not that these providers are using AI. The problem is that they are using AI without a validation pipeline behind it. Anyone scaling cert-prep content in 2026 (and at this point, that is nearly everyone) faces the same failure mode. The differentiator is what happens after the model finishes generating.

ReadRoost Team

Study & certification team

May 3, 20267 min read

AI Slop in AWS Practice Exams: Why It Happens and How to Spot a Provider Who Validates

What this week's Reddit threads actually showed

The Stephane Maarek thread (45 upvotes, 15 comments) showed a question where the answer key said one option was correct and the explanation paragraph then argued for a different option in the same breath. The Frank Kane thread (11 upvotes, 4 comments) showed the same failure pattern: an answer that contradicts its own explanation. The TutorialsDojo discussion thread (16 comments) has multiple candidates corroborating that all three major paid providers have shipped questions like this in the last six months.

The community word for it is AI slop. The technical name is unverified LLM output. Both names point to the same underlying issue: a large language model generated a question and an explanation, the output went through no second pass, and the contradictions that LLMs reliably produce when they hallucinate were left in the corpus.

If you are a candidate paying $15-30 for a practice exam set and drilling these questions for hours, you are not just wasting money. You are actively training yourself on wrong information. That is worse than not drilling at all.

Why "just don't use AI" is the wrong answer

The intuitive reaction is to find a provider that does not use AI. There are two problems with that.

First, you cannot reliably tell. The cert-prep industry quietly transitioned to AI-assisted content production over the last 18-24 months because hand-writing 500-question banks per cert track at the rate exams update is not commercially viable. Most providers are using AI in some part of their pipeline now, whether they say so or not. Some are honest about it. Some are not. The ones who say they have only "human-written" content are usually either lying or running on a backlog from 2022 that is rapidly going stale as exam versions update.

Second, hand-written content has its own failure modes. A single human author writing 500 questions across eight CISSP domains will get tired by question 200, will reuse phrasings, will introduce factual drift in domains they personally are weaker in, and will not catch their own mistakes on the second pass. Hand-written is not synonymous with accurate.

The honest framing is: the question is not whether AI is involved. The question is what the validation pipeline looks like.

What a real validation pipeline does

Treat any cert-prep provider's content like you would treat a Wikipedia article. The text might be right. It is more trustworthy when you can see the citation back to an authoritative source, and you can verify it yourself in 10 seconds.

A validation pipeline that catches AI slop has at least four stages:

1. Generate. A model produces the candidate question, four distractors, the correct answer, and the explanation. This is the part Maarek and Frank Kane appear to be doing - and stopping at.

2. Cross-check against the authoritative source. A second model (or the same one in a different role) reads the official documentation for the cert in question and verifies that the answer and explanation match the source. AWS-SAA-C03 questions get checked against AWS Skill Builder and the AWS docs. Azure questions get checked against Microsoft Learn. CompTIA questions get checked against the CompTIA exam objectives PDF. Anything in the explanation that cannot be traced back to a sentence in the source is flagged.

3. Detect contradictions and unsupported claims. The validator reads the question stem and the explanation as a single passage and checks for internal contradictions ("the answer is A because of X" followed by "X means option B is correct"). This is the exact failure mode the Reddit threads called out, and it is mechanically detectable when you actually look for it.

4. Cite the source. The final question carries a reference back to the section of the official material it was validated against. If you challenge an answer, the citation tells you exactly where to look. If the citation does not exist or does not actually contain the claim, the question is held back until a human reviews.

Each of these stages is cheap to run as a separate pass. Skipping them is what produces AI slop. Running them is what produces practice questions you can actually trust.

ReadRoost's pipeline, in plain language

We use AI. We are honest about it because the alternative - hand-waving about "expert-verified" content while AI is in the pipeline anyway - is the same provenance lie that ends careers in this industry. Here is what we actually do, end to end.

Generation: Kimi K2 (Moonshot AI's model) produces the question and explanation. We chose Kimi K2 because the failure modes it produces are different from the failure modes Claude Opus produces, which means Opus reading Kimi's output catches more issues than Opus reading its own output would.

Validation: Claude Opus reviews each question against the official learning materials for that exam. AWS questions get checked against AWS Skill Builder and the AWS docs. Azure questions against Microsoft Learn. CompTIA against the exam objectives PDF. AI/ML questions against the relevant cloud provider's AI documentation.

Flagging: Anything Opus cannot verify against a citable source gets flagged. Flagged questions go to a manual queue. They are not in the live question bank.

Citation: Every question that ships carries an internal reference to the source paragraph it was validated against. If a candidate challenges the answer, we can show them exactly where in the official material the answer comes from.

High-stakes review: For domains where being wrong has career consequences (CISSP, CCSP, AWS Security Specialty, anything involving incident-response procedures), a CISSP-certified human reviewer reads the questions before they ship. This is the only stage that is fully manual, and it is the stage where the editorial position - what counts as the BEST answer for CISSP, for example - lives.

We have shipped questions with errors. We will ship more. The honest claim is not that we are perfect; it is that the validation pipeline catches the kind of contradictions Maarek's audience caught last week. When we miss something, we update the question and the validation criteria so the same class of error gets caught next time.

How to evaluate any practice question provider

If you are choosing between providers right now, ask the questions that distinguish a real validation pipeline from a marketing claim:

1. Where does this answer come from? Pick a question. Read the explanation. Check whether you can verify the claim against the official documentation in 30 seconds. If you cannot find the source, that is a yellow flag. If you find the source and it disagrees with the explanation, that is a red flag.

2. What does the provider say about AI? A provider who says "we use AI for X, and here is our validation step Y" is more trustworthy than a provider who says "we are 100% human-written and we promise." The first answer is verifiable. The second is faith-based marketing in an industry that has been quietly running AI pipelines for two years.

3. What is the update cadence when an exam version changes? AWS SAA-C03 had its last refresh in late 2025. Azure AZ-104 had a content update in early 2026. If a provider is still selling questions referencing retired services or deprecated features six months later, the pipeline behind that content is broken regardless of whether AI is involved.

4. How do they handle reported errors? A good provider lets you flag a question, includes the source they validated against in their response, and updates the question publicly with a note about the change. A bad provider deletes your forum post.

What to do this week

If you are mid-prep on AWS SAA-C03 or DVA-C02 and have been using Maarek or Frank Kane, do not panic-bin them. The bulk of their content is still useful. But:

Spot-check 10 questions across the domains you are about to sit. For each, find the cited source in the AWS documentation. If you cannot find the source, or you find it and it disagrees, mark that question as suspect and move on. Do not memorise contested answers.

Cross-validate with a second provider's question on the same topic. If both say the same thing, you are probably safe. If they disagree, defer to the one citing the official AWS docs.

If you want a question bank built around this validation discipline from the start, our SAA-C03 pack carries 500+ questions through the pipeline above. The free preview shows the first 20 questions including the citations. Decide whether the validation cuts through.

ReadRoost Team

We turn crowdsourced pass reports and official exam objectives into practice questions, flashcards and timed exams — so you study what the exam actually tests. New guides every week.

Frequently Asked Questions

Is Stephane Maarek's course still worth buying for SAA-C03?

The video course is still excellent. The video content has not been called out for accuracy issues - this is specifically about the practice exam sets bundled alongside the course. You can buy the video course, and pair it with a different practice exam provider where you can verify the citations. Many candidates have been doing this for the last six months.

How do I tell if a provider is using AI in their pipeline?

You usually cannot tell from the marketing copy because providers who use AI without admitting it have an incentive not to. The proxy signals are: are explanations cited back to the official documentation? Is the update cadence keeping up with exam version changes? When you flag an error, do they cite a source in their response? A provider doing those three things probably has a real validation pipeline whether or not AI is involved.

Why is ReadRoost transparent about using Kimi K2 and Claude Opus?

Because cert prep is a reputation business and getting caught in a provenance lie is unrecoverable. The Maarek and Frank Kane threads this week prove the cert-candidate community is sharp enough to spot AI failure patterns. If we shipped under a 'human-verified' marketing claim and a careful candidate caught a wrong answer, the brand damage would be worse than the failure itself. Honest is also better marketing in a market that is rapidly losing patience with provenance theatre.

If everyone uses AI, does the validation pipeline really matter?

Yes, more than ever. When everyone is generating with the same set of foundation models, the differentiator is downstream of generation: which validator do they use, what authoritative sources does it check against, what is the flag rate for unsupported claims, and what is the update cadence when the exam changes. Generation is becoming a commodity. Validation is where quality comes from.

What if I find an error in a ReadRoost question?

Report it via the in-platform feedback button on any question. We respond with the source we validated the question against, and update the question and the validation criteria publicly so the same class of error gets caught next time. We have done this more than once. We will do it again.

Why use Kimi K2 instead of just Claude Opus for generation?

Different models produce different failure modes. Claude Opus reading Claude Opus's own output is a weaker validator than Claude Opus reading a different model's output, because Opus has correlated blind spots with its own generation. Using Kimi K2 for generation and Opus for validation means the validator catches errors the generator made that a same-family validator would miss. This is a standard ensemble-validation pattern.

How can I verify a single question against the official source myself?

For AWS: search the AWS documentation site for the service named in the question. Most authoritative answers are in the user guide for the relevant service or in AWS Skill Builder's official course material. For Azure: Microsoft Learn (`learn.microsoft.com`) is the primary source - search for the service and the specific feature mentioned. For CompTIA: the official exam objectives PDF for your exam version is the authoritative reference. If a practice provider's explanation cannot be reconciled to those sources in two minutes of searching, treat the answer as suspect.

Master your exam

Reading is good. Practising is better.

Practice questions, flashcards and timed exams for 57 certifications. Start with a free starter pack — no card needed.

Start free →View pricing

14-day money-back guarantee.