Table of Contents
Quick summary
CRISP-DM (Cross-Industry Standard Process for Data Mining) is a six-phase framework that covers the full lifecycle of a data science project, from business understanding through deployment. First published in 1999, it has since become the most widely used methodology for data science projects worldwide. Its real value for modern organizations is not the framework itself, but what it helps prevent: technically sound projects that never translate into business impact.
Key takeaways:
- CRISP-DM consists of six phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment.
- According to ScienceDirect (Procedia Computer Science, 2021), it is the de facto standard for data mining projects across industries and tools.
- The EU AI Act (effective from 1 August 2024) sets data quality requirements that closely align with the CRISP-DM phases for data preparation and governance.
- In practice, teams get the best results when CRISP-DM is combined with agile project management.
- Twentynext uses CRISP-DM as its guiding process model because it keeps the business challenge front and center rather than the technology.
Introduction (Copilot implementation)
Picture a data manager at a mid-sized manufacturing company. The business decides to invest in a predictive maintenance model. Three months later, an external agency delivers a model with impressive technical accuracy. But once implementation begins, the maintenance planners do not understand the output, the production data pipeline turns out to be inconsistent, and the business objective has shifted along the way. The model ends up gathering dust.

This plays out in organizations everywhere. Not because the data science is weak, but because the project process lacks structure. That is where CRISP-DM proves its value: it forces teams to define the business goal before they touch the data, and to check at the end whether that goal was actually achieved.
Twentynext uses CRISP-DM as the backbone of its data science projects precisely because it links the business question to the technical execution. As more organizations move into generative AI, that connection becomes even more important: an AI application without a clear business context is an expensive demo, not a sustainable solution. See also how data-driven decision-making consistently breaks down without that link.
This article was generated with LaunchMind — try it free
Get startedWhy business understanding is the most underrated phase (Services)
The first CRISP-DM phase, business understanding, shapes everything that follows. Yet it is also the phase teams are most likely to skip, rush, or treat as a formality.

The trap of solving the wrong problem
An operations manager at a logistics company asks for a model that predicts delivery times. Fair enough. But what is the goal behind that prediction? Is it to proactively update customers? Optimize workforce planning? Avoid penalty costs for missed deadlines? The answer determines which data matters, what level of accuracy is acceptable, and how the output should fit into existing workflows. CRISP-DM forces that conversation before the first data task even starts.
As Data Science PM puts it, a good project starts with a thorough understanding of what the customer is actually trying to achieve. That sounds obvious, but in practice teams often jump straight into the data the moment access is available.
Business criteria as the benchmark for success
CRISP-DM also requires teams to define measurable business criteria in the first phase. These are not technical metrics such as AUC or RMSE, but questions like: how much error is acceptable before decisions start going wrong in practice? Which KPIs should improve if the model works? That prevents project teams from celebrating technical success while business value never materializes.
Twentynext sees this pattern all the time in organizations taking their first steps into data science: the focus goes to the technology, not the business problem. Following CRISP-DM systematically flips that order.
Get started yourself:
- Write the business question in one sentence without using technical terms. If you cannot do that, the question is still too vague.
- Define at least two measurable success criteria in business language, not model performance metrics.
- Make sure all stakeholders agree on what success looks like before any data work begins.
- Identify which decision in the organization will change if the model works. If there is no clear decision, stop the project.
Data preparation: the phase that takes the most time and gets the least attention
In most organizations, a data scientist spends more than half of the project time collecting, cleaning, and transforming data. Even so, data preparation is still the phase most often underestimated in project planning.
Data is rarely where you expect it to be
A typical example: a BI manager at a professional services firm wants to analyze customer behavior over the past three years. The data turns out to be spread across a CRM system, a billing platform, and several Excel files maintained by account managers. Just consolidating those sources takes weeks, and the quality differences between systems require extensive validation.
CRISP-DM surfaces these issues during the data understanding phase and sends the team back to the business phase if the data quality falls short of the agreed business criteria. That is not a weakness of the framework, but one of its strengths: it is far better to reset the project than to build a model on unreliable data.
The link to the EU AI Act
This matters even more now that the EU AI Act (Regulation (EU) 2024/1689) is in force. Article 10 requires providers of high-risk AI systems to use high-quality datasets for training, validation, and testing. The required attention to data collection processes, data preparation, and potential bias maps directly to the CRISP-DM phases of data understanding and data preparation. Organizations that apply CRISP-DM consistently are, in effect, building the kind of documentation regulators are increasingly likely to expect.
Get started yourself:
- Map all data sources before modeling begins. A simple inventory table with source, format, owner, and update frequency is enough.
- Set a minimum quality threshold for each data source: for example, how many missing values are acceptable?
- Document all transformations and merge steps in a data dictionary. This speeds up later iterations and makes auditing easier.
- Compare the available data history to the time horizon of the business question. If you are using two years of data to support five-year decisions, that assumption needs to be made explicit.
Evaluation and deployment: the weakest point in most data science projects
A systematic literature review published in Procedia Computer Science (2021) found that most published CRISP-DM studies do not include a deployment phase. That says a lot: even the community that uses CRISP-DM most often tends to stop short of the final step.

A model without deployment has no business value
CRISP-DM intentionally treats evaluation as a two-layer exercise: first technical model evaluation (does the model hold up statistically?), then business evaluation (does it solve the original business problem?). Those questions are not the same. A model can perform brilliantly on paper and still answer the wrong business question, or present its output in a way users cannot act on.
Twentynext explicitly connects evaluation to deployment in client projects. More information about how Twentynext approaches data science projects is available via the projects page. End-user requirements are central to that process: how does the model output fit into the workflow? Who makes decisions based on it? How will the model be maintained when the underlying data changes?
CRISP-DM is cyclical, not one-and-done
One of the most important features of CRISP-DM is that it is cyclical. Once a model is deployed, the process starts again: lessons learned about data, assumptions, and user behavior feed into the next iteration. That makes it especially well suited to AI applications where models need to be retrained regularly on new data.
For clients in the Brainport region working with complex production data or knowledge-intensive processes, Twentynext applies this principle by building in a post-deployment review with end users, not just with the IT team.
Get started yourself:
- Schedule the business evaluation separately from the technical model evaluation. Involve a domain expert who can test the outcome against the success criteria defined in phase 1.
- Document at least three concrete usage scenarios: who uses the output, when, and what action follows.
- Define a monitoring cadence at deployment. For most predictive models, a quarterly review is a sensible starting point, depending on how quickly the underlying data changes.
- Use feedback from end users during the first six weeks after deployment as input for the second CRISP-DM cycle.
CRISP-DM versus other approaches: which one fits when?
| Feature | CRISP-DM | SEMMA (SAS) | KDD | Agile Data Science |
|---|---|---|---|---|
| Starting point | Business question | Modeling | Data exploration | Sprint planning |
| Business focus | High (phase 1 is mandatory) | Low (starts with data) | Low (academic) | Variable |
| Built-in iteration | Cyclical by design | Limited | Limited | High |
| Suitable for AI Act compliance | Yes (audit trail) | Partly | No | Depends on implementation |
| Team size | Small to mid-sized | Small | Small | Mid-sized to large |
| Adoption barrier | Low | Low (tool-specific) | High | Medium |
CRISP-DM is not the most formal project management framework. It more or less assumes a small, close-knit team working together. For larger teams, practitioners generally recommend combining CRISP-DM with an agile coordination method such as Scrum or Kanban. In practice, that combination, a shared iteration rhythm plus the analytical depth of CRISP-DM, tends to produce the most reliable results.
Practical example: CRISP-DM in a generative AI implementation
Imagine an IT manager at a mid-sized professional services firm with around 150 employees. The company wants to use generative AI to speed up contract analysis. The ambition sounds clear enough: legal staff should be able to review contracts faster. But what does faster actually mean? Which risks must the model flag? What kind of output is unacceptable?

Without CRISP-DM, a team might jump straight into choosing a language model and collecting contracts. With CRISP-DM, the work starts with a structured business understanding session: which decision is being accelerated, what is the acceptable error rate if a risky clause is missed, and how will the output be integrated into the workflow?
The data understanding phase reveals that contracts exist in three formats, one of which is only available as scanned documents. That means data preparation now includes an OCR step that no one originally planned for. During evaluation with legal end users, the team learns that the summary is useful, but the risk classification needs an additional validation layer before staff will trust it.
That is exactly the value Twentynext sees in AI implementations built around CRISP-DM: the structure forces realistic planning and helps prevent technically functional systems from being ignored in practice. For organizations exploring a first AI rollout, the Twentynext approach to AI implementations offers a starting point that connects the business challenge and the technology from day one.
In the Netherlands, more than one in five companies with at least ten employees were already using at least one form of AI technology in 2024, according to CBS data. That makes a structured implementation approach not a nice-to-have, but a necessity if AI investments are expected to pay off.
Key insights
CRISP-DM adds three practical benefits to modern data science projects:
- Direction before execution. By making the business question the mandatory starting point, CRISP-DM prevents teams from spending months building a model that solves the wrong problem.
- Traceability. Every phase produces documented decisions. That makes projects repeatable, auditable, and easier to adapt when the business context changes.
- A realistic deployment plan. The deployment phase forces teams to think through who will use the output, how they will use it, and what needs to happen if the model is updated.
For senior professionals interested in career opportunities around this way of working, it is worth reading why data engineers choose a specialist firm like Twentynext over a large corporate environment: the CRISP-DM approach offers both structure and room for professional growth.
Frequently asked questions
What are the six phases of CRISP-DM?
CRISP-DM defines six connected and iterative phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment. The process is not strictly linear. Teams regularly return to earlier phases when new insights emerge. That cyclical nature makes the framework suitable for both traditional data mining and modern machine learning and AI projects.
Why is CRISP-DM still relevant in the age of generative AI?
Generative AI increases the need for structure rather than reducing it. Larger models require better data quality, clearer business criteria, and a more thoughtful deployment plan. The EU AI Act also sets explicit requirements around data quality and governance for high-risk AI systems, requirements that align directly with the data preparation and evaluation phases of CRISP-DM. Organizations that implement generative AI without a structured process face both quality and compliance risks.
How does Twentynext apply CRISP-DM in data science projects?
Twentynext uses CRISP-DM as the process backbone for data science and AI implementations, with one important emphasis: the business understanding phase is always done together with the client, not just by the technical team. That helps prevent projects from being driven by available technology instead of the actual business challenge. Clients in Brainport and beyond benefit because the end result fits existing processes and decision-making structures.
What are the main limitations of CRISP-DM?
The main limitation of CRISP-DM is that it is not a full project management framework. It assumes a small, tightly aligned team and does not provide role definitions or coordination mechanisms for larger programs. That is why practitioners often combine it with agile methods such as Scrum or Kanban. A second limitation is that the framework does not explicitly address ethics, bias, and privacy, topics that have now become essential in AI projects.
How is CRISP-DM different from SEMMA and KDD?
SEMMA (developed by SAS) starts directly with the data and focuses mainly on model building, which makes the business context less central. KDD (Knowledge Discovery in Databases) is more academic in nature and focused on discovering patterns without explicitly linking them to predefined business goals. CRISP-DM stands out because it makes the business question the mandatory starting point, so technical outcomes are always tested against an agreed business objective.
Conclusion
CRISP-DM does not guarantee a successful data science project. What it does is improve the odds by forcing the right questions at the right time. That distinction matters: teams that treat it like a checklist get far less value from it than teams that use it as a way of thinking.
For organizations that want to implement AI rather than simply experiment with it, CRISP-DM offers a structure that supports both business value and compliance. The combination of a clear business phase, documented data preparation, and a concrete deployment step makes projects repeatable and reviewable, exactly what regulators and internal stakeholders increasingly expect.
Twentynext applies this model consistently, for clients in Brainport and beyond, because experience shows that a well-structured project process delivers better outcomes than racing to build the first model.
Sources
- meest gebruikte methodologie voor data science-projecten — Datascience-pm
- ScienceDirect (Procedia Computer Science, 2021) — Sciencedirect
- EU AI Act (Verordening (EU) 2024/1689) — Digital-strategy
- CBS-data — Cbs
- What is CRISP DM? — Data Science PM
- A Systematic Literature Review on Applying CRISP-DM Process Model — ScienceDirect (Procedia Computer Science)
- AI Act | Shaping Europe's digital future — Europese Commissie (digital-strategy.ec.europa.eu)
- Increasing use of AI by business — CBS (Centraal Bureau voor de Statistiek)


