Skip to main content
Back to Blog
BDD
2026-05-03

Cucumber vs Behave: Python BDD Comparison 2026

Compare Cucumber, Behave, and pytest-bdd for Python BDD. Syntax, fixtures, parallel execution, pytest integration, IDE tooling, and recommendations for Python QA teams in 2026.

Cucumber vs Behave: Python BDD Comparison 2026

Python QA teams adopting Behavior-Driven Development face an awkward choice: Cucumber, the universally-recognized BDD reference implementation, has no first-party Python port. There is no cucumber-py maintained by the Cucumber organization. Instead, Python teams pick between Behave, pytest-bdd, and Radish -- three independent projects that all claim Cucumber-like behavior with subtly different semantics. The result is real confusion among teams new to Python BDD: "should we use Cucumber" usually means "should we use Behave or pytest-bdd."

This guide settles the question once and for all. We compare the three Python BDD frameworks against each other and against Cucumber-JVM, the canonical reference. We provide real Gherkin feature files, working step definitions in each framework, pytest integration patterns, parallel execution recipes, and recommendations based on team size, project type, and reporting needs.

By the end you will understand exactly why "use Cucumber for Python" is a misnomer, what each Python BDD framework brings to the table, and how to make the right decision for your project in 2026.

Key Takeaways

  • Cucumber has no official Python implementation -- the choice is between Behave, pytest-bdd, and Radish.
  • Behave is the closest to Cucumber-JVM in syntax, hooks, and philosophy.
  • pytest-bdd embeds Gherkin scenarios in pytest and is best for teams already deep in pytest.
  • Radish is the fastest parallel runner but has the smallest community.
  • For new Python BDD projects, Behave is the safe default; pytest-bdd is a strong second.

1. Why "Cucumber for Python" Is Misleading

When teams ask about Cucumber for Python, they almost always mean one of three things: "we want Gherkin syntax in Python," "we want a Cucumber-style BDD framework that runs on Python," or "we want to share .feature files between Python and another language's Cucumber." The good news is that all three Python BDD frameworks parse standard Gherkin, so feature files written for Cucumber-JVM or Cucumber.js will execute against Behave, pytest-bdd, or Radish with only minor changes. The bad news is that step definitions are different across all three, and migrating them is not trivial.

2. Feature Files: 95% Portable Across Frameworks

Here is a feature file that works in all three Python BDD frameworks and in Cucumber-JVM with zero changes:

Feature: User registration
  As a visitor
  I want to create an account
  So that I can use the application

  Background:
    Given the application is running

  @smoke @registration
  Scenario: Successful registration
    When I register with email "new@example.com" and password "Sup3rS3cret!"
    Then the response status should be 201
    And a welcome email should be queued

  @validation
  Scenario Outline: Registration fails for invalid input
    When I register with email "<email>" and password "<password>"
    Then the response status should be <status>
    And the error message should contain "<error>"

    Examples:
      | email             | password    | status | error                |
      | not-an-email      | Sup3rS3cret!| 400    | Invalid email        |
      | valid@example.com | short       | 400    | Password too short   |
      | duplicate@x.com   | Sup3rS3cret!| 409    | Email already exists |

The differences between frameworks emerge in how you bind these steps to Python code.

3. Behave: The Reference Python BDD Framework

Behave is the oldest active Python BDD framework and the closest match to Cucumber-JVM's philosophy. It uses a global context object passed to every step function, organizes hooks in an environment.py file, and has a sensible default project structure.

# features/steps/registration_steps.py
from behave import given, when, then
import requests

@given("the application is running")
def step_app_is_running(context):
    response = requests.get(f"{context.base_url}/health")
    assert response.status_code == 200, "App health check failed"

@when('I register with email "{email}" and password "{password}"')
def step_register(context, email, password):
    context.response = requests.post(
        f"{context.base_url}/users",
        json={"email": email, "password": password},
    )

@then("the response status should be {status:d}")
def step_response_status(context, status):
    assert context.response.status_code == status, (
        f"Expected {status}, got {context.response.status_code}: {context.response.text}"
    )

@then('a welcome email should be queued')
def step_welcome_email(context):
    queue = context.email_queue.list()
    addresses = [m["to"] for m in queue]
    assert "new@example.com" in addresses

@then('the error message should contain "{text}"')
def step_error_contains(context, text):
    body = context.response.json()
    assert text in body.get("error", ""), f"Expected error to contain '{text}'"

The environment.py file centralizes setup, teardown, and tag-based hooks:

# features/environment.py
import os
from email_test_queue import EmailQueue

def before_all(context):
    context.base_url = os.getenv("BASE_URL", "http://localhost:3000")
    context.email_queue = EmailQueue.connect()

def before_scenario(context, scenario):
    context.email_queue.clear()
    if "smoke" in scenario.tags:
        context.config.userdata["timeout"] = 10

def after_scenario(context, scenario):
    if scenario.status == "failed":
        with open(f"reports/{scenario.name}.log", "w") as f:
            f.write(getattr(context, "response", "").text if hasattr(context, "response") else "")

Behave's strengths are the rich hook system, friendly tag expressions, and the simple project structure. It does have weaknesses: parallel execution requires a third-party tool (behavex), and its IDE support outside PyCharm is mediocre in 2026.

4. pytest-bdd: Gherkin Inside pytest

pytest-bdd takes a different approach: instead of running scenarios with its own runner, it generates pytest test functions from your feature files. This means scenarios benefit from pytest's enormous ecosystem -- pytest-xdist for parallel execution, pytest-html for reports, pytest-cov for coverage, all the conftest.py fixture magic.

# tests/features/registration.feature
# Same Gherkin as above, in tests/features/

# tests/test_registration.py
from pytest_bdd import scenarios, given, when, then, parsers
import pytest, requests

scenarios("features/registration.feature")

@given("the application is running")
def app_is_running(base_url):
    r = requests.get(f"{base_url}/health")
    assert r.status_code == 200

@when(parsers.parse('I register with email "{email}" and password "{password}"'))
def register(base_url, email, password, response_ctx):
    response_ctx["last"] = requests.post(
        f"{base_url}/users", json={"email": email, "password": password}
    )

@then(parsers.parse("the response status should be {status:d}"))
def response_status(status, response_ctx):
    assert response_ctx["last"].status_code == status

@pytest.fixture
def response_ctx():
    return {}

The conftest.py for the suite:

# tests/conftest.py
import os, pytest
from email_test_queue import EmailQueue

@pytest.fixture(scope="session")
def base_url():
    return os.getenv("BASE_URL", "http://localhost:3000")

@pytest.fixture(scope="session")
def email_queue():
    return EmailQueue.connect()

@pytest.fixture(autouse=True)
def reset_queue(email_queue):
    email_queue.clear()
    yield

pytest-bdd shines for teams already running pytest at scale. Adding BDD to an existing pytest suite is essentially free -- the same fixtures, the same reporting, the same CI pipeline. The cost is that the BDD experience is more "pytest with Gherkin sprinkled on top" than a pure BDD-first workflow, and pytest-bdd's parsers (parsers.parse, parsers.re, parsers.cfparse) are slightly more verbose than Behave's annotations.

5. Radish: The Performance Pick

Radish is the third option, with a smaller community but excellent runtime performance. It supports Gherkin plus extensions like preconditions and scenario loops, and has built-in parallel execution.

from radish import given, when, then

@given("the application is running")
def app_is_running(step):
    step.context.base_url = "http://localhost:3000"

@when('I register with email "{email}" and password "{password}"')
def register(step, email, password):
    import requests
    step.context.response = requests.post(
        f"{step.context.base_url}/users",
        json={"email": email, "password": password},
    )

@then("the response status should be {status:d}")
def response_status(step, status):
    assert step.context.response.status_code == status

Radish's syntax is similar to Behave but with subtle differences (the step parameter is required, context lives on it). For most teams, the smaller community is a deal-breaker, but if you have a 5,000+ scenario suite and need maximum throughput, Radish is worth evaluating.

6. Side-by-Side Feature Comparison

FeatureBehavepytest-bddRadishCucumber-JVM
Gherkin supportFullFullFull + extensionsFull
Pytest fixturesNoYesNoN/A
Parallel executionbehavex pluginpytest-xdistNativeJUnit 5 Platform
ReportingAllure, JUnit, JSONpytest-html, AllureCucumber JSONAllure, ExtentReports
IDE supportPyCharm strongPyCharm + VS CodeLimitedIntelliJ first-class
Community size (2026)LargeVery largeSmallMassive
MaturityStableStableStableVery stable
DocumentationGoodGoodSparseExcellent

7. Parallel Execution Across the Three

Sample Behave parallel run using behavex:

behavex features/ --parallel-processes 4 --parallel-scheme scenario

Sample pytest-bdd parallel run using pytest-xdist:

pytest tests/ -n 4 --dist=loadscope

Sample Radish parallel run (native):

radish features/ --parallel-runner=4

8. IDE Tooling

PyCharm Professional has excellent BDD support for Behave and pytest-bdd: step navigation, syntax highlighting, autocompletion, and integrated runners. VS Code's Cucumber extension supports Behave and pytest-bdd as of 2026 and is good enough for most workflows. Radish has only basic syntax highlighting.

9. AI Agent Integration

In 2026, AI agents like Claude and Cursor can generate Behave step definitions or pytest-bdd test files from acceptance criteria. The behave-python-bdd-complete-tutorial covers a SKILL.md pack that teaches Claude to author Behave scenarios with consistent style. See also cursor-skills-md-best-practices for how to package Python BDD conventions as a reusable skill.

10. Decision Framework

Pick Behave if: you want the closest Python equivalent to Cucumber-JVM, you have no existing pytest investment, or you value the simple environment.py hook model.

Pick pytest-bdd if: you already run pytest extensively, you want fixture-driven test setup, or you need pytest-xdist parallel execution for free.

Pick Radish if: you have a massive suite (5,000+ scenarios) and need maximum throughput, and you can accept a smaller community.

11. Real-World Adoption Patterns

Greenfield Python projects

Most greenfield Python teams in 2026 choose pytest-bdd when they're already heavily invested in pytest fixtures. The combination of pytest's parallel execution via pytest-xdist, rich fixture composition, and a thriving plugin ecosystem makes pytest-bdd the path of least resistance. Behave is chosen by teams who want a cleaner separation between BDD and unit testing or who are coming from Cucumber-JVM and want a familiar mental model.

Migrations from Behave to pytest-bdd

Several teams have publicly documented migrations from Behave to pytest-bdd, citing pytest's broader ecosystem as the primary motivator. The migration is non-trivial but mechanical: feature files port unchanged, environment.py logic migrates to conftest.py, and step definitions get rewritten to use pytest fixtures instead of the context object.

Hybrid pytest + Behave setups

A small but growing pattern: use pytest for unit and integration tests, Behave for end-to-end acceptance scenarios. The frameworks coexist in the same repo, and CI runs them in parallel. This works well when teams want stakeholder-facing scenarios in Behave's familiar format while engineers continue to use pytest fixtures for everything else.

12. Step Definition Patterns at Scale

Both Behave and pytest-bdd benefit from disciplined step organization. The patterns that work well at scale:

Domain-driven step modules

Organize steps by domain (accounts_steps.py, checkout_steps.py, orders_steps.py) rather than by Gherkin keyword. Each module owns the steps that touch one bounded context. This keeps step definitions navigable as the suite grows.

Page object pattern

For UI scenarios driven by Playwright or Selenium, create page object modules that step definitions import. The page object handles selectors and waits; the step definition handles assertions and business intent.

Factory-driven test data

Use factory_boy or model_bakery to produce valid domain objects with sensible defaults. Step definitions then call factories rather than constructing entities manually.

Async support

Modern Behave (1.2.6+) and pytest-bdd both support async step definitions. For Playwright async API or aiohttp-based tests, this is essential. Use async def step functions and @pytest.mark.asyncio markers in pytest-bdd.

13. Parallel Execution Deep Dive

behave + behavex

behavex parallelizes Behave scenarios across multiple worker processes. Each worker gets its own context and runs independent scenarios. Configure with --parallel-processes N and --parallel-scheme scenario or feature. State isolation requires per-process databases or scenario-unique data keys.

pytest-bdd + pytest-xdist

pytest-xdist works seamlessly with pytest-bdd. Run pytest -n auto to distribute tests across CPU cores. The --dist=loadscope option ensures all scenarios from one feature file run on the same worker, which is useful when scenarios share fixtures.

Radish native parallelism

Radish supports --parallel-runner=N out of the box. It's the fastest of the three for raw runtime but the smaller community means less battle-tested support.

14. Reporting Across the Three

FormatBehavepytest-bddRadish
JUnit XMLBuilt-inBuilt-in (pytest)Built-in
HTMLallure-behavepytest-htmlBuilt-in
AllureMature pluginMature pluginLimited
JSONBuilt-inPluginBuilt-in
Cucumber JSONBuilt-inPluginBuilt-in

allure-behave and allure-pytest are both mature in 2026 and produce comparable reports. For teams that value branded HTML output, allure is the safe choice across all three.

15. CI/CD Integration Patterns

A representative GitHub Actions matrix for Behave:

jobs:
  behave:
    runs-on: ubuntu-22.04
    strategy:
      matrix: { shard: [1, 2, 3, 4] }
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: '3.13' }
      - run: pip install -r requirements.txt
      - run: behavex --parallel-processes 2 --tags @smoke
      - uses: actions/upload-artifact@v4
        if: always()
        with: { name: behave-report-${{ matrix.shard }}, path: reports/ }

For pytest-bdd, the CI workflow is identical to any pytest suite -- standard pytest-xdist parallelism, standard pytest reporters, standard fixtures. This is the major draw for teams already running pytest at scale.

16. IDE Tooling Comparison

IDEBehavepytest-bddRadish
PyCharm ProfessionalExcellentExcellentLimited
VS CodeGood (Cucumber ext)GoodLimited
Vim/NeovimPluginsPluginsLimited
Sublime TextLimitedLimitedLimited

PyCharm Professional's BDD support is the gold standard in 2026 -- step navigation, autocomplete, refactoring, and integrated debugging all work seamlessly. VS Code's Cucumber extension supports Behave and pytest-bdd unified.

17. AI Agent Integration

In 2026, AI agents like Claude and Cursor are equally fluent in Behave and pytest-bdd. The QASkills directory has SKILL.md packs for both that teach the AI to generate step definitions matching your house style. Install via:

npx @qaskills/cli add behave-python
# or
npx @qaskills/cli add pytest-bdd

Then prompt:

Generate Behave step definitions for the checkout flow that covers happy path and three error cases. Use our Playwright page objects in src/pages/.

See behave-python-bdd-complete-tutorial and cursor-skills-md-best-practices for concrete prompts and skill setup.

18. Frequently Asked Questions

Q: Can I share feature files between Behave and pytest-bdd? A: Yes -- both parse standard Gherkin. Step definitions must be rewritten per framework.

Q: Which is faster, Behave or pytest-bdd? A: pytest-bdd inherits pytest's optimized runner and is typically 20-40% faster on identical suites. Behave's runtime overhead is higher.

Q: Is Radish dead? A: No -- still maintained, but the community is small. For teams needing maximum throughput, it's worth evaluating; for everyone else, Behave or pytest-bdd are safer choices.

Q: How do I share fixtures across pytest-bdd scenarios? A: Define fixtures in conftest.py at the directory level. Standard pytest fixture composition rules apply.

Q: Can I use Behave with Playwright? A: Yes -- store the playwright instance and context in the Behave context object, similar to the pattern in behave-python-bdd-complete-tutorial.

Conclusion

There is no "Cucumber for Python" in the strict sense, but the Python BDD ecosystem has three excellent frameworks that cover Cucumber's functionality. For most teams in 2026, the right choice is Behave or pytest-bdd, and the deciding factor is usually existing pytest investment. Whichever you pick, package your conventions as a reusable SKILL.md from the QASkills directory so AI agents author scenarios in your house style. See behave-python-bdd-complete-tutorial for a deeper Behave walk-through.

Cucumber vs Behave: Python BDD Comparison 2026 | QASkills.sh