Immutable Python Dictionary

Pytest fixture scope can cause test config side effects

When running unit tests in Python you might have a lot of config and environment variables in a shared fixture that is loaded in for every test. Session scoped fixtures are nice for loading something once and keeping it around for all tests, but that can be a problem if one of your tests decides to change something in the config.

1
2
3
4
5
6
7
8
9
@pytest.fixture(scope='session')
def config(request):
    return {
        'bar': 'biz'
    }

def test_foo(config):
    config['bar'] = 'bat'
    func_under_test(config)

Running this test by itself might succeed, but then it might fail when your entire test suite runs if other tests depend on that setting being a certain way.

The simple fix is to make the config() fixture here not session scoped. Thus every test will get a fresh copy of the config.

But what if that isn’t an option?

I work a lot with PySpark and have a lot of unit tests that test dataframes. To help with this, I use a tool called Hypothesis which, among other things, can intelligently generate sample dataframes to help you find edge cases in your logic.

So I have tests that can look like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
from hypothesis import given, HealthCheck, settings, strategies as st
from hypothesis.extra.pandas import column, data_frames
from pyspark.sql.types import StringType
import pytest

@pytest.mark.parameterize('class_name, full_history', [
    (Pipeline1, True),
    (Pipeline1, False),
    (Pipeline2, True),
    (Pipeline2, False)])
@given(data_frames([
    column('id', elements=st.integers(min_value=1, max_value=9999999999)),
    column('type', dtype=StringType,
           elements=st.sampled_from(['Admin', 'User'])),
    column('joined', elements=st.datetimes()),
    column('birthday', elements=st.dates()),
    column('name', elements=st.text()),
]))
@settings(max_examples=5, suppress_health_check=[HealthCheck.too_slow])
def test_foo(spark, config, class_name, full_history, pdf):
    df = spark.createDataFrame(pdf)
    result = class_name(spark, config, full_history)

This test has two separate things going on at the high level:

pytest.mark.parameterize() will call the test 4 times with the parameters defined. It will reload any fixtures that aren’t session scoped for each of those calls. It’s like having 4 separate tests, but since the test code is identical you can refactor them into a single parameterized test like this one.

given(data_frames([])) is a Hypothesis decorator that generates a Pandas dataframe (pdf) for the test. Not only that, but it’ll generate MANY dataframes, using the rules defined (min/max, sample from, etc.) to test the boundaries of the different datatypes. Here I’ve set max_examples to 5 so it doesn’t do this too many times.

However, running the test generates the following warning:

>>> hypothesis.errors.FailedHealthCheck:
test/test_foo.py::test_foo[False] uses the 'config' fixture, which is reset between function calls but not between test cases generated by `@given(…)`.
You can change it to a module- or session-scoped fixture if it is safe to reuse; if not we recommend using a context manager inside your test function.
See https://docs.pytest.org/en/latest/fixture.html#sharing-test-data for details on fixture scope.
See https://hypothesis.readthedocs.io/en/latest/healthchecks.html for more information about this.
If you want to disable just this health check, add HealthCheck.function_scoped_fixture to the suppress_health_check settings for this test.

So unlike parameterize(), Hypothesis will NOT reload fixtures between function calls. In my case this wasn’t a real concern since none of these tests were making changes to the config fixture. But what if another developer makes a change? What if later I need to? I had in the past intentionally not made config() session scoped to avoid weird side effects, but this new warning prompted me to revisit the question:

Can you make a Python Dictionary Immutable?

The answer is “Yes!”

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
from types import MappingProxyType

@pytest.fixture(scope='session')
def config(request):
    return MappingProxyType({
        'bar': 'biz'
    })

def test_foo(config):
    # Cannot do this. You will now get an error:
    # TypeError: 'mappingproxy' object does not support item assignment
    # config['bar'] = 'bat'
    new_config = config.copy()
    new_config['bar'] = 'bat'
    func_under_test(new_config)

def test_bar(config):
    func_under_test({**config, **{'bar': 'bat'}})

Now if a certain test needs to make a change in the config fixture, the developer will have to make a copy that is scoped to just that test. There are a couple of ways to do this, I’ve put two of them above (the second one I like a lot).

Many will tell you that you should never need an immutable dictionary and for the most part I agree. Chances are if you need this capability you can get a better result by doing a smart refactor.

I briefly thought about where this same change might benefit my main code, but have thus far not done so as the application doesn’t have this problem and using this feature would add complexity with no real benefit.

However I am happy to have this new tool in my back pocket.

Resources:

Edward Romano Written by:

I dabble in, and occasionally obsess over, technology and problems that bug me