Python 3.8 in Production: Walrus Operator and Performance Gains
Python 3.8 was released in October 2019. I’ve been testing it for 3 months and just migrated our production services.
The walrus operator is controversial but useful. Performance improvements are significant. Here’s my experience.
Table of Contents
The Walrus Operator (:=)
Assignment expressions - assign and use in one line.
Before:
# Read file in chunks
while True:
chunk = file.read(8192)
if not chunk:
break
process(chunk)
With walrus operator:
# Cleaner
while chunk := file.read(8192):
process(chunk)
More examples:
# List comprehension with filter
# Before
data = [expensive_function(x) for x in items]
filtered = [y for y in data if y > 0]
# After - compute once
filtered = [y for x in items if (y := expensive_function(x)) > 0]
# Regex matching
# Before
match = pattern.search(text)
if match:
print(match.group(1))
# After
if match := pattern.search(text):
print(match.group(1))
Positional-Only Parameters
Force parameters to be positional:
def calculate_price(base, /, tax, *, discount):
"""
base: positional-only (before /)
tax: positional or keyword
discount: keyword-only (after *)
"""
return base * (1 + tax) - discount
# Valid
calculate_price(100, 0.1, discount=10)
calculate_price(100, tax=0.1, discount=10)
# Invalid
calculate_price(base=100, tax=0.1, discount=10) # Error!
Useful for APIs where parameter names might change:
def process_data(data, /):
# Can rename 'data' parameter later without breaking callers
pass
f-string = for Debugging
New = specifier:
user = "Alice"
age = 30
# Before
print(f"user: {user}, age: {age}")
# After
print(f"{user=}, {age=}")
# Output: user='Alice', age=30
Great for debugging:
result = expensive_calculation()
print(f"{result=}")
# Output: result=42
TypedDict Improvements
Required vs optional keys:
from typing import TypedDict
class User(TypedDict, total=False):
name: str # Optional
age: int # Optional
class RequiredUser(TypedDict):
name: str # Required
age: int # Required
# Mix required and optional
class MixedUser(TypedDict):
name: str # Required
age: int # Required
class OptionalFields(MixedUser, total=False):
email: str # Optional
Literal Types
Specify exact values:
from typing import Literal
def set_status(status: Literal["active", "inactive", "pending"]) -> None:
print(f"Status: {status}")
set_status("active") # OK
set_status("deleted") # mypy error!
Useful for enums without enum class:
HttpMethod = Literal["GET", "POST", "PUT", "DELETE"]
def make_request(method: HttpMethod, url: str) -> None:
pass
make_request("GET", "/users") # OK
make_request("PATCH", "/users") # mypy error!
Performance Improvements
Python 3.8 is faster:
1. Faster function calls - New vectorcall protocol
import timeit
def simple_function(a, b, c):
return a + b + c
# Python 3.7: 0.15 µs
# Python 3.8: 0.11 µs (27% faster)
2. Faster dict operations
# Dict iteration is faster
d = {i: i*2 for i in range(1000)}
# Python 3.7: 12.5 µs
# Python 3.8: 10.2 µs (18% faster)
3. Faster pickle
import pickle
data = [{"id": i, "name": f"user{i}"} for i in range(1000)]
# Python 3.7: 1.2ms
# Python 3.8: 0.8ms (33% faster)
Real-World Performance
Our services after upgrading:
| Service | Python 3.7 | Python 3.8 | Improvement |
|---|---|---|---|
| API Gateway | 520 req/s | 580 req/s | 11.5% |
| User Service | 425 req/s | 475 req/s | 11.8% |
| Email Service | 310 req/s | 340 req/s | 9.7% |
Free performance boost!
Shared Memory for Multiprocessing
New shared_memory module:
from multiprocessing import shared_memory
import numpy as np
# Create shared memory
shm = shared_memory.SharedMemory(create=True, size=1000)
# Write data
buffer = shm.buf
buffer[:4] = bytearray([1, 2, 3, 4])
# Access from another process
shm2 = shared_memory.SharedMemory(name=shm.name)
print(bytes(shm2.buf[:4])) # [1, 2, 3, 4]
# Cleanup
shm.close()
shm.unlink()
Useful for sharing large arrays between processes:
# Share numpy array
arr = np.array([1, 2, 3, 4, 5])
shm = shared_memory.SharedMemory(create=True, size=arr.nbytes)
shared_arr = np.ndarray(arr.shape, dtype=arr.dtype, buffer=shm.buf)
shared_arr[:] = arr[:]
functools.cached_property
Cache property values:
from functools import cached_property
class DataProcessor:
def __init__(self, data):
self.data = data
@cached_property
def processed_data(self):
# Expensive operation
print("Processing...")
return [x * 2 for x in self.data]
processor = DataProcessor([1, 2, 3])
print(processor.processed_data) # Processing... [2, 4, 6]
print(processor.processed_data) # [2, 4, 6] (cached, no print)
importlib.metadata
Access package metadata:
from importlib.metadata import version, requires
# Get package version
print(version('flask')) # 1.1.1
# Get dependencies
print(requires('flask'))
# ['Werkzeug>=0.15', 'Jinja2>=2.10.1', ...]
Useful for debugging dependency issues.
Migration Experience
Upgraded 8 services in 2 weeks:
Week 1: Test in staging
- Update Dockerfile:
FROM python:3.8-slim - Run tests
- Fix compatibility issues
Week 2: Production rollout
- Deploy to production one service at a time
- Monitor performance and errors
- Rollback plan ready (didn’t need it)
Compatibility Issues
1. Deprecated warnings
Some libraries use deprecated APIs:
# Warning: Using or importing the ABCs from 'collections' instead of 'collections.abc' is deprecated
from collections import Mapping # Old way
from collections.abc import Mapping # New way
2. Type hint changes
Some type hints behave differently:
# Python 3.7
from typing import Dict
d: Dict[str, int] = {}
# Python 3.8 - can use built-in types
d: dict[str, int] = {} # Requires from __future__ import annotations
3. Library compatibility
Check library support:
pip list --outdated
Most popular libraries support 3.8.
Using Walrus Operator Wisely
Good uses:
# Avoid repeated calls
if (user := get_user(user_id)) and user.is_active:
process_user(user)
# List comprehension
[y for x in data if (y := transform(x)) is not None]
# While loops
while (line := file.readline()):
process(line)
Bad uses (reduces readability):
# Too complex
if (x := a + b) > 10 and (y := x * 2) < 50:
print(x, y)
# Better
x = a + b
if x > 10:
y = x * 2
if y < 50:
print(x, y)
Docker Deployment
Updated Dockerfile:
FROM python:3.8-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["gunicorn", "-w", "4", "-b", "0.0.0.0:5000", "app:app"]
Testing Strategy
Run tests on both versions during migration:
# .gitlab-ci.yml
test-py37:
image: python:3.7
script:
- pip install -r requirements.txt
- pytest
test-py38:
image: python:3.8
script:
- pip install -r requirements.txt
- pytest
Monitoring After Upgrade
Tracked metrics:
- Response time (improved 10-12%)
- Error rate (unchanged)
- Memory usage (slightly lower)
- CPU usage (slightly lower)
No regressions!
Should You Upgrade?
Yes, if:
- You’re on Python 3.7 (easy upgrade)
- You want performance improvements
- You like new features (walrus operator, etc.)
Wait, if:
- You’re on Python 3.6 or earlier (bigger jump)
- Your dependencies don’t support 3.8
- You’re risk-averse (wait for 3.8.1+)
Future: Python 3.9
Python 3.9 is coming soon. Features I’m excited about:
- Dict merge operator (
|) - Type hint improvements
- String methods (
removeprefix,removesuffix) - More performance improvements
I’ll upgrade as soon as it’s stable.
Conclusion
Python 3.8 is a solid release. Performance improvements alone justify the upgrade.
Key takeaways:
- Walrus operator is useful but don’t overuse
- Performance improvements are significant (10-12%)
- Migration from 3.7 is painless
- New type hint features improve code quality
- Positional-only parameters improve API design
If you’re on Python 3.7, upgrade to 3.8. The performance gains are worth it.
Python keeps getting better. I’m excited for the future.