Module 1: Foundational FP Concepts¶
Progression Note¶
By the end of Module 1, you'll master purity laws, write pure functions, and refactor impure code using Hypothesis. This builds the foundation for lazy streams in Module 3. See the series progression map in the repo root for full details.
Here's a snippet from the progression map:
| Module | Focus | Key Outcomes |
|---|---|---|
| 1: Foundational FP Concepts | Purity, contracts, refactoring | Spot impurities, write pure functions, prove equivalence with Hypothesis |
| 2: ... | ... | ... |
| ... | ... | ... |
M01C07: Type-Hinted Pure Functions & Higher-Order Pipelines – TypeVar, ParamSpec, Concatenate¶
Core question:
How do you use Python’s static typing (TypeVar, ParamSpec, Concatenate) to precisely describe pure functions and higher-order pipelines—so that composition errors are caught by the type checker instead of at 02:00 in production?
This core builds on Core 1's mindset, Core 2's contracts, Core 3's immutability, Core 4's composition, Core 5's refactorings, and Core 6's combinators by making them machine-checkable:
- TypeVar for generic pure functions over arbitrary types.
- ParamSpec to preserve full call signatures across decorators and wrappers.
- Concatenate to inject context/dependencies without lying to the type checker.
- Typed compose2 and Pipeline that reject incompatible stages, with notes on typing flow/pipe.
We continue the running project from Core 1-6: refactoring the FuncPipe RAG Builder, now with typed pipelines.
Audience: Developers comfortable with Core 6 combinators who now want static guarantees about their pure functions and higher-order utilities.
Outcome:
1. Declare generic pure functions with TypeVar (fmap, ffilter, foldl).
2. Implement type-safe compose/Pipeline utilities that reject incompatible stages.
3. Write decorators with ParamSpec that preserve original signatures.
4. Use Concatenate to bind context (config/logger/db) in a type-safe way.
5. Add Hypothesis properties proving typed pipelines preserve behavior while the type checker guards composition.
1. Conceptual Foundation¶
1.1 The One-Sentence Rule¶
Default to precise type hints on pure functions and higher-order utilities; if the type checker struggles, simplify the API instead of weakening everything to
Any.
1.2 Typed FP Pipelines in One Precise Sentence¶
A typed functional pipeline is a chain of pure functions whose input/output types line up via TypeVars, with decorators and context binders expressed using ParamSpec and Concatenate so that invalid pipelines fail to type-check.
1.3 Why This Matters Now¶
Typed FP enforces Core 6's combinators at compile time, enabling effect extraction (Core 8) and laws (Core 9); without it, mismatches hide until runtime.
1.4 Typed Spectrum Table (Recap with Focus on Typing)¶
| Level | Description | Example |
|---|---|---|
| Untyped | Any everywhere | def fmap(fn, xs): return [fn(x) for x in xs] |
| Partially Typed | Hard-coded types | def fmap(fn: Callable[[int], str], xs: list[int]) -> list[str]: ... |
| Fully Typed | Generics with TypeVar | def fmap(fn: Callable[[T], U], xs: Iterable[T]) -> list[U]: ... |
Note on Typing: This builds on the purity spectrum from earlier cores, focusing on how types make purity enforceable.
2. Mental Model: Untyped Jungle vs Typed Contracts¶
2.1 One Picture¶
Untyped Jungle Typed Contracts
+---------------------------+ +---------------------------+
| pipe: Any -> ... -> Any | | compose2(f: B->C, g: A->B) |
| decorators: ... -> Any | | -> (A->C) |
| context: *args/**kwargs | | with_context: |
| Everything compiles... | | Ctx, Callable[Concat... |
| ...until runtime crash | | -> Callable[P, R] |
+---------------------------+ +---------------------------+
2.2 Contract Table¶
| Clause | Violation Example | Detected By |
|---|---|---|
| Pipeline compatibility | Wrong intermediate type | mypy/pyright type error |
| Signature preservation | Decorator returns Callable[..., Any] | Type checker shows lost params |
| Context injection | Hidden ctx via args/*kwargs | No type hint for ctx |
| Generic reuse | Hard-coded types instead of TypeVar | Duplicate code, manual fixes |
| Refactor safety | Silent breakage on signature change | Type errors guide edits |
Note on Contracts: ParamSpec/Concatenate make these enforceable; types catch what runtime never could.
2.3 Bug Prevention Example¶
Untyped (bug slips through):
def bad_full_rag(docs: list[RawDoc], env: RagEnv) -> tuple[Chunk, ...]:
return tuple(
embed_chunk(doc) # Wrong: doc instead of chunk
for doc in docs
for chunk in chunk_doc(clean_doc(doc), env)
) # Runtime AttributeError on doc.text
Typed (mypy catches):
def bad_full_rag(docs: list[RawDoc], env: RagEnv) -> tuple[Chunk, ...]:
return tuple(
embed_chunk(doc) # mypy error: embed_chunk expects ChunkWithoutEmbedding, got CleanDoc
for doc in docs
for chunk in chunk_doc(clean_doc(doc), env)
)
Wins: Type checker complains immediately; no runtime surprise. Running mypy here will point exactly at embed_chunk(doc) as type-incompatible.
3. Running Project: Typed Pipelines in RAG¶
Our running project (from module-01/funcpipe-rag-01/README.md) adds types to Core 6's combinators.
- Goal: Make pipelines statically verifiable.
- Start: Core 1-6's pure functions.
- End (this core): Typed full_rag with properties. Semantics aligned with Core 1-6.
3.1 Types (Canonical)¶
These are defined in module-01/funcpipe-rag-01/src/funcpipe_rag/rag_types.py (as in Core 1) and imported as needed. No redefinition here.
3.2 Untyped Variants (Anti-Patterns in RAG)¶
Full code:
from funcpipe_rag import RawDoc, CleanDoc, ChunkWithoutEmbedding, Chunk, RagEnv
from typing import Any
import hashlib
# Untyped clean (Any hell)
def untyped_clean_doc(doc) -> Any:
abstract = " ".join(doc.abstract.strip().lower().split())
return CleanDoc(doc.doc_id, doc.title, abstract, doc.categories)
# Untyped chunk (no safety)
def untyped_chunk_doc(doc, env) -> Any:
text = doc.abstract
chunks = (
ChunkWithoutEmbedding(doc.doc_id, text[i:i + env.chunk_size], i, i + len(text[i:i + env.chunk_size]))
for i in range(0, len(text), env.chunk_size)
)
return tuple(chunks)
# Untyped embed (Any input/output)
def untyped_embed_chunk(chunk) -> Any:
h = hashlib.sha256(chunk.text.encode("utf-8")).hexdigest()
step = 4
vec = tuple(int(h[i:i + step], 16) / (16 ** step - 1) for i in range(0, 64, step))
return Chunk(chunk.doc_id, chunk.text, chunk.start, chunk.end, vec)
Smells: Untyped (Any), no checker safety, hard to refactor (mismatches hide).
4. Refactor to Typed: Machine-Checkable Pipelines in RAG¶
4.1 Practical Typing: TypeVar for Generics (Layer 1)¶
Use TypeVar for reusable combinators. This is an evolution of the fp.py from Core 6, adding generics.
Full code:
# module-01/funcpipe-rag-01/src/funcpipe_rag/fp.py (excerpt)
from typing import TypeVar, Callable, Iterable, Generic
from funcpipe_rag import RawDoc, CleanDoc, ChunkWithoutEmbedding, Chunk, RagEnv
import hashlib
T = TypeVar("T")
U = TypeVar("U")
R = TypeVar("R")
A = TypeVar("A")
B = TypeVar("B")
C = TypeVar("C")
def fmap(fn: Callable[[T], U]) -> Callable[[Iterable[T]], list[U]]:
def inner(xs: Iterable[T]) -> list[U]:
return [fn(x) for x in xs]
return inner
def ffilter(pred: Callable[[T], bool]) -> Callable[[Iterable[T]], list[T]]:
def inner(xs: Iterable[T]) -> list[T]:
return [x for x in xs if pred(x)]
return inner
def foldl(step: Callable[[R, T], R], init: R) -> Callable[[Iterable[T]], R]:
def inner(xs: Iterable[T]) -> R:
acc = init
for x in xs:
acc = step(acc, x)
return acc
return inner
def compose2(f: Callable[[B], C], g: Callable[[A], B]) -> Callable[[A], C]:
def inner(x: A) -> C:
return f(g(x))
return inner
# Simple example (not RAG) to show it off:
to_str: Callable[[int], str] = lambda n: str(n)
length: Callable[[str], int] = len
len_of_int = compose2(length, to_str) # int -> int
# Typed clean
def clean_doc(doc: RawDoc) -> CleanDoc:
abstract = " ".join(doc.abstract.strip().lower().split())
return CleanDoc(doc.doc_id, doc.title, abstract, doc.categories)
# Typed chunk
def chunk_doc(doc: CleanDoc, env: RagEnv) -> list[ChunkWithoutEmbedding]:
text = doc.abstract
return [
ChunkWithoutEmbedding(doc.doc_id, text[i:i + env.chunk_size], i, i + len(text[i:i + env.chunk_size]))
for i in range(0, len(text), env.chunk_size)
]
# Typed embed
def embed_chunk(chunk: ChunkWithoutEmbedding) -> Chunk:
h = hashlib.sha256(chunk.text.encode("utf-8")).hexdigest()
step = 4
vec = tuple(int(h[i:i + step], 16) / (16 ** step - 1) for i in range(0, 64, step))
return Chunk(chunk.doc_id, chunk.text, chunk.start, chunk.end, vec)
# Typed full_rag
def full_rag(docs: list[RawDoc], env: RagEnv) -> tuple[Chunk, ...]:
cleaned: list[CleanDoc] = fmap(clean_doc)(docs)
chunks: list[ChunkWithoutEmbedding] = [
c
for d in cleaned
for c in chunk_doc(d, env)
]
embedded: list[Chunk] = fmap(embed_chunk)(chunks)
return tuple(embedded)
Wins: Generics with TypeVar (reusable fmap/filter/fold), compose2 rejects mismatches.
4.2 Advanced Typing: ParamSpec for Decorators (Layer 2)¶
Use ParamSpec for signature-preserving decorators.
Full code:
from typing import ParamSpec
P = ParamSpec("P")
R = TypeVar("R")
# ---------- Decorator with ParamSpec ----------
def log_calls(fn: Callable[P, R]) -> Callable[P, R]:
"""Decorator that logs calls while preserving signature."""
def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
print(f"{fn.__name__} called with {args} {kwargs}")
return fn(*args, **kwargs)
return wrapper
@log_calls # Preserves signature via ParamSpec
def logged_full_rag(docs: list[RawDoc], env: RagEnv) -> tuple[Chunk, ...]:
cleaned = fmap(clean_doc)(docs)
chunks = [c for d in cleaned for c in chunk_doc(d, env)]
return tuple(fmap(embed_chunk)(chunks))
4.3 Advanced Typing: Concatenate for Context (Layer 2)¶
Use Concatenate for type-safe dependency injection.
Full code:
from typing import Concatenate
Ctx = TypeVar("Ctx")
def with_context(
ctx: Ctx,
fn: Callable[Concatenate[Ctx, P], R],
) -> Callable[P, R]:
"""Bind context type-safely."""
def wrapped(*args: P.args, **kwargs: P.kwargs) -> R:
return fn(ctx, *args, **kwargs)
return wrapped
# Example: Context injection (e.g., for a logger)
from logging import Logger, getLogger
logger: Logger = getLogger("funcpipe.rag")
def log_clean_doc(logger: Logger, doc: RawDoc) -> CleanDoc:
logger.info("Cleaning doc %s", doc.doc_id)
return clean_doc(doc)
typed_log_clean_doc = with_context(logger, log_clean_doc) # (RawDoc) -> CleanDoc
Note: This example is intentionally impure at the boundary (logging side effect) but shows how to keep the core pure while binding dependencies in a type-safe way.
4.4 Advanced Typing: Pipeline Class (Layer 2)¶
Use a Pipeline class for multi-stage typing.
Full code:
class Pipeline(Generic[A, B]):
"""Type-safe pipeline builder."""
def __init__(self, fn: Callable[[A], B]):
self._fn = fn
def __call__(self, x: A) -> B:
return self._fn(x)
def then(self, f: Callable[[B], C]) -> "Pipeline[A, C]":
return Pipeline(compose2(f, self._fn))
4.5 Typed Pipe/Flow (Layer 2)¶
To provide typed versions of pipe and flow (as promised), we can use generics for fixed-length chains or note limitations for variadic. For simplicity, here's a typed flow for two stages; extend as needed.
Full code:
def typed_flow2(f: Callable[[A], B], g: Callable[[B], C]) -> Callable[[A], C]:
return compose2(g, f)
def pipe2(x: A, f: Callable[[A], B], g: Callable[[B], C]) -> C:
return g(f(x))
# Example usage
typed_clean_chunk = typed_flow2(clean_doc, lambda d: chunk_doc(d, RagEnv(chunk_size=512)))
Note: For longer chains, use Pipeline or accept partial typing for variadic flow/pipe due to Python limitations; we'll strengthen in later modules.
4.6 Impure Shell (Edge Only)¶
The shell from Core 1 remains; typing focuses on core. Use 'with' for resource safety in impure boundaries—full details in Module 7.
5. Equational Reasoning: Substitution Exercise¶
Hand Exercise: Replace expressions in full_rag.
1. Inline clean_doc(doc) → CleanDoc.
2. Substitute into chunk_doc → list of chunks.
3. Result: Entire call = fixed value.
Bug Hunt: In untyped_clean_doc, substitution fails (no type safety).
6. Property-Based Testing: Proving Equivalence (Advanced, Optional)¶
Use Hypothesis to prove behavior.
You can safely skip this on a first read and still follow later cores—come back when you want to mechanically verify your own refactors.
To bridge theory and practice, here's a simple Hypothesis example illustrating impurity detection:
import random
from hypothesis import given
import hypothesis.strategies as st
def impure_random_add(x: int) -> int:
return x + random.randint(1, 10) # Non-deterministic
@given(st.integers())
def test_detect_impurity(x):
assert impure_random_add(x) == impure_random_add(x) # Falsifies due to randomness
# Hypothesis will quickly find differing outputs for the same x
This property test detects the impurity by showing outputs vary for identical inputs—run it to see Hypothesis in action.
6.1 Custom Strategy (RAG Domain)¶
From module-01/funcpipe-rag-01/tests/conftest.py (as in Core 1).
6.2 Equivalence Property¶
Properties for stages:
Full code:
# module-01/funcpipe-rag-01/tests/test_laws.py (excerpt)
from hypothesis import given
import hypothesis.strategies as st
from funcpipe_rag import clean_doc, chunk_doc, embed_chunk, full_rag
from funcpipe_rag import untyped_clean_doc, untyped_chunk_doc, untyped_embed_chunk
from funcpipe_rag import RawDoc, CleanDoc, ChunkWithoutEmbedding, Chunk, RagEnv
from .conftest import raw_doc_strategy, env_strategy, doc_list_strategy
# Properties for clean_doc
@given(raw_doc_strategy())
def test_clean_doc_deterministic(doc: RawDoc) -> None:
assert clean_doc(doc) == clean_doc(doc)
# Properties for chunk_doc
@given(st.builds(CleanDoc, doc_id=st.text(min_size=1), title=st.text(), abstract=st.text(), categories=st.text()),
env_strategy())
def test_chunk_doc_deterministic(doc: CleanDoc, env: RagEnv) -> None:
assert chunk_doc(doc, env) == chunk_doc(doc, env)
@given(st.builds(CleanDoc, doc_id=st.text(min_size=1), title=st.text(), abstract=st.text(), categories=st.text()),
env_strategy())
def test_chunk_doc_covers_abstract(doc: CleanDoc, env: RagEnv) -> None:
chunks = chunk_doc(doc, env)
reconstructed = "".join(c.text for c in chunks)
assert reconstructed == doc.abstract # Invariant: covers entire text; assumes positive chunk sizes from env_strategy
# Properties for embed_chunk
@given(st.builds(ChunkWithoutEmbedding, doc_id=st.text(min_size=1), text=st.text(min_size=1),
start=st.integers(min_value=0), end=st.integers(min_value=1)))
def test_embed_chunk_deterministic(chunk: ChunkWithoutEmbedding) -> None:
assert embed_chunk(chunk) == embed_chunk(chunk)
# Composite property (full_rag)
@given(doc_list_strategy(), env_strategy())
def test_full_rag_deterministic(docs: list[RawDoc], env: RagEnv) -> None:
assert full_rag(docs, env) == full_rag(docs, env)
@given(doc_list_strategy(), env_strategy())
def test_full_rag_equivalence(docs: list[RawDoc], env: RagEnv) -> None:
untyped = tuple(
untyped_embed_chunk(c)
for d in docs
for c in untyped_chunk_doc(untyped_clean_doc(d), env)
)
assert full_rag(docs, env) == untyped # Equivalence
Note: Properties enforce determinism, equivalence, invariants.
6.3 Shrinking Demo: Catching a Bug¶
Bad refactor (wrong intermediate in pipeline):
from typing import List, Tuple
from funcpipe_rag import clean_doc, chunk_doc, embed_chunk
from funcpipe_rag import RawDoc, Chunk, RagEnv
def bad_full_rag(docs: List[RawDoc], env: RagEnv) -> Tuple[Chunk, ...]:
return tuple(
embed_chunk(doc) # Wrong: doc instead of chunk
for doc in docs
for chunk in chunk_doc(clean_doc(doc), env)
)
Property:
from hypothesis import given
from funcpipe_rag import RawDoc, RagEnv
from .conftest import doc_list_strategy, env_strategy
@given(doc_list_strategy(), env_strategy())
def test_bad_full_rag_equivalence(docs: List[RawDoc], env: RagEnv) -> None:
imperative = tuple(
embed_chunk(c)
for d in docs
for c in chunk_doc(clean_doc(d), env)
)
assert bad_full_rag(docs, env) == imperative # Falsifies on mismatch
Hypothesis failure trace (run to verify; example):
Falsifying example: test_bad_full_rag_equivalence(
docs=[RawDoc(doc_id='a', title='', abstract='a', categories='')],
env=RagEnv(chunk_size=128),
)
AssertionError
- Shrinks to minimal doc; wrong intermediate fails equivalence. Catches bug via shrinking.
7. When Typed FP Isn't Worth It¶
Rarely, for dynamic boundaries (e.g., JSON parsing), use Any; rely on properties in tests.
8. Pre-Core Quiz¶
Callable[..., Any]vsCallable[P, R]– which preserves signature? → ParamSpec version- When to use
Concatenateinstead of normal arg? → Context injection - What does
TypeVar("T")buy overAny? → Generic reuse + safety - Why ParamSpec for decorators? → Preserve exact args/kwargs
- Tool to prove typed pipeline ≡ untyped? → Hypothesis equivalence
9. Post-Core Reflection & Exercise¶
Reflect: In your code, find one higher-order utility. Add types; run mypy.
Project Exercise: Type RAG pipelines; run properties on sample data.
All claims (e.g., referential transparency) are verifiable via the provided Hypothesis examples—run them to confirm.
Further Reading: For more on purity pitfalls, see 'Fluent Python' Chapter on Functions as Objects. Check free resources like Python.org's FP section or Codecademy's Advanced Python course for readers wanting basics.
Next: Core 8 – Extracting Side Effects. (Builds on this RAG pure core.)