函数式编程艺术 / 05 Map/Filter/Reduce

05 Map/Filter/Reduce

“map/filter/reduce 是函数式编程的三驾马车——掌握它们，你就掌握了函数式数据处理的基础。”

5.1 Map（映射）

map 对集合中的每个元素应用一个函数，返回新的集合。

5.1.1 类型签名

map :: (a -> b) -> [a] -> [b]
-- 接收一个 a→b 的函数和一个 [a] 列表
-- 返回一个 [b] 列表

5.1.2 各语言实现

Haskell：

-- 内置 map
map (*2) [1, 2, 3, 4]    -- [2, 4, 6, 8]
map toUpper "hello"       -- "HELLO"（String 即 [Char]）
map length ["hello", "world"]  -- [5, 5]

-- 自定义 map
myMap :: (a -> b) -> [a] -> [b]
myMap _ []     = []
myMap f (x:xs) = f x : myMap f xs

JavaScript：

// 基本 map
[1, 2, 3, 4].map(x => x * 2);       // [2, 4, 6, 8]
['hello', 'world'].map(s => s.toUpperCase()); // ['HELLO', 'WORLD']

// 对象数组 map
const users = [
  { name: 'Alice', age: 30 },
  { name: 'Bob', age: 25 },
];
users.map(u => ({ ...u, age: u.age + 1 }));

// async map
const asyncMap = async (fn, arr) =>
  Promise.all(arr.map(fn));

const results = await asyncMap(
  async (url) => fetch(url).then(r => r.json()),
  ['/api/users', '/api/posts']
);

Python：

# 使用 map 函数
list(map(lambda x: x * 2, [1, 2, 3, 4]))  # [2, 4, 6, 8]

# 列表推导式（Python 风格）
[x * 2 for x in [1, 2, 3, 4]]  # [2, 4, 6, 8]

# 嵌套 map
matrix = [[1, 2], [3, 4], [5, 6]]
[list(map(lambda x: x * 2, row)) for row in matrix]
# [[2, 4], [6, 8], [10, 12]]

Rust：

// 迭代器 map
let doubled: Vec<i32> = vec![1, 2, 3, 4].iter().map(|x| x * 2).collect();
// [2, 4, 6, 8]

// Option map
let some_number = Some(5);
let doubled = some_number.map(|x| x * 2);  // Some(10)

// Result map
let ok_value: Result<i32, &str> = Ok(5);
let doubled = ok_value.map(|x| x * 2);  // Ok(10)

Clojure：

(map #(* 2 %) [1 2 3 4])  ;; (2 4 6 8)
(map clojure.string/upper-case ["hello" "world"])  ;; ("HELLO" "WORLD")

;; 多集合 map
(map + [1 2 3] [10 20 30])  ;; (11 22 33)

5.1.3 Map 的定律

定律	表达式	含义
恒等	`map id xs ≡ xs`	映射恒等函数不变
组合	`map (f . g) xs ≡ map f (map g xs)`	先 map g 再 map f 等于一次 map 组合

-- 恒等律
map id [1, 2, 3] == [1, 2, 3]  -- True

-- 组合律
map ((*2) . (+1)) [1, 2, 3] == map (*2) (map (+1) [1, 2, 3])  -- True

5.2 Filter（过滤）

filter 根据谓词函数保留满足条件的元素。

5.2.1 类型签名

filter :: (a -> Bool) -> [a] -> [a]

5.2.2 各语言实现

Haskell：

filter even [1..10]            -- [2, 4, 6, 8, 10]
filter (>5) [1..10]            -- [6, 7, 8, 9, 10]
filter (\c -> c /= ' ') "hello world"  -- "helloworld"

-- 自定义 filter
myFilter :: (a -> Bool) -> [a] -> [a]
myFilter _ []     = []
myFilter f (x:xs)
  | f x       = x : myFilter f xs
  | otherwise = myFilter f xs

JavaScript：

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10].filter(x => x % 2 === 0);
// [2, 4, 6, 8, 10]

const users = [
  { name: 'Alice', active: true, age: 30 },
  { name: 'Bob', active: false, age: 25 },
  { name: 'Charlie', active: true, age: 35 },
];

// 多条件 filter
users
  .filter(u => u.active)
  .filter(u => u.age >= 30);
// [{ name: 'Alice', ... }, { name: 'Charlie', ... }]

// reject（filter 的反面）
const reject = (fn, arr) => arr.filter((...args) => !fn(...args));
reject(x => x > 3, [1, 2, 3, 4, 5]);  // [1, 2, 3]

Python：

list(filter(lambda x: x % 2 == 0, range(1, 11)))
# [2, 4, 6, 8, 10]

# 列表推导式
[x for x in range(1, 11) if x % 2 == 0]
# [2, 4, 6, 8, 10]

# 多条件
users = [
    {'name': 'Alice', 'active': True, 'age': 30},
    {'name': 'Bob', 'active': False, 'age': 25},
    {'name': 'Charlie', 'active': True, 'age': 35},
]

[u for u in users if u['active'] and u['age'] >= 30]

Rust：

let evens: Vec<i32> = (1..=10).filter(|x| x % 2 == 0).collect();
// [2, 4, 6, 8, 10]

// 组合 filter
let result: Vec<i32> = (1..=100)
    .filter(|x| x % 2 == 0)
    .filter(|x| x % 3 == 0)
    .collect();
// [6, 12, 18, 24, ...]

Clojure：

(filter even? (range 1 11))  ;; (2 4 6 8 10)
(filter #(> % 5) (range 1 11))  ;; (6 7 8 9 10)

;; remove 是 filter 的反面
(remove even? (range 1 11))  ;; (1 3 5 7 9)

5.2.3 Filter 的定律

定律	表达式	含义
恒等	`filter (const True) xs ≡ xs`	保留所有元素不变
空	`filter (const False) xs ≡ []`	过滤掉所有元素
幂等	`filter p (filter p xs) ≡ filter p xs`	多次过滤同条件结果不变
分配	`filter p (map f xs) ≡ map f (filter (p . f) xs)`	先 map 再 filter 等价

5.3 Reduce/Fold（归约）

reduce/fold 将集合归约为单个值，是最强大的列表操作——map 和 filter 都可以用 fold 实现。

5.3.1 左折叠 vs 右折叠

操作	类型签名	求值顺序	结合性
foldl (左折叠)	`(b -> a -> b) -> b -> [a] -> b`	从左到右	尾递归优化
foldr (右折叠)	`(a -> b -> b) -> b -> [a] -> b`	从右到左	惰性求值友好

foldl (+) 0 [1, 2, 3]  →  ((0 + 1) + 2) + 3  =  6
foldr (+) 0 [1, 2, 3]  →  1 + (2 + (3 + 0))  =  6

5.3.2 各语言实现

Haskell：

-- foldl'（严格左折叠，推荐使用）
import Data.List (foldl')

foldl' (+) 0 [1, 2, 3, 4, 5]     -- 15
foldl' (*) 1 [1, 2, 3, 4, 5]     -- 120

-- foldr（右折叠）
foldr (:) [] [1, 2, 3]            -- [1, 2, 3]（append）
foldr (\x acc -> acc ++ [x]) [] [1, 2, 3]  -- [3, 2, 1]（反转）

-- 用 fold 实现 map
myMap :: (a -> b) -> [a] -> [b]
myMap f = foldr (\x acc -> f x : acc) []

-- 用 fold 实现 filter
myFilter :: (a -> Bool) -> [a] -> [a]
myFilter p = foldr (\x acc -> if p x then x : acc else acc) []

-- 用 fold 实现 length
myLength :: [a] -> Int
myLength = foldl' (\acc _ -> acc + 1) 0

-- 用 fold 实现 reverse
myReverse :: [a] -> [a]
myReverse = foldl' (flip (:)) []

JavaScript：

// 左折叠
[1, 2, 3, 4, 5].reduce((acc, x) => acc + x, 0);  // 15

// 用 reduce 实现 map
const map = (fn, arr) =>
  arr.reduce((acc, x) => [...acc, fn(x)], []);

// 用 reduce 实现 filter
const filter = (fn, arr) =>
  arr.reduce((acc, x) => fn(x) ? [...acc, x] : acc, []);

// 用 reduce 实现 flat
const flatten = (arr) =>
  arr.reduce((acc, x) =>
    Array.isArray(x) ? [...acc, ...flatten(x)] : [...acc, x], []);

// 复杂示例：统计词频
const wordFrequency = (text) =>
  text.toLowerCase().split(/\s+/).reduce((freq, word) => ({
    ...freq,
    [word]: (freq[word] || 0) + 1,
  }), {});

Python：

from functools import reduce

# 左折叠
reduce(lambda acc, x: acc + x, [1, 2, 3, 4, 5], 0)  # 15

# 用 reduce 实现 map
def my_map(fn, xs):
    return reduce(lambda acc, x: acc + [fn(x)], xs, [])

# 用 reduce 实现 filter
def my_filter(fn, xs):
    return reduce(lambda acc, x: acc + [x] if fn(x) else acc, xs, [])

# 统计词频
def word_frequency(text):
    return reduce(
        lambda freq, word: {**freq, word: freq.get(word, 0) + 1},
        text.lower().split(),
        {}
    )

Rust：

// fold（左折叠）
let sum: i32 = (1..=5).fold(0, |acc, x| acc + x);  // 15

// 用 fold 实现 filter + map
let result: Vec<i32> = (1..=10).fold(Vec::new(), |mut acc, x| {
    if x % 2 == 0 {
        acc.push(x * x);
    }
    acc
});

// iter().fold vs collect
// fold 更灵活，collect 更简洁
let sum1: i32 = (1..=5).fold(0, |acc, x| acc + x);
let sum2: i32 = (1..=5).sum();  // 等价

Clojure：

(reduce + [1 2 3 4 5])       ;; 15
(reduce + 0 [1 2 3 4 5])     ;; 15（带初始值）

;; 用 reduce 实现 map
(defn my-map [f xs]
  (reduce (fn [acc x] (conj acc (f x))) [] xs))

;; 用 reduce 实现 filter
(defn my-filter [pred xs]
  (reduce (fn [acc x] (if (pred x) (conj acc x) acc)) [] xs))

;; 统计词频
(defn word-frequency [text]
  (reduce
    (fn [freq word] (update freq word (fnil inc 0)))
    {}
    (clojure.string/split (clojure.string/lower-case text) #"\s+")))

5.3.3 Reduce 的定律

定律	表达式	含义
左恒等	`foldl (flip (:)) [] xs ≡ reverse xs`	左折叠 cons 等于反转
右恒等	`foldr (:) [] xs ≡ xs`	右折叠 cons 等于原列表
fold/build	`foldr k z (build g) ≡ g k z`	GHC 融合优化

5.4 Scan（扫描）

scan 类似 reduce，但返回每一步的中间结果。

5.4.1 类型签名

scanl :: (b -> a -> b) -> b -> [a] -> [b]
scanr :: (a -> b -> b) -> b -> [a] -> [b]

5.4.2 各语言实现

Haskell：

scanl (+) 0 [1, 2, 3, 4]  -- [0, 1, 3, 6, 10]
scanl (*) 1 [1, 2, 3, 4]  -- [1, 1, 2, 6, 24]

-- 实用示例：累计和
runningSums :: Num a => [a] -> [a]
runningSums = scanl1 (+)

runningSums [1, 2, 3, 4]  -- [1, 3, 6, 10]

JavaScript：

// 实现 scan
const scan = (fn, initial) => (arr) =>
  arr.reduce((acc, x) => {
    acc.push(fn(acc[acc.length - 1], x));
    return acc;
  }, [initial]);

const runningSum = scan((a, b) => a + b, 0);
runningSum([1, 2, 3, 4]);  // [0, 1, 3, 6, 10]

// RxJS 中的 scan
import { from, scan } from 'rxjs';
from([1, 2, 3, 4]).pipe(
  scan((acc, x) => acc + x, 0)
).subscribe(console.log);
// 输出: 0, 1, 3, 6, 10

Python：

from itertools import accumulate

list(accumulate([1, 2, 3, 4]))           # [1, 3, 6, 10]
list(accumulate([1, 2, 3, 4], lambda a, b: a * b))  # [1, 2, 6, 24]

Rust：

let sums: Vec<i32> = vec![1, 2, 3, 4]
    .iter()
    .scan(0, |acc, &x| { *acc += x; Some(*acc) })
    .collect();
// [1, 3, 6, 10]

Clojure：

;; Clojure 没有内置 scan，但可以实现
(defn scanl [f init xs]
  (reductions f init xs))

(reductions + 0 [1 2 3 4])  ;; (0 1 3 6 10)

5.5 管道操作（Pipeline）

管道操作是将一系列变换串联起来的模式，数据从左到右流过每个变换。

5.5.1 各语言的管道

Haskell：

-- 使用 $ 和 &
import Data.Function ((&))

-- $ 从右到左
result1 = sum . map (*2) . filter even $ [1..10]
-- 等价于
result1 = sum (map (*2) (filter even [1..10]))

-- & 从左到右（管道风格）
result2 = [1..10]
  & filter even
  & map (*2)
  & sum

JavaScript：

// 管道函数
const pipe = (...fns) => (x) => fns.reduce((v, f) => f(v), x);

const process = pipe(
  arr => arr.filter(x => x % 2 === 0),
  arr => arr.map(x => x * 2),
  arr => arr.reduce((sum, x) => sum + x, 0)
);

process([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);  // 60

// TC39 Pipeline Operator (Stage 2)
// result = [1,2,3,4,5,6,7,8,9,10]
//   |> arr => arr.filter(x => x % 2 === 0)
//   |> arr => arr.map(x => x * 2)
//   |> arr => arr.reduce((sum, x) => sum + x, 0)

Python：

# 使用 pipe 库
from pipe import select, where

result = (
    range(1, 11)
    | where(lambda x: x % 2 == 0)
    | select(lambda x: x * 2)
)

# 手动管道
def pipeline(*fns):
    return lambda x: reduce(lambda v, f: f(v), fns, x)

process = pipeline(
    lambda xs: [x for x in xs if x % 2 == 0],
    lambda xs: [x * 2 for x in xs],
    sum
)

process(range(1, 11))  # 60

Rust：

// 迭代器链式调用就是管道
let result: i32 = (1..=10)
    .filter(|x| x % 2 == 0)
    .map(|x| x * 2)
    .sum();
// result = 60

// 使用 tap（inspect）调试
let result: Vec<i32> = (1..=10)
    .filter(|x| x % 2 == 0)
    .inspect(|x| println!("after filter: {}", x))
    .map(|x| x * 2)
    .inspect(|x| println!("after map: {}", x))
    .collect();

Clojure：

;; 线程宏 -> （管道）
(->> (range 1 11)
     (filter even?)
     (map #(* 2 %))
     (reduce +))
;; 60

;; -> 用于第一个参数位置
(-> "  Hello World  "
    clojure.string/trim
    clojure.string/lower-case
    (clojure.string/split #"\s+"))
;; ["hello" "world"]

5.6 Transducer（转换器）

Transducer 是 composable（可组合的）算法转换，与集合的遍历策略解耦。

5.6.1 核心思想

传统 map/filter:  每次操作都创建中间集合
Transducer:       将所有转换合并为一次遍历，无中间集合

5.6.2 Clojure 中的 Transducer

;; 传统方式：三次遍历
(->> (range 1000000)
     (filter even?)      ;; 中间集合 1
     (map #(* 2 %))      ;; 中间集合 2
     (take 10))          ;; 中间集合 3

;; Transducer 方式：一次遍历
(transduce
  (comp
    (filter even?)
    (map #(* 2 %))
    (take 10))
  conj
  []
  (range 1000000))

;; Transducer 可以复用到不同上下文
(def xform (comp (filter even?) (map #(* 2 %))))

;; 用于向量
(into [] xform (range 100))

;; 用于惰性序列
(sequence xform (range 100))

;; 用于异步通道
(let [ch (async/chan 10 xform)]
  (async/>!! ch 1)
  (async/<!! ch))

5.6.3 JavaScript 中的 Transducer

// 简化的 Transducer 实现
const map = (fn) => (reducer) => (acc, x) => reducer(acc, fn(x));
const filter = (pred) => (reducer) => (acc, x) =>
  pred(x) ? reducer(acc, x) : acc;

const compose = (...fns) => (x) =>
  fns.reduceRight((acc, fn) => fn(acc), x);

const append = (acc, x) => [...acc, x];

// 组合 Transducer
const xform = compose(
  filter(x => x % 2 === 0),
  map(x => x * 2)
);

// 应用到数组
const transduce = (xform, reducer, init, coll) =>
  coll.reduce(xform(reducer), init);

transduce(xform, append, [], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);
// [4, 8, 12, 16, 20]

5.6.4 性能对比

方式	遍历次数	中间分配	内存使用
链式 map/filter	N 次	N-1 个中间集合	高
手动 for 循环	1 次	无	低
Transducer	1 次	无	低

5.7 FlatMap（展平映射）

FlatMap 先映射再展平一层。

5.7.1 各语言实现

Haskell：

-- concatMap = concat . map
concatMap (\x -> [x, x*2]) [1, 2, 3]
-- [1, 2, 2, 4, 3, 6]

-- 也叫 >>= (bind)
-- Maybe monad 示例
safeDivide :: Int -> Int -> Maybe Int
safeDivide _ 0 = Nothing
safeDivide a b = Just (a `div` b)

result :: Maybe Int
result = Just 10 >>= \x -> safeDivide x 2  -- Just 5

JavaScript：

// flatMap
[1, 2, 3].flatMap(x => [x, x * 2]);  // [1, 2, 2, 4, 3, 6]

// 展平嵌套数组
[[1, 2], [3, 4], [5, 6]].flatMap(x => x);  // [1, 2, 3, 4, 5, 6]

// 处理可选值
const users = [
  { name: 'Alice', email: '[email protected]' },
  { name: 'Bob', email: null },
  { name: 'Charlie', email: '[email protected]' },
];
const emails = users.flatMap(u => u.email ? [u.email] : []);
// ['[email protected]', '[email protected]']

Python：

from itertools import chain

# 使用 chain.from_iterable 展平
list(chain.from_iterable([[1, 2], [3, 4], [5, 6]]))
# [1, 2, 3, 4, 5, 6]

# flatMap
list(chain.from_iterable([[x, x*2] for x in [1, 2, 3]]))
# [1, 2, 2, 4, 3, 6]

Rust：

// flat_map
let result: Vec<i32> = vec![1, 2, 3]
    .iter()
    .flat_map(|&x| vec![x, x * 2])
    .collect();
// [1, 2, 2, 4, 3, 6]

// Option 的 flat_map
let result: Option<i32> = Some(10)
    .flat_map(|x| if x > 5 { Some(x * 2) } else { None });
// Some(20)

Clojure：

(mapcat (fn [x] [x (* x 2)]) [1 2 3])
;; (1 2 2 4 3 6)

5.8 其他常用操作

5.8.1 操作速查表

操作	Haskell	JavaScript	Python	Rust	Clojure
映射	`map`	`.map()`	`map()`	`.map()`	`map`
过滤	`filter`	`.filter()`	`filter()`	`.filter()`	`filter`
归约	`foldl'`	`.reduce()`	`reduce()`	`.fold()`	`reduce`
扫描	`scanl`	(自定义)	`accumulate()`	`.scan()`	`reductions`
展平	`concat`	`.flat()`	`chain()`	`.flatten()`	`flatten`
扁平映射	`concatMap`	`.flatMap()`	`chain()`	`.flat_map()`	`mapcat`
存在	`any`	`.some()`	`any()`	`.any()`	`some`
全部	`all`	`.every()`	`all()`	`.all()`	`every?`
查找	`find`	`.find()`	`next(filter())`	`.find()`	`some`
计数	`length`	`.length`	`len()`	`.count()`	`count`
分组	`groupBy`	`Object.groupBy`	`groupby()`	(itertools)	`group-by`
去重	`nub`	`[...new Set()]`	`set()`	`.unique()`	`distinct`

5.8.2 分组与聚合

JavaScript：

const orders = [
  { product: 'Laptop', category: 'Electronics', price: 999 },
  { product: 'Mouse', category: 'Electronics', price: 25 },
  { product: 'Book', category: 'Education', price: 30 },
  { product: 'Pen', category: 'Education', price: 5 },
];

// 分组
const groupByCategory = (orders) =>
  orders.reduce((groups, order) => ({
    ...groups,
    [order.category]: [...(groups[order.category] || []), order],
  }), {});

// 聚合
const salesByCategory = (orders) =>
  Object.entries(groupByCategory(orders)).map(([category, items]) => ({
    category,
    totalSales: items.reduce((sum, i) => sum + i.price, 0),
    count: items.length,
  }));

Haskell：

import qualified Data.Map as Map
import Data.List (groupBy, sortOn)
import Data.Function (on)

groupByCategory :: [Order] -> Map.Map String [Order]
groupByCategory = Map.fromListWith (++) . map (\o -> (category o, [o]))

salesByCategory :: [Order] -> [(String, Double, Int)]
salesByCategory orders =
  let grouped = Map.toList $ groupByCategory orders
  in map (\(cat, items) -> (cat, sum (map price items), length items)) grouped

5.9 业务场景

5.9.1 日志分析管道

// 日志分析管道
const analyzeLogs = (logs) => {
  const parseLog = (line) => {
    const [timestamp, level, ...msgParts] = line.split(' ');
    return { timestamp, level, message: msgParts.join(' ') };
  };

  return logs
    .map(parseLog)                                    // 解析
    .filter(log => log.level === 'ERROR')             // 只保留错误
    .map(log => ({                                     // 提取关键信息
      ...log,
      module: log.message.match(/\[(\w+)\]/)?.[1],
    }))
    .reduce((stats, log) => ({                        // 统计
      ...stats,
      [log.module]: (stats[log.module] || 0) + 1,
    }), {});
};

// 使用
const logs = [
  '2026-01-01 ERROR [Auth] Login failed for user alice',
  '2026-01-01 INFO [Auth] Login success for user bob',
  '2026-01-01 ERROR [DB] Connection timeout',
  '2026-01-01 ERROR [Auth] Token expired',
];
analyzeLogs(logs);  // { Auth: 2, DB: 1 }

5.9.2 电商报表

from functools import reduce
from itertools import groupby
from operator import itemgetter

def generate_sales_report(orders):
    # 管道：筛选已完成订单 → 按类别分组 → 计算统计
    completed = [o for o in orders if o['status'] == 'completed']

    # 按月份分组
    def key_func(o):
        return o['date'][:7]  # YYYY-MM

    completed.sort(key=key_func)

    reports = []
    for month, group in groupby(completed, key=key_func):
        items = list(group)
        reports.append({
            'month': month,
            'total_orders': len(items),
            'total_revenue': sum(o['total'] for o in items),
            'avg_order_value': sum(o['total'] for o in items) / len(items),
            'top_category': max(
                reduce(lambda acc, o: {**acc, o['category']: acc.get(o['category'], 0) + o['total']},
                       items, {}).items(),
                key=lambda x: x[1]
            )[0],
        })

    return reports

5.10 注意事项

注意事项	说明
惰性 vs 急切	Haskell 默认惰性，其他语言通常急切
中间集合开销	链式调用产生中间集合，考虑使用 Transducer
短路求值	`any`/`all`/`find` 可以提前终止
并行化	`map` 天然可并行，`reduce` 需要结合律
错误处理	大规模数据流中考虑 Either/Result

5.11 小结

要点	说明
Map	转换每个元素，保持长度不变
Filter	根据谓词筛选元素
Reduce/Fold	将集合归约为单个值
Scan	像 reduce 但保留中间结果
Transducer	可组合的算法转换，无中间集合
定律	恒等律、组合律、分配律保证可预测性

扩展阅读

Transducers: Efficient Data Processing Pipelines in JavaScript — Eric Elliott
Clojure Transducers — Rich Hickey
Functors, Applicatives, And Monads In Pictures — Aditya Bhargava

下一章：06 模式匹配 — 强大的数据解构工具