强曰为道
与天地相似,故不违。知周乎万物,而道济天下,故不过。旁行而不流,乐天知命,故不忧.
文档目录

函数式编程艺术 / 05 Map/Filter/Reduce

05 Map/Filter/Reduce

“map/filter/reduce 是函数式编程的三驾马车——掌握它们,你就掌握了函数式数据处理的基础。”


5.1 Map(映射)

map 对集合中的每个元素应用一个函数,返回新的集合。

5.1.1 类型签名

map :: (a -> b) -> [a] -> [b]
-- 接收一个 a→b 的函数和一个 [a] 列表
-- 返回一个 [b] 列表

5.1.2 各语言实现

Haskell:

-- 内置 map
map (*2) [1, 2, 3, 4]    -- [2, 4, 6, 8]
map toUpper "hello"       -- "HELLO"(String 即 [Char])
map length ["hello", "world"]  -- [5, 5]

-- 自定义 map
myMap :: (a -> b) -> [a] -> [b]
myMap _ []     = []
myMap f (x:xs) = f x : myMap f xs

JavaScript:

// 基本 map
[1, 2, 3, 4].map(x => x * 2);       // [2, 4, 6, 8]
['hello', 'world'].map(s => s.toUpperCase()); // ['HELLO', 'WORLD']

// 对象数组 map
const users = [
  { name: 'Alice', age: 30 },
  { name: 'Bob', age: 25 },
];
users.map(u => ({ ...u, age: u.age + 1 }));

// async map
const asyncMap = async (fn, arr) =>
  Promise.all(arr.map(fn));

const results = await asyncMap(
  async (url) => fetch(url).then(r => r.json()),
  ['/api/users', '/api/posts']
);

Python:

# 使用 map 函数
list(map(lambda x: x * 2, [1, 2, 3, 4]))  # [2, 4, 6, 8]

# 列表推导式(Python 风格)
[x * 2 for x in [1, 2, 3, 4]]  # [2, 4, 6, 8]

# 嵌套 map
matrix = [[1, 2], [3, 4], [5, 6]]
[list(map(lambda x: x * 2, row)) for row in matrix]
# [[2, 4], [6, 8], [10, 12]]

Rust:

// 迭代器 map
let doubled: Vec<i32> = vec![1, 2, 3, 4].iter().map(|x| x * 2).collect();
// [2, 4, 6, 8]

// Option map
let some_number = Some(5);
let doubled = some_number.map(|x| x * 2);  // Some(10)

// Result map
let ok_value: Result<i32, &str> = Ok(5);
let doubled = ok_value.map(|x| x * 2);  // Ok(10)

Clojure:

(map #(* 2 %) [1 2 3 4])  ;; (2 4 6 8)
(map clojure.string/upper-case ["hello" "world"])  ;; ("HELLO" "WORLD")

;; 多集合 map
(map + [1 2 3] [10 20 30])  ;; (11 22 33)

5.1.3 Map 的定律

定律 表达式 含义
恒等 map id xs ≡ xs 映射恒等函数不变
组合 map (f . g) xs ≡ map f (map g xs) 先 map g 再 map f 等于一次 map 组合
-- 恒等律
map id [1, 2, 3] == [1, 2, 3]  -- True

-- 组合律
map ((*2) . (+1)) [1, 2, 3] == map (*2) (map (+1) [1, 2, 3])  -- True

5.2 Filter(过滤)

filter 根据谓词函数保留满足条件的元素。

5.2.1 类型签名

filter :: (a -> Bool) -> [a] -> [a]

5.2.2 各语言实现

Haskell:

filter even [1..10]            -- [2, 4, 6, 8, 10]
filter (>5) [1..10]            -- [6, 7, 8, 9, 10]
filter (\c -> c /= ' ') "hello world"  -- "helloworld"

-- 自定义 filter
myFilter :: (a -> Bool) -> [a] -> [a]
myFilter _ []     = []
myFilter f (x:xs)
  | f x       = x : myFilter f xs
  | otherwise = myFilter f xs

JavaScript:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10].filter(x => x % 2 === 0);
// [2, 4, 6, 8, 10]

const users = [
  { name: 'Alice', active: true, age: 30 },
  { name: 'Bob', active: false, age: 25 },
  { name: 'Charlie', active: true, age: 35 },
];

// 多条件 filter
users
  .filter(u => u.active)
  .filter(u => u.age >= 30);
// [{ name: 'Alice', ... }, { name: 'Charlie', ... }]

// reject(filter 的反面)
const reject = (fn, arr) => arr.filter((...args) => !fn(...args));
reject(x => x > 3, [1, 2, 3, 4, 5]);  // [1, 2, 3]

Python:

list(filter(lambda x: x % 2 == 0, range(1, 11)))
# [2, 4, 6, 8, 10]

# 列表推导式
[x for x in range(1, 11) if x % 2 == 0]
# [2, 4, 6, 8, 10]

# 多条件
users = [
    {'name': 'Alice', 'active': True, 'age': 30},
    {'name': 'Bob', 'active': False, 'age': 25},
    {'name': 'Charlie', 'active': True, 'age': 35},
]

[u for u in users if u['active'] and u['age'] >= 30]

Rust:

let evens: Vec<i32> = (1..=10).filter(|x| x % 2 == 0).collect();
// [2, 4, 6, 8, 10]

// 组合 filter
let result: Vec<i32> = (1..=100)
    .filter(|x| x % 2 == 0)
    .filter(|x| x % 3 == 0)
    .collect();
// [6, 12, 18, 24, ...]

Clojure:

(filter even? (range 1 11))  ;; (2 4 6 8 10)
(filter #(> % 5) (range 1 11))  ;; (6 7 8 9 10)

;; remove 是 filter 的反面
(remove even? (range 1 11))  ;; (1 3 5 7 9)

5.2.3 Filter 的定律

定律 表达式 含义
恒等 filter (const True) xs ≡ xs 保留所有元素不变
filter (const False) xs ≡ [] 过滤掉所有元素
幂等 filter p (filter p xs) ≡ filter p xs 多次过滤同条件结果不变
分配 filter p (map f xs) ≡ map f (filter (p . f) xs) 先 map 再 filter 等价

5.3 Reduce/Fold(归约)

reduce/fold 将集合归约为单个值,是最强大的列表操作——map 和 filter 都可以用 fold 实现。

5.3.1 左折叠 vs 右折叠

操作 类型签名 求值顺序 结合性
foldl (左折叠) (b -> a -> b) -> b -> [a] -> b 从左到右 尾递归优化
foldr (右折叠) (a -> b -> b) -> b -> [a] -> b 从右到左 惰性求值友好
foldl (+) 0 [1, 2, 3]  →  ((0 + 1) + 2) + 3  =  6
foldr (+) 0 [1, 2, 3]  →  1 + (2 + (3 + 0))  =  6

5.3.2 各语言实现

Haskell:

-- foldl'(严格左折叠,推荐使用)
import Data.List (foldl')

foldl' (+) 0 [1, 2, 3, 4, 5]     -- 15
foldl' (*) 1 [1, 2, 3, 4, 5]     -- 120

-- foldr(右折叠)
foldr (:) [] [1, 2, 3]            -- [1, 2, 3](append)
foldr (\x acc -> acc ++ [x]) [] [1, 2, 3]  -- [3, 2, 1](反转)

-- 用 fold 实现 map
myMap :: (a -> b) -> [a] -> [b]
myMap f = foldr (\x acc -> f x : acc) []

-- 用 fold 实现 filter
myFilter :: (a -> Bool) -> [a] -> [a]
myFilter p = foldr (\x acc -> if p x then x : acc else acc) []

-- 用 fold 实现 length
myLength :: [a] -> Int
myLength = foldl' (\acc _ -> acc + 1) 0

-- 用 fold 实现 reverse
myReverse :: [a] -> [a]
myReverse = foldl' (flip (:)) []

JavaScript:

// 左折叠
[1, 2, 3, 4, 5].reduce((acc, x) => acc + x, 0);  // 15

// 用 reduce 实现 map
const map = (fn, arr) =>
  arr.reduce((acc, x) => [...acc, fn(x)], []);

// 用 reduce 实现 filter
const filter = (fn, arr) =>
  arr.reduce((acc, x) => fn(x) ? [...acc, x] : acc, []);

// 用 reduce 实现 flat
const flatten = (arr) =>
  arr.reduce((acc, x) =>
    Array.isArray(x) ? [...acc, ...flatten(x)] : [...acc, x], []);

// 复杂示例:统计词频
const wordFrequency = (text) =>
  text.toLowerCase().split(/\s+/).reduce((freq, word) => ({
    ...freq,
    [word]: (freq[word] || 0) + 1,
  }), {});

Python:

from functools import reduce

# 左折叠
reduce(lambda acc, x: acc + x, [1, 2, 3, 4, 5], 0)  # 15

# 用 reduce 实现 map
def my_map(fn, xs):
    return reduce(lambda acc, x: acc + [fn(x)], xs, [])

# 用 reduce 实现 filter
def my_filter(fn, xs):
    return reduce(lambda acc, x: acc + [x] if fn(x) else acc, xs, [])

# 统计词频
def word_frequency(text):
    return reduce(
        lambda freq, word: {**freq, word: freq.get(word, 0) + 1},
        text.lower().split(),
        {}
    )

Rust:

// fold(左折叠)
let sum: i32 = (1..=5).fold(0, |acc, x| acc + x);  // 15

// 用 fold 实现 filter + map
let result: Vec<i32> = (1..=10).fold(Vec::new(), |mut acc, x| {
    if x % 2 == 0 {
        acc.push(x * x);
    }
    acc
});

// iter().fold vs collect
// fold 更灵活,collect 更简洁
let sum1: i32 = (1..=5).fold(0, |acc, x| acc + x);
let sum2: i32 = (1..=5).sum();  // 等价

Clojure:

(reduce + [1 2 3 4 5])       ;; 15
(reduce + 0 [1 2 3 4 5])     ;; 15(带初始值)

;; 用 reduce 实现 map
(defn my-map [f xs]
  (reduce (fn [acc x] (conj acc (f x))) [] xs))

;; 用 reduce 实现 filter
(defn my-filter [pred xs]
  (reduce (fn [acc x] (if (pred x) (conj acc x) acc)) [] xs))

;; 统计词频
(defn word-frequency [text]
  (reduce
    (fn [freq word] (update freq word (fnil inc 0)))
    {}
    (clojure.string/split (clojure.string/lower-case text) #"\s+")))

5.3.3 Reduce 的定律

定律 表达式 含义
左恒等 foldl (flip (:)) [] xs ≡ reverse xs 左折叠 cons 等于反转
右恒等 foldr (:) [] xs ≡ xs 右折叠 cons 等于原列表
fold/build foldr k z (build g) ≡ g k z GHC 融合优化

5.4 Scan(扫描)

scan 类似 reduce,但返回每一步的中间结果。

5.4.1 类型签名

scanl :: (b -> a -> b) -> b -> [a] -> [b]
scanr :: (a -> b -> b) -> b -> [a] -> [b]

5.4.2 各语言实现

Haskell:

scanl (+) 0 [1, 2, 3, 4]  -- [0, 1, 3, 6, 10]
scanl (*) 1 [1, 2, 3, 4]  -- [1, 1, 2, 6, 24]

-- 实用示例:累计和
runningSums :: Num a => [a] -> [a]
runningSums = scanl1 (+)

runningSums [1, 2, 3, 4]  -- [1, 3, 6, 10]

JavaScript:

// 实现 scan
const scan = (fn, initial) => (arr) =>
  arr.reduce((acc, x) => {
    acc.push(fn(acc[acc.length - 1], x));
    return acc;
  }, [initial]);

const runningSum = scan((a, b) => a + b, 0);
runningSum([1, 2, 3, 4]);  // [0, 1, 3, 6, 10]

// RxJS 中的 scan
import { from, scan } from 'rxjs';
from([1, 2, 3, 4]).pipe(
  scan((acc, x) => acc + x, 0)
).subscribe(console.log);
// 输出: 0, 1, 3, 6, 10

Python:

from itertools import accumulate

list(accumulate([1, 2, 3, 4]))           # [1, 3, 6, 10]
list(accumulate([1, 2, 3, 4], lambda a, b: a * b))  # [1, 2, 6, 24]

Rust:

let sums: Vec<i32> = vec![1, 2, 3, 4]
    .iter()
    .scan(0, |acc, &x| { *acc += x; Some(*acc) })
    .collect();
// [1, 3, 6, 10]

Clojure:

;; Clojure 没有内置 scan,但可以实现
(defn scanl [f init xs]
  (reductions f init xs))

(reductions + 0 [1 2 3 4])  ;; (0 1 3 6 10)

5.5 管道操作(Pipeline)

管道操作是将一系列变换串联起来的模式,数据从左到右流过每个变换。

5.5.1 各语言的管道

Haskell:

-- 使用 $ 和 &
import Data.Function ((&))

-- $ 从右到左
result1 = sum . map (*2) . filter even $ [1..10]
-- 等价于
result1 = sum (map (*2) (filter even [1..10]))

-- & 从左到右(管道风格)
result2 = [1..10]
  & filter even
  & map (*2)
  & sum

JavaScript:

// 管道函数
const pipe = (...fns) => (x) => fns.reduce((v, f) => f(v), x);

const process = pipe(
  arr => arr.filter(x => x % 2 === 0),
  arr => arr.map(x => x * 2),
  arr => arr.reduce((sum, x) => sum + x, 0)
);

process([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);  // 60

// TC39 Pipeline Operator (Stage 2)
// result = [1,2,3,4,5,6,7,8,9,10]
//   |> arr => arr.filter(x => x % 2 === 0)
//   |> arr => arr.map(x => x * 2)
//   |> arr => arr.reduce((sum, x) => sum + x, 0)

Python:

# 使用 pipe 库
from pipe import select, where

result = (
    range(1, 11)
    | where(lambda x: x % 2 == 0)
    | select(lambda x: x * 2)
)

# 手动管道
def pipeline(*fns):
    return lambda x: reduce(lambda v, f: f(v), fns, x)

process = pipeline(
    lambda xs: [x for x in xs if x % 2 == 0],
    lambda xs: [x * 2 for x in xs],
    sum
)

process(range(1, 11))  # 60

Rust:

// 迭代器链式调用就是管道
let result: i32 = (1..=10)
    .filter(|x| x % 2 == 0)
    .map(|x| x * 2)
    .sum();
// result = 60

// 使用 tap(inspect)调试
let result: Vec<i32> = (1..=10)
    .filter(|x| x % 2 == 0)
    .inspect(|x| println!("after filter: {}", x))
    .map(|x| x * 2)
    .inspect(|x| println!("after map: {}", x))
    .collect();

Clojure:

;; 线程宏 -> (管道)
(->> (range 1 11)
     (filter even?)
     (map #(* 2 %))
     (reduce +))
;; 60

;; -> 用于第一个参数位置
(-> "  Hello World  "
    clojure.string/trim
    clojure.string/lower-case
    (clojure.string/split #"\s+"))
;; ["hello" "world"]

5.6 Transducer(转换器)

Transducer 是 composable(可组合的)算法转换,与集合的遍历策略解耦。

5.6.1 核心思想

传统 map/filter:  每次操作都创建中间集合
Transducer:       将所有转换合并为一次遍历,无中间集合

5.6.2 Clojure 中的 Transducer

;; 传统方式:三次遍历
(->> (range 1000000)
     (filter even?)      ;; 中间集合 1
     (map #(* 2 %))      ;; 中间集合 2
     (take 10))          ;; 中间集合 3

;; Transducer 方式:一次遍历
(transduce
  (comp
    (filter even?)
    (map #(* 2 %))
    (take 10))
  conj
  []
  (range 1000000))

;; Transducer 可以复用到不同上下文
(def xform (comp (filter even?) (map #(* 2 %))))

;; 用于向量
(into [] xform (range 100))

;; 用于惰性序列
(sequence xform (range 100))

;; 用于异步通道
(let [ch (async/chan 10 xform)]
  (async/>!! ch 1)
  (async/<!! ch))

5.6.3 JavaScript 中的 Transducer

// 简化的 Transducer 实现
const map = (fn) => (reducer) => (acc, x) => reducer(acc, fn(x));
const filter = (pred) => (reducer) => (acc, x) =>
  pred(x) ? reducer(acc, x) : acc;

const compose = (...fns) => (x) =>
  fns.reduceRight((acc, fn) => fn(acc), x);

const append = (acc, x) => [...acc, x];

// 组合 Transducer
const xform = compose(
  filter(x => x % 2 === 0),
  map(x => x * 2)
);

// 应用到数组
const transduce = (xform, reducer, init, coll) =>
  coll.reduce(xform(reducer), init);

transduce(xform, append, [], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);
// [4, 8, 12, 16, 20]

5.6.4 性能对比

方式 遍历次数 中间分配 内存使用
链式 map/filter N 次 N-1 个中间集合
手动 for 循环 1 次
Transducer 1 次

5.7 FlatMap(展平映射)

FlatMap 先映射再展平一层。

5.7.1 各语言实现

Haskell:

-- concatMap = concat . map
concatMap (\x -> [x, x*2]) [1, 2, 3]
-- [1, 2, 2, 4, 3, 6]

-- 也叫 >>= (bind)
-- Maybe monad 示例
safeDivide :: Int -> Int -> Maybe Int
safeDivide _ 0 = Nothing
safeDivide a b = Just (a `div` b)

result :: Maybe Int
result = Just 10 >>= \x -> safeDivide x 2  -- Just 5

JavaScript:

// flatMap
[1, 2, 3].flatMap(x => [x, x * 2]);  // [1, 2, 2, 4, 3, 6]

// 展平嵌套数组
[[1, 2], [3, 4], [5, 6]].flatMap(x => x);  // [1, 2, 3, 4, 5, 6]

// 处理可选值
const users = [
  { name: 'Alice', email: '[email protected]' },
  { name: 'Bob', email: null },
  { name: 'Charlie', email: '[email protected]' },
];
const emails = users.flatMap(u => u.email ? [u.email] : []);
// ['[email protected]', '[email protected]']

Python:

from itertools import chain

# 使用 chain.from_iterable 展平
list(chain.from_iterable([[1, 2], [3, 4], [5, 6]]))
# [1, 2, 3, 4, 5, 6]

# flatMap
list(chain.from_iterable([[x, x*2] for x in [1, 2, 3]]))
# [1, 2, 2, 4, 3, 6]

Rust:

// flat_map
let result: Vec<i32> = vec![1, 2, 3]
    .iter()
    .flat_map(|&x| vec![x, x * 2])
    .collect();
// [1, 2, 2, 4, 3, 6]

// Option 的 flat_map
let result: Option<i32> = Some(10)
    .flat_map(|x| if x > 5 { Some(x * 2) } else { None });
// Some(20)

Clojure:

(mapcat (fn [x] [x (* x 2)]) [1 2 3])
;; (1 2 2 4 3 6)

5.8 其他常用操作

5.8.1 操作速查表

操作 Haskell JavaScript Python Rust Clojure
映射 map .map() map() .map() map
过滤 filter .filter() filter() .filter() filter
归约 foldl' .reduce() reduce() .fold() reduce
扫描 scanl (自定义) accumulate() .scan() reductions
展平 concat .flat() chain() .flatten() flatten
扁平映射 concatMap .flatMap() chain() .flat_map() mapcat
存在 any .some() any() .any() some
全部 all .every() all() .all() every?
查找 find .find() next(filter()) .find() some
计数 length .length len() .count() count
分组 groupBy Object.groupBy groupby() (itertools) group-by
去重 nub [...new Set()] set() .unique() distinct

5.8.2 分组与聚合

JavaScript:

const orders = [
  { product: 'Laptop', category: 'Electronics', price: 999 },
  { product: 'Mouse', category: 'Electronics', price: 25 },
  { product: 'Book', category: 'Education', price: 30 },
  { product: 'Pen', category: 'Education', price: 5 },
];

// 分组
const groupByCategory = (orders) =>
  orders.reduce((groups, order) => ({
    ...groups,
    [order.category]: [...(groups[order.category] || []), order],
  }), {});

// 聚合
const salesByCategory = (orders) =>
  Object.entries(groupByCategory(orders)).map(([category, items]) => ({
    category,
    totalSales: items.reduce((sum, i) => sum + i.price, 0),
    count: items.length,
  }));

Haskell:

import qualified Data.Map as Map
import Data.List (groupBy, sortOn)
import Data.Function (on)

groupByCategory :: [Order] -> Map.Map String [Order]
groupByCategory = Map.fromListWith (++) . map (\o -> (category o, [o]))

salesByCategory :: [Order] -> [(String, Double, Int)]
salesByCategory orders =
  let grouped = Map.toList $ groupByCategory orders
  in map (\(cat, items) -> (cat, sum (map price items), length items)) grouped

5.9 业务场景

5.9.1 日志分析管道

// 日志分析管道
const analyzeLogs = (logs) => {
  const parseLog = (line) => {
    const [timestamp, level, ...msgParts] = line.split(' ');
    return { timestamp, level, message: msgParts.join(' ') };
  };

  return logs
    .map(parseLog)                                    // 解析
    .filter(log => log.level === 'ERROR')             // 只保留错误
    .map(log => ({                                     // 提取关键信息
      ...log,
      module: log.message.match(/\[(\w+)\]/)?.[1],
    }))
    .reduce((stats, log) => ({                        // 统计
      ...stats,
      [log.module]: (stats[log.module] || 0) + 1,
    }), {});
};

// 使用
const logs = [
  '2026-01-01 ERROR [Auth] Login failed for user alice',
  '2026-01-01 INFO [Auth] Login success for user bob',
  '2026-01-01 ERROR [DB] Connection timeout',
  '2026-01-01 ERROR [Auth] Token expired',
];
analyzeLogs(logs);  // { Auth: 2, DB: 1 }

5.9.2 电商报表

from functools import reduce
from itertools import groupby
from operator import itemgetter

def generate_sales_report(orders):
    # 管道:筛选已完成订单 → 按类别分组 → 计算统计
    completed = [o for o in orders if o['status'] == 'completed']

    # 按月份分组
    def key_func(o):
        return o['date'][:7]  # YYYY-MM

    completed.sort(key=key_func)

    reports = []
    for month, group in groupby(completed, key=key_func):
        items = list(group)
        reports.append({
            'month': month,
            'total_orders': len(items),
            'total_revenue': sum(o['total'] for o in items),
            'avg_order_value': sum(o['total'] for o in items) / len(items),
            'top_category': max(
                reduce(lambda acc, o: {**acc, o['category']: acc.get(o['category'], 0) + o['total']},
                       items, {}).items(),
                key=lambda x: x[1]
            )[0],
        })

    return reports

5.10 注意事项

注意事项 说明
惰性 vs 急切 Haskell 默认惰性,其他语言通常急切
中间集合开销 链式调用产生中间集合,考虑使用 Transducer
短路求值 any/all/find 可以提前终止
并行化 map 天然可并行,reduce 需要结合律
错误处理 大规模数据流中考虑 Either/Result

5.11 小结

要点 说明
Map 转换每个元素,保持长度不变
Filter 根据谓词筛选元素
Reduce/Fold 将集合归约为单个值
Scan 像 reduce 但保留中间结果
Transducer 可组合的算法转换,无中间集合
定律 恒等律、组合律、分配律保证可预测性

扩展阅读

  1. Transducers: Efficient Data Processing Pipelines in JavaScript — Eric Elliott
  2. Clojure Transducers — Rich Hickey
  3. Functors, Applicatives, And Monads In Pictures — Aditya Bhargava

下一章06 模式匹配 — 强大的数据解构工具