强曰为道

与天地相似,故不违。知周乎万物,而道济天下,故不过。旁行而不流,乐天知命,故不忧.
文档目录

第11章:数据文件与动态内容

第11章:数据文件与动态内容

11.1 数据文件概述

数据文件(Data Files)存放在 _data/ 目录中,是 Jekyll 的结构化数据源。它们可以在模板中通过 site.data 对象访问。

支持的格式

格式扩展名适用场景
YAML.yml, .yaml配置、结构化数据
JSON.jsonAPI 响应、复杂嵌套数据
CSV.csv表格数据、批量导入
TSV.tsvTab 分隔的表格数据

命名规则

_data/
├── navigation.yml       → site.data.navigation
├── authors.yml          → site.data.authors
├── products.json        → site.data.products
├── stats.csv            → site.data.stats
└── settings/
    └── theme.yml        → site.data.settings.theme

11.2 YAML 数据文件

基础结构

# _data/faq.yml
- question: "什么是 Jekyll?"
  answer: "Jekyll 是一个静态站点生成器。"
  category: "基础"
  order: 1

- question: "如何安装 Jekyll?"
  answer: "通过 gem install jekyll 安装。"
  category: "安装"
  order: 2

嵌套结构

# _data/config.yml
site:
  title: "我的博客"
  description: "分享技术与生活"
  logo: "/images/logo.png"
  social:
    github: "https://github.com/user"
    twitter: "https://twitter.com/user"
    email: "[email protected]"

features:
  dark_mode: true
  search: true
  comments:
    provider: "disqus"
    shortname: "myblog"

navigation:
  main:
    - title: "首页"
      url: "/"
      icon: "home"
    - title: "博客"
      url: "/blog/"
      icon: "book"
      children:
        - title: "归档"
          url: "/archive/"
        - title: "分类"
          url: "/categories/"
    - title: "关于"
      url: "/about/"
      icon: "user"

模板中访问 YAML 数据

<!-- 访问简单值 -->
{{ site.data.config.site.title }}
{{ site.data.config.site.social.github }}

<!-- 遍历数组 -->
{% for item in site.data.faq %}
  <div class="faq-item">
    <h3>{{ item.question }}</h3>
    <p>{{ item.answer }}</p>
  </div>
{% endfor %}

<!-- 条件判断 -->
{% if site.data.config.features.dark_mode %}
  <link rel="stylesheet" href="{{ '/assets/css/dark.css' | relative_url }}">
{% endif %}

<!-- 嵌套遍历 -->
{% for item in site.data.config.navigation.main %}
  <li>
    <a href="{{ item.url }}">{{ item.title }}</a>
    {% if item.children %}
      <ul>
        {% for child in item.children %}
          <li><a href="{{ child.url }}">{{ child.title }}</a></li>
        {% endfor %}
      </ul>
    {% endif %}
  </li>
{% endfor %}

11.3 JSON 数据文件

基础结构

// _data/team.json
[
  {
    "id": "alice",
    "name": "Alice Wang",
    "role": "Lead Developer",
    "department": "Engineering",
    "skills": ["Ruby", "JavaScript", "Go"],
    "projects": ["Project A", "Project B"],
    "contact": {
      "email": "[email protected]",
      "github": "alice-dev"
    },
    "joined": "2020-03-15",
    "active": true
  },
  {
    "id": "bob",
    "name": "Bob Chen",
    "role": "Designer",
    "department": "Design",
    "skills": ["Figma", "CSS", "Illustration"],
    "projects": ["Project A", "Project C"],
    "contact": {
      "email": "[email protected]",
      "dribbble": "bob-design"
    },
    "joined": "2021-06-01",
    "active": true
  }
]

模板中使用 JSON 数据

<!-- 遍历 JSON 数组 -->
{% for member in site.data.team %}
  {% if member.active %}
    <div class="team-card" data-department="{{ member.department }}">
      <h3>{{ member.name }}</h3>
      <p class="role">{{ member.role }}</p>
      <div class="skills">
        {% for skill in member.skills %}
          <span class="badge">{{ skill }}</span>
        {% endfor %}
      </div>
      <p class="joined">加入时间:{{ member.joined }}</p>
    </div>
  {% endif %}
{% endfor %}

<!-- 筛选特定条件 -->
{% assign developers = site.data.team | where: "department", "Engineering" %}
{% for dev in developers %}
  {{ dev.name }}
{% endfor %}

<!-- 按技能筛选 -->
{% for member in site.data.team %}
  {% if member.skills contains "Ruby" %}
    <p>{{ member.name }} 擅长 Ruby</p>
  {% endif %}
{% endfor %}

11.4 CSV 数据文件

CSV 文件格式

# _data/products.csv
id,name,category,price,currency,in_stock,description
1,Jekyll Book,books,49.99,USD,true,A comprehensive Jekyll guide
2,Ruby T-Shirt,apparel,29.99,USD,true,Comfortable cotton t-shirt
3,Sticker Pack,accessories,9.99,USD,false,Set of 10 developer stickers
4,Coffee Mug,accessories,14.99,USD,true,12oz ceramic mug with logo
5,Online Course,digital,99.99,USD,true,40-hour video course

模板中使用 CSV 数据

<!-- 产品表格 -->
<table class="products-table">
  <thead>
    <tr>
      <th>名称</th>
      <th>分类</th>
      <th>价格</th>
      <th>状态</th>
    </tr>
  </thead>
  <tbody>
    {% for product in site.data.products %}
      <tr class="{% unless product.in_stock == 'true' %}out-of-stock{% endunless %}">
        <td>{{ product.name }}</td>
        <td>{{ product.category }}</td>
        <td>{{ product.currency }} {{ product.price }}</td>
        <td>
          {% if product.in_stock == 'true' %}
            <span class="badge success">有货</span>
          {% else %}
            <span class="badge danger">缺货</span>
          {% endif %}
        </td>
      </tr>
    {% endfor %}
  </tbody>
</table>

<!-- 按分类分组 -->
{% assign categories = site.data.products | group_by: "category" %}
{% for category in categories %}
  <h2>{{ category.name | capitalize }}</h2>
  <div class="product-grid">
    {% for product in category.items %}
      <div class="product-card">
        <h3>{{ product.name }}</h3>
        <p class="price">{{ product.currency }} {{ product.price }}</p>
        <p>{{ product.description }}</p>
      </div>
    {% endfor %}
  </div>
{% endfor %}

<!-- 筛选有货产品 -->
{% assign available = site.data.products | where: "in_stock", "true" %}
<p>当前有 {{ available.size }} 件商品有货</p>

注意事项

  • CSV 所有值都是字符串类型,数值比较需注意
  • 布尔值在 CSV 中是 "true" / "false" 字符串
  • CSV 文件首行必须是列名(header)

11.5 多文件数据组织

按功能分目录

_data/
├── navigation/
│   ├── main.yml
│   ├── footer.yml
│   └── sidebar.yml
├── content/
│   ├── features.yml
│   ├── testimonials.yml
│   ├── pricing.yml
│   └── faq.yml
├── i18n/
│   ├── en.yml
│   └── zh.yml
└── config/
    ├── seo.yml
    ├── analytics.yml
    └── social.yml
<!-- 访问嵌套数据 -->
{% for item in site.data.navigation.main %}
  {{ item.title }}
{% endfor %}

{% for feature in site.data.content.features %}
  {{ feature.title }}
{% endfor %}

{% assign lang = "zh" %}
{{ site.data.i18n[lang].welcome }}

国际化数据文件

# _data/i18n/en.yml
welcome: "Welcome"
read_more: "Read More"
posted_by: "Posted by"
categories: "Categories"
tags: "Tags"
search_placeholder: "Search..."
404_title: "Page Not Found"
404_message: "The page you're looking for doesn't exist."
# _data/i18n/zh.yml
welcome: "欢迎"
read_more: "阅读更多"
posted_by: "作者"
categories: "分类"
tags: "标签"
search_placeholder: "搜索..."
404_title: "页面未找到"
404_message: "您访问的页面不存在。"
<!-- 使用 -->
{% assign lang = page.lang | default: site.default_lang | default: "en" %}
{% assign t = site.data.i18n[lang] %}

<h1>{{ t.welcome }}</h1>
<a href="/blog/">{{ t.read_more }}</a>
<input type="text" placeholder="{{ t.search_placeholder }}">

11.6 动态生成数据内容

使用插件生成内容

# _plugins/generators/data_page_generator.rb

module Jekyll
  class DataPageGenerator < Generator
    safe true

    def generate(site)
      # 为数据文件中的每个条目生成独立页面
      if site.data.key?('team')
        site.data['team'].each do |member|
          site.pages << DataPage.new(site, 'team', member)
        end
      end
    end
  end

  class DataPage < Page
    def initialize(site, type, data)
      @site = site
      @base = site.source
      @dir  = File.join(type, data['id'])
      @name = 'index.html'

      process(@name)
      read_yaml(File.join(site.source, '_layouts'), "#{type}.html")

      self.data['title'] = data['name']
      self.data['member'] = data
      self.data['permalink'] = "/#{type}/#{data['id']}/"
    end
  end
end

从 API 获取数据

# _plugins/fetch_api_data.rb

require 'net/http'
require 'json'
require 'uri'

module Jekyll
  class FetchAPIData < Generator
    safe true
    priority :high

    def generate(site)
      # 在构建时获取 GitHub 仓库信息
      repos = fetch_github_repos('username')
      site.data['github_repos'] = repos if repos
    end

    private

    def fetch_github_repos(username)
      uri = URI("https://api.github.com/users/#{username}/repos?sort=updated&per_page=10")
      response = Net::HTTP.get_response(uri)

      return nil unless response.is_a?(Net::HTTPSuccess)

      JSON.parse(response.body).map do |repo|
        {
          'name' => repo['name'],
          'description' => repo['description'],
          'url' => repo['html_url'],
          'stars' => repo['stargazers_count'],
          'language' => repo['language'],
          'updated' => repo['updated_at']
        }
      end
    rescue StandardError => e
      Jekyll.logger.warn "API:", "Failed to fetch GitHub data: #{e.message}"
      nil
    end
  end
end
<!-- 使用 API 数据 -->
{% if site.data.github_repos %}
  <h2>GitHub 项目</h2>
  {% for repo in site.data.github_repos %}
    <div class="repo-card">
      <h3><a href="{{ repo.url }}">{{ repo.name }}</a></h3>
      <p>{{ repo.description }}</p>
      <span>⭐ {{ repo.stars }}</span>
      <span>{{ repo.language }}</span>
    </div>
  {% endfor %}
{% endif %}

11.7 数据文件最佳实践

大型数据文件处理

# 不推荐:单个巨大的 YAML 文件
# _data/all_products.yml (5000+ 条)

# 推荐:按类别拆分
_data/
├── products/
│   ├── electronics.yml    # 100 条
│   ├── books.yml          # 200 条
│   └── apparel.yml        # 150 条
<!-- 合并多个数据源 -->
{% assign all_products = "" | split: "" %}
{% for product in site.data.products.electronics %}
  {% assign all_products = all_products | push: product %}
{% endfor %}
{% for product in site.data.products.books %}
  {% assign all_products = all_products | push: product %}
{% endfor %}

数据验证

# _plugins/hooks/validate_data.rb

Jekyll::Hooks.register :site, :post_read do |site|
  # 验证必需的数据文件
  required_files = %w[navigation config]

  required_files.each do |file|
    unless site.data.key?(file)
      Jekyll.logger.warn "Data:", "Missing required data file: _data/#{file}.yml"
    end
  end

  # 验证数据结构
  if site.data['navigation'] && site.data['navigation']['main']
    site.data['navigation']['main'].each_with_index do |item, index|
      unless item.key?('title') && item.key?('url')
        Jekyll.logger.warn "Data:", "Navigation item #{index} missing title or url"
      end
    end
  end
end

11.8 业务场景:定价页面

# _data/pricing.yml
plans:
  - id: free
    name: "免费版"
    price: 0
    period: "永久"
    features:
      - "5 个项目"
      - "1GB 存储"
      - "社区支持"
      - "基础分析"
    cta: "免费开始"
    cta_url: "/signup/"
    popular: false

  - id: pro
    name: "专业版"
    price: 49
    period: "/月"
    features:
      - "无限项目"
      - "100GB 存储"
      - "优先支持"
      - "高级分析"
      - "自定义域名"
    cta: "升级专业版"
    cta_url: "/signup/pro/"
    popular: true

  - id: enterprise
    name: "企业版"
    price: 199
    period: "/月"
    features:
      - "所有专业版功能"
      - "1TB 存储"
      - "专属支持"
      - "SLA 保障"
      - "SSO 集成"
      - "自定义合同"
    cta: "联系我们"
    cta_url: "/contact/"
    popular: false
<!-- pricing.html -->
<section class="pricing">
  {% for plan in site.data.pricing.plans %}
    <div class="plan-card {% if plan.popular %}popular{% endif %}">
      {% if plan.popular %}
        <div class="badge">最受欢迎</div>
      {% endif %}
      <h3>{{ plan.name }}</h3>
      <div class="price">
        <span class="amount">¥{{ plan.price }}</span>
        <span class="period">{{ plan.period }}</span>
      </div>
      <ul class="features">
        {% for feature in plan.features %}
          <li>✅ {{ feature }}</li>
        {% endfor %}
      </ul>
      <a href="{{ plan.cta_url | relative_url }}" class="cta-button">
        {{ plan.cta }}
      </a>
    </div>
  {% endfor %}
</section>

11.9 扩展阅读


本章小结

要点说明
数据文件_data/ 目录中的 YAML/JSON/CSV 文件
访问方式site.data.filenamesite.data.dir.filename
YAML推荐用于配置和结构化数据
JSON适合 API 数据和复杂嵌套结构
CSV适合表格数据和批量导入
动态数据通过插件在构建时从 API 获取

下一章:部署方案