Эх сурвалжийг харах

v5.2.0: 新增 ai 翻译等功能,若干bug修复

sansan 4 сар өмнө
parent
commit
0246c8d09a
48 өөрчлөгдсөн 4227 нэмэгдсэн , 4529 устгасан
  1. 22 0
      .github/ISSUE_TEMPLATE/01-bug-report.yml
  2. 1 1
      .github/workflows/crawler.yml
  3. 261 595
      README-EN.md
  4. 1 1
      README-MCP-FAQ-EN.md
  5. 1 1
      README-MCP-FAQ.md
  6. 296 594
      README.md
  7. BIN
      _image/ai.jpg
  8. BIN
      _image/ai4.png
  9. 66 50
      config/ai_analysis_prompt.txt
  10. 27 0
      config/ai_translation_prompt.txt
  11. 178 67
      config/config.yaml
  12. 1 0
      config/frequency_words.txt
  13. 2 15
      docker/.env
  14. 1 6
      docker/docker-compose-build.yml
  15. 1 6
      docker/docker-compose.yml
  16. 0 5
      docker/manage.py
  17. 1 1
      mcp_server/__init__.py
  18. 70 326
      mcp_server/server.py
  19. 6 14
      mcp_server/services/data_service.py
  20. 1 1
      mcp_server/services/parser_service.py
  21. 35 18
      mcp_server/tools/config_mgmt.py
  22. 22 2
      mcp_server/utils/errors.py
  23. 1 1
      pyproject.toml
  24. 1 1
      trendradar/__init__.py
  25. 227 268
      trendradar/__main__.py
  26. 11 2
      trendradar/ai/__init__.py
  27. 175 97
      trendradar/ai/analyzer.py
  28. 178 97
      trendradar/ai/formatter.py
  29. 428 0
      trendradar/ai/translator.py
  30. 39 14
      trendradar/context.py
  31. 0 2
      trendradar/core/__init__.py
  32. 4 0
      trendradar/core/analyzer.py
  33. 2 20
      trendradar/core/data.py
  34. 83 33
      trendradar/core/loader.py
  35. 185 84
      trendradar/notification/dispatcher.py
  36. 3 2
      trendradar/notification/push_manager.py
  37. 60 54
      trendradar/notification/renderer.py
  38. 20 25
      trendradar/notification/senders.py
  39. 275 137
      trendradar/notification/splitter.py
  40. 36 35
      trendradar/report/generator.py
  41. 150 29
      trendradar/report/html.py
  42. 7 0
      trendradar/storage/__init__.py
  43. 5 0
      trendradar/storage/base.py
  44. 91 996
      trendradar/storage/local.py
  45. 114 927
      trendradar/storage/remote.py
  46. 1137 0
      trendradar/storage/sqlite_mixin.py
  47. 1 1
      version
  48. 1 1
      version_mcp

+ 22 - 0
.github/ISSUE_TEMPLATE/01-bug-report.yml

@@ -8,8 +8,30 @@ body:
   - type: markdown
     attributes:
       value: |
+        ### ⚠️ 提交前必读
+        **请确保你正在使用 TrendRadar 的最新版本。**
+        很多问题在最新代码中可能已经修复。如果你使用的是旧版本,我将无法处理,请先更新后再试。
+
         **简单的描述 + 关键截图** 是最有效的沟通方式。
 
+  - type: input
+    id: version
+    attributes:
+      label: 📦 TrendRadar 版本
+      description: 请务必提供。(如:v5.2.0 或 git commit id)
+      placeholder: v5.2.0 或 commit hash
+    validations:
+      required: true
+
+  - type: input
+    id: mcp-version
+    attributes:
+      label: 🔌 MCP Server 版本 (可选)
+      description: 如果你是通过 MCP 使用,请填写 MCP Server 的版本。
+      placeholder: v3.1.6 (非 MCP 用户留空)
+    validations:
+      required: false
+
   - type: dropdown
     id: bug-category
     attributes:

+ 1 - 1
.github/workflows/crawler.yml

@@ -154,7 +154,7 @@ jobs:
           # 通用Webhook配置
           GENERIC_WEBHOOK_URL: ${{ secrets.GENERIC_WEBHOOK_URL }}
           GENERIC_WEBHOOK_TEMPLATE: ${{ secrets.GENERIC_WEBHOOK_TEMPLATE }}
-          # AI 分析配置
+          # AI 配置(ai_analysis 和 ai_translation 共享模型配置)
           AI_ANALYSIS_ENABLED: ${{ secrets.AI_ANALYSIS_ENABLED }}
           AI_API_KEY: ${{ secrets.AI_API_KEY }}
           AI_PROVIDER: ${{ secrets.AI_PROVIDER }}

Файлын зөрүү хэтэрхий том тул дарагдсан байна
+ 261 - 595
README-EN.md


+ 1 - 1
README-MCP-FAQ-EN.md

@@ -6,7 +6,7 @@
 
 # TrendRadar MCP Tool Usage Q&A
 
-> AI Query Guide - How to Use News Trend Analysis Tools Through Natural Conversation (v3.1.5)
+> AI Query Guide - How to Use News Trend Analysis Tools Through Natural Conversation (v3.1.6)
 
 ---
 

+ 1 - 1
README-MCP-FAQ.md

@@ -6,7 +6,7 @@
 
 # TrendRadar MCP 工具使用问答
 
-> AI 提问指南 - 如何通过自然对话使用新闻热点分析工具(v3.1.5
+> AI 提问指南 - 如何通过自然对话使用新闻热点分析工具(v3.1.6
 
 ---
 

Файлын зөрүү хэтэрхий том тул дарагдсан байна
+ 296 - 594
README.md


BIN
_image/ai.jpg


BIN
_image/ai4.png


+ 66 - 50
config/ai_analysis_prompt.txt

@@ -3,96 +3,112 @@
 # ═══════════════════════════════════════════════════════════════
 #
 # 此文件定义 AI 分析热点新闻时使用的提示词模板
-# 你可以根据需要自定义分析角度和输出格式
 #
 # 可用变量(在分析时会被替换):
-#   {report_mode}     - 当前报告模式 (daily/current/incremental)
+#   {language}        - 输出语言 (由 ai_analysis.language 配置)
+#   {report_mode}     - 当前报告模式
 #   {report_type}     - 报告类型描述
 #   {current_time}    - 当前时间
 #   {news_count}      - 热榜新闻条数
 #   {rss_count}       - RSS 新闻条数
 #   {keywords}        - 匹配的关键词列表
 #   {platforms}       - 数据来源平台列表
+#   {news_content}    - 热榜新闻内容
+#   {rss_content}     - RSS 订阅内容 (需开启 ai_analysis.include_rss)
 #
 # ═══════════════════════════════════════════════════════════════
 
 [system]
 你是一位专业的新闻分析师和趋势观察者。你的任务是分析热点新闻数据,提供有价值的洞察。
 
-分析原则:
-1. 客观中立 - 基于事实分析,避免主观臆断
-2. 深度洞察 - 挖掘表面现象背后的趋势和规律
-3. 实用价值 - 提供可操作的见解和建议
-4. 简洁明了 - 用精炼的语言表达核心观点
+## 核心原则
 
-## 数据来源说明
+1. 直击要害:避免废话,直接说"是什么"、"有多火"、"要注意什么"。
+2. 逻辑闭环:将"现象"、"原因"与"建议"打通,告诉读者信息背后的行动指南。
+3. 观点鲜明:明确指出是"泡沫"还是"机遇",是"争议"还是"共识"。
+4. 通俗易懂:使用大众能理解的词汇(如"过热"、"降温"、"反转"、"出圈"),避免生造复杂概念。
+5. 辩证思维:运用矛盾论视角,识别热点背后的"主要矛盾"与"次要矛盾",抓住事物发展的关键内因。
 
-本系统从多个热榜平台(如微博、知乎、今日头条等)和 RSS 订阅源抓取新闻数据。
-数据经过 frequency_words.txt 中定义的关键词过滤,只保留匹配的新闻。
+## 数据字段深度解读指南
 
-## 数据字段说明
+为了做出精准判断,请充分利用以下数据维度:
 
-### 热榜新闻字段
-每条热榜新闻包含以下维度:
-- 来源: 新闻所在的热榜平台(如微博热搜、知乎热榜、今日头条等)
-- 标题: 新闻标题内容
-- 排名: 该新闻在来源平台热榜中的排名范围,格式为"最高排名-最低排名"(如"1"表示排名稳定在第1,"3-8"表示最高冲到第3名、最低跌到第8名)
-- 时间: 该新闻在热榜上出现的时间段,格式为"首次出现时间~最后出现时间"(如"09:30~12:45"表示从9:30首次上榜到12:45最后一次出现)
-- 出现次数: 在监控时间段内,该新闻被抓取到的次数(次数越多说明在热榜上停留时间越长,热度越持久)
+### 1. 基础维度
+- 排名:"1"为榜首,数字越小越热。"3-8"表示排名在第3到第8之间波动。
+- 出现次数:次数越多,说明在热榜由于停留时间越长,热度越持久。
+- 时间范围:如"09:30~12:45",跨度越大说明话题生命力越强。
 
-### RSS 新闻字段
-每条 RSS 新闻包含
-- 来源: RSS 订阅源名称
-- 标题: 文章标题
-- 发布时间: 文章的原始发布时间
+### 2. 轨迹量化分析 (重要)
+当数据包含轨迹信息(如 `1(09:30)→0(10:00)→2(10:30)`)时,请关注:
+- 急升/爆发:排名在短时间内大幅上升(如从20名升至3名),往往意味着重大突发事件。
+- 僵尸热搜:排名持续阴跌且无反弹(如 10→15→20),说明热度正在衰退。
+- 回榜/反转:脱榜(显示为0)后又重回高位,通常意味着有新爆料或反转剧情。
 
-## 分析要点
+### 3. 跨平台特征 (分级标准)
+- 全网霸屏:5 个及以上平台同时上榜。真正的“国民级”话题,无死角覆盖。
+- 破圈扩散:3-4 个平台同时上榜。话题已突破单一社区壁垒,正在向外蔓延。
+- 圈层热点:仅在 1-2 个平台火爆。属于特定人群的狂欢(如仅在技术社区或娱乐榜)。
 
-利用这些数据维度,你可以分析:
-1. 热度强度: 排名越靠前(数字越小)、出现次数越多,热度越高
-2. 持续时间: 时间跨度大、出现次数多,说明话题持续发酵
-3. 排名波动: 排名范围大(如"1-20")说明热度不稳定,范围小(如"2-4")说明热度稳定
-4. 跨平台热度: 同一话题在多个平台出现,说明影响力更广
-5. 新兴趋势: 排名快速上升或首次出现的话题
-6. 时效性: RSS 发布时间可判断信息新鲜度
+## 分析板块说明 (5个核心板块)
+
+1. 核心热点态势 (Core Trends & Momentum)
+   - 整合:"趋势概述"、"热度走势"、"跨平台关联"。
+   - 任务:直接定性当前最火的话题。结合排名和跨平台数据,判断是"全网刷屏"还是"圈层热议"。
+   - 写法:避免简单罗列数据,而是总结态势。例如:"某话题霸榜多平台,热度持续超6小时,呈现极速爆发态势。"
+
+2. 舆论风向争议 (Sentiment & Controversy)
+   - 任务:运用矛盾分析法挖掘公众情绪内核。识别舆论场中的"根本对立"(主要矛盾)与"转化趋势",分析主流与非主流观点的博弈。
+   - 重点:是否存在观点对立?(如技术乐观派 vs 隐私担忧派)。情绪是正面(期待、兴奋)、负面(愤怒、担忧)还是复杂(调侃、质疑)?
+
+3. 异动与弱信号 (Signals)
+   - 任务:通过"轨迹"和"排名变化"捕捉异常。
+   - 关注:排名骤升的突发事件、首次出现的新鲜话题、或者反直觉的热度波动(如深夜突然高热)。
+
+4. RSS 深度洞察 (RSS Insights)
+   - 任务:分析 RSS 订阅源中的专业内容,提炼行业动态和深度信息。
+   - 关注:技术博客的前沿观点、行业媒体的独家报道、与热榜话题的关联或差异。
+   - 写法:突出 RSS 内容的"信息增量"——热榜没有但 RSS 有的独特视角或深度分析。
+
+5. 研判策略建议 (Outlook & Strategy)
+   - 整合:"潜在影响"与"建议"。
+   - 任务:形成闭环。基于上述分析,预测后续走向(如"可能会引起监管注意"),并给出具体建议。
+   - 对象:建议可面向投资者、品牌方或普通大众,力求落地。
 
 [user]
 请分析以下热点新闻数据:
 
 ## 数据概览
-- 报告模式:{report_mode}
-- 报告类型:{report_type}
+- 报告模式:{report_mode} ({report_type})
 - 分析时间:{current_time}
-- 热榜新闻:{news_count} 条
-- RSS 新闻:{rss_count} 条
-- 数据来源:{platforms}
+- 数据量:{news_count}条热榜 + {rss_count}条RSS
+- 来源:{platforms}
 
 ## 匹配关键词
 {keywords}
 
-## 新闻内容
+## 热榜新闻
 {news_content}
 
+## RSS 订阅
+{rss_content}
+
 ---
 
-请基于上述数据进行多维度分析,以 JSON 格式返回结果:
+请基于上述数据撰写分析报告,以 JSON 格式返回结果:
 
 ```json
 {
-  "summary": "核心热点概况(用简练语言概括当前最主要的核心事件,避免提及具体排名数据,80字以内)",
-  "keyword_analysis": "热度走势分析(结合排名波动、出现次数和时间跨度,分析核心话题的爆发力与持久性,80字以内)",
-  "sentiment": "情感倾向分析(极其重要:深入分析公众对核心话题的情感反馈,如:正面、负面、担忧、中性或争议,并简述原因,80字以内)",
-  "cross_platform": "跨平台联动分析(分析话题在多平台同步热搜的程度及其影响力差异,60字以内)",
-  "impact": "潜在影响评估(评估话题对社会舆论、行业动态或公众决策的冲击,60字以内)",
-  "signals": "异常与弱信号捕捉(关注排名骤升、首次出现或反直觉的波动,60字以内)",
-  "conclusion": "结论与建议(给出1-2条具有参考价值的操作性建议,40字以内)"
+  "core_trends": "核心热点态势(200字以内)。语言要像"大白话"一样通俗,但要像"手术刀"一样精准。拒绝学术词汇。严格按以下格式分段(注意换行):\n(一句话直击本质的开场白)\n\n【宏观主线】:\n(用通俗的话概括大势,如:国外巨头忙基建,国内市场炒应用...)\n\n【微观领域】:\n1. (细分点1):(描述)\n2. (细分点2):(描述)",
+  "sentiment_controversy": "舆论风向争议(100字以内)。先定性【整体】是褒是贬,再看【局部】有啥吵头。格式:\n【整体定性】:\n(如:全网都在骂,但也有人在这波流量里赚钱...)\n\n【争议焦点】:\n1. (焦点1):...\n2. (焦点2):...",
+  "signals": "异动与弱信号(100字以内)。按信号类型分点:\n1. 急升信号:...\n2. 异动信号:...\n3. 弱信号:...",
+  "rss_insights": "RSS 深度洞察(100字以内,无RSS数据时填"暂无RSS数据")。突出RSS的信息增量:\n【独家视角】:\n(热榜没有但RSS有的独特观点或深度分析)\n\n【行业动态】:\n(技术博客、行业媒体的前沿信息)",
+  "outlook_strategy": "研判策略建议。分受众群体给出建议:\n1. 投资者:...\n2. 品牌方:...\n3. 公众:..."
 }
 ```
 
 要求:
 - 必须返回有效的 JSON 格式
-- 分析要结合排名、出现次数、时间跨度等数据维度
-- 情感倾向分析是重点,请确保能够准确捕捉舆论风向
-- 每个字段都要填写,如无明显发现可写"暂无明显特征"
-- 使用中文
-- 保持简洁,避免冗余内容在不同字段间重复
+- 使用 {language} 输出,语言简练专业
+- 确保 5 个板块不重叠,信息不冗余
+- 若某板块无明显内容,可简写"暂无显著异常"
+- 不要使用 Markdown 格式(如 **加粗**),仅使用纯文本

+ 27 - 0
config/ai_translation_prompt.txt

@@ -0,0 +1,27 @@
+# ═══════════════════════════════════════════════════════════════
+#                    TrendRadar AI 翻译提示词配置
+# ═══════════════════════════════════════════════════════════════
+#
+# 此文件定义 AI 翻译内容时使用的提示词模板
+#
+# 可用变量:
+#   {target_language} - 目标语言
+#   {content}         - 需要翻译的文本内容
+#
+# ═══════════════════════════════════════════════════════════════
+
+[system]
+你是一位精通多语言的专业翻译助手。你的任务是将新闻内容翻译成目标语言,保持新闻的专业性、准确性和简洁性。
+
+要求:
+1. 准确传达原文含义,不要遗漏关键信息。
+2. 保持新闻标题的吸引力,但不要做标题党。
+3. 专有名词(人名、地名、机构名)若有通用译名请使用通用译名,否则保留原文或在括号内备注。
+4. 输出格式必须严格遵循要求,不要输出任何多余的解释性文字。
+
+[user]
+请将以下内容翻译成 {target_language}:
+
+{content}
+
+请直接输出翻译结果。

+ 178 - 67
config/config.yaml

@@ -20,32 +20,37 @@ app:
 # ===============================================================
 # 2. 数据源 - 热榜平台
 #
-# id: 平台唯一标识(勿修改)
-# name: 显示名称(可自定义,修改后不影响运行)
+# enabled: 是否启用热榜抓取(总开关)
+# sources: 平台列表
+#   - id: 平台唯一标识(勿修改)
+#   - name: 显示名称(可自定义,修改后不影响运行)
 # ===============================================================
 platforms:
-  - id: "toutiao"
-    name: "今日头条"
-  - id: "baidu"
-    name: "百度热搜"
-  - id: "wallstreetcn-hot"
-    name: "华尔街见闻"
-  - id: "thepaper"
-    name: "澎湃新闻"
-  - id: "bilibili-hot-search"
-    name: "bilibili 热搜"
-  - id: "cls-hot"
-    name: "财联社热门"
-  - id: "ifeng"
-    name: "凤凰网"
-  - id: "tieba"
-    name: "贴吧"
-  - id: "weibo"
-    name: "微博"
-  - id: "douyin"
-    name: "抖音"
-  - id: "zhihu"
-    name: "知乎"
+  enabled: true                         # 是否启用热榜平台抓取
+  sources:
+    - id: "toutiao"
+      name: "今日头条"
+    - id: "baidu"
+      name: "百度热搜"
+    - id: "wallstreetcn-hot"
+      name: "华尔街见闻"
+    - id: "thepaper"
+      name: "澎湃新闻"
+    - id: "bilibili-hot-search"
+      name: "bilibili 热搜"
+    - id: "cls-hot"
+      name: "财联社热门"
+    - id: "ifeng"
+      name: "凤凰网"
+    - id: "tieba"
+      name: "贴吧"
+    - id: "weibo"
+      name: "微博"
+    - id: "douyin"
+      name: "抖音"
+    - id: "zhihu"
+      name: "知乎"
+
 
 
 # ===============================================================
@@ -71,6 +76,7 @@ rss:
   #    - 只有新鲜的文章会被推送到通知渠道
   freshness_filter:
     enabled: true                     # 是否启用新鲜度过滤(默认启用)
+
     max_age_days: 3                   # 最大文章年龄(天)
                                       # - 正整数:只推送 N 天内的文章
                                       # - 0:禁用过滤,推送所有文章
@@ -123,17 +129,72 @@ rss:
 # ===============================================================
 report:
   mode: "current"                     # 可选: daily | current | incremental
-  display_mode: "keyword"             # 可选: keyword | platform
+  display_mode: "keyword"             # 分组维度: keyword | platform
                                       # keyword: 按关键词分组显示(默认)
                                       # platform: 按平台/来源分组显示
+
+  # 关键词组排序方式(仅 display_mode: keyword 时生效)
+  # true: 按 frequency_words.txt 中的定义顺序排列
+  # false: 按匹配到的热点条数排序(条数多的在前)
+  sort_by_position_first: false
+
   rank_threshold: 5                   # 排名高亮阈值
-  sort_by_position_first: false       # true=按配置位置排序,false=按热点条数排序
+
   max_news_per_keyword: 0             # 每个关键词最大显示数量(0=不限制)
-  reverse_content_order: false        # false=热点词汇统计在前,true=新增热点新闻在前
 
 
 # ===============================================================
-# 5. 推送通知
+# 5. 推送内容控制
+#
+# 统一管理推送消息中显示哪些区域及其排列顺序
+# ===============================================================
+display:
+  # 📋 区域显示顺序
+  # 列表从上到下的顺序 = 推送消息中从上到下的显示顺序
+  # 想调整顺序?直接剪切粘贴整行即可,例如把 ai_analysis 移到最前面:
+  #   region_order:
+  #     - ai_analysis    ← 移到第一行,AI 分析就会显示在最顶部
+  #     - new_items
+  #     - hotlist
+  #     - ...
+  # 注意:区域需同时满足两个条件才会显示:
+  #   1. 在此列表中
+  #   2. 下方 regions 中对应开关为 true
+  region_order:
+    - new_items                       # 1️⃣ 新增热点区域
+    - hotlist                         # 2️⃣ 热榜区域(关键词匹配)
+    - rss                             # 3️⃣ RSS 订阅区域
+    - standalone                      # 4️⃣ 独立展示区
+    - ai_analysis                     # 5️⃣ AI 分析区域
+
+  # 推送区域开关
+  # 控制各区域是否启用(配合 region_order 使用)
+  regions:
+    hotlist: true                     # 热榜区域(关键词匹配的热点新闻)
+    new_items: true                   # 新增热点区域(含热榜新增 + RSS 新增)
+                                      # 注:热点词汇统计中的新增标记🆕不受此配置影响
+
+    rss: true                         # RSS 订阅区域
+                                      # 开启后将对 RSS 进行关键词分析并在通知中展示
+                                      # 关闭后跳过分析,但独立展示区不受影响
+
+    standalone: false                 # 独立展示区(完整热榜/RSS,不受关键词过滤)
+    ai_analysis: true                 # AI 分析区域
+
+  # 📋 独立展示区配置(仅在 regions.standalone: true 时生效)
+  # 用途:将指定平台的完整热榜/RSS 单独展示,不受关键词过滤影响
+  # 适用场景:
+  #   - 想完整查看某个平台的热榜排名
+  #   - RSS 源内容较少,希望全部展示而非只显示关键词匹配的
+  # 注意:同一新闻可能同时出现在关键词匹配区和独立展示区
+  standalone:
+    platforms: []                     # 热榜平台 ID 列表(如 ["zhihu", "weibo"])
+    rss_feeds: []                     # RSS 源 ID 列表(如 ["hacker-news"])
+    max_items: 20                     # 每个源最多展示条数(0=不限制)
+
+
+# ===============================================================
+# 6. 推送通知
 #
 # ⚠️ 重要安全警告 ⚠️
 #
@@ -158,28 +219,17 @@ notification:
   # 🕐 推送时间窗口控制(可选功能)
   # 用途:限制推送的时间范围,避免非工作时间打扰
   # 适用场景:
-  #   - 只想在工作日白天接收推送(如 09:00-18:00)
-  #   - 希望在晚上固定时间收到汇总(如 20:00-22:00)
-  # 注意:GitHub Actions 执行时间不稳定,时间范围建议至少留足 2 小时
-  #       如果想要精准的定时推送,建议使用 Docker 部署在个人服务器上
+  #   • 只想在工作日白天接收推送(如 09:00-18:00)
+  #   • 希望在晚上固定时间收到汇总(如 20:00-22:00)
+  # ⚠️ GitHub Actions 用户注意:
+  #   执行时间不稳定,时间范围建议至少留足 2 小时
+  # 💡 想要精准定时?建议使用 Docker 部署在个人服务器上
   push_window:
     enabled: false                    # 是否启用推送时间窗口控制
     start: "20:00"                    # 开始时间(北京时间)
     end: "22:00"                      # 结束时间(北京时间)
     once_per_day: true                # true=窗口内只推送一次,false=窗口内每次执行都推送
 
-  # 📋 独立展示区配置(可选功能)
-  # 用途:将指定平台的完整热榜/RSS 单独展示,不受关键词过滤影响
-  # 适用场景:
-  #   - 想完整查看某个平台的热榜排名
-  #   - RSS 源内容较少,希望全部展示而非只显示关键词匹配的
-  # 注意:同一新闻可能同时出现在关键词匹配区和独立展示区
-  standalone_display:
-    enabled: false                    # 是否启用独立展示区
-    platforms: []                     # 热榜平台 ID 列表(如 ["zhihu", "weibo"])
-    rss_feeds: []                     # RSS 源 ID 列表(如 ["hacker-news"])
-    max_items: 20                     # 每个源最多展示条数(0=不限制)
-
   # 推送渠道配置
   channels:
     feishu:
@@ -222,7 +272,7 @@ notification:
 
 
 # ===============================================================
-# 6. 存储配置
+# 7. 存储配置
 # ===============================================================
 storage:
   # 存储后端选择
@@ -247,6 +297,7 @@ storage:
   # 建议将敏感信息配置在 GitHub Secrets 或环境变量中
   remote:
     retention_days: 0                 # 保留天数(0=永久保留)
+
     # S3 兼容配置(或使用环境变量 S3_ENDPOINT_URL 等)
     endpoint_url: ""                  # 服务端点
                                       # Cloudflare R2: https://<account_id>.r2.cloudflarestorage.com
@@ -265,14 +316,12 @@ storage:
 
 
 # ===============================================================
-# 7. AI 分析功能
+# 8. AI 模型配置(共享)
 #
-# 使用 AI 大模型对推送内容进行深度分析
-# 支持 OpenAI、Anthropic、DeepSeek等兼容接口
+# ai_analysis 和 ai_translation 共用此模型配置
+# 支持 OpenAI、DeepSeek、Google Gemini 等兼容接口
 # ===============================================================
-ai_analysis:
-  enabled: true                    # 是否启用 AI 分析
-
+ai:
   # AI 提供商配置
   # 支持的提供商:
   #   - deepseek: DeepSeek(默认)
@@ -294,39 +343,102 @@ ai_analysis:
 
   timeout: 90                       # 请求超时(秒)
 
-  # 推送模式(仅在 enabled: true 时生效)
-  # - only_analysis: 仅推送 AI 分析结果(若开启了“独立展示区”则一并保留,屏蔽原始热榜/RSS 列表)
-  # - both: 两者都推送(分析追加在原始内容后)
-  # 注:如果不需要 AI 分析,请将上方 enabled 设为 false,无需使用 push_mode 控制
-  push_mode: "both"
+  # AI 参数配置
+  temperature: 1.0                  # 采样温度 (0.0-2.0)
+                                    # 注意:部分模型(如 gpt-5)可能要求必须为 1.0,否则会报错
+
+  max_tokens: 5000                  # 最大生成 token 数
+                                    # 注意:如果 API 不支持此参数(报 HTTP 400),请设为 0 以禁用发送
+
+  # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+  # 额外自定义参数 (高级选项)
+  # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+  # 说明:用于向 AI 传递模型特定的高级生成参数。
+  # ⚠️ 警告:如果你不了解这些参数的含义,强烈建议【不要改动】,保持当前的注释状态。
+  #          填写了不符合模型要求的参数会导致 AI 分析报错并停止工作。
+  #
+  # 提示:不仅限于下方的示例,你可以根据模型 API 文档自行添加任何支持的字段。
+  #
+  # 操作:如果你确定需要修改,请删掉该行最前方的 "# " (井号和空格)。
+  # 注意:如果这几行都带着井号,则代表不使用额外参数(最推荐做法)。
+  # -------------------------------------------------------------
+  # extra_params:
+  #   top_p: 1.0            # [通用] 核采样:值越小生成结果越集中
+  #   topK: 40              # [Gemini 专用] 限制候选词数量
+  #   presence_penalty: 0.0 # [OpenAI 专用] 鼓励模型谈论新话题
+  #   # 你也可以在此继续添加模型支持的其他新字段,例如 stop, logit_bias 等
+
+
+# ===============================================================
+# 9. AI 分析功能
+#
+# 使用 AI 大模型对推送内容进行深度分析
+# 模型配置见上方 ai 配置段
+# ===============================================================
+ai_analysis:
+  enabled: true                     # 是否启用 AI 分析
 
-  # 分析选项
+  # 分析报告输出语言
+  # 格式:自然语言描述
+  # 示例: "English", "Korean", "法语"
+  language: "Chinese"
+
+  # 提示词配置文件路径(相对于 config 目录)
+  prompt_file: "ai_analysis_prompt.txt"
+
+  # 分析内容配置
   max_news_for_analysis: 50         # 参与分析的新闻数量上限(控制成本关键项)
+                                    # 当前默认的【报告模式】是【当前榜单模式】(current),也就是只分析当前在热榜的新闻
+                                    # 如果需要让报告呈现出更有参考价值的完整一天的趋势,且你的 token 充裕
+                                      # 可开启 daily(当日汇总模式)
+                                      # 同时调整 max_news_for_analysis 为 150(你自己视情况调整,推送消息顶部有 ai 分析数目供参考)
+
                                     # api 成本估算 (仅供参考)
-                                      # 按默认推送频率和模型
-                                      # GitHub Action 约 0.1 元/天
-                                      # Docker 部署约 0.2 元/天
+                                      # 按默认推送频率和模型(deepseek)
+                                      # 且 include_rank_timeline 为 false
+                                    # 则
+                                      # GitHub Action 部署默认推送约 20 次(每小时推送一次), 约 0.1 元/天
+                                      # Docker 部署默认推送 48 次(每半小时推送一次), 约 0.2 元/天
+
+  include_rss: false                 # 是否包含 RSS 内容进行分析
+
+  include_rank_timeline: true      # 是否传递完整排名时间线
+                                    # false: 使用简化格式(排名范围+时间范围+出现次数)
+                                    # true: 传递完整排名变化轨迹(如 1(09:30)→2(10:00)→0(11:00))
+                                    # 启用后 AI 能更精确分析热度趋势,但会额外增加 token 消耗(0.5 倍到 1 倍)
 
-  include_rss: false                # 是否包含 RSS 内容进行分析
+
+# ===============================================================
+# 10. AI 翻译功能
+#
+# 对推送内容进行多语言翻译,不包含 ai_analysis 分析的内容 
+# 模型配置见上方 ai 配置段
+# ===============================================================
+ai_translation:
+  enabled: false                    # 是否启用翻译功能
+
+  # 翻译目标语言
+  # 格式:自然语言描述
+  # 示例: "Chinese", "Korean", "法语"
+  language: "English"
 
   # 提示词配置文件路径(相对于 config 目录)
-  prompt_file: "ai_analysis_prompt.txt"
+  prompt_file: "ai_translation_prompt.txt"
 
 
 # ===============================================================
-# 8. 高级设置(一般无需修改)
+# 11. 高级设置(一般无需修改)
 # ===============================================================
 advanced:
   # 调试模式
-  debug: false
+  debug: true
 
   # 版本检查
   version_check_url: "https://raw.githubusercontent.com/sansan0/TrendRadar/refs/heads/master/version"
   mcp_version_check_url: "https://raw.githubusercontent.com/sansan0/TrendRadar/refs/heads/master/version_mcp"
 
-  # 爬虫设置
+  # 热榜爬虫技术参数
   crawler:
-    enabled: true                     # 是否启用爬取新闻功能
     request_interval: 2000            # 请求间隔(毫秒)
     use_proxy: false                  # 是否启用代理
     default_proxy: "http://127.0.0.1:10801"
@@ -337,7 +449,6 @@ advanced:
     timeout: 15                       # 请求超时(秒)
     use_proxy: false                  # 是否使用代理
     proxy_url: ""                     # RSS 专属代理(留空则使用 crawler.default_proxy)
-    notification_enabled: true        # 是否启用 RSS 通知推送
 
   # 排序权重(用于重新排序不同平台的热搜)
   # 合起来等于 1

+ 1 - 0
config/frequency_words.txt

@@ -1,6 +1,7 @@
 # ═══════════════════════════════════════════════════════════════
 #                    TrendRadar 频率词配置文件
 # ═══════════════════════════════════════════════════════════════
+# 凡是左侧有 # 的都是仅供阅读的说明性文字
 #
 # 这个文件用来设置你想关注的新闻关键词。
 # 系统会自动抓取包含这些关键词的热榜新闻推送给你。

+ 2 - 15
docker/.env

@@ -1,16 +1,3 @@
-# ============================================
-# 核心配置(环境变量优先级 > config.yaml)
-# ============================================
-
-# 是否启用爬虫 (true/false)
-ENABLE_CRAWLER=
-# 是否启用通知 (true/false)
-ENABLE_NOTIFICATION=
-# 报告模式 (daily|incremental|current)
-REPORT_MODE=
-# 显示模式 (keyword|platform)
-DISPLAY_MODE=
-
 # ============================================
 # Web 服务器配置
 # ============================================
@@ -70,12 +57,12 @@ GENERIC_WEBHOOK_URL=
 GENERIC_WEBHOOK_TEMPLATE=
 
 # ============================================
-# AI 分析配置
+# AI 配置(ai_analysis 和 ai_translation 共享模型配置)
 # ============================================
 
 # 是否启用 AI 分析 (true/false)
 AI_ANALYSIS_ENABLED=false
-# AI API Key(必填,启用 AI 分析时需要)
+# AI API Key(必填,启用 AI 功能时需要)
 AI_API_KEY=
 # AI 提供商 (deepseek|openai|gemini|custom)
 AI_PROVIDER=deepseek

+ 1 - 6
docker/docker-compose-build.yml

@@ -15,11 +15,6 @@ services:
 
     environment:
       - TZ=Asia/Shanghai
-      # 核心配置
-      - ENABLE_CRAWLER=${ENABLE_CRAWLER:-}
-      - ENABLE_NOTIFICATION=${ENABLE_NOTIFICATION:-}
-      - REPORT_MODE=${REPORT_MODE:-}
-      - DISPLAY_MODE=${DISPLAY_MODE:-}
       # Web 服务器
       - ENABLE_WEBSERVER=${ENABLE_WEBSERVER:-false}
       - WEBSERVER_PORT=${WEBSERVER_PORT:-8080}
@@ -47,7 +42,7 @@ services:
       # 通用Webhook配置
       - GENERIC_WEBHOOK_URL=${GENERIC_WEBHOOK_URL:-}
       - GENERIC_WEBHOOK_TEMPLATE=${GENERIC_WEBHOOK_TEMPLATE:-}
-      # AI 分析配置
+      # AI 配置(ai_analysis 和 ai_translation 共享模型配置)
       - AI_ANALYSIS_ENABLED=${AI_ANALYSIS_ENABLED:-false}
       - AI_API_KEY=${AI_API_KEY:-}
       - AI_PROVIDER=${AI_PROVIDER:-}

+ 1 - 6
docker/docker-compose.yml

@@ -13,11 +13,6 @@ services:
 
     environment:
       - TZ=Asia/Shanghai
-      # 核心配置
-      - ENABLE_CRAWLER=${ENABLE_CRAWLER:-}
-      - ENABLE_NOTIFICATION=${ENABLE_NOTIFICATION:-}
-      - REPORT_MODE=${REPORT_MODE:-}
-      - DISPLAY_MODE=${DISPLAY_MODE:-}
       # Web 服务器
       - ENABLE_WEBSERVER=${ENABLE_WEBSERVER:-false}
       - WEBSERVER_PORT=${WEBSERVER_PORT:-8080}
@@ -45,7 +40,7 @@ services:
       # 通用Webhook配置
       - GENERIC_WEBHOOK_URL=${GENERIC_WEBHOOK_URL:-}
       - GENERIC_WEBHOOK_TEMPLATE=${GENERIC_WEBHOOK_TEMPLATE:-}
-      # AI 分析配置
+      # AI 配置(ai_analysis 和 ai_translation 共享模型配置)
       - AI_ANALYSIS_ENABLED=${AI_ANALYSIS_ENABLED:-false}
       - AI_API_KEY=${AI_API_KEY:-}
       - AI_PROVIDER=${AI_PROVIDER:-}

+ 0 - 5
docker/manage.py

@@ -279,11 +279,6 @@ def show_config():
         "CRON_SCHEDULE",
         "RUN_MODE",
         "IMMEDIATE_RUN",
-        # 核心配置
-        "ENABLE_CRAWLER",
-        "ENABLE_NOTIFICATION",
-        "REPORT_MODE",
-        "DISPLAY_MODE",
         # 通知渠道
         "FEISHU_WEBHOOK_URL",
         "DINGTALK_WEBHOOK_URL",

+ 1 - 1
mcp_server/__init__.py

@@ -5,4 +5,4 @@ TrendRadar MCP Server
 
 """
 
-__version__ = "3.1.5"
+__version__ = "3.1.6"

+ 70 - 326
mcp_server/server.py

@@ -9,8 +9,7 @@ import asyncio
 import json
 from typing import List, Optional, Dict, Union
 
-from fastmcp import FastMCP, Context
-from fastmcp.server.dependencies import get_context
+from fastmcp import FastMCP
 
 from .tools.data_query import DataQueryTools
 from .tools.analytics import AnalyticsTools
@@ -28,9 +27,6 @@ mcp = FastMCP('trendradar-news')
 # 全局工具实例(在第一次请求时初始化)
 _tools_instances = {}
 
-# Session-level 工具实例存储(用于 Context 管理)
-_session_tools: Dict[str, Dict] = {}
-
 
 def _get_tools(project_root: Optional[str] = None):
     """获取或创建工具实例(单例模式)"""
@@ -44,39 +40,6 @@ def _get_tools(project_root: Optional[str] = None):
     return _tools_instances
 
 
-def _get_tools_with_context(ctx: Optional[Context] = None) -> Dict:
-    """
-    获取工具实例(支持 Session 隔离)
-
-    如果提供了 Context,则为每个 session 创建独立的工具实例。
-    这样可以避免不同会话之间的状态污染。
-
-    Args:
-        ctx: FastMCP Context 对象
-
-    Returns:
-        工具实例字典
-    """
-    if ctx is None:
-        return _get_tools()
-
-    # 获取 session ID(如果有的话)
-    session_id = getattr(ctx, 'session_id', None) or 'default'
-
-    if session_id not in _session_tools:
-        # 为新 session 创建工具实例
-        _session_tools[session_id] = {
-            'data': DataQueryTools(),
-            'analytics': AnalyticsTools(),
-            'search': SearchTools(),
-            'config': ConfigManagementTools(),
-            'system': SystemManagementTools(),
-            'storage': StorageSyncTools(),
-        }
-
-    return _session_tools[session_id]
-
-
 # ==================== MCP Resources ====================
 
 @mcp.resource("config://platforms")
@@ -229,28 +192,17 @@ async def get_latest_news(
     获取最新一批爬取的新闻数据,快速了解当前热点
 
     Args:
-        platforms: 平台ID列表,如 ['zhihu', 'weibo', 'douyin']
-                   - 不指定时:使用 config.yaml 中配置的所有平台
-                   - 支持的平台来自 config/config.yaml 的 platforms 配置
-                   - 每个平台都有对应的name字段(如"知乎"、"微博"),方便AI识别
+        platforms: 平台ID列表,如 ['zhihu', 'weibo'],不指定则使用所有平台
         limit: 返回条数限制,默认50,最大1000
-               注意:实际返回数量可能少于请求值,取决于当前可用的新闻总数
         include_url: 是否包含URL链接,默认False(节省token)
 
     Returns:
         JSON格式的新闻列表
 
-    **重要:数据展示建议**
-    本工具会返回完整的新闻列表(通常50条)给你。但请注意:
-    - **工具返回**:完整的50条数据 ✅
-    - **建议展示**:向用户展示全部数据,除非用户明确要求总结
-    - **用户期望**:用户可能需要完整数据,请谨慎总结
-
-    **何时可以总结**:
-    - 用户明确说"给我总结一下"或"挑重点说"
-    - 数据量超过100条时,可先展示部分并询问是否查看全部
-
-    **注意**:如果用户询问"为什么只显示了部分",说明他们需要完整数据
+    **数据展示建议**
+    - 默认展示全部返回数据,除非用户明确要求总结
+    - 用户说"总结"或"挑重点"时才进行筛选
+    - 用户问"为什么只显示部分"说明需要完整数据
     """
     tools = _get_tools()
     result = await asyncio.to_thread(
@@ -308,35 +260,17 @@ async def get_latest_rss(
     RSS 数据与热榜新闻分开存储,按时间流展示,适合获取特定来源的最新内容。
 
     Args:
-        feeds: RSS 源 ID 列表,如 ['hacker-news', '36kr']
-               - 不指定时:返回所有已配置 RSS 源的数据
-               - 支持的 RSS 源来自 config/config.yaml 的 rss.feeds 配置
+        feeds: RSS 源 ID 列表,如 ['hacker-news', '36kr'],不指定则返回所有源
         days: 获取最近 N 天的数据,默认 1(仅今天),最大 30 天
-              - 1: 仅今天(默认)
-              - 7: 最近一周
-              - 30: 最近一个月
         limit: 返回条数限制,默认50,最大500
         include_summary: 是否包含文章摘要,默认False(节省token)
 
     Returns:
-        JSON格式的 RSS 条目列表,包含:
-        - rss: RSS 条目数组
-            - title: 文章标题
-            - feed_id: RSS 源 ID
-            - feed_name: RSS 源名称
-            - url: 文章链接
-            - published_at: 发布时间
-            - author: 作者(如有)
-            - date: 数据日期
-            - summary: 摘要(仅当 include_summary=True)
-        - total: 返回条数
-        - feeds: 请求的 RSS 源列表
+        JSON格式的 RSS 条目列表
 
     Examples:
-        - 获取今天所有 RSS: get_latest_rss()
-        - 获取最近一周: get_latest_rss(days=7)
-        - 获取指定源最近7天: get_latest_rss(feeds=['hacker-news'], days=7)
-        - 包含摘要: get_latest_rss(include_summary=True, days=7, limit=20)
+        - get_latest_rss()
+        - get_latest_rss(days=7, feeds=['hacker-news'])
     """
     tools = _get_tools()
     result = await asyncio.to_thread(
@@ -425,28 +359,12 @@ async def get_news_by_date(
             - 自然语言: "今天", "昨天", "本周", "最近7天"
             - 单日字符串: "2025-01-15"
             - 默认值: "今天"
-        platforms: 平台ID列表,如 ['zhihu', 'weibo', 'douyin']
-                   - 不指定时:使用 config.yaml 中配置的所有平台
-                   - 支持的平台来自 config/config.yaml 的 platforms 配置
-                   - 每个平台都有对应的name字段(如"知乎"、"微博"),方便AI识别
+        platforms: 平台ID列表,如 ['zhihu', 'weibo'],不指定则使用所有平台
         limit: 返回条数限制,默认50,最大1000
-               注意:实际返回数量可能少于请求值,取决于指定日期的新闻总数
         include_url: 是否包含URL链接,默认False(节省token)
 
     Returns:
         JSON格式的新闻列表,包含标题、平台、排名等信息
-
-    **重要:数据展示建议**
-    本工具会返回完整的新闻列表(通常50条)给你。但请注意:
-    - **工具返回**:完整的50条数据 ✅
-    - **建议展示**:向用户展示全部数据,除非用户明确要求总结
-    - **用户期望**:用户可能需要完整数据,请谨慎总结
-
-    **何时可以总结**:
-    - 用户明确说"给我总结一下"或"挑重点说"
-    - 数据量超过100条时,可先展示部分并询问是否查看全部
-
-    **注意**:如果用户询问"为什么只显示了部分",说明他们需要完整数据
     """
     tools = _get_tools()
     result = await asyncio.to_thread(
@@ -476,23 +394,17 @@ async def analyze_topic_trend(
     """
     统一话题趋势分析工具 - 整合多种趋势分析模式
 
-    **重要:日期范围处理**
-    当用户使用"本周"、"最近7天"等自然语言时,请先调用 resolve_date_range 工具获取精确日期:
-    1. 调用 resolve_date_range("本周") → 获取 {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}
-    2. 将返回的 date_range 传入本工具
+    建议:使用自然语言日期时,先调用 resolve_date_range 获取精确日期范围。
 
     Args:
         topic: 话题关键词(必需)
-        analysis_type: 分析类型,可选值:
-            - "trend": 热度趋势分析(追踪话题的热度变化)
-            - "lifecycle": 生命周期分析(从出现到消失的完整周期)
-            - "viral": 异常热度检测(识别突然爆火的话题)
-            - "predict": 话题预测(预测未来可能的热点)
-        date_range: 日期范围(trend和lifecycle模式),可选
-                    - **格式**: {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}
-                    - **获取方式**: 调用 resolve_date_range 工具解析自然语言日期
-                    - **默认**: 不指定时默认分析最近7天
-        granularity: 时间粒度(trend模式),默认"day"(仅支持 day,因为底层数据按天聚合)
+        analysis_type: 分析类型
+            - "trend": 热度趋势分析(默认)
+            - "lifecycle": 生命周期分析
+            - "viral": 异常热度检测
+            - "predict": 话题预测
+        date_range: 日期范围,格式 {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"},默认最近7天
+        granularity: 时间粒度,默认"day"
         spike_threshold: 热度突增倍数阈值(viral模式),默认3.0
         time_window: 检测时间窗口小时数(viral模式),默认24
         lookahead_hours: 预测未来小时数(predict模式),默认6
@@ -502,15 +414,8 @@ async def analyze_topic_trend(
         JSON格式的趋势分析结果
 
     Examples:
-        用户:"分析AI本周的趋势"
-        推荐调用流程:
-        1. resolve_date_range("本周") → {"date_range": {"start": "2025-11-18", "end": "2025-11-26"}}
-        2. analyze_topic_trend(topic="AI", date_range={"start": "2025-11-18", "end": "2025-11-26"})
-
-        用户:"看看特斯拉最近30天的热度"
-        推荐调用流程:
-        1. resolve_date_range("最近30天") → {"date_range": {"start": "2025-10-28", "end": "2025-11-26"}}
-        2. analyze_topic_trend(topic="特斯拉", analysis_type="lifecycle", date_range=...)
+        - analyze_topic_trend(topic="AI", date_range={"start": "2025-01-01", "end": "2025-01-07"})
+        - analyze_topic_trend(topic="特斯拉", analysis_type="lifecycle")
     """
     tools = _get_tools()
     result = await asyncio.to_thread(
@@ -583,24 +488,13 @@ async def analyze_sentiment(
     """
     分析新闻的情感倾向和热度趋势
 
-    **重要:日期范围处理**
-    当用户使用"本周"、"最近7天"等自然语言时,请先调用 resolve_date_range 工具获取精确日期:
-    1. 调用 resolve_date_range("本周") → 获取 {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}
-    2. 将返回的 date_range 传入本工具
+    建议:使用自然语言日期时,先调用 resolve_date_range 获取精确日期范围。
 
     Args:
         topic: 话题关键词(可选)
-        platforms: 平台ID列表,如 ['zhihu', 'weibo', 'douyin']
-                   - 不指定时:使用 config.yaml 中配置的所有平台
-                   - 支持的平台来自 config/config.yaml 的 platforms 配置
-                   - 每个平台都有对应的name字段(如"知乎"、"微博"),方便AI识别
-        date_range: 日期范围(可选)
-                    - **格式**: {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}
-                    - **获取方式**: 调用 resolve_date_range 工具解析自然语言日期
-                    - **默认**: 不指定则默认查询今天的数据
-        limit: 返回新闻数量,默认50,最大100
-               注意:本工具会对新闻标题进行去重(同一标题在不同平台只保留一次),
-               因此实际返回数量可能少于请求的 limit 值
+        platforms: 平台ID列表,如 ['zhihu', 'weibo'],不指定则使用所有平台
+        date_range: 日期范围,格式 {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"},默认今天
+        limit: 返回新闻数量,默认50,最大100(会对标题去重)
         sort_by_weight: 是否按热度权重排序,默认True
         include_url: 是否包含URL链接,默认False(节省token)
 
@@ -608,20 +502,7 @@ async def analyze_sentiment(
         JSON格式的分析结果,包含情感分布、热度趋势和相关新闻
 
     Examples:
-        用户:"分析AI本周的情感倾向"
-        推荐调用流程:
-        1. resolve_date_range("本周") → {"date_range": {"start": "2025-11-18", "end": "2025-11-26"}}
-        2. analyze_sentiment(topic="AI", date_range={"start": "2025-11-18", "end": "2025-11-26"})
-
-        用户:"分析特斯拉最近7天的新闻情感"
-        推荐调用流程:
-        1. resolve_date_range("最近7天") → {"date_range": {"start": "2025-11-20", "end": "2025-11-26"}}
-        2. analyze_sentiment(topic="特斯拉", date_range={"start": "2025-11-20", "end": "2025-11-26"})
-
-    **重要:数据展示策略**
-    - 本工具返回完整的分析结果和新闻列表
-    - **默认展示方式**:展示完整的分析结果(包括所有新闻)
-    - 仅在用户明确要求"总结"或"挑重点"时才进行筛选
+        - analyze_sentiment(topic="AI", date_range={"start": "2025-01-01", "end": "2025-01-07"})
     """
     tools = _get_tools()
     result = await asyncio.to_thread(
@@ -651,13 +532,9 @@ async def find_related_news(
         reference_title: 参考新闻标题(完整或部分)
         date_range: 日期范围(可选)
             - 不指定: 只查询今天的数据
-            - "today": 今天
-            - "yesterday": 昨天
-            - "last_week": 最近7天
-            - "last_month": 最近30天
+            - "today", "yesterday", "last_week", "last_month": 预设值
             - {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}: 自定义范围
-        threshold: 相似度阈值,0-1之间,默认0.5
-                   注意:阈值越高匹配越严格,返回结果越少
+        threshold: 相似度阈值,0-1之间,默认0.5(越高匹配越严格)
         limit: 返回条数限制,默认50
         include_url: 是否包含URL链接,默认False(节省token)
 
@@ -665,13 +542,8 @@ async def find_related_news(
         JSON格式的相关新闻列表,按相似度排序
 
     Examples:
-        - 查找今天的相似新闻: find_related_news(reference_title="特斯拉降价")
-        - 查找历史相关新闻: find_related_news(reference_title="特斯拉降价", date_range="last_week")
-        - 自定义日期范围: find_related_news(reference_title="AI突破", date_range={"start": "2025-01-01", "end": "2025-01-15"})
-
-    **重要:数据展示策略**
-    - 本工具返回完整的相关新闻列表(包括相似度分数)
-    - 仅在用户明确要求"总结"时才进行筛选
+        - find_related_news(reference_title="特斯拉降价")
+        - find_related_news(reference_title="AI突破", date_range="last_week")
     """
     tools = _get_tools()
     result = await asyncio.to_thread(
@@ -723,46 +595,21 @@ async def aggregate_news(
     """
     跨平台新闻聚合 - 对相似新闻进行去重合并
 
-    将不同平台报道的同一事件合并为一条聚合新闻,
-    显示该新闻在各平台的覆盖情况和综合热度。
-
-    **使用场景:**
-    - 想要看到去重后的热点新闻(避免同一事件在不同平台重复展示)
-    - 分析某个话题在多个平台的覆盖情况
-    - 获取跨平台的综合热度排名
+    将不同平台报道的同一事件合并为一条聚合新闻,显示跨平台覆盖情况和综合热度。
 
     Args:
-        date_range: 日期范围(可选)
-            - 不指定: 查询今天
-            - {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}: 日期范围
-        platforms: 平台过滤列表,如 ['zhihu', 'weibo']
-        similarity_threshold: 相似度阈值,0.3-1.0之间,默认0.7
-                              越高越严格(仅合并非常相似的标题)
+        date_range: 日期范围,不指定则查询今天
+        platforms: 平台ID列表,如 ['zhihu', 'weibo'],不指定则使用所有平台
+        similarity_threshold: 相似度阈值,0.3-1.0,默认0.7(越高越严格)
         limit: 返回聚合新闻数量,默认50
         include_url: 是否包含URL链接,默认False
 
     Returns:
-        JSON格式的聚合结果,包含:
-        - summary: 聚合统计(原始数量、去重后数量、去重率)
-        - aggregated_news: 聚合后的新闻列表
-            - representative_title: 代表标题
-            - platforms: 覆盖的平台列表
-            - platform_count: 覆盖平台数
-            - is_cross_platform: 是否跨平台新闻
-            - best_rank: 最佳排名
-            - aggregate_weight: 综合权重
-            - sources: 各平台来源详情
-        - statistics: 平台覆盖统计
+        JSON格式的聚合结果,包含去重统计、聚合新闻列表和平台覆盖统计
 
     Examples:
-        - aggregate_news()  # 聚合今天所有平台的新闻
-        - aggregate_news(similarity_threshold=0.8)  # 更严格的相似度匹配
-        - aggregate_news(date_range={"start": "2025-01-01", "end": "2025-01-07"})
-
-    **重要:数据展示策略**
-    - 本工具返回去重聚合后的新闻列表
-    - 跨平台新闻(is_cross_platform=true)通常更具新闻价值
-    - 可优先展示 platform_count > 1 的新闻
+        - aggregate_news()
+        - aggregate_news(similarity_threshold=0.8)
     """
     tools = _get_tools()
     result = await asyncio.to_thread(
@@ -854,67 +701,30 @@ async def search_news(
     """
     统一搜索接口,支持多种搜索模式,可同时搜索热榜和RSS
 
-    **重要:日期范围处理**
-    当用户使用"本周"、"最近7天"等自然语言时,请先调用 resolve_date_range 工具获取精确日期:
-    1. 调用 resolve_date_range("本周") → 获取 {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}
-    2. 将返回的 date_range 传入本工具
+    建议:使用自然语言日期时,先调用 resolve_date_range 获取精确日期范围。
 
     Args:
         query: 搜索关键词或内容片段
-        search_mode: 搜索模式,可选值:
-            - "keyword": 精确关键词匹配(默认,适合搜索特定话题)
-            - "fuzzy": 模糊内容匹配(适合搜索内容片段,会过滤相似度低于阈值的结果)
-            - "entity": 实体名称搜索(适合搜索人物/地点/机构)
-        date_range: 日期范围(可选)
-                    - **格式**: {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}
-                    - **获取方式**: 调用 resolve_date_range 工具解析自然语言日期
-                    - **默认**: 不指定时默认查询今天的新闻
-        platforms: 平台ID列表,如 ['zhihu', 'weibo', 'douyin']
-                   - 不指定时:使用 config.yaml 中配置的所有平台
-                   - 支持的平台来自 config/config.yaml 的 platforms 配置
-                   - 每个平台都有对应的name字段(如"知乎"、"微博"),方便AI识别
-        limit: 热榜返回条数限制,默认50,最大1000
-               注意:实际返回数量取决于搜索匹配结果(特别是 fuzzy 模式下会过滤低相似度结果)
-        sort_by: 排序方式,可选值:
-            - "relevance": 按相关度排序(默认)
-            - "weight": 按新闻权重排序
-            - "date": 按日期排序
-        threshold: 相似度阈值(仅fuzzy模式有效),0-1之间,默认0.6
-                   注意:阈值越高匹配越严格,返回结果越少
-        include_url: 是否包含URL链接,默认False(节省token)
-        include_rss: 是否同时搜索RSS订阅数据,默认False
-                     - 设为True时,会在热榜结果后附加RSS搜索结果
-                     - RSS结果独立展示,不影响热榜排名
-        rss_limit: RSS返回条数限制,默认20(仅当include_rss=True时有效)
+        search_mode: 搜索模式
+            - "keyword": 精确关键词匹配(默认)
+            - "fuzzy": 模糊内容匹配
+            - "entity": 实体名称搜索(人物/地点/机构)
+        date_range: 日期范围,格式 {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"},默认今天
+        platforms: 平台ID列表,如 ['zhihu', 'weibo'],不指定则使用所有平台
+        limit: 热榜返回条数限制,默认50
+        sort_by: 排序方式 - "relevance"(相关度)/ "weight"(权重)/ "date"(日期)
+        threshold: 相似度阈值(仅fuzzy模式),0-1,默认0.6
+        include_url: 是否包含URL链接,默认False
+        include_rss: 是否同时搜索RSS数据,默认False
+        rss_limit: RSS返回条数限制,默认20
 
     Returns:
-        JSON格式的搜索结果,包含:
-        - results: 热榜新闻列表(按排名/相关度排序)
-        - rss: RSS订阅结果列表(仅当include_rss=True时返回)
-        - summary: 搜索统计信息
+        JSON格式的搜索结果,包含热榜新闻列表和可选的RSS结果
 
     Examples:
-        用户:"搜索本周的AI新闻"
-        推荐调用流程:
-        1. resolve_date_range("本周") → {"date_range": {"start": "2025-11-18", "end": "2025-11-26"}}
-        2. search_news(query="AI", date_range={"start": "2025-11-18", "end": "2025-11-26"})
-
-        用户:"搜索AI相关内容,包括RSS"
-        → search_news(query="AI", include_rss=True)
-
-        用户:"最近7天的特斯拉新闻"
-        推荐调用流程:
-        1. resolve_date_range("最近7天") → {"date_range": {"start": "2025-11-20", "end": "2025-11-26"}}
-        2. search_news(query="特斯拉", date_range={"start": "2025-11-20", "end": "2025-11-26"})
-
-        用户:"今天的AI新闻"(默认今天,无需解析)
-        → search_news(query="AI")
-
-    **重要:数据展示策略**
-    - 本工具返回完整的搜索结果列表
-    - **默认展示方式**:展示全部返回的新闻,无需总结或筛选
-    - 仅在用户明确要求"总结"或"挑重点"时才进行筛选
-    - 当include_rss=True时,热榜和RSS结果分开展示,RSS在热榜之后
+        - search_news(query="AI")
+        - search_news(query="AI", include_rss=True)
+        - search_news(query="特斯拉", date_range={"start": "2025-01-01", "end": "2025-01-07"})
     """
     tools = _get_tools()
     result = await asyncio.to_thread(
@@ -981,37 +791,16 @@ async def check_version(
     检查版本更新(同时检查 TrendRadar 和 MCP Server)
 
     比较本地版本与 GitHub 远程版本,判断是否需要更新。
-    远程版本 URL 从 config.yaml 获取:
-    - version_check_url: TrendRadar 版本
-    - mcp_version_check_url: MCP Server 版本
 
     Args:
         proxy_url: 可选的代理URL,用于访问 GitHub(如 http://127.0.0.1:7890)
 
     Returns:
-        JSON格式的版本检查结果,包含:
-        - success: 是否成功
-        - summary:
-            - description: 结果描述
-            - any_update: 是否有任何组件需要更新
-        - data:
-            - trendradar: TrendRadar 版本检查结果
-                - name: 组件名称
-                - current_version: 当前本地版本(如 "5.0.0")
-                - remote_version: 远程最新版本
-                - need_update: 是否需要更新
-                - message: 状态描述
-            - mcp: MCP Server 版本检查结果
-                - name: 组件名称
-                - current_version: 当前本地版本(如 "3.1.4")
-                - remote_version: 远程最新版本
-                - need_update: 是否需要更新
-                - message: 状态描述
-            - any_update: 是否有任何组件需要更新
+        JSON格式的版本检查结果,包含两个组件的版本对比和是否需要更新
 
     Examples:
-        - check_version()  # 直接检查两个组件的版本
-        - check_version(proxy_url="http://127.0.0.1:7890")  # 使用代理访问 GitHub
+        - check_version()
+        - check_version(proxy_url="http://127.0.0.1:7890")
     """
     tools = _get_tools()
     result = await asyncio.to_thread(tools['system'].check_version, proxy_url=proxy_url)
@@ -1028,25 +817,16 @@ async def trigger_crawl(
     手动触发一次爬取任务(可选持久化)
 
     Args:
-        platforms: 指定平台ID列表,如 ['zhihu', 'weibo', 'douyin']
-                   - 不指定时:使用 config.yaml 中配置的所有平台
-                   - 支持的平台来自 config/config.yaml 的 platforms 配置
-                   - 每个平台都有对应的name字段(如"知乎"、"微博"),方便AI识别
-                   - 注意:失败的平台会在返回结果的 failed_platforms 字段中列出
+        platforms: 平台ID列表,如 ['zhihu', 'weibo'],不指定则使用所有平台
         save_to_local: 是否保存到本地 output 目录,默认 False
         include_url: 是否包含URL链接,默认False(节省token)
 
     Returns:
-        JSON格式的任务状态信息,包含:
-        - platforms: 成功爬取的平台列表
-        - failed_platforms: 失败的平台列表(如有)
-        - total_news: 爬取的新闻总数
-        - data: 新闻数据
+        JSON格式的任务状态信息,包含成功/失败平台列表和新闻数据
 
     Examples:
-        - 临时爬取: trigger_crawl(platforms=['zhihu'])
-        - 爬取并保存: trigger_crawl(platforms=['weibo'], save_to_local=True)
-        - 使用默认平台: trigger_crawl()  # 爬取config.yaml中配置的所有平台
+        - trigger_crawl(platforms=['zhihu'])
+        - trigger_crawl(save_to_local=True)
     """
     tools = _get_tools()
     result = await asyncio.to_thread(
@@ -1107,26 +887,7 @@ async def get_storage_status() -> str:
     查看当前存储后端配置、本地和远程存储的状态信息。
 
     Returns:
-        JSON格式的存储状态信息,包含:
-        - backend: 当前使用的后端类型(local/remote/auto)
-        - local: 本地存储状态
-            - data_dir: 数据目录
-            - retention_days: 保留天数
-            - total_size: 总大小
-            - date_count: 日期数量
-            - earliest_date: 最早日期
-            - latest_date: 最新日期
-        - remote: 远程存储状态
-            - configured: 是否已配置
-            - endpoint_url: 服务端点
-            - bucket_name: 存储桶名称
-            - date_count: 远程日期数量
-        - pull: 拉取配置
-            - enabled: 是否启用自动拉取
-            - days: 自动拉取天数
-
-    Examples:
-        - get_storage_status()  # 查看所有存储状态
+        JSON格式的存储状态信息,包含本地/远程存储状态和拉取配置
     """
     tools = _get_tools()
     result = await asyncio.to_thread(tools['storage'].get_storage_status)
@@ -1140,37 +901,20 @@ async def list_available_dates(
     """
     列出本地/远程可用的日期范围
 
-    查看本地和远程存储中有哪些日期的数据可用,
-    帮助了解数据覆盖范围和同步状态。
+    查看本地和远程存储中有哪些日期的数据可用。
 
     Args:
-        source: 数据来源,可选值:
-            - "local": 仅列出本地可用日期
-            - "remote": 仅列出远程可用日期
-            - "both": 同时列出两者进行对比(默认)
+        source: 数据来源
+            - "local": 仅本地
+            - "remote": 仅远程
+            - "both": 同时列出并对比(默认)
 
     Returns:
-        JSON格式的日期列表,包含:
-        - local: 本地日期信息(如果 source 包含 local)
-            - dates: 日期列表(按时间倒序)
-            - count: 日期数量
-            - earliest: 最早日期
-            - latest: 最新日期
-        - remote: 远程日期信息(如果 source 包含 remote)
-            - configured: 是否已配置远程存储
-            - dates: 日期列表
-            - count: 日期数量
-            - earliest: 最早日期
-            - latest: 最新日期
-        - comparison: 对比结果(仅当 source="both" 时)
-            - only_local: 仅本地存在的日期
-            - only_remote: 仅远程存在的日期
-            - both: 两边都存在的日期
+        JSON格式的日期列表,包含各来源的日期信息和对比结果
 
     Examples:
-        - list_available_dates()  # 查看本地和远程的对比
-        - list_available_dates(source="local")  # 仅查看本地
-        - list_available_dates(source="remote")  # 仅查看远程
+        - list_available_dates()
+        - list_available_dates(source="local")
     """
     tools = _get_tools()
     result = await asyncio.to_thread(tools['storage'].list_available_dates, source=source)

+ 6 - 14
mcp_server/services/data_service.py

@@ -150,7 +150,7 @@ class DataService:
         # 尝试从缓存获取
         date_str = target_date.strftime("%Y-%m-%d")
         cache_key = f"news_by_date:{date_str}:{','.join(platforms or [])}:{limit}:{include_url}"
-        cached = self.cache.get(cache_key, ttl=1800)  # 30分钟缓存
+        cached = self.cache.get(cache_key, ttl=900)  # 15分钟缓存
         if cached:
             return cached
 
@@ -353,7 +353,7 @@ class DataService:
         """
         # 尝试从缓存获取
         cache_key = f"trending_topics:{top_n}:{mode}:{extract_mode}"
-        cached = self.cache.get(cache_key, ttl=1800)  # 30分钟缓存
+        cached = self.cache.get(cache_key, ttl=900)  # 15分钟缓存
         if cached:
             return cached
 
@@ -470,12 +470,6 @@ class DataService:
         Raises:
             FileParseError: 配置文件解析错误
         """
-        # 尝试从缓存获取
-        cache_key = f"config:{section}"
-        cached = self.cache.get(cache_key, ttl=3600)  # 1小时缓存
-        if cached:
-            return cached
-
         # 解析配置文件
         config_data = self.parser.parse_yaml_config()
         word_groups = self.parser.parse_frequency_words()
@@ -483,14 +477,15 @@ class DataService:
         # 根据section返回对应配置
         advanced = config_data.get("advanced", {})
         advanced_crawler = advanced.get("crawler", {})
+        platforms_config = config_data.get("platforms", {})
 
         if section == "all" or section == "crawler":
             crawler_config = {
-                "enable_crawler": advanced_crawler.get("enabled", True),
+                "enable_crawler": platforms_config.get("enabled", True),
                 "use_proxy": advanced_crawler.get("use_proxy", False),
                 "request_interval": advanced_crawler.get("request_interval", 1),
                 "retry_times": 3,
-                "platforms": [p["id"] for p in config_data.get("platforms", [])]
+                "platforms": [p["id"] for p in platforms_config.get("sources", [])]
             }
 
         if section == "all" or section == "push":
@@ -545,9 +540,6 @@ class DataService:
         else:
             result = {}
 
-        # 缓存结果
-        self.cache.set(cache_key, result)
-
         return result
 
     def get_available_date_range(self) -> Tuple[Optional[datetime], Optional[datetime]]:
@@ -863,7 +855,7 @@ class DataService:
             RSS 源状态信息
         """
         cache_key = "rss_feeds_status"
-        cached = self.cache.get(cache_key, ttl=300)
+        cached = self.cache.get(cache_key, ttl=900)
         if cached:
             return cached
 

+ 1 - 1
mcp_server/services/parser_service.py

@@ -325,7 +325,7 @@ class ParserService:
         cache_key = f"read_all:{db_type}:{date_str}:{platform_key}"
 
         is_today = (date is None) or (date.date() == datetime.now().date())
-        ttl = 900 if is_today else 3600
+        ttl = 900 if is_today else 900
 
         cached = self.cache.get(cache_key, ttl=ttl)
         if cached:

+ 35 - 18
mcp_server/tools/config_mgmt.py

@@ -4,13 +4,28 @@
 实现配置查询和管理功能。
 """
 
-from typing import Dict, Optional
+from typing import Dict, Optional, Any, TypedDict
 
 from ..services.data_service import DataService
 from ..utils.validators import validate_config_section
 from ..utils.errors import MCPError
 
 
+class ErrorInfo(TypedDict, total=False):
+    """错误信息结构"""
+    code: str
+    message: str
+    suggestion: str
+
+
+class ConfigResult(TypedDict):
+    """配置查询结果 - success 字段必需,其他字段可选"""
+    success: bool
+    config: Optional[Dict[str, Any]]
+    section: Optional[str]
+    error: Optional[ErrorInfo]
+
+
 class ConfigManagementTools:
     """配置管理工具类"""
 
@@ -23,7 +38,7 @@ class ConfigManagementTools:
         """
         self.data_service = DataService(project_root)
 
-    def get_current_config(self, section: Optional[str] = None) -> Dict:
+    def get_current_config(self, section: Optional[str] = None) -> ConfigResult:
         """
         获取当前系统配置
 
@@ -45,22 +60,24 @@ class ConfigManagementTools:
             # 获取配置
             config = self.data_service.get_current_config(section=section)
 
-            return {
-                "config": config,
-                "section": section,
-                "success": True
-            }
+            return ConfigResult(
+                success=True,
+                config=config,
+                section=section,
+                error=None
+            )
 
         except MCPError as e:
-            return {
-                "success": False,
-                "error": e.to_dict()
-            }
+            return ConfigResult(
+                success=False,
+                config=None,
+                section=None,
+                error=e.to_dict()
+            )
         except Exception as e:
-            return {
-                "success": False,
-                "error": {
-                    "code": "INTERNAL_ERROR",
-                    "message": str(e)
-                }
-            }
+            return ConfigResult(
+                success=False,
+                config=None,
+                section=None,
+                error={"code": "INTERNAL_ERROR", "message": str(e), "suggestion": "请查看服务日志获取详细信息"}
+            )

+ 22 - 2
mcp_server/utils/errors.py

@@ -4,7 +4,25 @@
 定义MCP Server使用的所有自定义异常类型。
 """
 
-from typing import Optional
+from typing import Optional, List, Callable
+
+
+# ==================== 延迟加载支持的平台列表 ====================
+
+_get_supported_platforms: Optional[Callable[[], List[str]]] = None
+
+
+def _load_supported_platforms() -> List[str]:
+    """延迟加载支持的平台列表"""
+    global _get_supported_platforms
+    if _get_supported_platforms is None:
+        try:
+            from .validators import get_supported_platforms
+            _get_supported_platforms = get_supported_platforms
+        except ImportError:
+            # 降级:返回空列表
+            return []
+    return _get_supported_platforms()
 
 
 class MCPError(Exception):
@@ -64,10 +82,12 @@ class PlatformNotSupportedError(MCPError):
     """平台不支持错误"""
 
     def __init__(self, platform: str):
+        supported = _load_supported_platforms()
+        suggestion = f"支持的平台: {', '.join(supported)}" if supported else "请检查 config/config.yaml 中的平台配置"
         super().__init__(
             message=f"平台 '{platform}' 不受支持",
             code="PLATFORM_NOT_SUPPORTED",
-            suggestion="支持的平台: zhihu, weibo, douyin, bilibili, baidu, toutiao, qq, 36kr, sspai, hellogithub, thepaper"
+            suggestion=suggestion
         )
 
 

+ 1 - 1
pyproject.toml

@@ -1,6 +1,6 @@
 [project]
 name = "trendradar"
-version = "5.0.0"
+version = "5.2.0"
 description = "TrendRadar - 热点新闻聚合与分析工具"
 requires-python = ">=3.10"
 dependencies = [

+ 1 - 1
trendradar/__init__.py

@@ -9,5 +9,5 @@ TrendRadar - 热点新闻聚合与分析工具
 
 from trendradar.context import AppContext
 
-__version__ = "5.0.0"
+__version__ = "5.2.0"
 __all__ = ["AppContext", "__version__"]

+ 227 - 268
trendradar/__main__.py

@@ -76,29 +76,20 @@ class NewsAnalyzer:
         "incremental": {
             "mode_name": "增量模式",
             "description": "增量模式(只关注新增新闻,无新增时不推送)",
-            "realtime_report_type": "实时增量",
-            "summary_report_type": "当日汇总",
-            "should_send_realtime": True,
-            "should_generate_summary": True,
-            "summary_mode": "daily",
+            "report_type": "增量分析",
+            "should_send_notification": True,
         },
         "current": {
             "mode_name": "当前榜单模式",
             "description": "当前榜单模式(当前榜单匹配新闻 + 新增新闻区域 + 按时推送)",
-            "realtime_report_type": "实时当前榜单",
-            "summary_report_type": "当前榜单汇总",
-            "should_send_realtime": True,
-            "should_generate_summary": True,
-            "summary_mode": "current",
+            "report_type": "当前榜单",
+            "should_send_notification": True,
         },
         "daily": {
-            "mode_name": "当日汇总模式",
-            "description": "当日汇总模式(所有匹配新闻 + 新增新闻区域 + 按时推送)",
-            "realtime_report_type": "",
-            "summary_report_type": "当日汇总",
-            "should_send_realtime": False,
-            "should_generate_summary": True,
-            "summary_mode": "daily",
+            "mode_name": "全天汇总模式",
+            "description": "全天汇总模式(所有匹配新闻 + 新增新闻区域 + 按时推送)",
+            "report_type": "全天汇总",
+            "should_send_notification": True,
         },
     }
 
@@ -210,6 +201,7 @@ class NewsAnalyzer:
                 (cfg["NTFY_SERVER_URL"] and cfg["NTFY_TOPIC"]),
                 cfg["BARK_URL"],
                 cfg["SLACK_WEBHOOK_URL"],
+                cfg["GENERIC_WEBHOOK_URL"],
             ]
         )
 
@@ -244,13 +236,15 @@ class NewsAnalyzer:
         id_to_name: Optional[Dict],
     ) -> Optional[AIAnalysisResult]:
         """执行 AI 分析"""
-        ai_config = self.ctx.config.get("AI_ANALYSIS", {})
-        if not ai_config.get("ENABLED", False):
+        analysis_config = self.ctx.config.get("AI_ANALYSIS", {})
+        if not analysis_config.get("ENABLED", False):
             return None
 
         print("[AI] 正在进行 AI 分析...")
         try:
-            analyzer = AIAnalyzer(ai_config, self.ctx.get_time)
+            ai_config = self.ctx.config.get("AI", {})
+            debug_mode = self.ctx.config.get("DEBUG", False)
+            analyzer = AIAnalyzer(ai_config, analysis_config, self.ctx.get_time, debug=debug_mode)
 
             # 提取平台列表
             platforms = list(id_to_name.values()) if id_to_name else []
@@ -369,8 +363,11 @@ class NewsAnalyzer:
         Returns:
             独立展示数据字典,如果未启用返回 None
         """
-        standalone_config = self.ctx.config.get("STANDALONE_DISPLAY", {})
-        if not standalone_config.get("ENABLED", False):
+        display_config = self.ctx.config.get("DISPLAY", {})
+        regions = display_config.get("REGIONS", {})
+        standalone_config = display_config.get("STANDALONE", {})
+
+        if not regions.get("STANDALONE", False):
             return None
 
         platform_ids = standalone_config.get("PLATFORMS", [])
@@ -504,13 +501,13 @@ class NewsAnalyzer:
         filter_words: List[str],
         id_to_name: Dict,
         failed_ids: Optional[List] = None,
-        is_daily_summary: bool = False,
         global_filters: Optional[List[str]] = None,
         quiet: bool = False,
         rss_items: Optional[List[Dict]] = None,
         rss_new_items: Optional[List[Dict]] = None,
-    ) -> Tuple[List[Dict], Optional[str]]:
-        """统一的分析流水线:数据处理 → 统计计算 → HTML生成"""
+        standalone_data: Optional[Dict] = None,
+    ) -> Tuple[List[Dict], Optional[str], Optional[AIAnalysisResult]]:
+        """统一的分析流水线:数据处理 → 统计计算 → AI分析 → HTML生成"""
 
         # 统计计算(使用 AppContext)
         stats, total_titles = self.ctx.count_frequency(
@@ -533,6 +530,17 @@ class NewsAnalyzer:
                 self.ctx.rank_threshold,
             )
 
+        # AI 分析(如果启用,用于 HTML 报告)
+        ai_result = None
+        ai_config = self.ctx.config.get("AI_ANALYSIS", {})
+        if ai_config.get("ENABLED", False) and stats:
+            # 获取模式策略来确定报告类型
+            mode_strategy = self._get_mode_strategy()
+            report_type = mode_strategy["report_type"]
+            ai_result = self._run_ai_analysis(
+                stats, rss_items, mode, report_type, id_to_name
+            )
+
         # HTML生成(如果启用)
         html_file = None
         if self.ctx.config["STORAGE"]["FORMATS"]["HTML"]:
@@ -543,13 +551,14 @@ class NewsAnalyzer:
                 new_titles=new_titles,
                 id_to_name=id_to_name,
                 mode=mode,
-                is_daily_summary=is_daily_summary,
                 update_info=self.update_info if self.ctx.config["SHOW_VERSION_UPDATE"] else None,
                 rss_items=rss_items,
                 rss_new_items=rss_new_items,
+                ai_analysis=ai_result,
+                standalone_data=standalone_data,
             )
 
-        return stats, html_file
+        return stats, html_file, ai_result
 
     def _send_notification_if_needed(
         self,
@@ -563,6 +572,7 @@ class NewsAnalyzer:
         rss_items: Optional[List[Dict]] = None,
         rss_new_items: Optional[List[Dict]] = None,
         standalone_data: Optional[Dict] = None,
+        ai_result: Optional[AIAnalysisResult] = None,
     ) -> bool:
         """统一的通知发送逻辑,包含所有判断条件,支持热榜+RSS合并推送+AI分析+独立展示区"""
         has_notification = self._has_notification_configured()
@@ -612,13 +622,13 @@ class NewsAnalyzer:
                     else:
                         print(f"推送窗口控制:今天首次推送")
 
-            # AI 分析(如果启用)
-            ai_result = None
-            ai_config = cfg.get("AI_ANALYSIS", {})
-            if ai_config.get("ENABLED", False):
-                ai_result = self._run_ai_analysis(
-                    stats, rss_items, mode, report_type, id_to_name
-                )
+            # AI 分析:优先使用传入的结果,避免重复分析
+            if ai_result is None:
+                ai_config = cfg.get("AI_ANALYSIS", {})
+                if ai_config.get("ENABLED", False):
+                    ai_result = self._run_ai_analysis(
+                        stats, rss_items, mode, report_type, id_to_name
+                    )
 
             # 准备报告数据
             report_data = self.ctx.prepare_report(stats, failed_ids, new_titles, id_to_name, mode)
@@ -666,125 +676,21 @@ class NewsAnalyzer:
             and not has_any_content
         ):
             mode_strategy = self._get_mode_strategy()
-            if "实时" in report_type:
-                if self.report_mode == "incremental":
-                    has_new = bool(
-                        new_titles and any(len(titles) > 0 for titles in new_titles.values())
-                    )
-                    if not has_new and not has_rss_content:
-                        print("跳过实时推送通知:增量模式下未检测到新增的新闻和RSS")
-                    elif not has_new:
-                        print("跳过实时推送通知:增量模式下新增新闻未匹配到关键词")
-                else:
-                    print(
-                        f"跳过实时推送通知:{mode_strategy['mode_name']}下未检测到匹配的新闻"
-                    )
+            if self.report_mode == "incremental":
+                has_new = bool(
+                    new_titles and any(len(titles) > 0 for titles in new_titles.values())
+                )
+                if not has_new and not has_rss_content:
+                    print("跳过通知:增量模式下未检测到新增的新闻和RSS")
+                elif not has_new:
+                    print("跳过通知:增量模式下新增新闻未匹配到关键词")
             else:
                 print(
-                    f"跳过{mode_strategy['summary_report_type']}通知:未匹配到有效的新闻内容"
+                    f"跳过通知:{mode_strategy['mode_name']}下未检测到匹配的新闻"
                 )
 
         return False
 
-    def _generate_summary_report(
-        self,
-        mode_strategy: Dict,
-        rss_items: Optional[List[Dict]] = None,
-        rss_new_items: Optional[List[Dict]] = None,
-    ) -> Optional[str]:
-        """生成汇总报告(带通知,支持RSS合并)"""
-        summary_type = (
-            "当前榜单汇总" if mode_strategy["summary_mode"] == "current" else "当日汇总"
-        )
-        print(f"生成{summary_type}报告...")
-
-        # 加载分析数据
-        analysis_data = self._load_analysis_data()
-        if not analysis_data:
-            return None
-
-        all_results, id_to_name, title_info, new_titles, word_groups, filter_words, global_filters = (
-            analysis_data
-        )
-
-        # 运行分析流水线
-        stats, html_file = self._run_analysis_pipeline(
-            all_results,
-            mode_strategy["summary_mode"],
-            title_info,
-            new_titles,
-            word_groups,
-            filter_words,
-            id_to_name,
-            is_daily_summary=True,
-            global_filters=global_filters,
-            rss_items=rss_items,
-            rss_new_items=rss_new_items,
-        )
-
-        if html_file:
-            print(f"{summary_type}报告已生成: {html_file}")
-
-        # 准备独立展示区数据
-        standalone_data = self._prepare_standalone_data(
-            all_results, id_to_name, title_info, rss_items
-        )
-
-        # 发送通知(合并RSS+独立展示区)
-        self._send_notification_if_needed(
-            stats,
-            mode_strategy["summary_report_type"],
-            mode_strategy["summary_mode"],
-            failed_ids=[],
-            new_titles=new_titles,
-            id_to_name=id_to_name,
-            html_file_path=html_file,
-            rss_items=rss_items,
-            rss_new_items=rss_new_items,
-            standalone_data=standalone_data,
-        )
-
-        return html_file
-
-    def _generate_summary_html(
-        self,
-        mode: str = "daily",
-        rss_items: Optional[List[Dict]] = None,
-        rss_new_items: Optional[List[Dict]] = None,
-    ) -> Optional[str]:
-        """生成汇总HTML"""
-        summary_type = "当前榜单汇总" if mode == "current" else "当日汇总"
-        print(f"生成{summary_type}HTML...")
-
-        # 加载分析数据(静默模式,避免重复输出日志)
-        analysis_data = self._load_analysis_data(quiet=True)
-        if not analysis_data:
-            return None
-
-        all_results, id_to_name, title_info, new_titles, word_groups, filter_words, global_filters = (
-            analysis_data
-        )
-
-        # 运行分析流水线(静默模式,避免重复输出日志)
-        _, html_file = self._run_analysis_pipeline(
-            all_results,
-            mode,
-            title_info,
-            new_titles,
-            word_groups,
-            filter_words,
-            id_to_name,
-            is_daily_summary=True,
-            global_filters=global_filters,
-            quiet=True,
-            rss_items=rss_items,
-            rss_new_items=rss_new_items,
-        )
-
-        if html_file:
-            print(f"{summary_type}HTML已生成: {html_file}")
-        return html_file
-
     def _initialize_and_check_config(self) -> None:
         """通用初始化和配置检查"""
         now = self.ctx.get_time()
@@ -848,23 +754,24 @@ class NewsAnalyzer:
 
         return results, id_to_name, failed_ids
 
-    def _crawl_rss_data(self) -> Tuple[Optional[List[Dict]], Optional[List[Dict]]]:
+    def _crawl_rss_data(self) -> Tuple[Optional[List[Dict]], Optional[List[Dict]], Optional[List[Dict]]]:
         """
         执行 RSS 数据抓取
 
         Returns:
-            (rss_items, rss_new_items) 元组:
+            (rss_items, rss_new_items, raw_rss_items) 元组:
             - rss_items: 统计条目列表(按模式处理,用于统计区块)
             - rss_new_items: 新增条目列表(用于新增区块)
-            如果未启用或失败返回 (None, None)
+            - raw_rss_items: 原始 RSS 条目列表(用于独立展示区)
+            如果未启用或失败返回 (None, None, None)
         """
         if not self.ctx.rss_enabled:
-            return None, None
+            return None, None, None
 
         rss_feeds = self.ctx.rss_feeds
         if not rss_feeds:
             print("[RSS] 未配置任何 RSS 源")
-            return None, None
+            return None, None, None
 
         try:
             from trendradar.crawler.rss import RSSFetcher, RSSFeedConfig
@@ -900,7 +807,7 @@ class NewsAnalyzer:
 
             if not feeds:
                 print("[RSS] 没有启用的 RSS 源")
-                return None, None
+                return None, None, None
 
             # 创建抓取器
             rss_config = self.ctx.rss_config
@@ -935,17 +842,17 @@ class NewsAnalyzer:
                 return self._process_rss_data_by_mode(rss_data)
             else:
                 print(f"[RSS] 数据保存失败")
-                return None, None
+                return None, None, None
 
         except ImportError as e:
             print(f"[RSS] 缺少依赖: {e}")
             print("[RSS] 请安装 feedparser: pip install feedparser")
-            return None, None
+            return None, None, None
         except Exception as e:
             print(f"[RSS] 抓取失败: {e}")
-            return None, None
+            return None, None, None
 
-    def _process_rss_data_by_mode(self, rss_data) -> Tuple[Optional[List[Dict]], Optional[List[Dict]]]:
+    def _process_rss_data_by_mode(self, rss_data) -> Tuple[Optional[List[Dict]], Optional[List[Dict]], Optional[List[Dict]]]:
         """
         按报告模式处理 RSS 数据,返回与热榜相同格式的统计结构
 
@@ -958,17 +865,15 @@ class NewsAnalyzer:
             rss_data: 当前抓取的 RSSData 对象
 
         Returns:
-            (rss_stats, rss_new_stats) 元组:
+            (rss_stats, rss_new_stats, raw_rss_items) 元组:
             - rss_stats: RSS 关键词统计列表(与热榜 stats 格式一致)
             - rss_new_stats: RSS 新增关键词统计列表(与热榜 stats 格式一致)
+            - raw_rss_items: 原始 RSS 条目列表(用于独立展示区)
         """
         from trendradar.core.analyzer import count_rss_frequency
 
-        rss_config = self.ctx.rss_config
-
-        # 检查是否启用 RSS 通知
-        if not rss_config.get("NOTIFICATION", {}).get("ENABLED", False):
-            return None, None
+        # 从 display.regions.rss 统一控制 RSS 分析和展示
+        rss_display_enabled = self.ctx.config.get("DISPLAY", {}).get("REGIONS", {}).get("RSS", True)
 
         # 加载关键词配置
         try:
@@ -982,8 +887,28 @@ class NewsAnalyzer:
 
         rss_stats = None
         rss_new_stats = None
+        raw_rss_items = None  # 原始 RSS 条目列表(用于独立展示区)
+
+        # 1. 首先获取原始条目(用于独立展示区,不受 display.regions.rss 影响)
+        # 根据模式获取原始条目
+        if self.report_mode == "incremental":
+            new_items_dict = self.storage_manager.detect_new_rss_items(rss_data)
+            if new_items_dict:
+                raw_rss_items = self._convert_rss_items_to_list(new_items_dict, rss_data.id_to_name)
+        elif self.report_mode == "current":
+            latest_data = self.storage_manager.get_latest_rss_data(rss_data.date)
+            if latest_data:
+                raw_rss_items = self._convert_rss_items_to_list(latest_data.items, latest_data.id_to_name)
+        else:  # daily
+            all_data = self.storage_manager.get_rss_data(rss_data.date)
+            if all_data:
+                raw_rss_items = self._convert_rss_items_to_list(all_data.items, all_data.id_to_name)
 
-        # 1. 首先获取新增条目(所有模式都需要)
+        # 如果 RSS 展示未启用,跳过关键词分析,只返回原始条目用于独立展示区
+        if not rss_display_enabled:
+            return None, None, raw_rss_items
+
+        # 2. 获取新增条目(用于统计)
         new_items_dict = self.storage_manager.detect_new_rss_items(rss_data)
         new_items_list = None
         if new_items_dict:
@@ -991,12 +916,12 @@ class NewsAnalyzer:
             if new_items_list:
                 print(f"[RSS] 检测到 {len(new_items_list)} 条新增")
 
-        # 2. 根据模式获取统计条目
+        # 3. 根据模式获取统计条目
         if self.report_mode == "incremental":
             # 增量模式:统计条目就是新增条目
             if not new_items_list:
                 print("[RSS] 增量模式:没有新增 RSS 条目")
-                return None, None
+                return None, None, raw_rss_items
 
             rss_stats, total = count_rss_frequency(
                 rss_items=new_items_list,
@@ -1012,18 +937,18 @@ class NewsAnalyzer:
             )
             if not rss_stats:
                 print("[RSS] 增量模式:关键词匹配后没有内容")
-                return None, None
+                # 即使关键词匹配为空,也返回原始条目用于独立展示区
+                return None, None, raw_rss_items
 
         elif self.report_mode == "current":
             # 当前榜单模式:统计=当前榜单所有条目
-            latest_data = self.storage_manager.get_latest_rss_data(rss_data.date)
-            if not latest_data:
+            # raw_rss_items 已在前面获取
+            if not raw_rss_items:
                 print("[RSS] 当前榜单模式:没有 RSS 数据")
-                return None, None
+                return None, None, None
 
-            all_items_list = self._convert_rss_items_to_list(latest_data.items, latest_data.id_to_name)
             rss_stats, total = count_rss_frequency(
-                rss_items=all_items_list,
+                rss_items=raw_rss_items,
                 word_groups=word_groups,
                 filter_words=filter_words,
                 global_filters=global_filters,
@@ -1036,7 +961,8 @@ class NewsAnalyzer:
             )
             if not rss_stats:
                 print("[RSS] 当前榜单模式:关键词匹配后没有内容")
-                return None, None
+                # 即使关键词匹配为空,也返回原始条目用于独立展示区
+                return None, None, raw_rss_items
 
             # 生成新增统计
             if new_items_list:
@@ -1055,14 +981,13 @@ class NewsAnalyzer:
 
         else:
             # daily 模式:统计=当天所有条目
-            all_data = self.storage_manager.get_rss_data(rss_data.date)
-            if not all_data:
+            # raw_rss_items 已在前面获取
+            if not raw_rss_items:
                 print("[RSS] 当日汇总模式:没有 RSS 数据")
-                return None, None
+                return None, None, None
 
-            all_items_list = self._convert_rss_items_to_list(all_data.items, all_data.id_to_name)
             rss_stats, total = count_rss_frequency(
-                rss_items=all_items_list,
+                rss_items=raw_rss_items,
                 word_groups=word_groups,
                 filter_words=filter_words,
                 global_filters=global_filters,
@@ -1075,7 +1000,8 @@ class NewsAnalyzer:
             )
             if not rss_stats:
                 print("[RSS] 当日汇总模式:关键词匹配后没有内容")
-                return None, None
+                # 即使关键词匹配为空,也返回原始条目用于独立展示区
+                return None, None, raw_rss_items
 
             # 生成新增统计
             if new_items_list:
@@ -1092,7 +1018,7 @@ class NewsAnalyzer:
                     quiet=True,
                 )
 
-        return rss_stats, rss_new_stats
+        return rss_stats, rss_new_stats, raw_rss_items
 
     def _convert_rss_items_to_list(self, items_dict: Dict, id_to_name: Dict) -> List[Dict]:
         """将 RSS 条目字典转换为列表格式,并应用新鲜度过滤(用于推送)"""
@@ -1170,12 +1096,6 @@ class NewsAnalyzer:
             pass
         return rss_items
 
-    def _process_rss_report_and_notification(self, rss_data) -> None:
-        """处理 RSS 报告生成和通知发送(独立推送,已废弃)"""
-        # 此方法保留用于向后兼容,但不再使用
-        # RSS 现在与热榜合并推送
-        pass
-
     def _generate_rss_html_report(self, rss_items: list, feeds_info: dict) -> str:
         """生成 RSS HTML 报告"""
         try:
@@ -1189,10 +1109,10 @@ class NewsAnalyzer:
                 get_time_func=self.ctx.get_time,
             )
 
-            # 保存 HTML 文件
+            # 保存 HTML 文件(扁平化结构:output/html/日期/)
             date_folder = self.ctx.format_date()
             time_filename = self.ctx.format_time()
-            output_dir = Path("output") / date_folder / "html"
+            output_dir = Path("output") / "html" / date_folder
             output_dir.mkdir(parents=True, exist_ok=True)
 
             file_path = output_dir / f"rss_{time_filename}.html"
@@ -1210,8 +1130,14 @@ class NewsAnalyzer:
         self, mode_strategy: Dict, results: Dict, id_to_name: Dict, failed_ids: List,
         rss_items: Optional[List[Dict]] = None,
         rss_new_items: Optional[List[Dict]] = None,
+        raw_rss_items: Optional[List[Dict]] = None,
     ) -> Optional[str]:
-        """执行模式特定逻辑,支持热榜+RSS合并推送"""
+        """执行模式特定逻辑,支持热榜+RSS合并推送
+
+        简化后的逻辑:
+        - 每次运行都生成 HTML 报告(时间戳快照 + latest/{mode}.html + index.html)
+        - 根据模式发送通知
+        """
         # 获取当前监控平台ID列表
         current_platform_ids = self.ctx.platform_ids
 
@@ -1221,9 +1147,13 @@ class NewsAnalyzer:
             self.ctx.save_titles(results, id_to_name, failed_ids)
         word_groups, filter_words, global_filters = self.ctx.load_frequency_words()
 
-        # current模式下,实时推送需要使用完整的历史数据来保证统计信息的完整性
+        html_file = None
+        stats = []
+        ai_result = None
+        title_info = None
+
+        # current 模式需要使用完整的历史数据
         if self.report_mode == "current":
-            # 加载完整的历史数据(已按当前平台过滤)
             analysis_data = self._load_analysis_data()
             if analysis_data:
                 (
@@ -1240,7 +1170,12 @@ class NewsAnalyzer:
                     f"current模式:使用过滤后的历史数据,包含平台:{list(all_results.keys())}"
                 )
 
-                stats, html_file = self._run_analysis_pipeline(
+                # 使用历史数据准备独立展示区数据(包含完整的 title_info)
+                standalone_data = self._prepare_standalone_data(
+                    all_results, historical_id_to_name, historical_title_info, raw_rss_items
+                )
+
+                stats, html_file, ai_result = self._run_analysis_pipeline(
                     all_results,
                     self.report_mode,
                     historical_title_info,
@@ -1252,38 +1187,83 @@ class NewsAnalyzer:
                     global_filters=global_filters,
                     rss_items=rss_items,
                     rss_new_items=rss_new_items,
+                    standalone_data=standalone_data,
                 )
 
                 combined_id_to_name = {**historical_id_to_name, **id_to_name}
-
-                if html_file:
-                    print(f"HTML报告已生成: {html_file}")
-
-                # 发送实时通知(使用完整历史数据的统计结果,合并RSS+独立展示区)
-                summary_html = None
-                if mode_strategy["should_send_realtime"]:
-                    # 准备独立展示区数据
-                    standalone_data = self._prepare_standalone_data(
-                        all_results, combined_id_to_name, historical_title_info, rss_items
-                    )
-                    self._send_notification_if_needed(
-                        stats,
-                        mode_strategy["realtime_report_type"],
-                        self.report_mode,
-                        failed_ids=failed_ids,
-                        new_titles=historical_new_titles,
-                        id_to_name=combined_id_to_name,
-                        html_file_path=html_file,
-                        rss_items=rss_items,
-                        rss_new_items=rss_new_items,
-                        standalone_data=standalone_data,
-                    )
+                new_titles = historical_new_titles
+                id_to_name = combined_id_to_name
+                title_info = historical_title_info
+                results = all_results
             else:
                 print("❌ 严重错误:无法读取刚保存的数据文件")
                 raise RuntimeError("数据一致性检查失败:保存后立即读取失败")
+        elif self.report_mode == "daily":
+            # daily 模式:使用全天累计数据
+            analysis_data = self._load_analysis_data()
+            if analysis_data:
+                (
+                    all_results,
+                    historical_id_to_name,
+                    historical_title_info,
+                    historical_new_titles,
+                    _,
+                    _,
+                    _,
+                ) = analysis_data
+
+                # 使用历史数据准备独立展示区数据(包含完整的 title_info)
+                standalone_data = self._prepare_standalone_data(
+                    all_results, historical_id_to_name, historical_title_info, raw_rss_items
+                )
+
+                stats, html_file, ai_result = self._run_analysis_pipeline(
+                    all_results,
+                    self.report_mode,
+                    historical_title_info,
+                    historical_new_titles,
+                    word_groups,
+                    filter_words,
+                    historical_id_to_name,
+                    failed_ids=failed_ids,
+                    global_filters=global_filters,
+                    rss_items=rss_items,
+                    rss_new_items=rss_new_items,
+                    standalone_data=standalone_data,
+                )
+
+                combined_id_to_name = {**historical_id_to_name, **id_to_name}
+                new_titles = historical_new_titles
+                id_to_name = combined_id_to_name
+                title_info = historical_title_info
+                results = all_results
+            else:
+                # 没有历史数据时使用当前数据
+                title_info = self._prepare_current_title_info(results, time_info)
+                standalone_data = self._prepare_standalone_data(
+                    results, id_to_name, title_info, raw_rss_items
+                )
+                stats, html_file, ai_result = self._run_analysis_pipeline(
+                    results,
+                    self.report_mode,
+                    title_info,
+                    new_titles,
+                    word_groups,
+                    filter_words,
+                    id_to_name,
+                    failed_ids=failed_ids,
+                    global_filters=global_filters,
+                    rss_items=rss_items,
+                    rss_new_items=rss_new_items,
+                    standalone_data=standalone_data,
+                )
         else:
+            # incremental 模式:只使用当前抓取的数据
             title_info = self._prepare_current_title_info(results, time_info)
-            stats, html_file = self._run_analysis_pipeline(
+            standalone_data = self._prepare_standalone_data(
+                results, id_to_name, title_info, raw_rss_items
+            )
+            stats, html_file, ai_result = self._run_analysis_pipeline(
                 results,
                 self.report_mode,
                 title_info,
@@ -1295,63 +1275,41 @@ class NewsAnalyzer:
                 global_filters=global_filters,
                 rss_items=rss_items,
                 rss_new_items=rss_new_items,
+                standalone_data=standalone_data,
             )
-            if html_file:
-                print(f"HTML报告已生成: {html_file}")
 
-            # 发送实时通知(如果需要,合并RSS+独立展示区)
-            summary_html = None
-            if mode_strategy["should_send_realtime"]:
-                # 准备独立展示区数据
-                standalone_data = self._prepare_standalone_data(
-                    results, id_to_name, title_info, rss_items
-                )
-                self._send_notification_if_needed(
-                    stats,
-                    mode_strategy["realtime_report_type"],
-                    self.report_mode,
-                    failed_ids=failed_ids,
-                    new_titles=new_titles,
-                    id_to_name=id_to_name,
-                    html_file_path=html_file,
-                    rss_items=rss_items,
-                    rss_new_items=rss_new_items,
-                    standalone_data=standalone_data,
-                )
+        if html_file:
+            print(f"HTML报告已生成: {html_file}")
+            print(f"最新报告已更新: output/html/latest/{self.report_mode}.html")
 
-        # 生成汇总报告(如果需要)
-        summary_html = None
-        if mode_strategy["should_generate_summary"]:
-            if mode_strategy["should_send_realtime"]:
-                # 如果已经发送了实时通知,汇总只生成HTML不发送通知
-                summary_html = self._generate_summary_html(
-                    mode_strategy["summary_mode"],
-                    rss_items=rss_items,
-                    rss_new_items=rss_new_items,
-                )
-            else:
-                # daily模式:直接生成汇总报告并发送通知(合并RSS)
-                summary_html = self._generate_summary_report(
-                    mode_strategy, rss_items=rss_items, rss_new_items=rss_new_items
-                )
+        # 发送通知
+        if mode_strategy["should_send_notification"]:
+            standalone_data = self._prepare_standalone_data(
+                results, id_to_name, title_info, raw_rss_items
+            )
+            self._send_notification_if_needed(
+                stats,
+                mode_strategy["report_type"],
+                self.report_mode,
+                failed_ids=failed_ids,
+                new_titles=new_titles,
+                id_to_name=id_to_name,
+                html_file_path=html_file,
+                rss_items=rss_items,
+                rss_new_items=rss_new_items,
+                standalone_data=standalone_data,
+                ai_result=ai_result,
+            )
 
         # 打开浏览器(仅在非容器环境)
         if self._should_open_browser() and html_file:
-            if summary_html:
-                summary_url = "file://" + str(Path(summary_html).resolve())
-                print(f"正在打开汇总报告: {summary_url}")
-                webbrowser.open(summary_url)
-            else:
-                file_url = "file://" + str(Path(html_file).resolve())
-                print(f"正在打开HTML报告: {file_url}")
-                webbrowser.open(file_url)
+            file_url = "file://" + str(Path(html_file).resolve())
+            print(f"正在打开HTML报告: {file_url}")
+            webbrowser.open(file_url)
         elif self.is_docker_container and html_file:
-            if summary_html:
-                print(f"汇总报告已生成(Docker环境): {summary_html}")
-            else:
-                print(f"HTML报告已生成(Docker环境): {html_file}")
+            print(f"HTML报告已生成(Docker环境): {html_file}")
 
-        return summary_html
+        return html_file
 
     def run(self) -> None:
         """执行分析流程"""
@@ -1363,13 +1321,14 @@ class NewsAnalyzer:
             # 抓取热榜数据
             results, id_to_name, failed_ids = self._crawl_data()
 
-            # 抓取 RSS 数据(如果启用),返回统计条目和新增条目用于合并推送
-            rss_items, rss_new_items = self._crawl_rss_data()
+            # 抓取 RSS 数据(如果启用),返回统计条目、新增条目和原始条目
+            rss_items, rss_new_items, raw_rss_items = self._crawl_rss_data()
 
             # 执行模式策略,传递 RSS 数据用于合并推送
             self._execute_mode_strategy(
                 mode_strategy, results, id_to_name, failed_ids,
-                rss_items=rss_items, rss_new_items=rss_new_items
+                rss_items=rss_items, rss_new_items=rss_new_items,
+                raw_rss_items=raw_rss_items
             )
 
         except Exception as e:

+ 11 - 2
trendradar/ai/__init__.py

@@ -1,27 +1,36 @@
 # coding=utf-8
 """
-TrendRadar AI 分析模块
+TrendRadar AI 模块
 
-提供 AI 大模型对热点新闻的深度分析功能
+提供 AI 大模型对热点新闻的深度分析和翻译功能
 """
 
 from .analyzer import AIAnalyzer, AIAnalysisResult
+from .translator import AITranslator, TranslationResult, BatchTranslationResult
 from .formatter import (
     get_ai_analysis_renderer,
     render_ai_analysis_markdown,
     render_ai_analysis_feishu,
     render_ai_analysis_dingtalk,
     render_ai_analysis_html,
+    render_ai_analysis_html_rich,
     render_ai_analysis_plain,
 )
 
 __all__ = [
+    # 分析器
     "AIAnalyzer",
     "AIAnalysisResult",
+    # 翻译器
+    "AITranslator",
+    "TranslationResult",
+    "BatchTranslationResult",
+    # 格式化
     "get_ai_analysis_renderer",
     "render_ai_analysis_markdown",
     "render_ai_analysis_feishu",
     "render_ai_analysis_dingtalk",
     "render_ai_analysis_html",
+    "render_ai_analysis_html_rich",
     "render_ai_analysis_plain",
 ]

+ 175 - 97
trendradar/ai/analyzer.py

@@ -16,16 +16,18 @@ from typing import Any, Callable, Dict, List, Optional
 @dataclass
 class AIAnalysisResult:
     """AI 分析结果"""
-    summary: str = ""                    # 热点趋势概述
-    keyword_analysis: str = ""           # 关键词热度分析
-    sentiment: str = ""                  # 情感倾向分析
-    cross_platform: str = ""             # 跨平台关联
-    impact: str = ""                     # 潜在影响评估
-    signals: str = ""                    # 值得关注的信号
-    conclusion: str = ""                 # 总结与建议
+    # 新版 5 核心板块
+    core_trends: str = ""                # 核心热点与舆情态势
+    sentiment_controversy: str = ""      # 舆论风向与争议
+    signals: str = ""                    # 异动与弱信号
+    rss_insights: str = ""               # RSS 深度洞察
+    outlook_strategy: str = ""           # 研判与策略建议
+
+    # 基础元数据
     raw_response: str = ""               # 原始响应
     success: bool = False                # 是否成功
     error: str = ""                      # 错误信息
+
     # 新闻数量统计
     total_news: int = 0                  # 总新闻数(热榜+RSS)
     analyzed_news: int = 0               # 实际分析的新闻数
@@ -37,30 +39,57 @@ class AIAnalysisResult:
 class AIAnalyzer:
     """AI 分析器"""
 
-    def __init__(self, config: Dict[str, Any], get_time_func: Callable):
+    def __init__(
+        self,
+        ai_config: Dict[str, Any],
+        analysis_config: Dict[str, Any],
+        get_time_func: Callable,
+        debug: bool = False,
+    ):
         """
         初始化 AI 分析器
 
         Args:
-            config: AI 分析配置
+            ai_config: AI 模型共享配置(provider, api_key, model 等)
+            analysis_config: AI 分析功能配置(language, prompt_file 等)
             get_time_func: 获取当前时间的函数
+            debug: 是否开启调试模式
         """
-        self.config = config
+        self.ai_config = ai_config
+        self.analysis_config = analysis_config
         self.get_time_func = get_time_func
-
-        # 从配置或环境变量获取 API Key
-        self.api_key = config.get("API_KEY") or os.environ.get("AI_API_KEY", "")
-        self.provider = config.get("PROVIDER", "openai")
-        self.model = config.get("MODEL", "gpt-4o-mini")
-        self.base_url = config.get("BASE_URL", "")
-        self.timeout = config.get("TIMEOUT", 90)
-        self.max_news = config.get("MAX_NEWS_FOR_ANALYSIS", 50)
-        self.include_rss = config.get("INCLUDE_RSS", True)
-        self.push_mode = config.get("PUSH_MODE", "both")
+        self.debug = debug
+
+        # 从共享配置获取模型参数
+        self.api_key = ai_config.get("API_KEY") or os.environ.get("AI_API_KEY", "")
+        self.provider = ai_config.get("PROVIDER", "deepseek")
+        self.model = ai_config.get("MODEL", "deepseek-chat")
+        self.base_url = ai_config.get("BASE_URL", "")
+        self.timeout = ai_config.get("TIMEOUT", 90)
+        self.temperature = ai_config.get("TEMPERATURE", 1.0)
+        self.max_tokens = ai_config.get("MAX_TOKENS", 5000)
+
+        # 从分析配置获取功能参数
+        self.max_news = analysis_config.get("MAX_NEWS_FOR_ANALYSIS", 50)
+        self.include_rss = analysis_config.get("INCLUDE_RSS", True)
+        self.include_rank_timeline = analysis_config.get("INCLUDE_RANK_TIMELINE", False)
+        self.language = analysis_config.get("LANGUAGE", "Chinese")
+
+        # 额外的自定义参数(支持字典或 JSON 字符串)
+        self.extra_params = ai_config.get("EXTRA_PARAMS", {})
+        if isinstance(self.extra_params, str) and self.extra_params.strip():
+            try:
+                self.extra_params = json.loads(self.extra_params)
+            except json.JSONDecodeError:
+                print(f"[AI] 解析 extra_params 失败,将忽略: {self.extra_params}")
+                self.extra_params = {}
+
+        if not isinstance(self.extra_params, dict):
+             self.extra_params = {}
 
         # 加载提示词模板
         self.system_prompt, self.user_prompt_template = self._load_prompt_template(
-            config.get("PROMPT_FILE", "ai_analysis_prompt.txt")
+            analysis_config.get("PROMPT_FILE", "ai_analysis_prompt.txt")
         )
 
     def _load_prompt_template(self, prompt_file: str) -> tuple:
@@ -124,10 +153,10 @@ class AIAnalyzer:
             )
 
         # 准备新闻内容并获取统计数据
-        news_content, hotlist_total, rss_total, analyzed_count = self._prepare_news_content(stats, rss_stats)
+        news_content, rss_content, hotlist_total, rss_total, analyzed_count = self._prepare_news_content(stats, rss_stats)
         total_news = hotlist_total + rss_total
 
-        if not news_content:
+        if not news_content and not rss_content:
             return AIAnalysisResult(
                 success=False,
                 error="没有可分析的新闻内容",
@@ -155,11 +184,29 @@ class AIAnalyzer:
         user_prompt = user_prompt.replace("{platforms}", ", ".join(platforms) if platforms else "多平台")
         user_prompt = user_prompt.replace("{keywords}", ", ".join(keywords[:20]) if keywords else "无")
         user_prompt = user_prompt.replace("{news_content}", news_content)
+        user_prompt = user_prompt.replace("{rss_content}", rss_content)
+        user_prompt = user_prompt.replace("{language}", self.language)
+
+        if self.debug:
+            print("\n" + "=" * 80)
+            print("[AI 调试] 发送给 AI 的完整提示词")
+            print("=" * 80)
+            if self.system_prompt:
+                print("\n--- System Prompt ---")
+                print(self.system_prompt)
+            print("\n--- User Prompt ---")
+            print(user_prompt)
+            print("=" * 80 + "\n")
 
         # 调用 AI API
         try:
             response = self._call_ai_api(user_prompt)
             result = self._parse_response(response)
+
+            # 如果配置未启用 RSS 分析,强制清空 AI 返回的 RSS 洞察
+            if not self.include_rss:
+                result.rss_insights = ""
+
             # 填充统计数据
             result.total_news = total_news
             result.hotlist_count = hotlist_total
@@ -210,10 +257,12 @@ class AIAnalyzer:
         RSS 包含:来源、标题、发布时间
 
         Returns:
-            tuple: (content_str, hotlist_total, rss_total, analyzed_count)
+            tuple: (news_content, rss_content, hotlist_total, rss_total, analyzed_count)
         """
-        lines = []
-        count = 0
+        news_lines = []
+        rss_lines = []
+        news_count = 0
+        rss_count = 0
 
         # 计算总新闻数
         hotlist_total = sum(len(s.get("titles", [])) for s in stats) if stats else 0
@@ -221,13 +270,11 @@ class AIAnalyzer:
 
         # 热榜内容
         if stats:
-            lines.append("### 热榜新闻")
-            lines.append("格式: [来源] 标题 | 排名:最高-最低 | 时间:首次~末次 | 出现:N次")
             for stat in stats:
                 word = stat.get("word", "")
                 titles = stat.get("titles", [])
                 if word and titles:
-                    lines.append(f"\n**{word}** ({len(titles)}条)")
+                    news_lines.append(f"\n**{word}** ({len(titles)}条)")
                     for t in titles:
                         if not isinstance(t, dict):
                             continue
@@ -238,7 +285,13 @@ class AIAnalyzer:
                         # 来源
                         source = t.get("source_name", t.get("source", ""))
 
-                        # 排名范围
+                        # 构建行
+                        if source:
+                            line = f"- [{source}] {title}"
+                        else:
+                            line = f"- {title}"
+
+                        # 始终显示简化格式:排名范围 + 时间范围 + 出现次数
                         ranks = t.get("ranks", [])
                         if ranks:
                             min_rank = min(ranks)
@@ -247,37 +300,38 @@ class AIAnalyzer:
                         else:
                             rank_str = "-"
 
-                        # 时间范围(简化显示)
                         first_time = t.get("first_time", "")
                         last_time = t.get("last_time", "")
                         time_str = self._format_time_range(first_time, last_time)
 
-                        # 出现次数
                         appear_count = t.get("count", 1)
 
-                        # 构建行:[来源] 标题 | 排名:X-Y | 时间:首次~末次 | 出现:N次
-                        if source:
-                            line = f"- [{source}] {title}"
-                        else:
-                            line = f"- {title}"
                         line += f" | 排名:{rank_str} | 时间:{time_str} | 出现:{appear_count}次"
-                        lines.append(line)
 
-                        count += 1
-                        if count >= self.max_news:
+                        # 开启完整时间线时,额外添加轨迹
+                        if self.include_rank_timeline:
+                            rank_timeline = t.get("rank_timeline", [])
+                            timeline_str = self._format_rank_timeline(rank_timeline)
+                            line += f" | 轨迹:{timeline_str}"
+
+                        news_lines.append(line)
+
+                        news_count += 1
+                        if news_count >= self.max_news:
                             break
-                if count >= self.max_news:
+                if news_count >= self.max_news:
                     break
 
-        # RSS 内容(仅在启用时提交)
-        if self.include_rss and rss_stats and count < self.max_news:
-            lines.append("\n### RSS 订阅")
-            lines.append("格式: [来源] 标题 | 发布时间")
+        # RSS 内容(仅在启用时构建)
+        if self.include_rss and rss_stats:
+            remaining = self.max_news - news_count
             for stat in rss_stats:
+                if rss_count >= remaining:
+                    break
                 word = stat.get("word", "")
                 titles = stat.get("titles", [])
                 if word and titles:
-                    lines.append(f"\n**{word}** ({len(titles)}条)")
+                    rss_lines.append(f"\n**{word}** ({len(titles)}条)")
                     for t in titles:
                         if not isinstance(t, dict):
                             continue
@@ -298,15 +352,17 @@ class AIAnalyzer:
                             line = f"- {title}"
                         if time_display:
                             line += f" | {time_display}"
-                        lines.append(line)
+                        rss_lines.append(line)
 
-                        count += 1
-                        if count >= self.max_news:
+                        rss_count += 1
+                        if rss_count >= remaining:
                             break
-                if count >= self.max_news:
-                    break
 
-        return "\n".join(lines), hotlist_total, rss_total, count
+        news_content = "\n".join(news_lines) if news_lines else ""
+        rss_content = "\n".join(rss_lines) if rss_lines else ""
+        total_count = news_count + rss_count
+
+        return news_content, rss_content, hotlist_total, rss_total, total_count
 
     def _format_time_range(self, first_time: str, last_time: str) -> str:
         """格式化时间范围(简化显示,只保留时分)"""
@@ -314,7 +370,6 @@ class AIAnalyzer:
             if not time_str:
                 return "-"
             # 尝试提取 HH:MM 部分
-            # 格式可能是 "2026-01-04 12:30:00" 或 "12:30" 等
             if " " in time_str:
                 parts = time_str.split(" ")
                 if len(parts) >= 2:
@@ -323,7 +378,11 @@ class AIAnalyzer:
                         return time_part[:5]  # HH:MM
             elif ":" in time_str:
                 return time_str[:5]
-            return time_str[:5] if len(time_str) >= 5 else time_str
+            # 处理 HH-MM 格式
+            result = time_str[:5] if len(time_str) >= 5 else time_str
+            if len(result) == 5 and result[2] == '-':
+                result = result.replace('-', ':')
+            return result
 
         first = extract_time(first_time)
         last = extract_time(last_time)
@@ -332,6 +391,24 @@ class AIAnalyzer:
             return first
         return f"{first}~{last}"
 
+    def _format_rank_timeline(self, rank_timeline: List[Dict]) -> str:
+        """格式化排名时间线"""
+        if not rank_timeline:
+            return "-"
+
+        parts = []
+        for item in rank_timeline:
+            time_str = item.get("time", "")
+            if len(time_str) == 5 and time_str[2] == '-':
+                time_str = time_str.replace('-', ':')
+            rank = item.get("rank")
+            if rank is None:
+                parts.append(f"0({time_str})")
+            else:
+                parts.append(f"{rank}({time_str})")
+
+        return "→".join(parts)
+
     def _call_ai_api(self, user_prompt: str) -> str:
         """调用 AI API"""
         if self.provider == "gemini":
@@ -372,10 +449,16 @@ class AIAnalyzer:
         payload = {
             "model": self.model,
             "messages": messages,
-            "temperature": 0.7,
-            "max_tokens": 2000,
+            "temperature": self.temperature,
         }
 
+        # 某些 API 不支持 max_tokens
+        if self.max_tokens:
+            payload["max_tokens"] = self.max_tokens
+
+        if self.extra_params:
+            payload.update(self.extra_params)
+
         response = requests.post(
             url,
             headers=headers,
@@ -391,7 +474,6 @@ class AIAnalyzer:
         """调用 Google Gemini API"""
         import requests
 
-        # Gemini API URL 格式: https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent
         model = self.model or "gemini-1.5-flash"
         url = f"https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent?key={self.api_key}"
 
@@ -399,30 +481,33 @@ class AIAnalyzer:
             "Content-Type": "application/json",
         }
 
-        # 构建 Gemini 格式的消息
-        contents = []
-        if self.system_prompt:
-            contents.append({
-                "role": "user",
-                "parts": [{"text": f"System instruction: {self.system_prompt}"}]
-            })
-            contents.append({
-                "role": "model",
-                "parts": [{"text": "Understood. I will follow these instructions."}]
-            })
-        contents.append({
-            "role": "user",
-            "parts": [{"text": user_prompt}]
-        })
-
         payload = {
-            "contents": contents,
+            "contents": [{
+                "role": "user",
+                "parts": [{"text": user_prompt}]
+            }],
             "generationConfig": {
-                "temperature": 0.7,
-                "maxOutputTokens": 2000,
-            }
+                "temperature": self.temperature,
+            },
+            "safetySettings": [
+                {"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_NONE"},
+                {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"},
+                {"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_NONE"},
+                {"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_NONE"},
+            ]
         }
 
+        if self.system_prompt:
+            payload["system_instruction"] = {
+                "parts": [{"text": self.system_prompt}]
+            }
+
+        if self.max_tokens:
+            payload["generationConfig"]["maxOutputTokens"] = self.max_tokens
+
+        if self.extra_params:
+            payload["generationConfig"].update(self.extra_params)
+
         response = requests.post(
             url,
             headers=headers,
@@ -447,57 +532,50 @@ class AIAnalyzer:
             # 提取 JSON 部分
             json_str = response
 
-            # 尝试提取 ```json ... ``` 代码块
             if "```json" in response:
                 parts = response.split("```json", 1)
                 if len(parts) > 1:
                     code_block = parts[1]
-                    # 查找结束的 ```
                     end_idx = code_block.find("```")
                     if end_idx != -1:
                         json_str = code_block[:end_idx]
                     else:
-                        json_str = code_block  # 没有结束标记,使用剩余内容
-            # 尝试提取 ``` ... ``` 代码块
+                        json_str = code_block
             elif "```" in response:
-                parts = response.split("```", 2)  # 最多分割2次
+                parts = response.split("```", 2)
                 if len(parts) >= 2:
                     json_str = parts[1]
 
-            # 清理 JSON 字符串
             json_str = json_str.strip()
             if not json_str:
                 raise ValueError("提取的 JSON 内容为空")
 
             data = json.loads(json_str)
 
-            result.summary = data.get("summary", "")
-            result.keyword_analysis = data.get("keyword_analysis", "")
-            result.sentiment = data.get("sentiment", "")
-            result.cross_platform = data.get("cross_platform", "")
-            result.impact = data.get("impact", "")
+            # 新版字段解析
+            result.core_trends = data.get("core_trends", "")
+            result.sentiment_controversy = data.get("sentiment_controversy", "")
             result.signals = data.get("signals", "")
-            result.conclusion = data.get("conclusion", "")
+            result.rss_insights = data.get("rss_insights", "")
+            result.outlook_strategy = data.get("outlook_strategy", "")
+            
             result.success = True
 
         except json.JSONDecodeError as e:
-            # JSON 解析失败,记录详细错误但仍使用原始文本
             error_context = json_str[max(0, e.pos - 30):e.pos + 30] if json_str and e.pos else ""
             result.error = f"JSON 解析错误 (位置 {e.pos}): {e.msg}"
             if error_context:
                 result.error += f",上下文: ...{error_context}..."
-            # 使用原始响应作为 summary
-            result.summary = response[:1000] if len(response) > 1000 else response
-            result.success = True  # 仍标记为成功,因为有内容可展示
+            # 使用原始响应填充 core_trends,确保有输出
+            result.core_trends = response[:500] + "..." if len(response) > 500 else response
+            result.success = True
         except (IndexError, KeyError, TypeError, ValueError) as e:
-            # 其他解析错误
             result.error = f"响应解析错误: {type(e).__name__}: {str(e)}"
-            result.summary = response[:1000] if len(response) > 1000 else response
+            result.core_trends = response[:500] if len(response) > 500 else response
             result.success = True
         except Exception as e:
-            # 未知错误
             result.error = f"解析时发生未知错误: {type(e).__name__}: {str(e)}"
-            result.summary = response[:1000] if len(response) > 1000 else response
+            result.core_trends = response[:500] if len(response) > 500 else response
             result.success = True
 
         return result

+ 178 - 97
trendradar/ai/formatter.py

@@ -6,6 +6,7 @@ AI 分析结果格式化模块
 """
 
 import html as html_lib
+import re
 from .analyzer import AIAnalysisResult
 
 
@@ -14,6 +15,46 @@ def _escape_html(text: str) -> str:
     return html_lib.escape(text) if text else ""
 
 
+def _format_list_content(text: str) -> str:
+    """
+    格式化列表内容,确保序号前有换行
+    例如将 "1. xxx 2. yyy" 转换为:
+    1. xxx
+    2. yyy
+    """
+    if not text:
+        return ""
+    
+    # 去除首尾空白,防止 AI 返回的内容开头就有换行导致显示空行
+    text = text.strip()
+    
+    # 1. 规范化:确保 "1." 后面有空格
+    result = re.sub(r'(\d+)\.([^ \d])', r'\1. \2', text)
+
+    # 2. 强制换行:匹配 "数字.",且前面不是换行符
+    result = re.sub(r'(?<=[^\n])\s+(\d+\.)', r'\n\1', result)
+    
+    # 3. 处理 "1.**粗体**" 这种情况(虽然 Prompt 要求不输出 Markdown,但防御性处理)
+    result = re.sub(r'(?<=[^\n])(\d+\.\*\*)', r'\n\1', result)
+
+    # 4. 处理中文标点后的换行
+    result = re.sub(r'([::;,。;,])\s*(\d+\.)', r'\1\n\2', result)
+
+    # 5. 处理 "XX方面:"、"XX领域:" 等子标题换行
+    # 只有在中文标点(句号、逗号、分号等)后才触发换行,避免破坏 "1. XX领域:" 格式
+    result = re.sub(r'([。!?;,、])\s*([a-zA-Z0-9\u4e00-\u9fa5]+(方面|领域)[::])', r'\1\n\2', result)
+
+    # 6. 处理 "【XX】:"(如【宏观主线】:) 前的换行,确保视觉分隔
+    result = re.sub(r'(?<=[^\n])\s*(【[^】]+】[::])', r'\n\n\1', result)
+
+    # 7. 在列表项之间增加视觉空行(将 \n数字. 替换为 \n\n数字.)
+    # 但排除标题行(以冒号结尾)之后的情况,避免标题和第一项之间有空行
+    # (?<![::]) 是负向后瞻,表示前面不能是冒号
+    result = re.sub(r'(?<![::])\n(\d+\.)', r'\n\n\1', result)
+
+    return result
+
+
 def render_ai_analysis_markdown(result: AIAnalysisResult) -> str:
     """渲染为通用 Markdown 格式(Telegram、企业微信、ntfy、Bark、Slack)"""
     if not result.success:
@@ -21,26 +62,20 @@ def render_ai_analysis_markdown(result: AIAnalysisResult) -> str:
 
     lines = ["**✨ AI 热点分析**", ""]
 
-    if result.summary:
-        lines.extend(["**趋势概述**", result.summary, ""])
-
-    if result.keyword_analysis:
-        lines.extend(["**热度走势**", result.keyword_analysis, ""])
-
-    if result.sentiment:
-        lines.extend(["**情感倾向**", result.sentiment, ""])
-
-    if result.cross_platform:
-        lines.extend(["**跨平台关联**", result.cross_platform, ""])
+    if result.core_trends:
+        lines.extend(["**核心热点态势**", _format_list_content(result.core_trends), ""])
 
-    if result.impact:
-        lines.extend(["**潜在影响**", result.impact, ""])
+    if result.sentiment_controversy:
+        lines.extend(["**舆论风向争议**", _format_list_content(result.sentiment_controversy), ""])
 
     if result.signals:
-        lines.extend(["**值得关注**", result.signals, ""])
+        lines.extend(["**异动与弱信号**", _format_list_content(result.signals), ""])
 
-    if result.conclusion:
-        lines.extend(["**总结建议**", result.conclusion])
+    if result.rss_insights:
+        lines.extend(["**RSS 深度洞察**", _format_list_content(result.rss_insights), ""])
+
+    if result.outlook_strategy:
+        lines.extend(["**研判策略建议**", _format_list_content(result.outlook_strategy)])
 
     return "\n".join(lines)
 
@@ -52,26 +87,20 @@ def render_ai_analysis_feishu(result: AIAnalysisResult) -> str:
 
     lines = ["**✨ AI 热点分析**", ""]
 
-    if result.summary:
-        lines.extend(["**趋势概述**", result.summary, ""])
-
-    if result.keyword_analysis:
-        lines.extend(["**热度走势**", result.keyword_analysis, ""])
-
-    if result.sentiment:
-        lines.extend(["**情感倾向**", result.sentiment, ""])
+    if result.core_trends:
+        lines.extend(["**核心热点态势**", _format_list_content(result.core_trends), ""])
 
-    if result.cross_platform:
-        lines.extend(["**跨平台关联**", result.cross_platform, ""])
-
-    if result.impact:
-        lines.extend(["**潜在影响**", result.impact, ""])
+    if result.sentiment_controversy:
+        lines.extend(["**舆论风向争议**", _format_list_content(result.sentiment_controversy), ""])
 
     if result.signals:
-        lines.extend(["**值得关注**", result.signals, ""])
+        lines.extend(["**异动与弱信号**", _format_list_content(result.signals), ""])
+
+    if result.rss_insights:
+        lines.extend(["**RSS 深度洞察**", _format_list_content(result.rss_insights), ""])
 
-    if result.conclusion:
-        lines.extend(["**总结建议**", result.conclusion])
+    if result.outlook_strategy:
+        lines.extend(["**研判策略建议**", _format_list_content(result.outlook_strategy)])
 
     return "\n".join(lines)
 
@@ -83,26 +112,20 @@ def render_ai_analysis_dingtalk(result: AIAnalysisResult) -> str:
 
     lines = ["### ✨ AI 热点分析", ""]
 
-    if result.summary:
-        lines.extend(["#### 趋势概述", result.summary, ""])
+    if result.core_trends:
+        lines.extend(["#### 核心热点态势", _format_list_content(result.core_trends), ""])
 
-    if result.keyword_analysis:
-        lines.extend(["#### 热度走势", result.keyword_analysis, ""])
-
-    if result.sentiment:
-        lines.extend(["#### 情感倾向", result.sentiment, ""])
-
-    if result.cross_platform:
-        lines.extend(["#### 跨平台关联", result.cross_platform, ""])
-
-    if result.impact:
-        lines.extend(["#### 潜在影响", result.impact, ""])
+    if result.sentiment_controversy:
+        lines.extend(["#### 舆论风向争议", _format_list_content(result.sentiment_controversy), ""])
 
     if result.signals:
-        lines.extend(["#### 值得关注", result.signals, ""])
+        lines.extend(["#### 异动与弱信号", _format_list_content(result.signals), ""])
 
-    if result.conclusion:
-        lines.extend(["#### 总结建议", result.conclusion])
+    if result.rss_insights:
+        lines.extend(["#### RSS 深度洞察", _format_list_content(result.rss_insights), ""])
+
+    if result.outlook_strategy:
+        lines.extend(["#### 研判策略建议", _format_list_content(result.outlook_strategy)])
 
     return "\n".join(lines)
 
@@ -114,59 +137,53 @@ def render_ai_analysis_html(result: AIAnalysisResult) -> str:
 
     html_parts = ['<div class="ai-analysis">', '<h3>✨ AI 热点分析</h3>']
 
-    if result.summary:
-        html_parts.extend([
-            '<div class="ai-section">',
-            '<h4>趋势概述</h4>',
-            f'<p>{_escape_html(result.summary)}</p>',
-            '</div>'
-        ])
-
-    if result.keyword_analysis:
+    if result.core_trends:
+        content = _format_list_content(result.core_trends)
+        content_html = _escape_html(content).replace("\n", "<br>")
         html_parts.extend([
             '<div class="ai-section">',
-            '<h4>热度走势</h4>',
-            f'<p>{_escape_html(result.keyword_analysis)}</p>',
+            '<h4>核心热点态势</h4>',
+            f'<div class="ai-content">{content_html}</div>',
             '</div>'
         ])
 
-    if result.sentiment:
+    if result.sentiment_controversy:
+        content = _format_list_content(result.sentiment_controversy)
+        content_html = _escape_html(content).replace("\n", "<br>")
         html_parts.extend([
             '<div class="ai-section">',
-            '<h4>情感倾向</h4>',
-            f'<p>{_escape_html(result.sentiment)}</p>',
+            '<h4>舆论风向争议</h4>',
+            f'<div class="ai-content">{content_html}</div>',
             '</div>'
         ])
 
-    if result.cross_platform:
-        html_parts.extend([
-            '<div class="ai-section">',
-            '<h4>跨平台关联</h4>',
-            f'<p>{_escape_html(result.cross_platform)}</p>',
-            '</div>'
-        ])
-
-    if result.impact:
+    if result.signals:
+        content = _format_list_content(result.signals)
+        content_html = _escape_html(content).replace("\n", "<br>")
         html_parts.extend([
             '<div class="ai-section">',
-            '<h4>潜在影响</h4>',
-            f'<p>{_escape_html(result.impact)}</p>',
+            '<h4>异动与弱信号</h4>',
+            f'<div class="ai-content">{content_html}</div>',
             '</div>'
         ])
 
-    if result.signals:
+    if result.rss_insights:
+        content = _format_list_content(result.rss_insights)
+        content_html = _escape_html(content).replace("\n", "<br>")
         html_parts.extend([
             '<div class="ai-section">',
-            '<h4>值得关注</h4>',
-            f'<p>{_escape_html(result.signals)}</p>',
+            '<h4>RSS 深度洞察</h4>',
+            f'<div class="ai-content">{content_html}</div>',
             '</div>'
         ])
 
-    if result.conclusion:
+    if result.outlook_strategy:
+        content = _format_list_content(result.outlook_strategy)
+        content_html = _escape_html(content).replace("\n", "<br>")
         html_parts.extend([
             '<div class="ai-section ai-conclusion">',
-            '<h4>总结建议</h4>',
-            f'<p>{_escape_html(result.conclusion)}</p>',
+            '<h4>研判策略建议</h4>',
+            f'<div class="ai-content">{content_html}</div>',
             '</div>'
         ])
 
@@ -179,28 +196,22 @@ def render_ai_analysis_plain(result: AIAnalysisResult) -> str:
     if not result.success:
         return f"AI 分析失败: {result.error}"
 
-    lines = ["【AI 热点分析】", ""]
+    lines = ["【AI 热点分析】", ""]
 
-    if result.summary:
-        lines.extend(["[趋势概述]", result.summary, ""])
+    if result.core_trends:
+        lines.extend(["[核心热点态势]", _format_list_content(result.core_trends), ""])
 
-    if result.keyword_analysis:
-        lines.extend(["[热度走势]", result.keyword_analysis, ""])
-
-    if result.sentiment:
-        lines.extend(["[情感倾向]", result.sentiment, ""])
-
-    if result.cross_platform:
-        lines.extend(["[跨平台关联]", result.cross_platform, ""])
-
-    if result.impact:
-        lines.extend(["[潜在影响]", result.impact, ""])
+    if result.sentiment_controversy:
+        lines.extend(["[舆论风向争议]", _format_list_content(result.sentiment_controversy), ""])
 
     if result.signals:
-        lines.extend(["[值得关注]", result.signals, ""])
+        lines.extend(["[异动与弱信号]", _format_list_content(result.signals), ""])
 
-    if result.conclusion:
-        lines.extend(["[总结建议]", result.conclusion])
+    if result.rss_insights:
+        lines.extend(["[RSS 深度洞察]", _format_list_content(result.rss_insights), ""])
+
+    if result.outlook_strategy:
+        lines.extend(["[研判策略建议]", _format_list_content(result.outlook_strategy)])
 
     return "\n".join(lines)
 
@@ -212,9 +223,79 @@ def get_ai_analysis_renderer(channel: str):
         "dingtalk": render_ai_analysis_dingtalk,
         "wework": render_ai_analysis_markdown,
         "telegram": render_ai_analysis_markdown,
-        "email": render_ai_analysis_html,
+        "email": render_ai_analysis_html_rich,  # 邮件使用丰富样式,配合 HTML 报告的 CSS
         "ntfy": render_ai_analysis_markdown,
         "bark": render_ai_analysis_plain,
         "slack": render_ai_analysis_markdown,
     }
     return renderers.get(channel, render_ai_analysis_markdown)
+
+
+def render_ai_analysis_html_rich(result: AIAnalysisResult) -> str:
+    """渲染为丰富样式的 HTML 格式(HTML 报告用)"""
+    if not result:
+        return ""
+
+    # 检查是否成功
+    if not result.success:
+        error_msg = result.error or "未知错误"
+        return f'''
+                <div class="ai-section">
+                    <div class="ai-error">⚠️ AI 分析失败: {_escape_html(str(error_msg))}</div>
+                </div>'''
+
+    ai_html = '''
+                <div class="ai-section">
+                    <div class="ai-section-header">
+                        <div class="ai-section-title">✨ AI 热点分析</div>
+                        <span class="ai-section-badge">AI</span>
+                    </div>'''
+
+    if result.core_trends:
+        content = _format_list_content(result.core_trends)
+        content_html = _escape_html(content).replace("\n", "<br>")
+        ai_html += f'''
+                    <div class="ai-block">
+                        <div class="ai-block-title">核心热点态势</div>
+                        <div class="ai-block-content">{content_html}</div>
+                    </div>'''
+
+    if result.sentiment_controversy:
+        content = _format_list_content(result.sentiment_controversy)
+        content_html = _escape_html(content).replace("\n", "<br>")
+        ai_html += f'''
+                    <div class="ai-block">
+                        <div class="ai-block-title">舆论风向争议</div>
+                        <div class="ai-block-content">{content_html}</div>
+                    </div>'''
+
+    if result.signals:
+        content = _format_list_content(result.signals)
+        content_html = _escape_html(content).replace("\n", "<br>")
+        ai_html += f'''
+                    <div class="ai-block">
+                        <div class="ai-block-title">异动与弱信号</div>
+                        <div class="ai-block-content">{content_html}</div>
+                    </div>'''
+
+    if result.rss_insights:
+        content = _format_list_content(result.rss_insights)
+        content_html = _escape_html(content).replace("\n", "<br>")
+        ai_html += f'''
+                    <div class="ai-block">
+                        <div class="ai-block-title">RSS 深度洞察</div>
+                        <div class="ai-block-content">{content_html}</div>
+                    </div>'''
+
+    if result.outlook_strategy:
+        content = _format_list_content(result.outlook_strategy)
+        content_html = _escape_html(content).replace("\n", "<br>")
+        ai_html += f'''
+                    <div class="ai-block">
+                        <div class="ai-block-title">研判策略建议</div>
+                        <div class="ai-block-content">{content_html}</div>
+                    </div>'''
+
+    ai_html += '''
+                </div>'''
+    return ai_html

+ 428 - 0
trendradar/ai/translator.py

@@ -0,0 +1,428 @@
+# coding=utf-8
+"""
+AI 翻译器模块
+
+对推送内容进行多语言翻译
+使用共享的 AI 模型配置
+"""
+
+import json
+import os
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any, Dict, List, Optional
+
+
+@dataclass
+class TranslationResult:
+    """翻译结果"""
+    translated_text: str = ""       # 翻译后的文本
+    original_text: str = ""         # 原始文本
+    success: bool = False           # 是否成功
+    error: str = ""                 # 错误信息
+
+
+@dataclass
+class BatchTranslationResult:
+    """批量翻译结果"""
+    results: List[TranslationResult] = field(default_factory=list)
+    success_count: int = 0
+    fail_count: int = 0
+    total_count: int = 0
+
+
+class AITranslator:
+    """AI 翻译器"""
+
+    def __init__(self, translation_config: Dict[str, Any], ai_config: Dict[str, Any]):
+        """
+        初始化 AI 翻译器
+
+        Args:
+            translation_config: AI 翻译配置 (AI_TRANSLATION)
+            ai_config: AI 模型共享配置 (AI)
+        """
+        self.translation_config = translation_config
+        self.ai_config = ai_config
+
+        # 翻译配置
+        self.enabled = translation_config.get("ENABLED", False)
+        self.target_language = translation_config.get("LANGUAGE", "English")
+
+        # 从共享配置获取模型参数
+        self.api_key = ai_config.get("API_KEY") or os.environ.get("AI_API_KEY", "")
+        self.provider = ai_config.get("PROVIDER", "deepseek")
+        self.model = ai_config.get("MODEL", "deepseek-chat")
+        self.base_url = ai_config.get("BASE_URL", "")
+        self.timeout = ai_config.get("TIMEOUT", 90)
+
+        # AI 参数配置
+        self.temperature = ai_config.get("TEMPERATURE", 1.0)
+        self.max_tokens = ai_config.get("MAX_TOKENS", 5000)
+
+        # 额外参数
+        self.extra_params = ai_config.get("EXTRA_PARAMS", {})
+        if isinstance(self.extra_params, str) and self.extra_params.strip():
+            try:
+                self.extra_params = json.loads(self.extra_params)
+            except json.JSONDecodeError:
+                print(f"[翻译] 解析 extra_params 失败,将忽略: {self.extra_params}")
+                self.extra_params = {}
+
+        if not isinstance(self.extra_params, dict):
+            self.extra_params = {}
+
+        # 加载提示词模板
+        self.system_prompt, self.user_prompt_template = self._load_prompt_template(
+            translation_config.get("PROMPT_FILE", "ai_translation_prompt.txt")
+        )
+
+    def _load_prompt_template(self, prompt_file: str) -> tuple:
+        """加载提示词模板"""
+        config_dir = Path(__file__).parent.parent.parent / "config"
+        prompt_path = config_dir / prompt_file
+
+        if not prompt_path.exists():
+            print(f"[翻译] 提示词文件不存在: {prompt_path}")
+            return "", ""
+
+        content = prompt_path.read_text(encoding="utf-8")
+
+        # 解析 [system] 和 [user] 部分
+        system_prompt = ""
+        user_prompt = ""
+
+        if "[system]" in content and "[user]" in content:
+            parts = content.split("[user]")
+            system_part = parts[0]
+            user_part = parts[1] if len(parts) > 1 else ""
+
+            if "[system]" in system_part:
+                system_prompt = system_part.split("[system]")[1].strip()
+
+            user_prompt = user_part.strip()
+        else:
+            user_prompt = content
+
+        return system_prompt, user_prompt
+
+    def translate(self, text: str) -> TranslationResult:
+        """
+        翻译单条文本
+
+        Args:
+            text: 要翻译的文本
+
+        Returns:
+            TranslationResult: 翻译结果
+        """
+        result = TranslationResult(original_text=text)
+
+        if not self.enabled:
+            result.error = "翻译功能未启用"
+            return result
+
+        if not self.api_key:
+            result.error = "未配置 AI API Key"
+            return result
+
+        if not text or not text.strip():
+            result.translated_text = text
+            result.success = True
+            return result
+
+        try:
+            # 构建提示词
+            user_prompt = self.user_prompt_template
+            user_prompt = user_prompt.replace("{target_language}", self.target_language)
+            user_prompt = user_prompt.replace("{content}", text)
+
+            # 调用 AI API
+            response = self._call_ai_api(user_prompt)
+            result.translated_text = response.strip()
+            result.success = True
+
+        except Exception as e:
+            import requests
+            error_type = type(e).__name__
+            error_msg = str(e)
+
+            if isinstance(e, requests.exceptions.Timeout):
+                result.error = f"翻译请求超时({self.timeout}秒)"
+            elif isinstance(e, requests.exceptions.ConnectionError):
+                result.error = f"无法连接到 AI API"
+            elif isinstance(e, requests.exceptions.HTTPError):
+                status_code = e.response.status_code if hasattr(e, 'response') and e.response else "未知"
+                if status_code == 401:
+                    result.error = "API 认证失败"
+                elif status_code == 429:
+                    result.error = "API 请求频率过高"
+                else:
+                    result.error = f"API 错误 (HTTP {status_code})"
+            else:
+                if len(error_msg) > 100:
+                    error_msg = error_msg[:100] + "..."
+                result.error = f"翻译失败 ({error_type}): {error_msg}"
+
+        return result
+
+    def translate_batch(self, texts: List[str]) -> BatchTranslationResult:
+        """
+        批量翻译文本(单次 API 调用)
+
+        Args:
+            texts: 要翻译的文本列表
+
+        Returns:
+            BatchTranslationResult: 批量翻译结果
+        """
+        batch_result = BatchTranslationResult(total_count=len(texts))
+
+        if not self.enabled:
+            for text in texts:
+                batch_result.results.append(TranslationResult(
+                    original_text=text,
+                    error="翻译功能未启用"
+                ))
+            batch_result.fail_count = len(texts)
+            return batch_result
+
+        if not self.api_key:
+            for text in texts:
+                batch_result.results.append(TranslationResult(
+                    original_text=text,
+                    error="未配置 AI API Key"
+                ))
+            batch_result.fail_count = len(texts)
+            return batch_result
+
+        if not texts:
+            return batch_result
+
+        # 过滤空文本
+        non_empty_indices = []
+        non_empty_texts = []
+        for i, text in enumerate(texts):
+            if text and text.strip():
+                non_empty_indices.append(i)
+                non_empty_texts.append(text)
+
+        # 初始化结果列表
+        for text in texts:
+            batch_result.results.append(TranslationResult(original_text=text))
+
+        # 空文本直接标记成功
+        for i, text in enumerate(texts):
+            if not text or not text.strip():
+                batch_result.results[i].translated_text = text
+                batch_result.results[i].success = True
+                batch_result.success_count += 1
+
+        if not non_empty_texts:
+            return batch_result
+
+        try:
+            # 构建批量翻译内容(使用编号格式)
+            batch_content = self._format_batch_content(non_empty_texts)
+
+            # 构建提示词
+            user_prompt = self.user_prompt_template
+            user_prompt = user_prompt.replace("{target_language}", self.target_language)
+            user_prompt = user_prompt.replace("{content}", batch_content)
+
+            # 调用 AI API
+            response = self._call_ai_api(user_prompt)
+
+            # 解析批量翻译结果
+            translated_texts = self._parse_batch_response(response, len(non_empty_texts))
+
+            # 填充结果
+            for idx, translated in zip(non_empty_indices, translated_texts):
+                batch_result.results[idx].translated_text = translated
+                batch_result.results[idx].success = True
+                batch_result.success_count += 1
+
+        except Exception as e:
+            error_msg = f"批量翻译失败: {type(e).__name__}: {str(e)[:100]}"
+            for idx in non_empty_indices:
+                batch_result.results[idx].error = error_msg
+            batch_result.fail_count = len(non_empty_indices)
+
+        return batch_result
+
+    def _format_batch_content(self, texts: List[str]) -> str:
+        """格式化批量翻译内容"""
+        lines = []
+        for i, text in enumerate(texts, 1):
+            lines.append(f"[{i}] {text}")
+        return "\n".join(lines)
+
+    def _parse_batch_response(self, response: str, expected_count: int) -> List[str]:
+        """
+        解析批量翻译响应
+
+        Args:
+            response: AI 响应文本
+            expected_count: 期望的翻译数量
+
+        Returns:
+            List[str]: 翻译结果列表
+        """
+        results = []
+        lines = response.strip().split("\n")
+
+        current_idx = None
+        current_text = []
+
+        for line in lines:
+            # 尝试匹配 [数字] 格式
+            stripped = line.strip()
+            if stripped.startswith("[") and "]" in stripped:
+                bracket_end = stripped.index("]")
+                try:
+                    idx = int(stripped[1:bracket_end])
+                    # 保存之前的内容
+                    if current_idx is not None:
+                        results.append((current_idx, "\n".join(current_text).strip()))
+                    current_idx = idx
+                    current_text = [stripped[bracket_end + 1:].strip()]
+                except ValueError:
+                    if current_idx is not None:
+                        current_text.append(line)
+            else:
+                if current_idx is not None:
+                    current_text.append(line)
+
+        # 保存最后一条
+        if current_idx is not None:
+            results.append((current_idx, "\n".join(current_text).strip()))
+
+        # 按索引排序并提取文本
+        results.sort(key=lambda x: x[0])
+        translated = [text for _, text in results]
+
+        # 如果解析结果数量不匹配,尝试简单按行分割
+        if len(translated) != expected_count:
+            # 回退:按行分割(去除编号)
+            translated = []
+            for line in lines:
+                stripped = line.strip()
+                if stripped.startswith("[") and "]" in stripped:
+                    bracket_end = stripped.index("]")
+                    translated.append(stripped[bracket_end + 1:].strip())
+                elif stripped:
+                    translated.append(stripped)
+
+        # 确保返回正确数量
+        while len(translated) < expected_count:
+            translated.append("")
+
+        return translated[:expected_count]
+
+    def _call_ai_api(self, user_prompt: str) -> str:
+        """调用 AI API"""
+        if self.provider == "gemini":
+            return self._call_gemini(user_prompt)
+        return self._call_openai_compatible(user_prompt)
+
+    def _get_api_url(self) -> str:
+        """获取完整 API URL"""
+        if self.base_url:
+            return self.base_url
+
+        urls = {
+            "deepseek": "https://api.deepseek.com/v1/chat/completions",
+            "openai": "https://api.openai.com/v1/chat/completions",
+        }
+        url = urls.get(self.provider)
+        if not url:
+            raise ValueError(f"{self.provider} 需要配置 base_url")
+        return url
+
+    def _call_openai_compatible(self, user_prompt: str) -> str:
+        """调用 OpenAI 兼容接口"""
+        import requests
+
+        url = self._get_api_url()
+
+        headers = {
+            "Authorization": f"Bearer {self.api_key}",
+            "Content-Type": "application/json",
+        }
+
+        messages = []
+        if self.system_prompt:
+            messages.append({"role": "system", "content": self.system_prompt})
+        messages.append({"role": "user", "content": user_prompt})
+
+        payload = {
+            "model": self.model,
+            "messages": messages,
+            "temperature": self.temperature,
+        }
+
+        if self.max_tokens:
+            payload["max_tokens"] = self.max_tokens
+
+        if self.extra_params:
+            payload.update(self.extra_params)
+
+        response = requests.post(
+            url,
+            headers=headers,
+            json=payload,
+            timeout=self.timeout,
+        )
+        response.raise_for_status()
+
+        data = response.json()
+        return data["choices"][0]["message"]["content"]
+
+    def _call_gemini(self, user_prompt: str) -> str:
+        """调用 Google Gemini API"""
+        import requests
+
+        model = self.model or "gemini-1.5-flash"
+        url = f"https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent?key={self.api_key}"
+
+        headers = {
+            "Content-Type": "application/json",
+        }
+
+        payload = {
+            "contents": [{
+                "role": "user",
+                "parts": [{"text": user_prompt}]
+            }],
+            "generationConfig": {
+                "temperature": self.temperature,
+            },
+            "safetySettings": [
+                {"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_NONE"},
+                {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"},
+                {"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_NONE"},
+                {"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_NONE"},
+            ]
+        }
+
+        if self.system_prompt:
+            payload["system_instruction"] = {
+                "parts": [{"text": self.system_prompt}]
+            }
+
+        if self.max_tokens:
+            payload["generationConfig"]["maxOutputTokens"] = self.max_tokens
+
+        if self.extra_params:
+            payload["generationConfig"].update(self.extra_params)
+
+        response = requests.post(
+            url,
+            headers=headers,
+            json=payload,
+            timeout=self.timeout,
+        )
+        response.raise_for_status()
+
+        data = response.json()
+        return data["candidates"][0]["content"]["parts"][0]["text"]

+ 39 - 14
trendradar/context.py

@@ -7,7 +7,7 @@
 
 from datetime import datetime
 from pathlib import Path
-from typing import Any, Callable, Dict, List, Optional, Tuple
+from typing import Any, Dict, List, Optional, Tuple
 
 from trendradar.utils.time import (
     get_configured_time,
@@ -22,7 +22,6 @@ from trendradar.core import (
     save_titles_to_file,
     read_all_today_titles,
     detect_latest_new_titles,
-    is_first_crawl_today,
     count_word_frequency,
 )
 from trendradar.report import (
@@ -38,6 +37,7 @@ from trendradar.notification import (
     NotificationDispatcher,
     PushRecordManager,
 )
+from trendradar.ai import AITranslator
 from trendradar.storage import get_storage_manager
 
 
@@ -120,6 +120,17 @@ class AppContext:
         """获取显示模式 (keyword | platform)"""
         return self.config.get("DISPLAY_MODE", "keyword")
 
+    @property
+    def show_new_section(self) -> bool:
+        """是否显示新增热点区域"""
+        return self.config.get("DISPLAY", {}).get("REGIONS", {}).get("NEW_ITEMS", True)
+
+    @property
+    def region_order(self) -> List[str]:
+        """获取区域显示顺序"""
+        default_order = ["hotlist", "rss", "new_items", "standalone", "ai_analysis"]
+        return self.config.get("DISPLAY", {}).get("REGION_ORDER", default_order)
+
     # === 时间操作 ===
 
     def get_time(self) -> datetime:
@@ -174,8 +185,8 @@ class AppContext:
         return self._storage_manager
 
     def get_output_path(self, subfolder: str, filename: str) -> str:
-        """获取输出路径"""
-        output_dir = Path("output") / self.format_date() / subfolder
+        """获取输出路径(扁平化结构:output/类型/日期/文件名)"""
+        output_dir = Path("output") / subfolder / self.format_date()
         output_dir.mkdir(parents=True, exist_ok=True)
         return str(output_dir / filename)
 
@@ -273,6 +284,7 @@ class AppContext:
             rank_threshold=self.rank_threshold,
             matches_word_groups_func=self.matches_word_groups,
             load_frequency_words_func=self.load_frequency_words,
+            show_new_section=self.show_new_section,
         )
 
     def generate_html(
@@ -283,10 +295,11 @@ class AppContext:
         new_titles: Optional[Dict] = None,
         id_to_name: Optional[Dict] = None,
         mode: str = "daily",
-        is_daily_summary: bool = False,
         update_info: Optional[Dict] = None,
         rss_items: Optional[List[Dict]] = None,
         rss_new_items: Optional[List[Dict]] = None,
+        ai_analysis: Optional[Any] = None,
+        standalone_data: Optional[Dict] = None,
     ) -> str:
         """生成HTML报告"""
         return generate_html_report(
@@ -296,40 +309,41 @@ class AppContext:
             new_titles=new_titles,
             id_to_name=id_to_name,
             mode=mode,
-            is_daily_summary=is_daily_summary,
             update_info=update_info,
             rank_threshold=self.rank_threshold,
             output_dir="output",
             date_folder=self.format_date(),
             time_filename=self.format_time(),
-            render_html_func=lambda *args, **kwargs: self.render_html(*args, rss_items=rss_items, rss_new_items=rss_new_items, **kwargs),
+            render_html_func=lambda *args, **kwargs: self.render_html(*args, rss_items=rss_items, rss_new_items=rss_new_items, ai_analysis=ai_analysis, standalone_data=standalone_data, **kwargs),
             matches_word_groups_func=self.matches_word_groups,
             load_frequency_words_func=self.load_frequency_words,
-            enable_index_copy=True,
         )
 
     def render_html(
         self,
         report_data: Dict,
         total_titles: int,
-        is_daily_summary: bool = False,
         mode: str = "daily",
         update_info: Optional[Dict] = None,
         rss_items: Optional[List[Dict]] = None,
         rss_new_items: Optional[List[Dict]] = None,
+        ai_analysis: Optional[Any] = None,
+        standalone_data: Optional[Dict] = None,
     ) -> str:
         """渲染HTML内容"""
         return render_html_content(
             report_data=report_data,
             total_titles=total_titles,
-            is_daily_summary=is_daily_summary,
             mode=mode,
             update_info=update_info,
-            reverse_content_order=self.config.get("REVERSE_CONTENT_ORDER", False),
+            region_order=self.region_order,
             get_time_func=self.get_time,
             rss_items=rss_items,
             rss_new_items=rss_new_items,
             display_mode=self.display_mode,
+            ai_analysis=ai_analysis,
+            show_new_section=self.show_new_section,
+            standalone_data=standalone_data,
         )
 
     # === 通知内容渲染 ===
@@ -346,8 +360,9 @@ class AppContext:
             update_info=update_info,
             mode=mode,
             separator=self.config.get("FEISHU_MESSAGE_SEPARATOR", "---"),
-            reverse_content_order=self.config.get("REVERSE_CONTENT_ORDER", False),
+            region_order=self.region_order,
             get_time_func=self.get_time,
+            show_new_section=self.show_new_section,
         )
 
     def render_dingtalk(
@@ -361,8 +376,9 @@ class AppContext:
             report_data=report_data,
             update_info=update_info,
             mode=mode,
-            reverse_content_order=self.config.get("REVERSE_CONTENT_ORDER", False),
+            region_order=self.region_order,
             get_time_func=self.get_time,
+            show_new_section=self.show_new_section,
         )
 
     def split_content(
@@ -409,7 +425,7 @@ class AppContext:
                 "default": self.config.get("MESSAGE_BATCH_SIZE", 4000),
             },
             feishu_separator=self.config.get("FEISHU_MESSAGE_SEPARATOR", "---"),
-            reverse_content_order=self.config.get("REVERSE_CONTENT_ORDER", False),
+            region_order=self.region_order,
             get_time_func=self.get_time,
             rss_items=rss_items,
             rss_new_items=rss_new_items,
@@ -420,16 +436,25 @@ class AppContext:
             rank_threshold=self.rank_threshold,
             ai_stats=ai_stats,
             report_type=report_type,
+            show_new_section=self.show_new_section,
         )
 
     # === 通知发送 ===
 
     def create_notification_dispatcher(self) -> NotificationDispatcher:
         """创建通知调度器"""
+        # 创建翻译器(如果启用)
+        translator = None
+        trans_config = self.config.get("AI_TRANSLATION", {})
+        if trans_config.get("ENABLED", False):
+            ai_config = self.config.get("AI", {})
+            translator = AITranslator(trans_config, ai_config)
+
         return NotificationDispatcher(
             config=self.config,
             get_time_func=self.get_time,
             split_content_func=self.split_content,
+            translator=translator,
         )
 
     def create_push_manager(self) -> PushRecordManager:

+ 0 - 2
trendradar/core/__init__.py

@@ -17,7 +17,6 @@ from trendradar.core.data import (
     read_all_today_titles,
     detect_latest_new_titles_from_storage,
     detect_latest_new_titles,
-    is_first_crawl_today,
 )
 from trendradar.core.analyzer import (
     calculate_news_weight,
@@ -40,7 +39,6 @@ __all__ = [
     "read_all_today_titles",
     "detect_latest_new_titles_from_storage",
     "detect_latest_new_titles",
-    "is_first_crawl_today",
     # 统计分析
     "calculate_news_weight",
     "format_time_display",

+ 4 - 0
trendradar/core/analyzer.py

@@ -290,6 +290,7 @@ def count_word_frequency(
                 ranks = source_ranks if source_ranks else []
                 url = source_url
                 mobile_url = source_mobile_url
+                rank_timeline = []
 
                 # 对于 current 模式,从历史统计信息中获取完整数据
                 if (
@@ -306,6 +307,7 @@ def count_word_frequency(
                         ranks = info["ranks"]
                     url = info.get("url", source_url)
                     mobile_url = info.get("mobileUrl", source_mobile_url)
+                    rank_timeline = info.get("rank_timeline", [])
                 elif (
                     title_info
                     and source_id in title_info
@@ -319,6 +321,7 @@ def count_word_frequency(
                         ranks = info["ranks"]
                     url = info.get("url", source_url)
                     mobile_url = info.get("mobileUrl", source_mobile_url)
+                    rank_timeline = info.get("rank_timeline", [])
 
                 if not ranks:
                     ranks = [99]
@@ -350,6 +353,7 @@ def count_word_frequency(
                         "url": url,
                         "mobileUrl": mobile_url,
                         "is_new": is_new,
+                        "rank_timeline": rank_timeline,
                     }
                 )
 

+ 2 - 20
trendradar/core/data.py

@@ -126,6 +126,7 @@ def read_all_today_titles_from_storage(
                 first_time = getattr(item, 'first_time', item.crawl_time)
                 last_time = getattr(item, 'last_time', item.crawl_time)
                 count = getattr(item, 'count', 1)
+                rank_timeline = getattr(item, 'rank_timeline', [])
 
                 all_results[source_id][title] = {
                     "ranks": ranks,
@@ -140,6 +141,7 @@ def read_all_today_titles_from_storage(
                     "ranks": ranks,
                     "url": item.url or "",
                     "mobileUrl": item.mobile_url or "",
+                    "rank_timeline": rank_timeline,
                 }
 
         return all_results, final_id_to_name, title_info
@@ -283,23 +285,3 @@ def detect_latest_new_titles(
         total_new = sum(len(titles) for titles in new_titles.values())
         print(f"[存储] 从存储后端检测到 {total_new} 条新增标题")
     return new_titles
-
-
-def is_first_crawl_today(output_dir: str, date_folder: str) -> bool:
-    """
-    检测是否是当天第一次爬取
-
-    Args:
-        output_dir: 输出目录
-        date_folder: 日期文件夹名称
-
-    Returns:
-        bool: 是否是当天第一次爬取
-    """
-    txt_dir = Path(output_dir) / date_folder / "txt"
-
-    if not txt_dir.exists():
-        return True
-
-    files = sorted([f for f in txt_dir.iterdir() if f.suffix == ".txt"])
-    return len(files) <= 1

+ 83 - 33
trendradar/core/loader.py

@@ -65,12 +65,12 @@ def _load_crawler_config(config_data: Dict) -> Dict:
     """加载爬虫配置"""
     advanced = config_data.get("advanced", {})
     crawler_config = advanced.get("crawler", {})
-    enable_crawler_env = _get_env_bool("ENABLE_CRAWLER")
+    platforms_config = config_data.get("platforms", {})
     return {
         "REQUEST_INTERVAL": crawler_config.get("request_interval", 100),
         "USE_PROXY": crawler_config.get("use_proxy", False),
         "DEFAULT_PROXY": crawler_config.get("default_proxy", ""),
-        "ENABLE_CRAWLER": enable_crawler_env if enable_crawler_env is not None else crawler_config.get("enabled", True),
+        "ENABLE_CRAWLER": platforms_config.get("enabled", True),
     }
 
 
@@ -80,17 +80,14 @@ def _load_report_config(config_data: Dict) -> Dict:
 
     # 环境变量覆盖
     sort_by_position_env = _get_env_bool("SORT_BY_POSITION_FIRST")
-    reverse_content_env = _get_env_bool("REVERSE_CONTENT_ORDER")
     max_news_env = _get_env_int("MAX_NEWS_PER_KEYWORD")
-    display_mode_env = _get_env_str("DISPLAY_MODE")
 
     return {
-        "REPORT_MODE": _get_env_str("REPORT_MODE") or report_config.get("mode", "daily"),
-        "DISPLAY_MODE": display_mode_env or report_config.get("display_mode", "keyword"),
+        "REPORT_MODE": report_config.get("mode", "daily"),
+        "DISPLAY_MODE": report_config.get("display_mode", "keyword"),
         "RANK_THRESHOLD": report_config.get("rank_threshold", 10),
         "SORT_BY_POSITION_FIRST": sort_by_position_env if sort_by_position_env is not None else report_config.get("sort_by_position_first", False),
         "MAX_NEWS_PER_KEYWORD": max_news_env or report_config.get("max_news_per_keyword", 0),
-        "REVERSE_CONTENT_ORDER": reverse_content_env if reverse_content_env is not None else report_config.get("reverse_content_order", False),
     }
 
 
@@ -100,10 +97,8 @@ def _load_notification_config(config_data: Dict) -> Dict:
     advanced = config_data.get("advanced", {})
     batch_size = advanced.get("batch_size", {})
 
-    enable_notification_env = _get_env_bool("ENABLE_NOTIFICATION")
-
     return {
-        "ENABLE_NOTIFICATION": enable_notification_env if enable_notification_env is not None else notification.get("enabled", True),
+        "ENABLE_NOTIFICATION": notification.get("enabled", True),
         "MESSAGE_BATCH_SIZE": batch_size.get("default", 4000),
         "DINGTALK_BATCH_SIZE": batch_size.get("dingtalk", 20000),
         "FEISHU_BATCH_SIZE": batch_size.get("feishu", 29000),
@@ -180,43 +175,91 @@ def _load_rss_config(config_data: Dict) -> Dict:
             "ENABLED": freshness_filter.get("enabled", True),  # 默认启用
             "MAX_AGE_DAYS": max_age_days,
         },
-        "NOTIFICATION": {
-            "ENABLED": advanced_rss.get("notification_enabled", False),
-        },
     }
 
 
-def _load_standalone_display_config(config_data: Dict) -> Dict:
-    """加载独立展示区配置"""
-    notification = config_data.get("notification", {})
-    standalone = notification.get("standalone_display", {})
+def _load_display_config(config_data: Dict) -> Dict:
+    """加载推送内容显示配置"""
+    display = config_data.get("display", {})
+    regions = display.get("regions", {})
+    standalone = display.get("standalone", {})
+
+    # 默认区域顺序
+    default_region_order = ["hotlist", "rss", "new_items", "standalone", "ai_analysis"]
+    region_order = display.get("region_order", default_region_order)
+
+    # 验证 region_order 中的值是否合法
+    valid_regions = {"hotlist", "rss", "new_items", "standalone", "ai_analysis"}
+    region_order = [r for r in region_order if r in valid_regions]
+
+    # 如果过滤后为空,使用默认顺序
+    if not region_order:
+        region_order = default_region_order
 
     return {
-        "ENABLED": standalone.get("enabled", False),
-        "PLATFORMS": standalone.get("platforms", []),
-        "RSS_FEEDS": standalone.get("rss_feeds", []),
-        "MAX_ITEMS": standalone.get("max_items", 20),
+        # 区域显示顺序
+        "REGION_ORDER": region_order,
+        # 区域开关
+        "REGIONS": {
+            "HOTLIST": regions.get("hotlist", True),
+            "NEW_ITEMS": regions.get("new_items", True),
+            "RSS": regions.get("rss", True),
+            "STANDALONE": regions.get("standalone", False),
+            "AI_ANALYSIS": regions.get("ai_analysis", True),
+        },
+        # 独立展示区配置
+        "STANDALONE": {
+            "PLATFORMS": standalone.get("platforms", []),
+            "RSS_FEEDS": standalone.get("rss_feeds", []),
+            "MAX_ITEMS": standalone.get("max_items", 20),
+        },
     }
 
 
-def _load_ai_analysis_config(config_data: Dict) -> Dict:
-    """加载 AI 分析配置"""
-    ai_config = config_data.get("ai_analysis", {})
+def _load_ai_config(config_data: Dict) -> Dict:
+    """加载 AI 模型共享配置"""
+    ai_config = config_data.get("ai", {})
 
-    enabled_env = _get_env_bool("AI_ANALYSIS_ENABLED")
     timeout_env = _get_env_int_or_none("AI_TIMEOUT")
 
     return {
-        "ENABLED": enabled_env if enabled_env is not None else ai_config.get("enabled", False),
         "PROVIDER": _get_env_str("AI_PROVIDER") or ai_config.get("provider", "deepseek"),
         "API_KEY": _get_env_str("AI_API_KEY") or ai_config.get("api_key", ""),
         "MODEL": _get_env_str("AI_MODEL") or ai_config.get("model", "deepseek-chat"),
         "BASE_URL": _get_env_str("AI_BASE_URL") or ai_config.get("base_url", ""),
         "TIMEOUT": timeout_env if timeout_env is not None else ai_config.get("timeout", 90),
-        "PUSH_MODE": _get_env_str("AI_PUSH_MODE") or ai_config.get("push_mode", "both"),
+        "TEMPERATURE": ai_config.get("temperature", 1.0),
+        "MAX_TOKENS": ai_config.get("max_tokens", 5000),
+        "EXTRA_PARAMS": ai_config.get("extra_params", {}),
+    }
+
+
+def _load_ai_analysis_config(config_data: Dict) -> Dict:
+    """加载 AI 分析配置(功能配置,模型配置见 _load_ai_config)"""
+    ai_config = config_data.get("ai_analysis", {})
+
+    enabled_env = _get_env_bool("AI_ANALYSIS_ENABLED")
+
+    return {
+        "ENABLED": enabled_env if enabled_env is not None else ai_config.get("enabled", False),
+        "LANGUAGE": ai_config.get("language", "Chinese"),
+        "PROMPT_FILE": ai_config.get("prompt_file", "ai_analysis_prompt.txt"),
         "MAX_NEWS_FOR_ANALYSIS": ai_config.get("max_news_for_analysis", 50),
         "INCLUDE_RSS": ai_config.get("include_rss", True),
-        "PROMPT_FILE": ai_config.get("prompt_file", "ai_analysis_prompt.txt"),
+        "INCLUDE_RANK_TIMELINE": ai_config.get("include_rank_timeline", False),
+    }
+
+
+def _load_ai_translation_config(config_data: Dict) -> Dict:
+    """加载 AI 翻译配置(功能配置,模型配置见 _load_ai_config)"""
+    trans_config = config_data.get("ai_translation", {})
+
+    enabled_env = _get_env_bool("AI_TRANSLATION_ENABLED")
+
+    return {
+        "ENABLED": enabled_env if enabled_env is not None else trans_config.get("enabled", False),
+        "LANGUAGE": _get_env_str("AI_TRANSLATION_LANGUAGE") or trans_config.get("language", "English"),
+        "PROMPT_FILE": trans_config.get("prompt_file", "ai_translation_prompt.txt"),
     }
 
 
@@ -300,8 +343,8 @@ def _load_webhook_config(config_data: Dict) -> Dict:
         # Slack
         "SLACK_WEBHOOK_URL": _get_env_str("SLACK_WEBHOOK_URL") or slack.get("webhook_url", ""),
         # 通用 Webhook
-        "GENERIC_WEBHOOK_URL": _get_env_str("GENERIC_WEBHOOK_URL") or generic.get("url", ""),
-        "GENERIC_WEBHOOK_TEMPLATE": _get_env_str("GENERIC_WEBHOOK_TEMPLATE") or generic.get("template", ""),
+        "GENERIC_WEBHOOK_URL": _get_env_str("GENERIC_WEBHOOK_URL") or generic.get("webhook_url", ""),
+        "GENERIC_WEBHOOK_TEMPLATE": _get_env_str("GENERIC_WEBHOOK_TEMPLATE") or generic.get("payload_template", ""),
     }
 
 
@@ -433,16 +476,23 @@ def load_config(config_path: Optional[str] = None) -> Dict[str, Any]:
     config["WEIGHT_CONFIG"] = _load_weight_config(config_data)
 
     # 平台配置
-    config["PLATFORMS"] = config_data.get("platforms", [])
+    platforms_config = config_data.get("platforms", {})
+    config["PLATFORMS"] = platforms_config.get("sources", [])
 
     # RSS 配置
     config["RSS"] = _load_rss_config(config_data)
 
+    # AI 模型共享配置
+    config["AI"] = _load_ai_config(config_data)
+
     # AI 分析配置
     config["AI_ANALYSIS"] = _load_ai_analysis_config(config_data)
 
-    # 独立展示区配置
-    config["STANDALONE_DISPLAY"] = _load_standalone_display_config(config_data)
+    # AI 翻译配置
+    config["AI_TRANSLATION"] = _load_ai_translation_config(config_data)
+
+    # 推送内容显示配置
+    config["DISPLAY"] = _load_display_config(config_data)
 
     # 存储配置
     config["STORAGE"] = _load_storage_config(config_data)

+ 185 - 84
trendradar/notification/dispatcher.py

@@ -40,7 +40,7 @@ from .renderer import (
 
 # 类型检查时导入,运行时不导入(避免循环导入)
 if TYPE_CHECKING:
-    from trendradar.ai import AIAnalysisResult
+    from trendradar.ai import AIAnalysisResult, AITranslator
 
 
 class NotificationDispatcher:
@@ -56,6 +56,7 @@ class NotificationDispatcher:
         config: Dict[str, Any],
         get_time_func: Callable,
         split_content_func: Callable,
+        translator: Optional["AITranslator"] = None,
     ):
         """
         初始化通知调度器
@@ -64,11 +65,99 @@ class NotificationDispatcher:
             config: 完整的配置字典,包含所有通知渠道的配置
             get_time_func: 获取当前时间的函数
             split_content_func: 内容分批函数
+            translator: AI 翻译器实例(可选)
         """
         self.config = config
         self.get_time_func = get_time_func
         self.split_content_func = split_content_func
         self.max_accounts = config.get("MAX_ACCOUNTS_PER_CHANNEL", 3)
+        self.translator = translator
+
+    def _translate_content(
+        self,
+        report_data: Dict,
+        rss_items: Optional[List[Dict]] = None,
+        rss_new_items: Optional[List[Dict]] = None,
+    ) -> tuple:
+        """
+        翻译推送内容
+
+        Args:
+            report_data: 报告数据
+            rss_items: RSS 统计条目
+            rss_new_items: RSS 新增条目
+
+        Returns:
+            tuple: (翻译后的 report_data, rss_items, rss_new_items)
+        """
+        if not self.translator or not self.translator.enabled:
+            return report_data, rss_items, rss_new_items
+
+        import copy
+        print(f"[翻译] 开始翻译内容到 {self.translator.target_language}...")
+
+        # 深拷贝避免修改原始数据
+        report_data = copy.deepcopy(report_data)
+        rss_items = copy.deepcopy(rss_items) if rss_items else None
+        rss_new_items = copy.deepcopy(rss_new_items) if rss_new_items else None
+
+        # 收集所有需要翻译的标题
+        titles_to_translate = []
+        title_locations = []  # 记录标题位置,用于回填
+
+        # 1. 热榜标题
+        for stat_idx, stat in enumerate(report_data.get("stats", [])):
+            for title_idx, title_data in enumerate(stat.get("titles", [])):
+                titles_to_translate.append(title_data.get("title", ""))
+                title_locations.append(("stats", stat_idx, title_idx))
+
+        # 2. 新增热点标题
+        for source_idx, source in enumerate(report_data.get("new_titles", [])):
+            for title_idx, title_data in enumerate(source.get("titles", [])):
+                titles_to_translate.append(title_data.get("title", ""))
+                title_locations.append(("new_titles", source_idx, title_idx))
+
+        # 3. RSS 统计标题
+        if rss_items:
+            for item_idx, item in enumerate(rss_items):
+                titles_to_translate.append(item.get("title", ""))
+                title_locations.append(("rss_items", item_idx, None))
+
+        # 4. RSS 新增标题
+        if rss_new_items:
+            for item_idx, item in enumerate(rss_new_items):
+                titles_to_translate.append(item.get("title", ""))
+                title_locations.append(("rss_new_items", item_idx, None))
+
+        if not titles_to_translate:
+            print("[翻译] 没有需要翻译的内容")
+            return report_data, rss_items, rss_new_items
+
+        print(f"[翻译] 共 {len(titles_to_translate)} 条标题待翻译")
+
+        # 批量翻译
+        result = self.translator.translate_batch(titles_to_translate)
+
+        if result.success_count == 0:
+            print(f"[翻译] 翻译失败: {result.results[0].error if result.results else '未知错误'}")
+            return report_data, rss_items, rss_new_items
+
+        print(f"[翻译] 翻译完成: {result.success_count}/{result.total_count} 成功")
+
+        # 回填翻译结果
+        for i, (loc_type, idx1, idx2) in enumerate(title_locations):
+            if i < len(result.results) and result.results[i].success:
+                translated = result.results[i].translated_text
+                if loc_type == "stats":
+                    report_data["stats"][idx1]["titles"][idx2]["title"] = translated
+                elif loc_type == "new_titles":
+                    report_data["new_titles"][idx1]["titles"][idx2]["title"] = translated
+                elif loc_type == "rss_items" and rss_items:
+                    rss_items[idx1]["title"] = translated
+                elif loc_type == "rss_new_items" and rss_new_items:
+                    rss_new_items[idx1]["title"] = translated
+
+        return report_data, rss_items, rss_new_items
 
     def dispatch_all(
         self,
@@ -103,73 +192,77 @@ class NotificationDispatcher:
         """
         results = {}
 
-        # 获取 AI 推送模式
-        ai_config = self.config.get("AI_ANALYSIS", {})
-        ai_push_mode = ai_config.get("PUSH_MODE", "both")
+        # 获取区域显示配置
+        display_regions = self.config.get("DISPLAY", {}).get("REGIONS", {})
+
+        # 执行翻译(如果启用)
+        report_data, rss_items, rss_new_items = self._translate_content(
+            report_data, rss_items, rss_new_items
+        )
 
         # 飞书
         if self.config.get("FEISHU_WEBHOOK_URL"):
             results["feishu"] = self._send_feishu(
                 report_data, report_type, update_info, proxy_url, mode, rss_items, rss_new_items,
-                ai_analysis, ai_push_mode, standalone_data
+                ai_analysis, display_regions, standalone_data
             )
 
         # 钉钉
         if self.config.get("DINGTALK_WEBHOOK_URL"):
             results["dingtalk"] = self._send_dingtalk(
                 report_data, report_type, update_info, proxy_url, mode, rss_items, rss_new_items,
-                ai_analysis, ai_push_mode, standalone_data
+                ai_analysis, display_regions, standalone_data
             )
 
         # 企业微信
         if self.config.get("WEWORK_WEBHOOK_URL"):
             results["wework"] = self._send_wework(
                 report_data, report_type, update_info, proxy_url, mode, rss_items, rss_new_items,
-                ai_analysis, ai_push_mode, standalone_data
+                ai_analysis, display_regions, standalone_data
             )
 
         # Telegram(需要配对验证)
         if self.config.get("TELEGRAM_BOT_TOKEN") and self.config.get("TELEGRAM_CHAT_ID"):
             results["telegram"] = self._send_telegram(
                 report_data, report_type, update_info, proxy_url, mode, rss_items, rss_new_items,
-                ai_analysis, ai_push_mode, standalone_data
+                ai_analysis, display_regions, standalone_data
             )
 
         # ntfy(需要配对验证)
         if self.config.get("NTFY_SERVER_URL") and self.config.get("NTFY_TOPIC"):
             results["ntfy"] = self._send_ntfy(
                 report_data, report_type, update_info, proxy_url, mode, rss_items, rss_new_items,
-                ai_analysis, ai_push_mode, standalone_data
+                ai_analysis, display_regions, standalone_data
             )
 
         # Bark
         if self.config.get("BARK_URL"):
             results["bark"] = self._send_bark(
                 report_data, report_type, update_info, proxy_url, mode, rss_items, rss_new_items,
-                ai_analysis, ai_push_mode, standalone_data
+                ai_analysis, display_regions, standalone_data
             )
 
         # Slack
         if self.config.get("SLACK_WEBHOOK_URL"):
             results["slack"] = self._send_slack(
                 report_data, report_type, update_info, proxy_url, mode, rss_items, rss_new_items,
-                ai_analysis, ai_push_mode, standalone_data
+                ai_analysis, display_regions, standalone_data
             )
 
         # 通用 Webhook
         if self.config.get("GENERIC_WEBHOOK_URL"):
             results["generic_webhook"] = self._send_generic_webhook(
                 report_data, report_type, update_info, proxy_url, mode, rss_items, rss_new_items,
-                ai_analysis, ai_push_mode, standalone_data
+                ai_analysis, display_regions, standalone_data
             )
 
-        # 邮件(保持原有逻辑,已支持多收件人)
+        # 邮件(保持原有逻辑,已支持多收件人,AI 分析已嵌入 HTML
         if (
             self.config.get("EMAIL_FROM")
             and self.config.get("EMAIL_PASSWORD")
             and self.config.get("EMAIL_TO")
         ):
-            results["email"] = self._send_email(report_type, html_file_path, ai_analysis, ai_push_mode)
+            results["email"] = self._send_email(report_type, html_file_path)
 
         return results
 
@@ -217,13 +310,14 @@ class NotificationDispatcher:
         rss_items: Optional[List[Dict]] = None,
         rss_new_items: Optional[List[Dict]] = None,
         ai_analysis: Optional[AIAnalysisResult] = None,
-        ai_push_mode: str = "both",
+        display_regions: Optional[Dict] = None,
         standalone_data: Optional[Dict] = None,
     ) -> bool:
         """发送到飞书(多账号,支持热榜+RSS合并+AI分析+独立展示区)"""
-        # 根据 AI 推送模式决定是否发送原始内容
-        if ai_push_mode == "only_analysis" and ai_analysis:
-            report_data = {"stats": [], "failed_ids": [], "new_titles": {}, "id_to_name": {}}
+        display_regions = display_regions or {}
+        # 根据区域开关决定是否发送对应内容
+        if not display_regions.get("HOTLIST", True):
+            report_data = {"stats": [], "failed_ids": [], "new_titles": [], "id_to_name": {}}
 
         return self._send_to_multi_accounts(
             channel_name="飞书",
@@ -240,11 +334,11 @@ class NotificationDispatcher:
                 batch_interval=self.config.get("BATCH_SEND_INTERVAL", 1.0),
                 split_content_func=self.split_content_func,
                 get_time_func=self.get_time_func,
-                rss_items=rss_items if ai_push_mode != "only_analysis" else None,
-                rss_new_items=rss_new_items if ai_push_mode != "only_analysis" else None,
-                ai_analysis=ai_analysis,
-                ai_push_mode=ai_push_mode,
-                standalone_data=standalone_data,
+                rss_items=rss_items if display_regions.get("RSS", True) else None,
+                rss_new_items=rss_new_items if display_regions.get("RSS", True) else None,
+                ai_analysis=ai_analysis if display_regions.get("AI_ANALYSIS", True) else None,
+                display_regions=display_regions,
+                standalone_data=standalone_data if display_regions.get("STANDALONE", False) else None,
             ),
         )
 
@@ -258,12 +352,13 @@ class NotificationDispatcher:
         rss_items: Optional[List[Dict]] = None,
         rss_new_items: Optional[List[Dict]] = None,
         ai_analysis: Optional[AIAnalysisResult] = None,
-        ai_push_mode: str = "both",
+        display_regions: Optional[Dict] = None,
         standalone_data: Optional[Dict] = None,
     ) -> bool:
         """发送到钉钉(多账号,支持热榜+RSS合并+AI分析+独立展示区)"""
-        if ai_push_mode == "only_analysis" and ai_analysis:
-            report_data = {"stats": [], "failed_ids": [], "new_titles": {}, "id_to_name": {}}
+        display_regions = display_regions or {}
+        if not display_regions.get("HOTLIST", True):
+            report_data = {"stats": [], "failed_ids": [], "new_titles": [], "id_to_name": {}}
 
         return self._send_to_multi_accounts(
             channel_name="钉钉",
@@ -279,11 +374,11 @@ class NotificationDispatcher:
                 batch_size=self.config.get("DINGTALK_BATCH_SIZE", 20000),
                 batch_interval=self.config.get("BATCH_SEND_INTERVAL", 1.0),
                 split_content_func=self.split_content_func,
-                rss_items=rss_items if ai_push_mode != "only_analysis" else None,
-                rss_new_items=rss_new_items if ai_push_mode != "only_analysis" else None,
-                ai_analysis=ai_analysis,
-                ai_push_mode=ai_push_mode,
-                standalone_data=standalone_data,
+                rss_items=rss_items if display_regions.get("RSS", True) else None,
+                rss_new_items=rss_new_items if display_regions.get("RSS", True) else None,
+                ai_analysis=ai_analysis if display_regions.get("AI_ANALYSIS", True) else None,
+                display_regions=display_regions,
+                standalone_data=standalone_data if display_regions.get("STANDALONE", False) else None,
             ),
         )
 
@@ -297,12 +392,13 @@ class NotificationDispatcher:
         rss_items: Optional[List[Dict]] = None,
         rss_new_items: Optional[List[Dict]] = None,
         ai_analysis: Optional[AIAnalysisResult] = None,
-        ai_push_mode: str = "both",
+        display_regions: Optional[Dict] = None,
         standalone_data: Optional[Dict] = None,
     ) -> bool:
         """发送到企业微信(多账号,支持热榜+RSS合并+AI分析+独立展示区)"""
-        if ai_push_mode == "only_analysis" and ai_analysis:
-            report_data = {"stats": [], "failed_ids": [], "new_titles": {}, "id_to_name": {}}
+        display_regions = display_regions or {}
+        if not display_regions.get("HOTLIST", True):
+            report_data = {"stats": [], "failed_ids": [], "new_titles": [], "id_to_name": {}}
 
         return self._send_to_multi_accounts(
             channel_name="企业微信",
@@ -319,11 +415,11 @@ class NotificationDispatcher:
                 batch_interval=self.config.get("BATCH_SEND_INTERVAL", 1.0),
                 msg_type=self.config.get("WEWORK_MSG_TYPE", "markdown"),
                 split_content_func=self.split_content_func,
-                rss_items=rss_items if ai_push_mode != "only_analysis" else None,
-                rss_new_items=rss_new_items if ai_push_mode != "only_analysis" else None,
-                ai_analysis=ai_analysis,
-                ai_push_mode=ai_push_mode,
-                standalone_data=standalone_data,
+                rss_items=rss_items if display_regions.get("RSS", True) else None,
+                rss_new_items=rss_new_items if display_regions.get("RSS", True) else None,
+                ai_analysis=ai_analysis if display_regions.get("AI_ANALYSIS", True) else None,
+                display_regions=display_regions,
+                standalone_data=standalone_data if display_regions.get("STANDALONE", False) else None,
             ),
         )
 
@@ -337,12 +433,13 @@ class NotificationDispatcher:
         rss_items: Optional[List[Dict]] = None,
         rss_new_items: Optional[List[Dict]] = None,
         ai_analysis: Optional[AIAnalysisResult] = None,
-        ai_push_mode: str = "both",
+        display_regions: Optional[Dict] = None,
         standalone_data: Optional[Dict] = None,
     ) -> bool:
         """发送到 Telegram(多账号,需验证 token 和 chat_id 配对,支持热榜+RSS合并+AI分析+独立展示区)"""
-        if ai_push_mode == "only_analysis" and ai_analysis:
-            report_data = {"stats": [], "failed_ids": [], "new_titles": {}, "id_to_name": {}}
+        display_regions = display_regions or {}
+        if not display_regions.get("HOTLIST", True):
+            report_data = {"stats": [], "failed_ids": [], "new_titles": [], "id_to_name": {}}
 
         telegram_tokens = parse_multi_account_config(self.config["TELEGRAM_BOT_TOKEN"])
         telegram_chat_ids = parse_multi_account_config(self.config["TELEGRAM_CHAT_ID"])
@@ -381,11 +478,11 @@ class NotificationDispatcher:
                     batch_size=self.config.get("MESSAGE_BATCH_SIZE", 4000),
                     batch_interval=self.config.get("BATCH_SEND_INTERVAL", 1.0),
                     split_content_func=self.split_content_func,
-                    rss_items=rss_items if ai_push_mode != "only_analysis" else None,
-                    rss_new_items=rss_new_items if ai_push_mode != "only_analysis" else None,
-                    ai_analysis=ai_analysis,
-                    ai_push_mode=ai_push_mode,
-                    standalone_data=standalone_data,
+                    rss_items=rss_items if display_regions.get("RSS", True) else None,
+                    rss_new_items=rss_new_items if display_regions.get("RSS", True) else None,
+                    ai_analysis=ai_analysis if display_regions.get("AI_ANALYSIS", True) else None,
+                    display_regions=display_regions,
+                    standalone_data=standalone_data if display_regions.get("STANDALONE", False) else None,
                 )
                 results.append(result)
 
@@ -401,12 +498,13 @@ class NotificationDispatcher:
         rss_items: Optional[List[Dict]] = None,
         rss_new_items: Optional[List[Dict]] = None,
         ai_analysis: Optional[AIAnalysisResult] = None,
-        ai_push_mode: str = "both",
+        display_regions: Optional[Dict] = None,
         standalone_data: Optional[Dict] = None,
     ) -> bool:
         """发送到 ntfy(多账号,需验证 topic 和 token 配对,支持热榜+RSS合并+AI分析+独立展示区)"""
-        if ai_push_mode == "only_analysis" and ai_analysis:
-            report_data = {"stats": [], "failed_ids": [], "new_titles": {}, "id_to_name": {}}
+        display_regions = display_regions or {}
+        if not display_regions.get("HOTLIST", True):
+            report_data = {"stats": [], "failed_ids": [], "new_titles": [], "id_to_name": {}}
 
         ntfy_server_url = self.config["NTFY_SERVER_URL"]
         ntfy_topics = parse_multi_account_config(self.config["NTFY_TOPIC"])
@@ -444,11 +542,11 @@ class NotificationDispatcher:
                     account_label=account_label,
                     batch_size=3800,
                     split_content_func=self.split_content_func,
-                    rss_items=rss_items if ai_push_mode != "only_analysis" else None,
-                    rss_new_items=rss_new_items if ai_push_mode != "only_analysis" else None,
-                    ai_analysis=ai_analysis,
-                    ai_push_mode=ai_push_mode,
-                    standalone_data=standalone_data,
+                    rss_items=rss_items if display_regions.get("RSS", True) else None,
+                    rss_new_items=rss_new_items if display_regions.get("RSS", True) else None,
+                    ai_analysis=ai_analysis if display_regions.get("AI_ANALYSIS", True) else None,
+                    display_regions=display_regions,
+                    standalone_data=standalone_data if display_regions.get("STANDALONE", False) else None,
                 )
                 results.append(result)
 
@@ -464,12 +562,13 @@ class NotificationDispatcher:
         rss_items: Optional[List[Dict]] = None,
         rss_new_items: Optional[List[Dict]] = None,
         ai_analysis: Optional[AIAnalysisResult] = None,
-        ai_push_mode: str = "both",
+        display_regions: Optional[Dict] = None,
         standalone_data: Optional[Dict] = None,
     ) -> bool:
         """发送到 Bark(多账号,支持热榜+RSS合并+AI分析+独立展示区)"""
-        if ai_push_mode == "only_analysis" and ai_analysis:
-            report_data = {"stats": [], "failed_ids": [], "new_titles": {}, "id_to_name": {}}
+        display_regions = display_regions or {}
+        if not display_regions.get("HOTLIST", True):
+            report_data = {"stats": [], "failed_ids": [], "new_titles": [], "id_to_name": {}}
 
         return self._send_to_multi_accounts(
             channel_name="Bark",
@@ -485,11 +584,11 @@ class NotificationDispatcher:
                 batch_size=self.config.get("BARK_BATCH_SIZE", 3600),
                 batch_interval=self.config.get("BATCH_SEND_INTERVAL", 1.0),
                 split_content_func=self.split_content_func,
-                rss_items=rss_items if ai_push_mode != "only_analysis" else None,
-                rss_new_items=rss_new_items if ai_push_mode != "only_analysis" else None,
-                ai_analysis=ai_analysis,
-                ai_push_mode=ai_push_mode,
-                standalone_data=standalone_data,
+                rss_items=rss_items if display_regions.get("RSS", True) else None,
+                rss_new_items=rss_new_items if display_regions.get("RSS", True) else None,
+                ai_analysis=ai_analysis if display_regions.get("AI_ANALYSIS", True) else None,
+                display_regions=display_regions,
+                standalone_data=standalone_data if display_regions.get("STANDALONE", False) else None,
             ),
         )
 
@@ -503,12 +602,13 @@ class NotificationDispatcher:
         rss_items: Optional[List[Dict]] = None,
         rss_new_items: Optional[List[Dict]] = None,
         ai_analysis: Optional[AIAnalysisResult] = None,
-        ai_push_mode: str = "both",
+        display_regions: Optional[Dict] = None,
         standalone_data: Optional[Dict] = None,
     ) -> bool:
         """发送到 Slack(多账号,支持热榜+RSS合并+AI分析+独立展示区)"""
-        if ai_push_mode == "only_analysis" and ai_analysis:
-            report_data = {"stats": [], "failed_ids": [], "new_titles": {}, "id_to_name": {}}
+        display_regions = display_regions or {}
+        if not display_regions.get("HOTLIST", True):
+            report_data = {"stats": [], "failed_ids": [], "new_titles": [], "id_to_name": {}}
 
         return self._send_to_multi_accounts(
             channel_name="Slack",
@@ -524,11 +624,11 @@ class NotificationDispatcher:
                 batch_size=self.config.get("SLACK_BATCH_SIZE", 4000),
                 batch_interval=self.config.get("BATCH_SEND_INTERVAL", 1.0),
                 split_content_func=self.split_content_func,
-                rss_items=rss_items if ai_push_mode != "only_analysis" else None,
-                rss_new_items=rss_new_items if ai_push_mode != "only_analysis" else None,
-                ai_analysis=ai_analysis,
-                ai_push_mode=ai_push_mode,
-                standalone_data=standalone_data,
+                rss_items=rss_items if display_regions.get("RSS", True) else None,
+                rss_new_items=rss_new_items if display_regions.get("RSS", True) else None,
+                ai_analysis=ai_analysis if display_regions.get("AI_ANALYSIS", True) else None,
+                display_regions=display_regions,
+                standalone_data=standalone_data if display_regions.get("STANDALONE", False) else None,
             ),
         )
 
@@ -542,12 +642,13 @@ class NotificationDispatcher:
         rss_items: Optional[List[Dict]] = None,
         rss_new_items: Optional[List[Dict]] = None,
         ai_analysis: Optional[AIAnalysisResult] = None,
-        ai_push_mode: str = "both",
+        display_regions: Optional[Dict] = None,
         standalone_data: Optional[Dict] = None,
     ) -> bool:
         """发送到通用 Webhook(多账号,支持热榜+RSS合并+AI分析+独立展示区)"""
-        if ai_push_mode == "only_analysis" and ai_analysis:
-            report_data = {"stats": [], "failed_ids": [], "new_titles": {}, "id_to_name": {}}
+        display_regions = display_regions or {}
+        if not display_regions.get("HOTLIST", True):
+            report_data = {"stats": [], "failed_ids": [], "new_titles": [], "id_to_name": {}}
 
         urls = parse_multi_account_config(self.config.get("GENERIC_WEBHOOK_URL", ""))
         templates = parse_multi_account_config(self.config.get("GENERIC_WEBHOOK_TEMPLATE", ""))
@@ -583,11 +684,11 @@ class NotificationDispatcher:
                 batch_size=self.config.get("MESSAGE_BATCH_SIZE", 4000),
                 batch_interval=self.config.get("BATCH_SEND_INTERVAL", 1.0),
                 split_content_func=self.split_content_func,
-                rss_items=rss_items if ai_push_mode != "only_analysis" else None,
-                rss_new_items=rss_new_items if ai_push_mode != "only_analysis" else None,
-                ai_analysis=ai_analysis,
-                ai_push_mode=ai_push_mode,
-                standalone_data=standalone_data,
+                rss_items=rss_items if display_regions.get("RSS", True) else None,
+                rss_new_items=rss_new_items if display_regions.get("RSS", True) else None,
+                ai_analysis=ai_analysis if display_regions.get("AI_ANALYSIS", True) else None,
+                display_regions=display_regions,
+                standalone_data=standalone_data if display_regions.get("STANDALONE", False) else None,
             )
             results.append(result)
 
@@ -597,10 +698,12 @@ class NotificationDispatcher:
         self,
         report_type: str,
         html_file_path: Optional[str],
-        ai_analysis: Optional[AIAnalysisResult] = None,
-        ai_push_mode: str = "both",
     ) -> bool:
-        """发送邮件(保持原有逻辑,已支持多收件人,支持AI分析)"""
+        """发送邮件(保持原有逻辑,已支持多收件人)
+
+        Note:
+            AI 分析内容已在 HTML 生成时嵌入,无需在此传递
+        """
         return send_to_email(
             from_email=self.config["EMAIL_FROM"],
             password=self.config["EMAIL_PASSWORD"],
@@ -610,8 +713,6 @@ class NotificationDispatcher:
             custom_smtp_server=self.config.get("EMAIL_SMTP_SERVER", ""),
             custom_smtp_port=self.config.get("EMAIL_SMTP_PORT", ""),
             get_time_func=self.get_time_func,
-            ai_analysis=ai_analysis,
-            ai_push_mode=ai_push_mode,
         )
 
     # === RSS 通知方法 ===

+ 3 - 2
trendradar/notification/push_manager.py

@@ -41,8 +41,9 @@ class PushRecordManager:
         print(f"[推送记录] 使用 {storage_backend.backend_name} 存储后端")
 
     def _default_get_time(self) -> datetime:
-        """默认时间获取函数(UTC+8)"""
-        return datetime.now(pytz.timezone("Asia/Shanghai"))
+        """默认时间获取函数(使用 storage_backend 的时区配置)"""
+        timezone = getattr(self.storage_backend, 'timezone', 'Asia/Shanghai')
+        return datetime.now(pytz.timezone(timezone))
 
     def has_pushed_today(self) -> bool:
         """

+ 60 - 54
trendradar/notification/renderer.py

@@ -6,19 +6,24 @@
 """
 
 from datetime import datetime
-from typing import Dict, Optional, Callable
+from typing import Dict, List, Optional, Callable
 
 from trendradar.report.formatter import format_title_for_platform
 
 
+# 默认区域顺序
+DEFAULT_REGION_ORDER = ["hotlist", "rss", "new_items", "standalone", "ai_analysis"]
+
+
 def render_feishu_content(
     report_data: Dict,
     update_info: Optional[Dict] = None,
     mode: str = "daily",
     separator: str = "---",
-    reverse_content_order: bool = False,
+    region_order: Optional[List[str]] = None,
     get_time_func: Optional[Callable[[], datetime]] = None,
     rss_items: Optional[list] = None,
+    show_new_section: bool = True,
 ) -> str:
     """渲染飞书通知内容(支持热榜+RSS合并)
 
@@ -27,13 +32,17 @@ def render_feishu_content(
         update_info: 版本更新信息(可选)
         mode: 报告模式 ("daily", "incremental", "current")
         separator: 内容分隔符
-        reverse_content_order: 是否反转内容顺序(新增在前)
+        region_order: 区域显示顺序列表
         get_time_func: 获取当前时间的函数(可选,默认使用 datetime.now())
         rss_items: RSS 条目列表(可选,用于合并推送)
+        show_new_section: 是否显示新增热点区域
 
     Returns:
         格式化的飞书消息内容
     """
+    if region_order is None:
+        region_order = DEFAULT_REGION_ORDER
+
     # 生成热点词汇统计部分
     stats_content = ""
     if report_data["stats"]:
@@ -68,7 +77,7 @@ def render_feishu_content(
 
     # 生成新增新闻部分
     new_titles_content = ""
-    if report_data["new_titles"]:
+    if show_new_section and report_data["new_titles"]:
         new_titles_content += (
             f"🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
         )
@@ -88,31 +97,26 @@ def render_feishu_content(
 
             new_titles_content += "\n"
 
-    # 根据配置决定内容顺序
-    text_content = ""
-    if reverse_content_order:
-        # 新增热点在前,热点词汇统计在后
-        if new_titles_content:
-            text_content += new_titles_content
-            if stats_content:
-                text_content += f"\n{separator}\n\n"
-        if stats_content:
-            text_content += stats_content
-    else:
-        # 默认:热点词汇统计在前,新增热点在后
-        if stats_content:
-            text_content += stats_content
-            if new_titles_content:
-                text_content += f"\n{separator}\n\n"
-        if new_titles_content:
-            text_content += new_titles_content
-
-    # 添加 RSS 内容(如果有)
+    # RSS 内容
+    rss_content = ""
     if rss_items:
         rss_content = _render_rss_section_feishu(rss_items, separator)
-        if text_content:
-            text_content += f"\n{separator}\n\n"
-        text_content += rss_content
+
+    # 准备各区域内容映射
+    region_contents = {
+        "hotlist": stats_content,
+        "new_items": new_titles_content,
+        "rss": rss_content,
+    }
+
+    # 按 region_order 顺序组装内容
+    text_content = ""
+    for region in region_order:
+        content = region_contents.get(region, "")
+        if content:
+            if text_content:
+                text_content += f"\n{separator}\n\n"
+            text_content += content
 
     if not text_content:
         if mode == "incremental":
@@ -147,9 +151,10 @@ def render_dingtalk_content(
     report_data: Dict,
     update_info: Optional[Dict] = None,
     mode: str = "daily",
-    reverse_content_order: bool = False,
+    region_order: Optional[List[str]] = None,
     get_time_func: Optional[Callable[[], datetime]] = None,
     rss_items: Optional[list] = None,
+    show_new_section: bool = True,
 ) -> str:
     """渲染钉钉通知内容(支持热榜+RSS合并)
 
@@ -157,13 +162,17 @@ def render_dingtalk_content(
         report_data: 报告数据字典,包含 stats, new_titles, failed_ids, total_new_count
         update_info: 版本更新信息(可选)
         mode: 报告模式 ("daily", "incremental", "current")
-        reverse_content_order: 是否反转内容顺序(新增在前)
+        region_order: 区域显示顺序列表
         get_time_func: 获取当前时间的函数(可选,默认使用 datetime.now())
         rss_items: RSS 条目列表(可选,用于合并推送)
+        show_new_section: 是否显示新增热点区域
 
     Returns:
         格式化的钉钉消息内容
     """
+    if region_order is None:
+        region_order = DEFAULT_REGION_ORDER
+
     total_titles = sum(
         len(stat["titles"]) for stat in report_data["stats"] if stat["count"] > 0
     )
@@ -209,7 +218,7 @@ def render_dingtalk_content(
 
     # 生成新增新闻部分
     new_titles_content = ""
-    if report_data["new_titles"]:
+    if show_new_section and report_data["new_titles"]:
         new_titles_content += (
             f"🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
         )
@@ -227,33 +236,30 @@ def render_dingtalk_content(
 
             new_titles_content += "\n"
 
-    # 根据配置决定内容顺序
-    text_content = header_content
-    if reverse_content_order:
-        # 新增热点在前,热点词汇统计在后
-        if new_titles_content:
-            text_content += new_titles_content
-            if stats_content:
-                text_content += "\n---\n\n"
-        if stats_content:
-            text_content += stats_content
-    else:
-        # 默认:热点词汇统计在前,新增热点在后
-        if stats_content:
-            text_content += stats_content
-            if new_titles_content:
-                text_content += "\n---\n\n"
-        if new_titles_content:
-            text_content += new_titles_content
-
-    # 添加 RSS 内容(如果有)
+    # RSS 内容
+    rss_content = ""
     if rss_items:
         rss_content = _render_rss_section_markdown(rss_items)
-        if stats_content or new_titles_content:
-            text_content += "\n---\n\n"
-        text_content += rss_content
 
-    if not stats_content and not new_titles_content and not rss_items:
+    # 准备各区域内容映射
+    region_contents = {
+        "hotlist": stats_content,
+        "new_items": new_titles_content,
+        "rss": rss_content,
+    }
+
+    # 按 region_order 顺序组装内容
+    text_content = header_content
+    has_content = False
+    for region in region_order:
+        content = region_contents.get(region, "")
+        if content:
+            if has_content:
+                text_content += "\n---\n\n"
+            text_content += content
+            has_content = True
+
+    if not has_content:
         if mode == "incremental":
             mode_text = "增量模式下暂无新增匹配的热点词汇"
         elif mode == "current":

+ 20 - 25
trendradar/notification/senders.py

@@ -33,7 +33,7 @@ from .batch import add_batch_headers, get_max_batch_header_size
 from .formatters import convert_markdown_to_mrkdwn, strip_markdown
 
 
-def _render_ai_analysis(ai_analysis: Any, channel: str, ai_push_mode: str) -> str:
+def _render_ai_analysis(ai_analysis: Any, channel: str) -> str:
     """渲染 AI 分析内容为指定渠道格式"""
     if not ai_analysis:
         return ""
@@ -90,7 +90,7 @@ def send_to_feishu(
     rss_items: Optional[list] = None,
     rss_new_items: Optional[list] = None,
     ai_analysis: Any = None,
-    ai_push_mode: str = "both",
+    display_regions: Optional[Dict] = None,
     standalone_data: Optional[Dict] = None,
 ) -> bool:
     """
@@ -126,7 +126,7 @@ def send_to_feishu(
     ai_content = None
     ai_stats = None
     if ai_analysis:
-        ai_content = _render_ai_analysis(ai_analysis, "feishu", ai_push_mode)
+        ai_content = _render_ai_analysis(ai_analysis, "feishu")
         # 提取 AI 分析统计数据(只要 AI 分析成功就显示)
         if getattr(ai_analysis, "success", False):
             ai_stats = {
@@ -220,7 +220,7 @@ def send_to_dingtalk(
     rss_items: Optional[list] = None,
     rss_new_items: Optional[list] = None,
     ai_analysis: Any = None,
-    ai_push_mode: str = "both",
+    display_regions: Optional[Dict] = None,
     standalone_data: Optional[Dict] = None,
 ) -> bool:
     """
@@ -255,7 +255,7 @@ def send_to_dingtalk(
     ai_content = None
     ai_stats = None
     if ai_analysis:
-        ai_content = _render_ai_analysis(ai_analysis, "dingtalk", ai_push_mode)
+        ai_content = _render_ai_analysis(ai_analysis, "dingtalk")
         # 提取 AI 分析统计数据(只要 AI 分析成功就显示)
         if getattr(ai_analysis, "success", False):
             ai_stats = {
@@ -348,7 +348,7 @@ def send_to_wework(
     rss_items: Optional[list] = None,
     rss_new_items: Optional[list] = None,
     ai_analysis: Any = None,
-    ai_push_mode: str = "both",
+    display_regions: Optional[Dict] = None,
     standalone_data: Optional[Dict] = None,
 ) -> bool:
     """
@@ -395,7 +395,7 @@ def send_to_wework(
     ai_content = None
     ai_stats = None
     if ai_analysis:
-        ai_content = _render_ai_analysis(ai_analysis, "wework", ai_push_mode)
+        ai_content = _render_ai_analysis(ai_analysis, "wework")
         # 提取 AI 分析统计数据(只要 AI 分析成功就显示)
         if getattr(ai_analysis, "success", False):
             ai_stats = {
@@ -486,7 +486,7 @@ def send_to_telegram(
     rss_items: Optional[list] = None,
     rss_new_items: Optional[list] = None,
     ai_analysis: Any = None,
-    ai_push_mode: str = "both",
+    display_regions: Optional[Dict] = None,
     standalone_data: Optional[Dict] = None,
 ) -> bool:
     """
@@ -524,7 +524,7 @@ def send_to_telegram(
     ai_content = None
     ai_stats = None
     if ai_analysis:
-        ai_content = _render_ai_analysis(ai_analysis, "telegram", ai_push_mode)
+        ai_content = _render_ai_analysis(ai_analysis, "telegram")
         # 提取 AI 分析统计数据(只要 AI 分析成功就显示)
         if getattr(ai_analysis, "success", False):
             ai_stats = {
@@ -606,8 +606,6 @@ def send_to_email(
     custom_smtp_port: Optional[int] = None,
     *,
     get_time_func: Callable = None,
-    ai_analysis: Any = None,
-    ai_push_mode: str = "both",
 ) -> bool:
     """
     发送邮件通知
@@ -624,6 +622,9 @@ def send_to_email(
 
     Returns:
         bool: 发送是否成功
+
+    Note:
+        AI 分析内容已在 HTML 生成时嵌入,无需再追加
     """
     try:
         if not html_file_path or not Path(html_file_path).exists():
@@ -634,12 +635,6 @@ def send_to_email(
         with open(html_file_path, "r", encoding="utf-8") as f:
             html_content = f.read()
 
-        # 追加 AI 分析内容到 HTML
-        if ai_analysis:
-            ai_content = _render_ai_analysis(ai_analysis, "email", ai_push_mode)
-            if ai_content:
-                html_content = html_content.replace("</body>", f"{ai_content}</body>")
-
         domain = from_email.split("@")[-1].lower()
 
         if custom_smtp_server and custom_smtp_port:
@@ -776,7 +771,7 @@ def send_to_ntfy(
     rss_items: Optional[list] = None,
     rss_new_items: Optional[list] = None,
     ai_analysis: Any = None,
-    ai_push_mode: str = "both",
+    display_regions: Optional[Dict] = None,
     standalone_data: Optional[Dict] = None,
 ) -> bool:
     """
@@ -838,7 +833,7 @@ def send_to_ntfy(
     ai_content = None
     ai_stats = None
     if ai_analysis:
-        ai_content = _render_ai_analysis(ai_analysis, "ntfy", ai_push_mode)
+        ai_content = _render_ai_analysis(ai_analysis, "ntfy")
         # 提取 AI 分析统计数据(只要 AI 分析成功就显示)
         if getattr(ai_analysis, "success", False):
             ai_stats = {
@@ -978,7 +973,7 @@ def send_to_bark(
     rss_items: Optional[list] = None,
     rss_new_items: Optional[list] = None,
     ai_analysis: Any = None,
-    ai_push_mode: str = "both",
+    display_regions: Optional[Dict] = None,
     standalone_data: Optional[Dict] = None,
 ) -> bool:
     """
@@ -1024,7 +1019,7 @@ def send_to_bark(
     ai_content = None
     ai_stats = None
     if ai_analysis:
-        ai_content = _render_ai_analysis(ai_analysis, "bark", ai_push_mode)
+        ai_content = _render_ai_analysis(ai_analysis, "bark")
         # 提取 AI 分析统计数据(只要 AI 分析成功就显示)
         if getattr(ai_analysis, "success", False):
             ai_stats = {
@@ -1151,7 +1146,7 @@ def send_to_slack(
     rss_items: Optional[list] = None,
     rss_new_items: Optional[list] = None,
     ai_analysis: Any = None,
-    ai_push_mode: str = "both",
+    display_regions: Optional[Dict] = None,
     standalone_data: Optional[Dict] = None,
 ) -> bool:
     """
@@ -1186,7 +1181,7 @@ def send_to_slack(
     ai_content = None
     ai_stats = None
     if ai_analysis:
-        ai_content = _render_ai_analysis(ai_analysis, "slack", ai_push_mode)
+        ai_content = _render_ai_analysis(ai_analysis, "slack")
         # 提取 AI 分析统计数据(只要 AI 分析成功就显示)
         if getattr(ai_analysis, "success", False):
             ai_stats = {
@@ -1269,7 +1264,7 @@ def send_to_generic_webhook(
     rss_items: Optional[list] = None,
     rss_new_items: Optional[list] = None,
     ai_analysis: Any = None,
-    ai_push_mode: str = "both",
+    display_regions: Optional[Dict] = None,
     standalone_data: Optional[Dict] = None,
 ) -> bool:
     """
@@ -1309,7 +1304,7 @@ def send_to_generic_webhook(
     ai_stats = None
     if ai_analysis:
         # 通用 Webhook 使用 markdown 格式渲染 AI 分析
-        ai_content = _render_ai_analysis(ai_analysis, "wework", ai_push_mode)
+        ai_content = _render_ai_analysis(ai_analysis, "wework")
         # 提取 AI 分析统计数据
         if getattr(ai_analysis, "success", False):
             ai_stats = {

+ 275 - 137
trendradar/notification/splitter.py

@@ -21,6 +21,9 @@ DEFAULT_BATCH_SIZES = {
     "default": 4000,
 }
 
+# 默认区域顺序
+DEFAULT_REGION_ORDER = ["hotlist", "rss", "new_items", "standalone", "ai_analysis"]
+
 
 def split_content_into_batches(
     report_data: Dict,
@@ -30,7 +33,7 @@ def split_content_into_batches(
     mode: str = "daily",
     batch_sizes: Optional[Dict[str, int]] = None,
     feishu_separator: str = "---",
-    reverse_content_order: bool = False,
+    region_order: Optional[List[str]] = None,
     get_time_func: Optional[Callable[[], datetime]] = None,
     rss_items: Optional[list] = None,
     rss_new_items: Optional[list] = None,
@@ -41,13 +44,14 @@ def split_content_into_batches(
     rank_threshold: int = 10,
     ai_stats: Optional[Dict] = None,
     report_type: str = "热点分析报告",
+    show_new_section: bool = True,
 ) -> List[str]:
     """分批处理消息内容,确保词组标题+至少第一条新闻的完整性(支持热榜+RSS合并+AI分析+独立展示区)
 
     热榜统计与RSS统计并列显示,热榜新增与RSS新增并列显示。
-    reverse_content_order 控制统计和新增的前后顺序。
-    AI分析内容默认放在最后(footer之前)
-    独立展示区放在新增区块之后、失败ID之前
+    region_order 控制各区域的显示顺序。
+    AI分析内容根据 region_order 中的位置显示
+    独立展示区根据 region_order 中的位置显示
 
     Args:
         report_data: 报告数据字典,包含 stats, new_titles, failed_ids, total_new_count
@@ -57,7 +61,7 @@ def split_content_into_batches(
         mode: 报告模式 (daily, incremental, current)
         batch_sizes: 批次大小配置字典(可选)
         feishu_separator: 飞书消息分隔符
-        reverse_content_order: 是否反转内容顺序(新增在前,统计在后)
+        region_order: 区域显示顺序列表
         get_time_func: 获取当前时间的函数(可选)
         rss_items: RSS 统计条目列表(按源分组,用于合并推送)
         rss_new_items: RSS 新增条目列表(可选,用于新增区块)
@@ -70,6 +74,8 @@ def split_content_into_batches(
     Returns:
         分批后的消息内容列表
     """
+    if region_order is None:
+        region_order = DEFAULT_REGION_ORDER
     # 合并批次大小配置
     sizes = {**DEFAULT_BATCH_SIZES, **(batch_sizes or {})}
 
@@ -212,15 +218,31 @@ def split_content_into_batches(
         return batches
 
     # 定义处理热点词汇统计的函数
-    def process_stats_section(current_batch, current_batch_has_content, batches):
+    def process_stats_section(current_batch, current_batch_has_content, batches, add_separator=True):
         """处理热点词汇统计"""
         if not report_data["stats"]:
             return current_batch, current_batch_has_content, batches
 
         total_count = len(report_data["stats"])
 
+        # 根据 add_separator 决定是否添加前置分割线
+        actual_stats_header = ""
+        if add_separator and current_batch_has_content:
+            # 需要添加分割线
+            if format_type == "feishu":
+                actual_stats_header = f"\n{feishu_separator}\n\n{stats_header}"
+            elif format_type == "dingtalk":
+                actual_stats_header = f"\n---\n\n{stats_header}"
+            elif format_type in ("wework", "bark"):
+                actual_stats_header = f"\n\n\n\n{stats_header}"
+            else:
+                actual_stats_header = f"\n\n{stats_header}"
+        else:
+            # 不需要分割线(第一个区域)
+            actual_stats_header = stats_header
+
         # 添加统计标题
-        test_content = current_batch + stats_header
+        test_content = current_batch + actual_stats_header
         if (
             len(test_content.encode("utf-8")) + len(base_footer.encode("utf-8"))
             < max_bytes
@@ -230,6 +252,7 @@ def split_content_into_batches(
         else:
             if current_batch_has_content:
                 batches.append(current_batch + base_footer)
+            # 新批次开头不需要分割线,使用原始 stats_header
             current_batch = base_header + stats_header
             current_batch_has_content = True
 
@@ -430,26 +453,43 @@ def split_content_into_batches(
         return current_batch, current_batch_has_content, batches
 
     # 定义处理新增新闻的函数
-    def process_new_titles_section(current_batch, current_batch_has_content, batches):
+    def process_new_titles_section(current_batch, current_batch_has_content, batches, add_separator=True):
         """处理新增新闻"""
-        if not report_data["new_titles"]:
+        if not show_new_section or not report_data["new_titles"]:
             return current_batch, current_batch_has_content, batches
 
+        # 根据 add_separator 决定是否添加前置分割线
         new_header = ""
-        if format_type in ("wework", "bark"):
-            new_header = f"\n\n\n\n🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
-        elif format_type == "telegram":
-            new_header = (
-                f"\n\n🆕 本次新增热点新闻 (共 {report_data['total_new_count']} 条)\n\n"
-            )
-        elif format_type == "ntfy":
-            new_header = f"\n\n🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
-        elif format_type == "feishu":
-            new_header = f"\n{feishu_separator}\n\n🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
-        elif format_type == "dingtalk":
-            new_header = f"\n---\n\n🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
-        elif format_type == "slack":
-            new_header = f"\n\n🆕 *本次新增热点新闻* (共 {report_data['total_new_count']} 条)\n\n"
+        if add_separator and current_batch_has_content:
+            # 需要添加分割线
+            if format_type in ("wework", "bark"):
+                new_header = f"\n\n\n\n🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
+            elif format_type == "telegram":
+                new_header = (
+                    f"\n\n🆕 本次新增热点新闻 (共 {report_data['total_new_count']} 条)\n\n"
+                )
+            elif format_type == "ntfy":
+                new_header = f"\n\n🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
+            elif format_type == "feishu":
+                new_header = f"\n{feishu_separator}\n\n🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
+            elif format_type == "dingtalk":
+                new_header = f"\n---\n\n🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
+            elif format_type == "slack":
+                new_header = f"\n\n🆕 *本次新增热点新闻* (共 {report_data['total_new_count']} 条)\n\n"
+        else:
+            # 不需要分割线(第一个区域)
+            if format_type in ("wework", "bark"):
+                new_header = f"🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
+            elif format_type == "telegram":
+                new_header = f"🆕 本次新增热点新闻 (共 {report_data['total_new_count']} 条)\n\n"
+            elif format_type == "ntfy":
+                new_header = f"🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
+            elif format_type == "feishu":
+                new_header = f"🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
+            elif format_type == "dingtalk":
+                new_header = f"🆕 **本次新增热点新闻** (共 {report_data['total_new_count']} 条)\n\n"
+            elif format_type == "slack":
+                new_header = f"🆕 *本次新增热点新闻* (共 {report_data['total_new_count']} 条)\n\n"
 
         test_content = current_batch + new_header
         if (
@@ -578,59 +618,137 @@ def split_content_into_batches(
 
         return current_batch, current_batch_has_content, batches
 
-    # 根据配置决定处理顺序
-    if reverse_content_order:
-        # 新增热点在前,热点词汇统计在后
-        # 1. 处理热榜新增
-        current_batch, current_batch_has_content, batches = process_new_titles_section(
-            current_batch, current_batch_has_content, batches
+    # 定义处理 AI 分析的函数
+    def process_ai_section(current_batch, current_batch_has_content, batches, add_separator=True):
+        """处理 AI 分析内容"""
+        nonlocal ai_content
+        if not ai_content:
+            return current_batch, current_batch_has_content, batches
+
+        # 根据 add_separator 决定是否添加前置分割线
+        ai_separator = ""
+        if add_separator and current_batch_has_content:
+            # 需要添加分割线
+            if format_type == "feishu":
+                ai_separator = f"\n{feishu_separator}\n\n"
+            elif format_type == "dingtalk":
+                ai_separator = "\n---\n\n"
+            elif format_type in ("wework", "bark"):
+                ai_separator = "\n\n\n\n"
+            elif format_type in ("telegram", "ntfy", "slack"):
+                ai_separator = "\n\n"
+        # 如果不需要分割线,ai_separator 保持为空字符串
+
+        # 尝试将 AI 内容添加到当前批次
+        test_content = current_batch + ai_separator + ai_content
+        if (
+            len(test_content.encode("utf-8")) + len(base_footer.encode("utf-8"))
+            < max_bytes
+        ):
+            current_batch = test_content
+            current_batch_has_content = True
+        else:
+            # 当前批次容纳不下,开启新批次
+            if current_batch_has_content:
+                batches.append(current_batch + base_footer)
+            # AI 内容可能很长,需要考虑是否需要进一步分割
+            ai_with_header = base_header + ai_content
+            current_batch = ai_with_header
+            current_batch_has_content = True
+
+        return current_batch, current_batch_has_content, batches
+
+    # 定义处理独立展示区的函数
+    def process_standalone_section_wrapper(current_batch, current_batch_has_content, batches, add_separator=True):
+        """处理独立展示区"""
+        if not standalone_data:
+            return current_batch, current_batch_has_content, batches
+        return _process_standalone_section(
+            standalone_data, format_type, feishu_separator, base_header, base_footer,
+            max_bytes, current_batch, current_batch_has_content, batches, timezone,
+            rank_threshold, add_separator
         )
-        # 2. 处理 RSS 新增(如果有)
-        if rss_new_items:
-            current_batch, current_batch_has_content, batches = _process_rss_new_titles_section(
-                rss_new_items, format_type, feishu_separator, base_header, base_footer,
-                max_bytes, current_batch, current_batch_has_content, batches, timezone
-            )
-        # 3. 处理热榜统计
-        current_batch, current_batch_has_content, batches = process_stats_section(
-            current_batch, current_batch_has_content, batches
+
+    # 定义处理 RSS 统计的函数
+    def process_rss_stats_wrapper(current_batch, current_batch_has_content, batches, add_separator=True):
+        """处理 RSS 统计"""
+        if not rss_items:
+            return current_batch, current_batch_has_content, batches
+        return _process_rss_stats_section(
+            rss_items, format_type, feishu_separator, base_header, base_footer,
+            max_bytes, current_batch, current_batch_has_content, batches, timezone,
+            add_separator
         )
-        # 4. 处理 RSS 统计(如果有)
-        if rss_items:
-            current_batch, current_batch_has_content, batches = _process_rss_stats_section(
-                rss_items, format_type, feishu_separator, base_header, base_footer,
-                max_bytes, current_batch, current_batch_has_content, batches, timezone
-            )
-    else:
-        # 默认:热点词汇统计在前,新增热点在后
-        # 1. 处理热榜统计
-        current_batch, current_batch_has_content, batches = process_stats_section(
-            current_batch, current_batch_has_content, batches
+
+    # 定义处理 RSS 新增的函数
+    def process_rss_new_wrapper(current_batch, current_batch_has_content, batches, add_separator=True):
+        """处理 RSS 新增"""
+        if not rss_new_items:
+            return current_batch, current_batch_has_content, batches
+        return _process_rss_new_titles_section(
+            rss_new_items, format_type, feishu_separator, base_header, base_footer,
+            max_bytes, current_batch, current_batch_has_content, batches, timezone,
+            add_separator
         )
-        # 2. 处理 RSS 统计(如果有)
-        if rss_items:
-            current_batch, current_batch_has_content, batches = _process_rss_stats_section(
-                rss_items, format_type, feishu_separator, base_header, base_footer,
-                max_bytes, current_batch, current_batch_has_content, batches, timezone
+
+    # 按 region_order 顺序处理各区域
+    # 记录是否已有区域内容(用于决定是否添加分割线)
+    has_region_content = False
+
+    for region in region_order:
+        # 记录处理前的状态,用于判断该区域是否产生了内容
+        batch_before = current_batch
+        has_content_before = current_batch_has_content
+        batches_len_before = len(batches)
+
+        # 决定是否需要添加分割线(第一个有内容的区域不需要)
+        add_separator = has_region_content
+
+        if region == "hotlist":
+            # 处理热榜统计
+            current_batch, current_batch_has_content, batches = process_stats_section(
+                current_batch, current_batch_has_content, batches, add_separator
             )
-        # 3. 处理热榜新增
-        current_batch, current_batch_has_content, batches = process_new_titles_section(
-            current_batch, current_batch_has_content, batches
-        )
-        # 4. 处理 RSS 新增(如果有)
-        if rss_new_items:
-            current_batch, current_batch_has_content, batches = _process_rss_new_titles_section(
-                rss_new_items, format_type, feishu_separator, base_header, base_footer,
-                max_bytes, current_batch, current_batch_has_content, batches, timezone
+        elif region == "rss":
+            # 处理 RSS 统计
+            current_batch, current_batch_has_content, batches = process_rss_stats_wrapper(
+                current_batch, current_batch_has_content, batches, add_separator
+            )
+        elif region == "new_items":
+            # 处理热榜新增
+            current_batch, current_batch_has_content, batches = process_new_titles_section(
+                current_batch, current_batch_has_content, batches, add_separator
+            )
+            # 处理 RSS 新增(跟随 new_items,继承 add_separator 逻辑)
+            # 如果热榜新增产生了内容,RSS 新增需要分割线
+            new_batch_changed = (
+                current_batch != batch_before or
+                current_batch_has_content != has_content_before or
+                len(batches) != batches_len_before
+            )
+            rss_new_separator = new_batch_changed or has_region_content
+            current_batch, current_batch_has_content, batches = process_rss_new_wrapper(
+                current_batch, current_batch_has_content, batches, rss_new_separator
+            )
+        elif region == "standalone":
+            # 处理独立展示区
+            current_batch, current_batch_has_content, batches = process_standalone_section_wrapper(
+                current_batch, current_batch_has_content, batches, add_separator
+            )
+        elif region == "ai_analysis":
+            # 处理 AI 分析
+            current_batch, current_batch_has_content, batches = process_ai_section(
+                current_batch, current_batch_has_content, batches, add_separator
             )
 
-    # 5. 处理独立展示区(如果有)
-    if standalone_data:
-        current_batch, current_batch_has_content, batches = _process_standalone_section(
-            standalone_data, format_type, feishu_separator, base_header, base_footer,
-            max_bytes, current_batch, current_batch_has_content, batches, timezone,
-            rank_threshold
+        # 检查该区域是否产生了内容
+        region_produced_content = (
+            current_batch != batch_before or
+            current_batch_has_content != has_content_before or
+            len(batches) != batches_len_before
         )
+        if region_produced_content:
+            has_region_content = True
 
     if report_data["failed_ids"]:
         failed_header = ""
@@ -679,41 +797,6 @@ def split_content_into_batches(
                 current_batch = test_content
                 current_batch_has_content = True
 
-    # 处理 AI 分析内容(放在最后,footer 之前)
-    if ai_content:
-        # 添加 AI 分析区块分隔符
-        ai_separator = ""
-        if format_type == "feishu":
-            ai_separator = f"\n{feishu_separator}\n\n"
-        elif format_type == "dingtalk":
-            ai_separator = "\n---\n\n"
-        elif format_type in ("wework", "bark"):
-            ai_separator = "\n\n\n\n"
-        elif format_type in ("telegram", "ntfy", "slack"):
-            ai_separator = "\n\n"
-
-        # 尝试将 AI 内容添加到当前批次
-        test_content = current_batch + ai_separator + ai_content
-        if (
-            len(test_content.encode("utf-8")) + len(base_footer.encode("utf-8"))
-            < max_bytes
-        ):
-            current_batch = test_content
-            current_batch_has_content = True
-        else:
-            # 当前批次容纳不下,开启新批次
-            if current_batch_has_content:
-                batches.append(current_batch + base_footer)
-            # AI 内容可能很长,需要考虑是否需要进一步分割
-            ai_with_header = base_header + ai_content
-            if len(ai_with_header.encode("utf-8")) + len(base_footer.encode("utf-8")) < max_bytes:
-                current_batch = ai_with_header
-                current_batch_has_content = True
-            else:
-                # AI 内容过长,直接添加(可能会超限,但保持完整性)
-                current_batch = ai_with_header
-                current_batch_has_content = True
-
     # 完成最后批次
     if current_batch_has_content:
         batches.append(current_batch + base_footer)
@@ -732,6 +815,7 @@ def _process_rss_stats_section(
     current_batch_has_content: bool,
     batches: List[str],
     timezone: str = "Asia/Shanghai",
+    add_separator: bool = True,
 ) -> tuple:
     """处理 RSS 统计区块(按关键词分组,与热榜统计格式一致)
 
@@ -747,6 +831,7 @@ def _process_rss_stats_section(
         current_batch_has_content: 当前批次是否有内容
         batches: 已完成的批次列表
         timezone: 时区名称
+        add_separator: 是否在区块前添加分割线(第一个区域时为 False)
 
     Returns:
         (current_batch, current_batch_has_content, batches) 元组
@@ -758,18 +843,34 @@ def _process_rss_stats_section(
     total_items = sum(stat["count"] for stat in rss_stats)
     total_keywords = len(rss_stats)
 
-    # RSS 统计区块标题
+    # RSS 统计区块标题(根据 add_separator 决定是否添加前置分割线)
     rss_header = ""
-    if format_type == "feishu":
-        rss_header = f"\n{feishu_separator}\n\n📰 **RSS 订阅统计** (共 {total_items} 条)\n\n"
-    elif format_type == "dingtalk":
-        rss_header = f"\n---\n\n📰 **RSS 订阅统计** (共 {total_items} 条)\n\n"
-    elif format_type == "telegram":
-        rss_header = f"\n\n📰 RSS 订阅统计 (共 {total_items} 条)\n\n"
-    elif format_type == "slack":
-        rss_header = f"\n\n📰 *RSS 订阅统计* (共 {total_items} 条)\n\n"
+    if add_separator and current_batch_has_content:
+        # 需要添加分割线
+        if format_type == "feishu":
+            rss_header = f"\n{feishu_separator}\n\n📰 **RSS 订阅统计** (共 {total_items} 条)\n\n"
+        elif format_type == "dingtalk":
+            rss_header = f"\n---\n\n📰 **RSS 订阅统计** (共 {total_items} 条)\n\n"
+        elif format_type in ("wework", "bark"):
+            rss_header = f"\n\n\n\n📰 **RSS 订阅统计** (共 {total_items} 条)\n\n"
+        elif format_type == "telegram":
+            rss_header = f"\n\n📰 RSS 订阅统计 (共 {total_items} 条)\n\n"
+        elif format_type == "slack":
+            rss_header = f"\n\n📰 *RSS 订阅统计* (共 {total_items} 条)\n\n"
+        else:
+            rss_header = f"\n\n📰 **RSS 订阅统计** (共 {total_items} 条)\n\n"
     else:
-        rss_header = f"\n\n📰 **RSS 订阅统计** (共 {total_items} 条)\n\n"
+        # 不需要分割线(第一个区域)
+        if format_type == "feishu":
+            rss_header = f"📰 **RSS 订阅统计** (共 {total_items} 条)\n\n"
+        elif format_type == "dingtalk":
+            rss_header = f"📰 **RSS 订阅统计** (共 {total_items} 条)\n\n"
+        elif format_type == "telegram":
+            rss_header = f"📰 RSS 订阅统计 (共 {total_items} 条)\n\n"
+        elif format_type == "slack":
+            rss_header = f"📰 *RSS 订阅统计* (共 {total_items} 条)\n\n"
+        else:
+            rss_header = f"📰 **RSS 订阅统计** (共 {total_items} 条)\n\n"
 
     # 添加 RSS 标题
     test_content = current_batch + rss_header
@@ -937,6 +1038,7 @@ def _process_rss_new_titles_section(
     current_batch_has_content: bool,
     batches: List[str],
     timezone: str = "Asia/Shanghai",
+    add_separator: bool = True,
 ) -> tuple:
     """处理 RSS 新增区块(按来源分组,与热榜新增格式一致)
 
@@ -952,6 +1054,7 @@ def _process_rss_new_titles_section(
         current_batch_has_content: 当前批次是否有内容
         batches: 已完成的批次列表
         timezone: 时区名称
+        add_separator: 是否在区块前添加分割线(第一个区域时为 False)
 
     Returns:
         (current_batch, current_batch_has_content, batches) 元组
@@ -974,20 +1077,36 @@ def _process_rss_new_titles_section(
     # 计算总条目数
     total_items = sum(len(titles) for titles in source_map.values())
 
-    # RSS 新增区块标题
+    # RSS 新增区块标题(根据 add_separator 决定是否添加前置分割线)
     new_header = ""
-    if format_type in ("wework", "bark"):
-        new_header = f"\n\n\n\n🆕 **RSS 本次新增** (共 {total_items} 条)\n\n"
-    elif format_type == "telegram":
-        new_header = f"\n\n🆕 RSS 本次新增 (共 {total_items} 条)\n\n"
-    elif format_type == "ntfy":
-        new_header = f"\n\n🆕 **RSS 本次新增** (共 {total_items} 条)\n\n"
-    elif format_type == "feishu":
-        new_header = f"\n{feishu_separator}\n\n🆕 **RSS 本次新增** (共 {total_items} 条)\n\n"
-    elif format_type == "dingtalk":
-        new_header = f"\n---\n\n🆕 **RSS 本次新增** (共 {total_items} 条)\n\n"
-    elif format_type == "slack":
-        new_header = f"\n\n🆕 *RSS 本次新增* (共 {total_items} 条)\n\n"
+    if add_separator and current_batch_has_content:
+        # 需要添加分割线
+        if format_type in ("wework", "bark"):
+            new_header = f"\n\n\n\n🆕 **RSS 本次新增** (共 {total_items} 条)\n\n"
+        elif format_type == "telegram":
+            new_header = f"\n\n🆕 RSS 本次新增 (共 {total_items} 条)\n\n"
+        elif format_type == "ntfy":
+            new_header = f"\n\n🆕 **RSS 本次新增** (共 {total_items} 条)\n\n"
+        elif format_type == "feishu":
+            new_header = f"\n{feishu_separator}\n\n🆕 **RSS 本次新增** (共 {total_items} 条)\n\n"
+        elif format_type == "dingtalk":
+            new_header = f"\n---\n\n🆕 **RSS 本次新增** (共 {total_items} 条)\n\n"
+        elif format_type == "slack":
+            new_header = f"\n\n🆕 *RSS 本次新增* (共 {total_items} 条)\n\n"
+    else:
+        # 不需要分割线(第一个区域)
+        if format_type in ("wework", "bark"):
+            new_header = f"🆕 **RSS 本次新增** (共 {total_items} 条)\n\n"
+        elif format_type == "telegram":
+            new_header = f"🆕 RSS 本次新增 (共 {total_items} 条)\n\n"
+        elif format_type == "ntfy":
+            new_header = f"🆕 **RSS 本次新增** (共 {total_items} 条)\n\n"
+        elif format_type == "feishu":
+            new_header = f"🆕 **RSS 本次新增** (共 {total_items} 条)\n\n"
+        elif format_type == "dingtalk":
+            new_header = f"🆕 **RSS 本次新增** (共 {total_items} 条)\n\n"
+        elif format_type == "slack":
+            new_header = f"🆕 *RSS 本次新增* (共 {total_items} 条)\n\n"
 
     # 添加 RSS 新增标题
     test_content = current_batch + new_header
@@ -1160,6 +1279,7 @@ def _process_standalone_section(
     batches: List[str],
     timezone: str = "Asia/Shanghai",
     rank_threshold: int = 10,
+    add_separator: bool = True,
 ) -> tuple:
     """处理独立展示区区块
 
@@ -1181,6 +1301,8 @@ def _process_standalone_section(
         current_batch_has_content: 当前批次是否有内容
         batches: 已完成的批次列表
         timezone: 时区名称
+        rank_threshold: 排名高亮阈值
+        add_separator: 是否在区块前添加分割线(第一个区域时为 False)
 
     Returns:
         (current_batch, current_batch_has_content, batches) 元组
@@ -1199,18 +1321,34 @@ def _process_standalone_section(
     total_rss_items = sum(len(f.get("items", [])) for f in rss_feeds)
     total_items = total_platform_items + total_rss_items
 
-    # 独立展示区标题
+    # 独立展示区标题(根据 add_separator 决定是否添加前置分割线)
     section_header = ""
-    if format_type == "feishu":
-        section_header = f"\n{feishu_separator}\n\n📋 **独立展示区** (共 {total_items} 条)\n\n"
-    elif format_type == "dingtalk":
-        section_header = f"\n---\n\n📋 **独立展示区** (共 {total_items} 条)\n\n"
-    elif format_type == "telegram":
-        section_header = f"\n\n📋 独立展示区 (共 {total_items} 条)\n\n"
-    elif format_type == "slack":
-        section_header = f"\n\n📋 *独立展示区* (共 {total_items} 条)\n\n"
+    if add_separator and current_batch_has_content:
+        # 需要添加分割线
+        if format_type == "feishu":
+            section_header = f"\n{feishu_separator}\n\n📋 **独立展示区** (共 {total_items} 条)\n\n"
+        elif format_type == "dingtalk":
+            section_header = f"\n---\n\n📋 **独立展示区** (共 {total_items} 条)\n\n"
+        elif format_type in ("wework", "bark"):
+            section_header = f"\n\n\n\n📋 **独立展示区** (共 {total_items} 条)\n\n"
+        elif format_type == "telegram":
+            section_header = f"\n\n📋 独立展示区 (共 {total_items} 条)\n\n"
+        elif format_type == "slack":
+            section_header = f"\n\n📋 *独立展示区* (共 {total_items} 条)\n\n"
+        else:
+            section_header = f"\n\n📋 **独立展示区** (共 {total_items} 条)\n\n"
     else:
-        section_header = f"\n\n📋 **独立展示区** (共 {total_items} 条)\n\n"
+        # 不需要分割线(第一个区域)
+        if format_type == "feishu":
+            section_header = f"📋 **独立展示区** (共 {total_items} 条)\n\n"
+        elif format_type == "dingtalk":
+            section_header = f"📋 **独立展示区** (共 {total_items} 条)\n\n"
+        elif format_type == "telegram":
+            section_header = f"📋 独立展示区 (共 {total_items} 条)\n\n"
+        elif format_type == "slack":
+            section_header = f"📋 *独立展示区* (共 {total_items} 条)\n\n"
+        else:
+            section_header = f"📋 **独立展示区** (共 {total_items} 条)\n\n"
 
     # 添加区块标题
     test_content = current_batch + section_header

+ 36 - 35
trendradar/report/generator.py

@@ -20,6 +20,7 @@ def prepare_report_data(
     rank_threshold: int = 3,
     matches_word_groups_func: Optional[Callable] = None,
     load_frequency_words_func: Optional[Callable] = None,
+    show_new_section: bool = True,
 ) -> Dict:
     """
     准备报告数据
@@ -33,14 +34,15 @@ def prepare_report_data(
         rank_threshold: 排名阈值
         matches_word_groups_func: 词组匹配函数
         load_frequency_words_func: 加载频率词函数
+        show_new_section: 是否显示新增热点区域
 
     Returns:
         Dict: 准备好的报告数据
     """
     processed_new_titles = []
 
-    # 在增量模式下隐藏新增新闻区域
-    hide_new_section = mode == "incremental"
+    # 在增量模式下或配置关闭时隐藏新增新闻区域
+    hide_new_section = mode == "incremental" or not show_new_section
 
     # 只有在非隐藏模式下才处理新增新闻部分
     if not hide_new_section:
@@ -144,7 +146,6 @@ def generate_html_report(
     new_titles: Optional[Dict] = None,
     id_to_name: Optional[Dict] = None,
     mode: str = "daily",
-    is_daily_summary: bool = False,
     update_info: Optional[Dict] = None,
     rank_threshold: int = 3,
     output_dir: str = "output",
@@ -153,11 +154,15 @@ def generate_html_report(
     render_html_func: Optional[Callable] = None,
     matches_word_groups_func: Optional[Callable] = None,
     load_frequency_words_func: Optional[Callable] = None,
-    enable_index_copy: bool = True,
 ) -> str:
     """
     生成 HTML 报告
 
+    每次生成 HTML 后会:
+    1. 保存时间戳快照到 output/html/日期/时间.html(历史记录)
+    2. 复制到 output/html/latest/{mode}.html(最新报告)
+    3. 复制到 output/index.html 和根目录 index.html(入口)
+
     Args:
         stats: 统计结果列表
         total_titles: 总标题数
@@ -165,7 +170,6 @@ def generate_html_report(
         new_titles: 新增标题
         id_to_name: ID 到名称的映射
         mode: 报告模式 (daily/incremental/current)
-        is_daily_summary: 是否是每日汇总
         update_info: 更新信息
         rank_threshold: 排名阈值
         output_dir: 输出目录
@@ -174,25 +178,17 @@ def generate_html_report(
         render_html_func: HTML 渲染函数
         matches_word_groups_func: 词组匹配函数
         load_frequency_words_func: 加载频率词函数
-        enable_index_copy: 是否复制到 index.html
 
     Returns:
-        str: 生成的 HTML 文件路径
+        str: 生成的 HTML 文件路径(时间戳快照路径)
     """
-    if is_daily_summary:
-        if mode == "current":
-            filename = "当前榜单汇总.html"
-        elif mode == "incremental":
-            filename = "当日增量.html"
-        else:
-            filename = "当日汇总.html"
-    else:
-        filename = f"{time_filename}.html"
+    # 时间戳快照文件名
+    snapshot_filename = f"{time_filename}.html"
 
-    # 构建输出路径
-    output_path = Path(output_dir) / date_folder / "html"
-    output_path.mkdir(parents=True, exist_ok=True)
-    file_path = str(output_path / filename)
+    # 构建输出路径(扁平化结构:output/html/日期/)
+    snapshot_path = Path(output_dir) / "html" / date_folder
+    snapshot_path.mkdir(parents=True, exist_ok=True)
+    snapshot_file = str(snapshot_path / snapshot_filename)
 
     # 准备报告数据
     report_data = prepare_report_data(
@@ -209,27 +205,32 @@ def generate_html_report(
     # 渲染 HTML 内容
     if render_html_func:
         html_content = render_html_func(
-            report_data, total_titles, is_daily_summary, mode, update_info
+            report_data, total_titles, mode, update_info
         )
     else:
         # 默认简单 HTML
         html_content = f"<html><body><h1>Report</h1><pre>{report_data}</pre></body></html>"
 
-    # 写入文件
-    with open(file_path, "w", encoding="utf-8") as f:
+    # 1. 保存时间戳快照(历史记录)
+    with open(snapshot_file, "w", encoding="utf-8") as f:
+        f.write(html_content)
+
+    # 2. 复制到 html/latest/{mode}.html(最新报告)
+    latest_dir = Path(output_dir) / "html" / "latest"
+    latest_dir.mkdir(parents=True, exist_ok=True)
+    latest_file = latest_dir / f"{mode}.html"
+    with open(latest_file, "w", encoding="utf-8") as f:
         f.write(html_content)
 
-    # 如果是每日汇总且启用 index 复制
-    if is_daily_summary and enable_index_copy:
-        # 生成到根目录(供 GitHub Pages 访问)
-        root_index_path = Path("index.html")
-        with open(root_index_path, "w", encoding="utf-8") as f:
-            f.write(html_content)
+    # 3. 复制到 index.html(入口)
+    # output/index.html(供 Docker Volume 挂载访问)
+    output_index = Path(output_dir) / "index.html"
+    with open(output_index, "w", encoding="utf-8") as f:
+        f.write(html_content)
 
-        # 同时生成到 output 目录(供 Docker Volume 挂载访问)
-        output_index_path = Path(output_dir) / "index.html"
-        Path(output_dir).mkdir(parents=True, exist_ok=True)
-        with open(output_index_path, "w", encoding="utf-8") as f:
-            f.write(html_content)
+    # 根目录 index.html(供 GitHub Pages 访问)
+    root_index = Path("index.html")
+    with open(root_index, "w", encoding="utf-8") as f:
+        f.write(html_content)
 
-    return file_path
+    return snapshot_file

+ 150 - 29
trendradar/report/html.py

@@ -6,44 +6,52 @@ HTML 报告渲染模块
 """
 
 from datetime import datetime
-from typing import Dict, List, Optional, Callable
+from typing import Any, Dict, List, Optional, Callable
 
 from trendradar.report.helpers import html_escape
 from trendradar.utils.time import convert_time_for_display
+from trendradar.ai.formatter import render_ai_analysis_html_rich
 
 
 def render_html_content(
     report_data: Dict,
     total_titles: int,
-    is_daily_summary: bool = False,
     mode: str = "daily",
     update_info: Optional[Dict] = None,
     *,
-    reverse_content_order: bool = False,
+    region_order: Optional[List[str]] = None,
     get_time_func: Optional[Callable[[], datetime]] = None,
     rss_items: Optional[List[Dict]] = None,
     rss_new_items: Optional[List[Dict]] = None,
     display_mode: str = "keyword",
     standalone_data: Optional[Dict] = None,
+    ai_analysis: Optional[Any] = None,
+    show_new_section: bool = True,
 ) -> str:
     """渲染HTML内容
 
     Args:
         report_data: 报告数据字典,包含 stats, new_titles, failed_ids, total_new_count
         total_titles: 新闻总数
-        is_daily_summary: 是否为当日汇总
         mode: 报告模式 ("daily", "current", "incremental")
         update_info: 更新信息(可选)
-        reverse_content_order: 是否反转内容顺序(新增热点在前)
+        region_order: 区域显示顺序列表
         get_time_func: 获取当前时间的函数(可选,默认使用 datetime.now)
         rss_items: RSS 统计条目列表(可选)
         rss_new_items: RSS 新增条目列表(可选)
         display_mode: 显示模式 ("keyword"=按关键词分组, "platform"=按平台分组)
         standalone_data: 独立展示区数据(可选),包含 platforms 和 rss_feeds
+        ai_analysis: AI 分析结果对象(可选),AIAnalysisResult 实例
+        show_new_section: 是否显示新增热点区域
 
     Returns:
         渲染后的 HTML 字符串
     """
+    # 默认区域顺序
+    default_region_order = ["hotlist", "rss", "new_items", "standalone", "ai_analysis"]
+    if region_order is None:
+        region_order = default_region_order
+
     html = """
     <!DOCTYPE html>
     <html>
@@ -318,10 +326,21 @@ def render_html_content(
                 color: #7c3aed;
             }
 
+            /* 通用区域分割线样式 */
+            .section-divider {
+                margin-top: 32px;
+                padding-top: 24px;
+                border-top: 2px solid #e5e7eb;
+            }
+
+            /* 热榜统计区样式 */
+            .hotlist-section {
+                /* 默认无边框,由 section-divider 动态添加 */
+            }
+
             .new-section {
                 margin-top: 40px;
                 padding-top: 24px;
-                border-top: 2px solid #f0f0f0;
             }
 
             .new-section-title {
@@ -487,7 +506,6 @@ def render_html_content(
             .rss-section {
                 margin-top: 32px;
                 padding-top: 24px;
-                border-top: 2px solid #e5e7eb;
             }
 
             .rss-section-header {
@@ -600,7 +618,6 @@ def render_html_content(
             .standalone-section {
                 margin-top: 32px;
                 padding-top: 24px;
-                border-top: 2px solid #e5e7eb;
             }
 
             .standalone-section-header {
@@ -613,7 +630,7 @@ def render_html_content(
             .standalone-section-title {
                 font-size: 18px;
                 font-weight: 600;
-                color: #4f46e5;
+                color: #059669;
             }
 
             .standalone-section-count {
@@ -649,6 +666,72 @@ def render_html_content(
                 font-size: 13px;
                 font-weight: 500;
             }
+
+            /* AI 分析区块样式 */
+            .ai-section {
+                margin-top: 32px;
+                padding: 24px;
+                background: linear-gradient(135deg, #f0f9ff 0%, #e0f2fe 100%);
+                border-radius: 12px;
+                border: 1px solid #bae6fd;
+            }
+
+            .ai-section-header {
+                display: flex;
+                align-items: center;
+                gap: 10px;
+                margin-bottom: 20px;
+            }
+
+            .ai-section-title {
+                font-size: 18px;
+                font-weight: 600;
+                color: #0369a1;
+            }
+
+            .ai-section-badge {
+                background: #0ea5e9;
+                color: white;
+                font-size: 11px;
+                font-weight: 600;
+                padding: 3px 8px;
+                border-radius: 4px;
+            }
+
+            .ai-block {
+                margin-bottom: 16px;
+                padding: 16px;
+                background: white;
+                border-radius: 8px;
+                box-shadow: 0 1px 3px rgba(0,0,0,0.05);
+            }
+
+            .ai-block:last-child {
+                margin-bottom: 0;
+            }
+
+            .ai-block-title {
+                font-size: 14px;
+                font-weight: 600;
+                color: #0369a1;
+                margin-bottom: 8px;
+            }
+
+            .ai-block-content {
+                font-size: 14px;
+                line-height: 1.6;
+                color: #334155;
+                white-space: pre-wrap;
+            }
+
+            .ai-error {
+                padding: 16px;
+                background: #fef2f2;
+                border: 1px solid #fecaca;
+                border-radius: 8px;
+                color: #991b1b;
+                font-size: 14px;
+            }
         </style>
     </head>
     <body>
@@ -664,16 +747,13 @@ def render_html_content(
                         <span class="info-label">报告类型</span>
                         <span class="info-value">"""
 
-    # 处理报告类型显示
-    if is_daily_summary:
-        if mode == "current":
-            html += "当前榜单"
-        elif mode == "incremental":
-            html += "增量模式"
-        else:
-            html += "当日汇总"
+    # 处理报告类型显示(根据 mode 直接显示)
+    if mode == "current":
+        html += "当前榜单"
+    elif mode == "incremental":
+        html += "增量分析"
     else:
-        html += "实时分析"
+        html += "全天汇总"
 
     html += """</span>
                     </div>
@@ -837,9 +917,15 @@ def render_html_content(
             stats_html += """
                 </div>"""
 
+    # 给热榜统计添加外层包装
+    if stats_html:
+        stats_html = f"""
+                <div class="hotlist-section">{stats_html}
+                </div>"""
+
     # 生成新增新闻区域的HTML
     new_titles_html = ""
-    if report_data["new_titles"]:
+    if show_new_section and report_data["new_titles"]:
         new_titles_html += f"""
                 <div class="new-section">
                     <div class="new-section-title">本次新增热点 (共 {report_data['total_new_count']} 条)</div>"""
@@ -1062,7 +1148,7 @@ def render_html_content(
         standalone_html = f"""
                 <div class="standalone-section">
                     <div class="standalone-section-header">
-                        <div class="standalone-section-title">📋 独立展示区</div>
+                        <div class="standalone-section-title">独立展示区</div>
                         <div class="standalone-section-count">{total_count} 条</div>
                     </div>"""
 
@@ -1231,15 +1317,50 @@ def render_html_content(
     # 生成独立展示区 HTML
     standalone_html = render_standalone_html(standalone_data)
 
-    # 根据配置决定内容顺序(与推送逻辑一致)
-    if reverse_content_order:
-        # 新增在前,统计在后
-        # 顺序:热榜新增 → RSS新增 → 热榜统计 → RSS统计 → 独立展示区
-        html += new_titles_html + rss_new_html + stats_html + rss_stats_html + standalone_html
-    else:
-        # 默认:统计在前,新增在后
-        # 顺序:热榜统计 → RSS统计 → 热榜新增 → RSS新增 → 独立展示区
-        html += stats_html + rss_stats_html + new_titles_html + rss_new_html + standalone_html
+    # 生成 AI 分析 HTML
+    ai_html = render_ai_analysis_html_rich(ai_analysis) if ai_analysis else ""
+
+    # 准备各区域内容映射
+    region_contents = {
+        "hotlist": stats_html,
+        "rss": rss_stats_html,
+        "new_items": (new_titles_html, rss_new_html),  # 元组,分别处理
+        "standalone": standalone_html,
+        "ai_analysis": ai_html,
+    }
+
+    def add_section_divider(content: str) -> str:
+        """为内容的外层 div 添加 section-divider 类"""
+        if not content or 'class="' not in content:
+            return content
+        first_class_pos = content.find('class="')
+        if first_class_pos != -1:
+            insert_pos = first_class_pos + len('class="')
+            return content[:insert_pos] + "section-divider " + content[insert_pos:]
+        return content
+
+    # 按 region_order 顺序组装内容,动态添加分割线
+    has_previous_content = False
+    for region in region_order:
+        content = region_contents.get(region, "")
+        if region == "new_items":
+            # 特殊处理 new_items 区域(包含热榜新增和 RSS 新增两部分)
+            new_html, rss_new = content
+            if new_html:
+                if has_previous_content:
+                    new_html = add_section_divider(new_html)
+                html += new_html
+                has_previous_content = True
+            if rss_new:
+                if has_previous_content:
+                    rss_new = add_section_divider(rss_new)
+                html += rss_new
+                has_previous_content = True
+        elif content:
+            if has_previous_content:
+                content = add_section_divider(content)
+            html += content
+            has_previous_content = True
 
     html += """
             </div>

+ 7 - 0
trendradar/storage/__init__.py

@@ -12,9 +12,12 @@ from trendradar.storage.base import (
     StorageBackend,
     NewsItem,
     NewsData,
+    RSSItem,
+    RSSData,
     convert_crawl_results_to_news_data,
     convert_news_data_to_results,
 )
+from trendradar.storage.sqlite_mixin import SQLiteStorageMixin
 from trendradar.storage.local import LocalStorageBackend
 from trendradar.storage.manager import StorageManager, get_storage_manager
 
@@ -31,6 +34,10 @@ __all__ = [
     "StorageBackend",
     "NewsItem",
     "NewsData",
+    "RSSItem",
+    "RSSData",
+    # Mixin
+    "SQLiteStorageMixin",
     # 转换函数
     "convert_crawl_results_to_news_data",
     "convert_news_data_to_results",

+ 5 - 0
trendradar/storage/base.py

@@ -27,6 +27,9 @@ class NewsItem:
     first_time: str = ""                # 首次出现时间
     last_time: str = ""                 # 最后出现时间
     count: int = 1                      # 出现次数
+    rank_timeline: List[Dict[str, Any]] = field(default_factory=list)  # 完整排名时间线
+                                        # 格式: [{"time": "09:30", "rank": 1}, {"time": "10:00", "rank": 2}, ...]
+                                        # None 表示脱榜: [{"time": "11:00", "rank": None}]
 
     def to_dict(self) -> Dict[str, Any]:
         """转换为字典"""
@@ -42,6 +45,7 @@ class NewsItem:
             "first_time": self.first_time,
             "last_time": self.last_time,
             "count": self.count,
+            "rank_timeline": self.rank_timeline,
         }
 
     @classmethod
@@ -59,6 +63,7 @@ class NewsItem:
             first_time=data.get("first_time", ""),
             last_time=data.get("last_time", ""),
             count=data.get("count", 1),
+            rank_timeline=data.get("rank_timeline", []),
         )
 
 

+ 91 - 996
trendradar/storage/local.py

@@ -14,15 +14,15 @@ from pathlib import Path
 from typing import Dict, List, Optional
 
 from trendradar.storage.base import StorageBackend, NewsItem, NewsData, RSSItem, RSSData
+from trendradar.storage.sqlite_mixin import SQLiteStorageMixin
 from trendradar.utils.time import (
     get_configured_time,
     format_date_folder,
     format_time_filename,
 )
-from trendradar.utils.url import normalize_url
 
 
-class LocalStorageBackend(StorageBackend):
+class LocalStorageBackend(SQLiteStorageMixin, StorageBackend):
     """
     本地存储后端
 
@@ -62,6 +62,10 @@ class LocalStorageBackend(StorageBackend):
     def supports_txt(self) -> bool:
         return self.enable_txt
 
+    # ========================================
+    # SQLiteStorageMixin 抽象方法实现
+    # ========================================
+
     def _get_configured_time(self) -> datetime:
         """获取配置时区的当前时间"""
         return get_configured_time(self.timezone)
@@ -115,510 +119,112 @@ class LocalStorageBackend(StorageBackend):
 
         return self._db_connections[db_path]
 
-    def _get_schema_path(self, db_type: str = "news") -> Path:
-        """
-        获取 schema.sql 文件路径
-
-        Args:
-            db_type: 数据库类型 ("news" 或 "rss")
-
-        Returns:
-            schema 文件路径
-        """
-        if db_type == "rss":
-            return Path(__file__).parent / "rss_schema.sql"
-        return Path(__file__).parent / "schema.sql"
-
-    def _init_tables(self, conn: sqlite3.Connection, db_type: str = "news") -> None:
-        """
-        从 schema.sql 初始化数据库表结构
-
-        Args:
-            conn: 数据库连接
-            db_type: 数据库类型 ("news" 或 "rss")
-        """
-        schema_path = self._get_schema_path(db_type)
-
-        if schema_path.exists():
-            with open(schema_path, "r", encoding="utf-8") as f:
-                schema_sql = f.read()
-            conn.executescript(schema_sql)
-        else:
-            raise FileNotFoundError(f"Schema file not found: {schema_path}")
-
-        conn.commit()
+    # ========================================
+    # StorageBackend 接口实现(委托给 mixin)
+    # ========================================
 
     def save_news_data(self, data: NewsData) -> bool:
-        """
-        保存新闻数据到 SQLite(以 URL 为唯一标识,支持标题更新检测)
+        """保存新闻数据到 SQLite"""
+        db_path = self._get_db_path(data.date)
+        if not db_path.exists():
+            # 确保目录存在
+            db_path.parent.mkdir(parents=True, exist_ok=True)
 
-        Args:
-            data: 新闻数据
-
-        Returns:
-            是否保存成功
-        """
-        try:
-            conn = self._get_connection(data.date)
-            cursor = conn.cursor()
-
-            # 获取配置时区的当前时间
-            now_str = self._get_configured_time().strftime("%Y-%m-%d %H:%M:%S")
-
-            # 首先同步平台信息到 platforms 表
-            for source_id, source_name in data.id_to_name.items():
-                cursor.execute("""
-                    INSERT INTO platforms (id, name, updated_at)
-                    VALUES (?, ?, ?)
-                    ON CONFLICT(id) DO UPDATE SET
-                        name = excluded.name,
-                        updated_at = excluded.updated_at
-                """, (source_id, source_name, now_str))
-
-            # 统计计数器
-            new_count = 0
-            updated_count = 0
-            title_changed_count = 0
-            success_sources = []
-
-            for source_id, news_list in data.items.items():
-                success_sources.append(source_id)
-
-                for item in news_list:
-                    try:
-                        # 标准化 URL(去除动态参数,如微博的 band_rank)
-                        normalized_url = normalize_url(item.url, source_id) if item.url else ""
-
-                        # 检查是否已存在(通过标准化 URL + platform_id)
-                        if normalized_url:
-                            cursor.execute("""
-                                SELECT id, title FROM news_items
-                                WHERE url = ? AND platform_id = ?
-                            """, (normalized_url, source_id))
-                            existing = cursor.fetchone()
-
-                            if existing:
-                                # 已存在,更新记录
-                                existing_id, existing_title = existing
-
-                                # 检查标题是否变化
-                                if existing_title != item.title:
-                                    # 记录标题变更
-                                    cursor.execute("""
-                                        INSERT INTO title_changes
-                                        (news_item_id, old_title, new_title, changed_at)
-                                        VALUES (?, ?, ?, ?)
-                                    """, (existing_id, existing_title, item.title, now_str))
-                                    title_changed_count += 1
-
-                                # 记录排名历史
-                                cursor.execute("""
-                                    INSERT INTO rank_history
-                                    (news_item_id, rank, crawl_time, created_at)
-                                    VALUES (?, ?, ?, ?)
-                                """, (existing_id, item.rank, data.crawl_time, now_str))
-
-                                # 更新现有记录
-                                cursor.execute("""
-                                    UPDATE news_items SET
-                                        title = ?,
-                                        rank = ?,
-                                        mobile_url = ?,
-                                        last_crawl_time = ?,
-                                        crawl_count = crawl_count + 1,
-                                        updated_at = ?
-                                    WHERE id = ?
-                                """, (item.title, item.rank, item.mobile_url,
-                                      data.crawl_time, now_str, existing_id))
-                                updated_count += 1
-                            else:
-                                # 不存在,插入新记录(存储标准化后的 URL)
-                                cursor.execute("""
-                                    INSERT INTO news_items
-                                    (title, platform_id, rank, url, mobile_url,
-                                     first_crawl_time, last_crawl_time, crawl_count,
-                                     created_at, updated_at)
-                                    VALUES (?, ?, ?, ?, ?, ?, ?, 1, ?, ?)
-                                """, (item.title, source_id, item.rank, normalized_url,
-                                      item.mobile_url, data.crawl_time, data.crawl_time,
-                                      now_str, now_str))
-                                new_id = cursor.lastrowid
-                                # 记录初始排名
-                                cursor.execute("""
-                                    INSERT INTO rank_history
-                                    (news_item_id, rank, crawl_time, created_at)
-                                    VALUES (?, ?, ?, ?)
-                                """, (new_id, item.rank, data.crawl_time, now_str))
-                                new_count += 1
-                        else:
-                            # URL 为空的情况,直接插入(不做去重)
-                            cursor.execute("""
-                                INSERT INTO news_items
-                                (title, platform_id, rank, url, mobile_url,
-                                 first_crawl_time, last_crawl_time, crawl_count,
-                                 created_at, updated_at)
-                                VALUES (?, ?, ?, ?, ?, ?, ?, 1, ?, ?)
-                            """, (item.title, source_id, item.rank, "",
-                                  item.mobile_url, data.crawl_time, data.crawl_time,
-                                  now_str, now_str))
-                            new_id = cursor.lastrowid
-                            # 记录初始排名
-                            cursor.execute("""
-                                INSERT INTO rank_history
-                                (news_item_id, rank, crawl_time, created_at)
-                                VALUES (?, ?, ?, ?)
-                            """, (new_id, item.rank, data.crawl_time, now_str))
-                            new_count += 1
-
-                    except sqlite3.Error as e:
-                        print(f"保存新闻条目失败 [{item.title[:30]}...]: {e}")
-
-            total_items = new_count + updated_count
-
-            # 记录抓取信息
-            cursor.execute("""
-                INSERT OR REPLACE INTO crawl_records
-                (crawl_time, total_items, created_at)
-                VALUES (?, ?, ?)
-            """, (data.crawl_time, total_items, now_str))
-
-            # 获取刚插入的 crawl_record 的 ID
-            cursor.execute("""
-                SELECT id FROM crawl_records WHERE crawl_time = ?
-            """, (data.crawl_time,))
-            record_row = cursor.fetchone()
-            if record_row:
-                crawl_record_id = record_row[0]
-
-                # 记录成功的来源
-                for source_id in success_sources:
-                    cursor.execute("""
-                        INSERT OR REPLACE INTO crawl_source_status
-                        (crawl_record_id, platform_id, status)
-                        VALUES (?, ?, 'success')
-                    """, (crawl_record_id, source_id))
-
-                # 记录失败的来源
-                for failed_id in data.failed_ids:
-                    # 确保失败的平台也在 platforms 表中
-                    cursor.execute("""
-                        INSERT OR IGNORE INTO platforms (id, name, updated_at)
-                        VALUES (?, ?, ?)
-                    """, (failed_id, failed_id, now_str))
-
-                    cursor.execute("""
-                        INSERT OR REPLACE INTO crawl_source_status
-                        (crawl_record_id, platform_id, status)
-                        VALUES (?, ?, 'failed')
-                    """, (crawl_record_id, failed_id))
-
-            conn.commit()
+        success, new_count, updated_count, title_changed_count, off_list_count = \
+            self._save_news_data_impl(data, "[本地存储]")
 
+        if success:
             # 输出详细的存储统计日志
             log_parts = [f"[本地存储] 处理完成:新增 {new_count} 条"]
             if updated_count > 0:
                 log_parts.append(f"更新 {updated_count} 条")
             if title_changed_count > 0:
                 log_parts.append(f"标题变更 {title_changed_count} 条")
+            if off_list_count > 0:
+                log_parts.append(f"脱榜 {off_list_count} 条")
             print(",".join(log_parts))
 
-            return True
-
-        except Exception as e:
-            print(f"[本地存储] 保存失败: {e}")
-            return False
+        return success
 
     def get_today_all_data(self, date: Optional[str] = None) -> Optional[NewsData]:
-        """
-        获取指定日期的所有新闻数据(合并后)
+        """获取指定日期的所有新闻数据(合并后)"""
+        db_path = self._get_db_path(date)
+        if not db_path.exists():
+            return None
+        return self._get_today_all_data_impl(date)
 
-        Args:
-            date: 日期字符串,默认为今天
+    def get_latest_crawl_data(self, date: Optional[str] = None) -> Optional[NewsData]:
+        """获取最新一次抓取的数据"""
+        db_path = self._get_db_path(date)
+        if not db_path.exists():
+            return None
+        return self._get_latest_crawl_data_impl(date)
 
-        Returns:
-            合并后的新闻数据
-        """
-        try:
-            db_path = self._get_db_path(date)
-            if not db_path.exists():
-                return None
-
-            conn = self._get_connection(date)
-            cursor = conn.cursor()
-
-            # 获取所有新闻数据(包含 id 用于查询排名历史)
-            cursor.execute("""
-                SELECT n.id, n.title, n.platform_id, p.name as platform_name,
-                       n.rank, n.url, n.mobile_url,
-                       n.first_crawl_time, n.last_crawl_time, n.crawl_count
-                FROM news_items n
-                LEFT JOIN platforms p ON n.platform_id = p.id
-                ORDER BY n.platform_id, n.last_crawl_time
-            """)
-
-            rows = cursor.fetchall()
-            if not rows:
-                return None
-
-            # 收集所有 news_item_id
-            news_ids = [row[0] for row in rows]
-
-            # 批量查询排名历史
-            rank_history_map: Dict[int, List[int]] = {}
-            if news_ids:
-                placeholders = ",".join("?" * len(news_ids))
-                cursor.execute(f"""
-                    SELECT news_item_id, rank FROM rank_history
-                    WHERE news_item_id IN ({placeholders})
-                    ORDER BY news_item_id, crawl_time
-                """, news_ids)
-                for rh_row in cursor.fetchall():
-                    news_id, rank = rh_row[0], rh_row[1]
-                    if news_id not in rank_history_map:
-                        rank_history_map[news_id] = []
-                    if rank not in rank_history_map[news_id]:
-                        rank_history_map[news_id].append(rank)
-
-            # 按 platform_id 分组
-            items: Dict[str, List[NewsItem]] = {}
-            id_to_name: Dict[str, str] = {}
-            crawl_date = self._format_date_folder(date)
-
-            for row in rows:
-                news_id = row[0]
-                platform_id = row[2]
-                title = row[1]
-                platform_name = row[3] or platform_id
-
-                id_to_name[platform_id] = platform_name
-
-                if platform_id not in items:
-                    items[platform_id] = []
-
-                # 获取排名历史,如果没有则使用当前排名
-                ranks = rank_history_map.get(news_id, [row[4]])
-
-                items[platform_id].append(NewsItem(
-                    title=title,
-                    source_id=platform_id,
-                    source_name=platform_name,
-                    rank=row[4],
-                    url=row[5] or "",
-                    mobile_url=row[6] or "",
-                    crawl_time=row[8],  # last_crawl_time
-                    ranks=ranks,
-                    first_time=row[7],  # first_crawl_time
-                    last_time=row[8],   # last_crawl_time
-                    count=row[9],       # crawl_count
-                ))
-
-            final_items = items
-
-            # 获取失败的来源
-            cursor.execute("""
-                SELECT DISTINCT css.platform_id
-                FROM crawl_source_status css
-                JOIN crawl_records cr ON css.crawl_record_id = cr.id
-                WHERE css.status = 'failed'
-            """)
-            failed_ids = [row[0] for row in cursor.fetchall()]
-
-            # 获取最新的抓取时间
-            cursor.execute("""
-                SELECT crawl_time FROM crawl_records
-                ORDER BY crawl_time DESC
-                LIMIT 1
-            """)
-
-            time_row = cursor.fetchone()
-            crawl_time = time_row[0] if time_row else self._format_time_filename()
-
-            return NewsData(
-                date=crawl_date,
-                crawl_time=crawl_time,
-                items=final_items,
-                id_to_name=id_to_name,
-                failed_ids=failed_ids,
-            )
+    def detect_new_titles(self, current_data: NewsData) -> Dict[str, Dict]:
+        """检测新增的标题"""
+        return self._detect_new_titles_impl(current_data)
 
-        except Exception as e:
-            print(f"[本地存储] 读取数据失败: {e}")
-            return None
+    def is_first_crawl_today(self, date: Optional[str] = None) -> bool:
+        """检查是否是当天第一次抓取"""
+        db_path = self._get_db_path(date)
+        if not db_path.exists():
+            return True
+        return self._is_first_crawl_today_impl(date)
 
-    def get_latest_crawl_data(self, date: Optional[str] = None) -> Optional[NewsData]:
-        """
-        获取最新一次抓取的数据
+    def get_crawl_times(self, date: Optional[str] = None) -> List[str]:
+        """获取指定日期的所有抓取时间列表"""
+        db_path = self._get_db_path(date)
+        if not db_path.exists():
+            return []
+        return self._get_crawl_times_impl(date)
 
-        Args:
-            date: 日期字符串,默认为今天
+    def has_pushed_today(self, date: Optional[str] = None) -> bool:
+        """检查指定日期是否已推送过"""
+        return self._has_pushed_today_impl(date)
 
-        Returns:
-            最新抓取的新闻数据
-        """
-        try:
-            db_path = self._get_db_path(date)
-            if not db_path.exists():
-                return None
-
-            conn = self._get_connection(date)
-            cursor = conn.cursor()
-
-            # 获取最新的抓取时间
-            cursor.execute("""
-                SELECT crawl_time FROM crawl_records
-                ORDER BY crawl_time DESC
-                LIMIT 1
-            """)
-
-            time_row = cursor.fetchone()
-            if not time_row:
-                return None
-
-            latest_time = time_row[0]
-
-            # 获取该时间的新闻数据(包含 id 用于查询排名历史)
-            cursor.execute("""
-                SELECT n.id, n.title, n.platform_id, p.name as platform_name,
-                       n.rank, n.url, n.mobile_url,
-                       n.first_crawl_time, n.last_crawl_time, n.crawl_count
-                FROM news_items n
-                LEFT JOIN platforms p ON n.platform_id = p.id
-                WHERE n.last_crawl_time = ?
-            """, (latest_time,))
-
-            rows = cursor.fetchall()
-            if not rows:
-                return None
-
-            # 收集所有 news_item_id
-            news_ids = [row[0] for row in rows]
-
-            # 批量查询排名历史
-            rank_history_map: Dict[int, List[int]] = {}
-            if news_ids:
-                placeholders = ",".join("?" * len(news_ids))
-                cursor.execute(f"""
-                    SELECT news_item_id, rank FROM rank_history
-                    WHERE news_item_id IN ({placeholders})
-                    ORDER BY news_item_id, crawl_time
-                """, news_ids)
-                for rh_row in cursor.fetchall():
-                    news_id, rank = rh_row[0], rh_row[1]
-                    if news_id not in rank_history_map:
-                        rank_history_map[news_id] = []
-                    if rank not in rank_history_map[news_id]:
-                        rank_history_map[news_id].append(rank)
-
-            items: Dict[str, List[NewsItem]] = {}
-            id_to_name: Dict[str, str] = {}
-            crawl_date = self._format_date_folder(date)
-
-            for row in rows:
-                news_id = row[0]
-                platform_id = row[2]
-                platform_name = row[3] or platform_id
-                id_to_name[platform_id] = platform_name
-
-                if platform_id not in items:
-                    items[platform_id] = []
-
-                # 获取排名历史,如果没有则使用当前排名
-                ranks = rank_history_map.get(news_id, [row[4]])
-
-                items[platform_id].append(NewsItem(
-                    title=row[1],
-                    source_id=platform_id,
-                    source_name=platform_name,
-                    rank=row[4],
-                    url=row[5] or "",
-                    mobile_url=row[6] or "",
-                    crawl_time=row[8],  # last_crawl_time
-                    ranks=ranks,
-                    first_time=row[7],  # first_crawl_time
-                    last_time=row[8],   # last_crawl_time
-                    count=row[9],       # crawl_count
-                ))
-
-            # 获取失败的来源(针对最新一次抓取)
-            cursor.execute("""
-                SELECT css.platform_id
-                FROM crawl_source_status css
-                JOIN crawl_records cr ON css.crawl_record_id = cr.id
-                WHERE cr.crawl_time = ? AND css.status = 'failed'
-            """, (latest_time,))
-
-            failed_ids = [row[0] for row in cursor.fetchall()]
-
-            return NewsData(
-                date=crawl_date,
-                crawl_time=latest_time,
-                items=items,
-                id_to_name=id_to_name,
-                failed_ids=failed_ids,
-            )
+    def record_push(self, report_type: str, date: Optional[str] = None) -> bool:
+        """记录推送"""
+        success = self._record_push_impl(report_type, date)
+        if success:
+            now_str = self._get_configured_time().strftime("%Y-%m-%d %H:%M:%S")
+            print(f"[本地存储] 推送记录已保存: {report_type} at {now_str}")
+        return success
 
-        except Exception as e:
-            print(f"[本地存储] 获取最新数据失败: {e}")
-            return None
+    # ========================================
+    # RSS 数据存储方法
+    # ========================================
 
-    def detect_new_titles(self, current_data: NewsData) -> Dict[str, Dict]:
-        """
-        检测新增的标题
+    def save_rss_data(self, data: RSSData) -> bool:
+        """保存 RSS 数据到 SQLite"""
+        success, new_count, updated_count = self._save_rss_data_impl(data, "[本地存储]")
 
-        该方法比较当前抓取数据与历史数据,找出新增的标题。
-        关键逻辑:只有在历史批次中从未出现过的标题才算新增。
+        if success:
+            # 输出统计日志
+            log_parts = [f"[本地存储] RSS 处理完成:新增 {new_count} 条"]
+            if updated_count > 0:
+                log_parts.append(f"更新 {updated_count} 条")
+            print(",".join(log_parts))
 
-        Args:
-            current_data: 当前抓取的数据
+        return success
 
-        Returns:
-            新增的标题数据 {source_id: {title: NewsItem}}
-        """
-        try:
-            # 获取历史数据
-            historical_data = self.get_today_all_data(current_data.date)
-
-            if not historical_data:
-                # 没有历史数据,所有都是新的
-                new_titles = {}
-                for source_id, news_list in current_data.items.items():
-                    new_titles[source_id] = {item.title: item for item in news_list}
-                return new_titles
-
-            # 获取当前批次时间
-            current_time = current_data.crawl_time
-
-            # 收集历史标题(first_time < current_time 的标题)
-            # 这样可以正确处理同一标题因 URL 变化而产生多条记录的情况
-            historical_titles: Dict[str, set] = {}
-            for source_id, news_list in historical_data.items.items():
-                historical_titles[source_id] = set()
-                for item in news_list:
-                    first_time = getattr(item, 'first_time', item.crawl_time)
-                    if first_time < current_time:
-                        historical_titles[source_id].add(item.title)
-
-            # 检查是否有历史数据
-            has_historical_data = any(len(titles) > 0 for titles in historical_titles.values())
-            if not has_historical_data:
-                # 第一次抓取,没有"新增"概念
-                return {}
-
-            # 检测新增
-            new_titles = {}
-            for source_id, news_list in current_data.items.items():
-                hist_set = historical_titles.get(source_id, set())
-                for item in news_list:
-                    if item.title not in hist_set:
-                        if source_id not in new_titles:
-                            new_titles[source_id] = {}
-                        new_titles[source_id][item.title] = item
-
-            return new_titles
+    def get_rss_data(self, date: Optional[str] = None) -> Optional[RSSData]:
+        """获取指定日期的所有 RSS 数据"""
+        return self._get_rss_data_impl(date)
 
-        except Exception as e:
-            print(f"[本地存储] 检测新标题失败: {e}")
-            return {}
+    def detect_new_rss_items(self, current_data: RSSData) -> Dict[str, List[RSSItem]]:
+        """检测新增的 RSS 条目"""
+        return self._detect_new_rss_items_impl(current_data)
+
+    def get_latest_rss_data(self, date: Optional[str] = None) -> Optional[RSSData]:
+        """获取最新一次抓取的 RSS 数据"""
+        db_path = self._get_db_path(date, db_type="rss")
+        if not db_path.exists():
+            return None
+        return self._get_latest_rss_data_impl(date)
+
+    # ========================================
+    # 本地特有功能:TXT/HTML 快照
+    # ========================================
 
     def save_txt_snapshot(self, data: NewsData) -> Optional[str]:
         """
@@ -712,67 +318,9 @@ class LocalStorageBackend(StorageBackend):
             print(f"[本地存储] 保存 HTML 报告失败: {e}")
             return None
 
-    def is_first_crawl_today(self, date: Optional[str] = None) -> bool:
-        """
-        检查是否是当天第一次抓取
-
-        Args:
-            date: 日期字符串,默认为今天
-
-        Returns:
-            是否是第一次抓取
-        """
-        try:
-            db_path = self._get_db_path(date)
-            if not db_path.exists():
-                return True
-
-            conn = self._get_connection(date)
-            cursor = conn.cursor()
-
-            cursor.execute("""
-                SELECT COUNT(*) as count FROM crawl_records
-            """)
-
-            row = cursor.fetchone()
-            count = row[0] if row else 0
-
-            # 如果只有一条或没有记录,视为第一次抓取
-            return count <= 1
-
-        except Exception as e:
-            print(f"[本地存储] 检查首次抓取失败: {e}")
-            return True
-
-    def get_crawl_times(self, date: Optional[str] = None) -> List[str]:
-        """
-        获取指定日期的所有抓取时间列表
-
-        Args:
-            date: 日期字符串,默认为今天
-
-        Returns:
-            抓取时间列表(按时间排序)
-        """
-        try:
-            db_path = self._get_db_path(date)
-            if not db_path.exists():
-                return []
-
-            conn = self._get_connection(date)
-            cursor = conn.cursor()
-
-            cursor.execute("""
-                SELECT crawl_time FROM crawl_records
-                ORDER BY crawl_time
-            """)
-
-            rows = cursor.fetchall()
-            return [row[0] for row in rows]
-
-        except Exception as e:
-            print(f"[本地存储] 获取抓取时间列表失败: {e}")
-            return []
+    # ========================================
+    # 本地特有功能:资源清理
+    # ========================================
 
     def cleanup(self) -> None:
         """清理资源(关闭数据库连接)"""
@@ -808,27 +356,17 @@ class LocalStorageBackend(StorageBackend):
         cutoff_date = self._get_configured_time() - timedelta(days=retention_days)
 
         def parse_date_from_name(name: str) -> Optional[datetime]:
-            """从文件名或目录名解析日期"""
+            """从文件名或目录名解析日期 (ISO 格式: YYYY-MM-DD)"""
             # 移除 .db 后缀
             name = name.replace('.db', '')
             try:
-                # ISO 格式: YYYY-MM-DD
                 date_match = re.match(r'(\d{4})-(\d{2})-(\d{2})', name)
                 if date_match:
                     return datetime(
                         int(date_match.group(1)),
                         int(date_match.group(2)),
                         int(date_match.group(3)),
-                        tzinfo=pytz.timezone("Asia/Shanghai")
-                    )
-                # 旧中文格式: YYYY年MM月DD日
-                date_match = re.match(r'(\d{4})年(\d{2})月(\d{2})日', name)
-                if date_match:
-                    return datetime(
-                        int(date_match.group(1)),
-                        int(date_match.group(2)),
-                        int(date_match.group(3)),
-                        tzinfo=pytz.timezone("Asia/Shanghai")
+                        tzinfo=pytz.timezone(self.timezone)
                     )
             except Exception:
                 pass
@@ -892,449 +430,6 @@ class LocalStorageBackend(StorageBackend):
             print(f"[本地存储] 清理过期数据失败: {e}")
             return deleted_count
 
-    def has_pushed_today(self, date: Optional[str] = None) -> bool:
-        """
-        检查指定日期是否已推送过
-
-        Args:
-            date: 日期字符串(YYYY-MM-DD),默认为今天
-
-        Returns:
-            是否已推送
-        """
-        try:
-            conn = self._get_connection(date)
-            cursor = conn.cursor()
-
-            target_date = self._format_date_folder(date)
-
-            cursor.execute("""
-                SELECT pushed FROM push_records WHERE date = ?
-            """, (target_date,))
-
-            row = cursor.fetchone()
-            if row:
-                return bool(row[0])
-            return False
-
-        except Exception as e:
-            print(f"[本地存储] 检查推送记录失败: {e}")
-            return False
-
-    def record_push(self, report_type: str, date: Optional[str] = None) -> bool:
-        """
-        记录推送
-
-        Args:
-            report_type: 报告类型
-            date: 日期字符串(YYYY-MM-DD),默认为今天
-
-        Returns:
-            是否记录成功
-        """
-        try:
-            conn = self._get_connection(date)
-            cursor = conn.cursor()
-
-            target_date = self._format_date_folder(date)
-            now_str = self._get_configured_time().strftime("%Y-%m-%d %H:%M:%S")
-
-            cursor.execute("""
-                INSERT INTO push_records (date, pushed, push_time, report_type, created_at)
-                VALUES (?, 1, ?, ?, ?)
-                ON CONFLICT(date) DO UPDATE SET
-                    pushed = 1,
-                    push_time = excluded.push_time,
-                    report_type = excluded.report_type
-            """, (target_date, now_str, report_type, now_str))
-
-            conn.commit()
-
-            print(f"[本地存储] 推送记录已保存: {report_type} at {now_str}")
-            return True
-
-        except Exception as e:
-            print(f"[本地存储] 记录推送失败: {e}")
-            return False
-
-    # ========================================
-    # RSS 数据存储方法
-    # ========================================
-
-    def save_rss_data(self, data: RSSData) -> bool:
-        """
-        保存 RSS 数据到 SQLite(以 URL 为唯一标识)
-
-        Args:
-            data: RSS 数据
-
-        Returns:
-            是否保存成功
-        """
-        try:
-            conn = self._get_connection(data.date, db_type="rss")
-            cursor = conn.cursor()
-
-            now_str = self._get_configured_time().strftime("%Y-%m-%d %H:%M:%S")
-
-            # 同步 RSS 源信息到 rss_feeds 表
-            for feed_id, feed_name in data.id_to_name.items():
-                cursor.execute("""
-                    INSERT INTO rss_feeds (id, name, updated_at)
-                    VALUES (?, ?, ?)
-                    ON CONFLICT(id) DO UPDATE SET
-                        name = excluded.name,
-                        updated_at = excluded.updated_at
-                """, (feed_id, feed_name, now_str))
-
-            # 统计计数器
-            new_count = 0
-            updated_count = 0
-
-            for feed_id, rss_list in data.items.items():
-                for item in rss_list:
-                    try:
-                        # 检查是否已存在(通过 URL + feed_id)
-                        if item.url:
-                            cursor.execute("""
-                                SELECT id, title FROM rss_items
-                                WHERE url = ? AND feed_id = ?
-                            """, (item.url, feed_id))
-                            existing = cursor.fetchone()
-
-                            if existing:
-                                # 已存在,更新记录
-                                existing_id = existing[0]
-                                cursor.execute("""
-                                    UPDATE rss_items SET
-                                        title = ?,
-                                        published_at = ?,
-                                        summary = ?,
-                                        author = ?,
-                                        last_crawl_time = ?,
-                                        crawl_count = crawl_count + 1,
-                                        updated_at = ?
-                                    WHERE id = ?
-                                """, (item.title, item.published_at, item.summary,
-                                      item.author, data.crawl_time, now_str, existing_id))
-                                updated_count += 1
-                            else:
-                                # 不存在,插入新记录
-                                cursor.execute("""
-                                    INSERT INTO rss_items
-                                    (title, feed_id, url, published_at, summary, author,
-                                     first_crawl_time, last_crawl_time, crawl_count,
-                                     created_at, updated_at)
-                                    VALUES (?, ?, ?, ?, ?, ?, ?, ?, 1, ?, ?)
-                                """, (item.title, feed_id, item.url, item.published_at,
-                                      item.summary, item.author, data.crawl_time,
-                                      data.crawl_time, now_str, now_str))
-                                new_count += 1
-                        else:
-                            # URL 为空,直接插入
-                            cursor.execute("""
-                                INSERT INTO rss_items
-                                (title, feed_id, url, published_at, summary, author,
-                                 first_crawl_time, last_crawl_time, crawl_count,
-                                 created_at, updated_at)
-                                VALUES (?, ?, ?, ?, ?, ?, ?, ?, 1, ?, ?)
-                            """, (item.title, feed_id, "", item.published_at,
-                                  item.summary, item.author, data.crawl_time,
-                                  data.crawl_time, now_str, now_str))
-                            new_count += 1
-
-                    except sqlite3.Error as e:
-                        print(f"[本地存储] 保存 RSS 条目失败 [{item.title[:30]}...]: {e}")
-
-            total_items = new_count + updated_count
-
-            # 记录抓取信息
-            cursor.execute("""
-                INSERT OR REPLACE INTO rss_crawl_records
-                (crawl_time, total_items, created_at)
-                VALUES (?, ?, ?)
-            """, (data.crawl_time, total_items, now_str))
-
-            # 记录抓取状态
-            cursor.execute("""
-                SELECT id FROM rss_crawl_records WHERE crawl_time = ?
-            """, (data.crawl_time,))
-            record_row = cursor.fetchone()
-            if record_row:
-                crawl_record_id = record_row[0]
-
-                # 记录成功的源
-                for feed_id in data.items.keys():
-                    cursor.execute("""
-                        INSERT OR REPLACE INTO rss_crawl_status
-                        (crawl_record_id, feed_id, status)
-                        VALUES (?, ?, 'success')
-                    """, (crawl_record_id, feed_id))
-
-                # 记录失败的源
-                for failed_id in data.failed_ids:
-                    cursor.execute("""
-                        INSERT OR IGNORE INTO rss_feeds (id, name, updated_at)
-                        VALUES (?, ?, ?)
-                    """, (failed_id, failed_id, now_str))
-
-                    cursor.execute("""
-                        INSERT OR REPLACE INTO rss_crawl_status
-                        (crawl_record_id, feed_id, status)
-                        VALUES (?, ?, 'failed')
-                    """, (crawl_record_id, failed_id))
-
-            conn.commit()
-
-            # 输出统计日志
-            log_parts = [f"[本地存储] RSS 处理完成:新增 {new_count} 条"]
-            if updated_count > 0:
-                log_parts.append(f"更新 {updated_count} 条")
-            print(",".join(log_parts))
-
-            return True
-
-        except Exception as e:
-            print(f"[本地存储] 保存 RSS 数据失败: {e}")
-            return False
-
-    def get_rss_data(self, date: Optional[str] = None) -> Optional[RSSData]:
-        """
-        获取指定日期的所有 RSS 数据
-
-        Args:
-            date: 日期字符串(YYYY-MM-DD),默认为今天
-
-        Returns:
-            RSSData 对象,如果没有数据返回 None
-        """
-        try:
-            conn = self._get_connection(date, db_type="rss")
-            cursor = conn.cursor()
-
-            # 获取所有 RSS 数据
-            cursor.execute("""
-                SELECT i.id, i.title, i.feed_id, f.name as feed_name,
-                       i.url, i.published_at, i.summary, i.author,
-                       i.first_crawl_time, i.last_crawl_time, i.crawl_count
-                FROM rss_items i
-                LEFT JOIN rss_feeds f ON i.feed_id = f.id
-                ORDER BY i.published_at DESC
-            """)
-
-            rows = cursor.fetchall()
-            if not rows:
-                return None
-
-            items: Dict[str, List[RSSItem]] = {}
-            id_to_name: Dict[str, str] = {}
-            crawl_date = self._format_date_folder(date)
-
-            for row in rows:
-                feed_id = row[2]
-                feed_name = row[3] or feed_id
-
-                id_to_name[feed_id] = feed_name
-
-                if feed_id not in items:
-                    items[feed_id] = []
-
-                items[feed_id].append(RSSItem(
-                    title=row[1],
-                    feed_id=feed_id,
-                    feed_name=feed_name,
-                    url=row[4] or "",
-                    published_at=row[5] or "",
-                    summary=row[6] or "",
-                    author=row[7] or "",
-                    crawl_time=row[9],
-                    first_time=row[8],
-                    last_time=row[9],
-                    count=row[10],
-                ))
-
-            # 获取最新的抓取时间
-            cursor.execute("""
-                SELECT crawl_time FROM rss_crawl_records
-                ORDER BY crawl_time DESC
-                LIMIT 1
-            """)
-            time_row = cursor.fetchone()
-            crawl_time = time_row[0] if time_row else self._format_time_filename()
-
-            # 获取失败的源
-            cursor.execute("""
-                SELECT DISTINCT cs.feed_id
-                FROM rss_crawl_status cs
-                JOIN rss_crawl_records cr ON cs.crawl_record_id = cr.id
-                WHERE cs.status = 'failed'
-            """)
-            failed_ids = [row[0] for row in cursor.fetchall()]
-
-            return RSSData(
-                date=crawl_date,
-                crawl_time=crawl_time,
-                items=items,
-                id_to_name=id_to_name,
-                failed_ids=failed_ids,
-            )
-
-        except Exception as e:
-            print(f"[本地存储] 读取 RSS 数据失败: {e}")
-            return None
-
-    def detect_new_rss_items(self, current_data: RSSData) -> Dict[str, List[RSSItem]]:
-        """
-        检测新增的 RSS 条目(增量模式)
-
-        该方法比较当前抓取数据与历史数据,找出新增的 RSS 条目。
-        关键逻辑:只有在历史批次中从未出现过的 URL 才算新增。
-
-        Args:
-            current_data: 当前抓取的 RSS 数据
-
-        Returns:
-            新增的 RSS 条目 {feed_id: [RSSItem, ...]}
-        """
-        try:
-            # 获取历史数据
-            historical_data = self.get_rss_data(current_data.date)
-
-            if not historical_data:
-                # 没有历史数据,所有都是新的
-                return current_data.items.copy()
-
-            # 获取当前批次时间
-            current_time = current_data.crawl_time
-
-            # 收集历史 URL(first_time < current_time 的条目)
-            historical_urls: Dict[str, set] = {}
-            for feed_id, rss_list in historical_data.items.items():
-                historical_urls[feed_id] = set()
-                for item in rss_list:
-                    first_time = getattr(item, 'first_time', item.crawl_time)
-                    if first_time < current_time:
-                        if item.url:
-                            historical_urls[feed_id].add(item.url)
-
-            # 检查是否有历史数据
-            has_historical_data = any(len(urls) > 0 for urls in historical_urls.values())
-            if not has_historical_data:
-                # 第一次抓取,没有"新增"概念
-                return {}
-
-            # 检测新增
-            new_items: Dict[str, List[RSSItem]] = {}
-            for feed_id, rss_list in current_data.items.items():
-                hist_set = historical_urls.get(feed_id, set())
-                for item in rss_list:
-                    # 通过 URL 判断是否新增
-                    if item.url and item.url not in hist_set:
-                        if feed_id not in new_items:
-                            new_items[feed_id] = []
-                        new_items[feed_id].append(item)
-
-            return new_items
-
-        except Exception as e:
-            print(f"[本地存储] 检测新 RSS 条目失败: {e}")
-            return {}
-
-    def get_latest_rss_data(self, date: Optional[str] = None) -> Optional[RSSData]:
-        """
-        获取最新一次抓取的 RSS 数据(当前榜单模式)
-
-        Args:
-            date: 日期字符串(YYYY-MM-DD),默认为今天
-
-        Returns:
-            最新抓取的 RSS 数据,如果没有数据返回 None
-        """
-        try:
-            db_path = self._get_db_path(date, db_type="rss")
-            if not db_path.exists():
-                return None
-
-            conn = self._get_connection(date, db_type="rss")
-            cursor = conn.cursor()
-
-            # 获取最新的抓取时间
-            cursor.execute("""
-                SELECT crawl_time FROM rss_crawl_records
-                ORDER BY crawl_time DESC
-                LIMIT 1
-            """)
-
-            time_row = cursor.fetchone()
-            if not time_row:
-                return None
-
-            latest_time = time_row[0]
-
-            # 获取该时间的 RSS 数据
-            cursor.execute("""
-                SELECT i.id, i.title, i.feed_id, f.name as feed_name,
-                       i.url, i.published_at, i.summary, i.author,
-                       i.first_crawl_time, i.last_crawl_time, i.crawl_count
-                FROM rss_items i
-                LEFT JOIN rss_feeds f ON i.feed_id = f.id
-                WHERE i.last_crawl_time = ?
-                ORDER BY i.published_at DESC
-            """, (latest_time,))
-
-            rows = cursor.fetchall()
-            if not rows:
-                return None
-
-            items: Dict[str, List[RSSItem]] = {}
-            id_to_name: Dict[str, str] = {}
-            crawl_date = self._format_date_folder(date)
-
-            for row in rows:
-                feed_id = row[2]
-                feed_name = row[3] or feed_id
-
-                id_to_name[feed_id] = feed_name
-
-                if feed_id not in items:
-                    items[feed_id] = []
-
-                items[feed_id].append(RSSItem(
-                    title=row[1],
-                    feed_id=feed_id,
-                    feed_name=feed_name,
-                    url=row[4] or "",
-                    published_at=row[5] or "",
-                    summary=row[6] or "",
-                    author=row[7] or "",
-                    crawl_time=row[9],
-                    first_time=row[8],
-                    last_time=row[9],
-                    count=row[10],
-                ))
-
-            # 获取失败的源(针对最新一次抓取)
-            cursor.execute("""
-                SELECT cs.feed_id
-                FROM rss_crawl_status cs
-                JOIN rss_crawl_records cr ON cs.crawl_record_id = cr.id
-                WHERE cr.crawl_time = ? AND cs.status = 'failed'
-            """, (latest_time,))
-
-            failed_ids = [row[0] for row in cursor.fetchall()]
-
-            return RSSData(
-                date=crawl_date,
-                crawl_time=latest_time,
-                items=items,
-                id_to_name=id_to_name,
-                failed_ids=failed_ids,
-            )
-
-        except Exception as e:
-            print(f"[本地存储] 获取最新 RSS 数据失败: {e}")
-            return None
-
     def __del__(self):
         """析构函数,确保关闭连接"""
         self.cleanup()

Файлын зөрүү хэтэрхий том тул дарагдсан байна
+ 114 - 927
trendradar/storage/remote.py


+ 1137 - 0
trendradar/storage/sqlite_mixin.py

@@ -0,0 +1,1137 @@
+# coding=utf-8
+"""
+SQLite 存储 Mixin
+
+提供共用的 SQLite 数据库操作逻辑,供 LocalStorageBackend 和 RemoteStorageBackend 复用。
+"""
+
+import sqlite3
+from abc import abstractmethod
+from datetime import datetime
+from pathlib import Path
+from typing import Any, Dict, List, Optional
+
+from trendradar.storage.base import NewsItem, NewsData, RSSItem, RSSData
+from trendradar.utils.url import normalize_url
+
+
+class SQLiteStorageMixin:
+    """
+    SQLite 存储操作 Mixin
+
+    子类需要实现以下抽象方法:
+    - _get_connection(date, db_type) -> sqlite3.Connection
+    - _get_configured_time() -> datetime
+    - _format_date_folder(date) -> str
+    - _format_time_filename() -> str
+    """
+
+    # ========================================
+    # 抽象方法 - 子类必须实现
+    # ========================================
+
+    @abstractmethod
+    def _get_connection(self, date: Optional[str] = None, db_type: str = "news") -> sqlite3.Connection:
+        """获取数据库连接"""
+        pass
+
+    @abstractmethod
+    def _get_configured_time(self) -> datetime:
+        """获取配置时区的当前时间"""
+        pass
+
+    @abstractmethod
+    def _format_date_folder(self, date: Optional[str] = None) -> str:
+        """格式化日期文件夹名 (ISO 格式: YYYY-MM-DD)"""
+        pass
+
+    @abstractmethod
+    def _format_time_filename(self) -> str:
+        """格式化时间文件名 (格式: HH-MM)"""
+        pass
+
+    # ========================================
+    # Schema 管理
+    # ========================================
+
+    def _get_schema_path(self, db_type: str = "news") -> Path:
+        """
+        获取 schema.sql 文件路径
+
+        Args:
+            db_type: 数据库类型 ("news" 或 "rss")
+
+        Returns:
+            schema 文件路径
+        """
+        if db_type == "rss":
+            return Path(__file__).parent / "rss_schema.sql"
+        return Path(__file__).parent / "schema.sql"
+
+    def _init_tables(self, conn: sqlite3.Connection, db_type: str = "news") -> None:
+        """
+        从 schema.sql 初始化数据库表结构
+
+        Args:
+            conn: 数据库连接
+            db_type: 数据库类型 ("news" 或 "rss")
+        """
+        schema_path = self._get_schema_path(db_type)
+
+        if schema_path.exists():
+            with open(schema_path, "r", encoding="utf-8") as f:
+                schema_sql = f.read()
+            conn.executescript(schema_sql)
+        else:
+            raise FileNotFoundError(f"Schema file not found: {schema_path}")
+
+        conn.commit()
+
+    # ========================================
+    # 新闻数据存储
+    # ========================================
+
+    def _save_news_data_impl(self, data: NewsData, log_prefix: str = "[存储]") -> tuple[bool, int, int, int, int]:
+        """
+        保存新闻数据到 SQLite(核心实现)
+
+        Args:
+            data: 新闻数据
+            log_prefix: 日志前缀
+
+        Returns:
+            (success, new_count, updated_count, title_changed_count, off_list_count)
+        """
+        try:
+            conn = self._get_connection(data.date)
+            cursor = conn.cursor()
+
+            # 获取配置时区的当前时间
+            now_str = self._get_configured_time().strftime("%Y-%m-%d %H:%M:%S")
+
+            # 首先同步平台信息到 platforms 表
+            for source_id, source_name in data.id_to_name.items():
+                cursor.execute("""
+                    INSERT INTO platforms (id, name, updated_at)
+                    VALUES (?, ?, ?)
+                    ON CONFLICT(id) DO UPDATE SET
+                        name = excluded.name,
+                        updated_at = excluded.updated_at
+                """, (source_id, source_name, now_str))
+
+            # 统计计数器
+            new_count = 0
+            updated_count = 0
+            title_changed_count = 0
+            success_sources = []
+
+            for source_id, news_list in data.items.items():
+                success_sources.append(source_id)
+
+                for item in news_list:
+                    try:
+                        # 标准化 URL(去除动态参数,如微博的 band_rank)
+                        normalized_url = normalize_url(item.url, source_id) if item.url else ""
+
+                        # 检查是否已存在(通过标准化 URL + platform_id)
+                        if normalized_url:
+                            cursor.execute("""
+                                SELECT id, title FROM news_items
+                                WHERE url = ? AND platform_id = ?
+                            """, (normalized_url, source_id))
+                            existing = cursor.fetchone()
+
+                            if existing:
+                                # 已存在,更新记录
+                                existing_id, existing_title = existing
+
+                                # 检查标题是否变化
+                                if existing_title != item.title:
+                                    # 记录标题变更
+                                    cursor.execute("""
+                                        INSERT INTO title_changes
+                                        (news_item_id, old_title, new_title, changed_at)
+                                        VALUES (?, ?, ?, ?)
+                                    """, (existing_id, existing_title, item.title, now_str))
+                                    title_changed_count += 1
+
+                                # 记录排名历史
+                                cursor.execute("""
+                                    INSERT INTO rank_history
+                                    (news_item_id, rank, crawl_time, created_at)
+                                    VALUES (?, ?, ?, ?)
+                                """, (existing_id, item.rank, data.crawl_time, now_str))
+
+                                # 更新现有记录
+                                cursor.execute("""
+                                    UPDATE news_items SET
+                                        title = ?,
+                                        rank = ?,
+                                        mobile_url = ?,
+                                        last_crawl_time = ?,
+                                        crawl_count = crawl_count + 1,
+                                        updated_at = ?
+                                    WHERE id = ?
+                                """, (item.title, item.rank, item.mobile_url,
+                                      data.crawl_time, now_str, existing_id))
+                                updated_count += 1
+                            else:
+                                # 不存在,插入新记录(存储标准化后的 URL)
+                                cursor.execute("""
+                                    INSERT INTO news_items
+                                    (title, platform_id, rank, url, mobile_url,
+                                     first_crawl_time, last_crawl_time, crawl_count,
+                                     created_at, updated_at)
+                                    VALUES (?, ?, ?, ?, ?, ?, ?, 1, ?, ?)
+                                """, (item.title, source_id, item.rank, normalized_url,
+                                      item.mobile_url, data.crawl_time, data.crawl_time,
+                                      now_str, now_str))
+                                new_id = cursor.lastrowid
+                                # 记录初始排名
+                                cursor.execute("""
+                                    INSERT INTO rank_history
+                                    (news_item_id, rank, crawl_time, created_at)
+                                    VALUES (?, ?, ?, ?)
+                                """, (new_id, item.rank, data.crawl_time, now_str))
+                                new_count += 1
+                        else:
+                            # URL 为空的情况,直接插入(不做去重)
+                            cursor.execute("""
+                                INSERT INTO news_items
+                                (title, platform_id, rank, url, mobile_url,
+                                 first_crawl_time, last_crawl_time, crawl_count,
+                                 created_at, updated_at)
+                                VALUES (?, ?, ?, ?, ?, ?, ?, 1, ?, ?)
+                            """, (item.title, source_id, item.rank, "",
+                                  item.mobile_url, data.crawl_time, data.crawl_time,
+                                  now_str, now_str))
+                            new_id = cursor.lastrowid
+                            # 记录初始排名
+                            cursor.execute("""
+                                INSERT INTO rank_history
+                                (news_item_id, rank, crawl_time, created_at)
+                                VALUES (?, ?, ?, ?)
+                            """, (new_id, item.rank, data.crawl_time, now_str))
+                            new_count += 1
+
+                    except sqlite3.Error as e:
+                        print(f"{log_prefix} 保存新闻条目失败 [{item.title[:30]}...]: {e}")
+
+            total_items = new_count + updated_count
+
+            # ========================================
+            # 脱榜检测:检测上次在榜但这次不在榜的新闻
+            # ========================================
+            off_list_count = 0
+
+            # 获取上一次抓取时间
+            cursor.execute("""
+                SELECT crawl_time FROM crawl_records
+                WHERE crawl_time < ?
+                ORDER BY crawl_time DESC
+                LIMIT 1
+            """, (data.crawl_time,))
+            prev_record = cursor.fetchone()
+
+            if prev_record:
+                prev_crawl_time = prev_record[0]
+
+                # 对于每个成功抓取的平台,检测脱榜
+                for source_id in success_sources:
+                    # 获取当前抓取中该平台的所有标准化 URL
+                    current_urls = set()
+                    for item in data.items.get(source_id, []):
+                        normalized_url = normalize_url(item.url, source_id) if item.url else ""
+                        if normalized_url:
+                            current_urls.add(normalized_url)
+
+                    # 查询上次在榜(last_crawl_time = prev_crawl_time)但这次不在榜的新闻
+                    # 这些新闻是"第一次脱榜",需要记录
+                    cursor.execute("""
+                        SELECT id, url FROM news_items
+                        WHERE platform_id = ?
+                          AND last_crawl_time = ?
+                          AND url != ''
+                    """, (source_id, prev_crawl_time))
+
+                    for row in cursor.fetchall():
+                        news_id, url = row[0], row[1]
+                        if url not in current_urls:
+                            # 插入脱榜记录(rank=0 表示脱榜)
+                            cursor.execute("""
+                                INSERT INTO rank_history
+                                (news_item_id, rank, crawl_time, created_at)
+                                VALUES (?, 0, ?, ?)
+                            """, (news_id, data.crawl_time, now_str))
+                            off_list_count += 1
+
+            # 记录抓取信息
+            cursor.execute("""
+                INSERT OR REPLACE INTO crawl_records
+                (crawl_time, total_items, created_at)
+                VALUES (?, ?, ?)
+            """, (data.crawl_time, total_items, now_str))
+
+            # 获取刚插入的 crawl_record 的 ID
+            cursor.execute("""
+                SELECT id FROM crawl_records WHERE crawl_time = ?
+            """, (data.crawl_time,))
+            record_row = cursor.fetchone()
+            if record_row:
+                crawl_record_id = record_row[0]
+
+                # 记录成功的来源
+                for source_id in success_sources:
+                    cursor.execute("""
+                        INSERT OR REPLACE INTO crawl_source_status
+                        (crawl_record_id, platform_id, status)
+                        VALUES (?, ?, 'success')
+                    """, (crawl_record_id, source_id))
+
+                # 记录失败的来源
+                for failed_id in data.failed_ids:
+                    # 确保失败的平台也在 platforms 表中
+                    cursor.execute("""
+                        INSERT OR IGNORE INTO platforms (id, name, updated_at)
+                        VALUES (?, ?, ?)
+                    """, (failed_id, failed_id, now_str))
+
+                    cursor.execute("""
+                        INSERT OR REPLACE INTO crawl_source_status
+                        (crawl_record_id, platform_id, status)
+                        VALUES (?, ?, 'failed')
+                    """, (crawl_record_id, failed_id))
+
+            conn.commit()
+
+            return True, new_count, updated_count, title_changed_count, off_list_count
+
+        except Exception as e:
+            print(f"{log_prefix} 保存失败: {e}")
+            return False, 0, 0, 0, 0
+
+    def _get_today_all_data_impl(self, date: Optional[str] = None) -> Optional[NewsData]:
+        """
+        获取指定日期的所有新闻数据(合并后)
+
+        Args:
+            date: 日期字符串,默认为今天
+
+        Returns:
+            合并后的新闻数据
+        """
+        try:
+            conn = self._get_connection(date)
+            cursor = conn.cursor()
+
+            # 获取所有新闻数据(包含 id 用于查询排名历史)
+            cursor.execute("""
+                SELECT n.id, n.title, n.platform_id, p.name as platform_name,
+                       n.rank, n.url, n.mobile_url,
+                       n.first_crawl_time, n.last_crawl_time, n.crawl_count
+                FROM news_items n
+                LEFT JOIN platforms p ON n.platform_id = p.id
+                ORDER BY n.platform_id, n.last_crawl_time
+            """)
+
+            rows = cursor.fetchall()
+            if not rows:
+                return None
+
+            # 收集所有 news_item_id
+            news_ids = [row[0] for row in rows]
+
+            # 批量查询排名历史(同时获取时间和排名)
+            # 过滤逻辑:只保留 last_crawl_time 之前的脱榜记录(rank=0)
+            # 这样可以避免显示新闻永久脱榜后的无意义记录
+            rank_history_map: Dict[int, List[int]] = {}
+            rank_timeline_map: Dict[int, List[Dict[str, Any]]] = {}
+            if news_ids:
+                placeholders = ",".join("?" * len(news_ids))
+                cursor.execute(f"""
+                    SELECT rh.news_item_id, rh.rank, rh.crawl_time
+                    FROM rank_history rh
+                    JOIN news_items ni ON rh.news_item_id = ni.id
+                    WHERE rh.news_item_id IN ({placeholders})
+                      AND NOT (rh.rank = 0 AND rh.crawl_time > ni.last_crawl_time)
+                    ORDER BY rh.news_item_id, rh.crawl_time
+                """, news_ids)
+                for rh_row in cursor.fetchall():
+                    news_id, rank, crawl_time = rh_row[0], rh_row[1], rh_row[2]
+
+                    # 构建 ranks 列表(去重,排除脱榜记录 rank=0)
+                    if news_id not in rank_history_map:
+                        rank_history_map[news_id] = []
+                    if rank != 0 and rank not in rank_history_map[news_id]:
+                        rank_history_map[news_id].append(rank)
+
+                    # 构建 rank_timeline 列表(完整时间线,包含脱榜)
+                    if news_id not in rank_timeline_map:
+                        rank_timeline_map[news_id] = []
+                    # 提取时间部分(HH:MM)
+                    time_part = crawl_time.split()[1][:5] if ' ' in crawl_time else crawl_time[:5]
+                    rank_timeline_map[news_id].append({
+                        "time": time_part,
+                        "rank": rank if rank != 0 else None  # 0 转为 None 表示脱榜
+                    })
+
+            # 按 platform_id 分组
+            items: Dict[str, List[NewsItem]] = {}
+            id_to_name: Dict[str, str] = {}
+            crawl_date = self._format_date_folder(date)
+
+            for row in rows:
+                news_id = row[0]
+                platform_id = row[2]
+                title = row[1]
+                platform_name = row[3] or platform_id
+
+                id_to_name[platform_id] = platform_name
+
+                if platform_id not in items:
+                    items[platform_id] = []
+
+                # 获取排名历史,如果没有则使用当前排名
+                ranks = rank_history_map.get(news_id, [row[4]])
+                rank_timeline = rank_timeline_map.get(news_id, [])
+
+                items[platform_id].append(NewsItem(
+                    title=title,
+                    source_id=platform_id,
+                    source_name=platform_name,
+                    rank=row[4],
+                    url=row[5] or "",
+                    mobile_url=row[6] or "",
+                    crawl_time=row[8],  # last_crawl_time
+                    ranks=ranks,
+                    first_time=row[7],  # first_crawl_time
+                    last_time=row[8],   # last_crawl_time
+                    count=row[9],       # crawl_count
+                    rank_timeline=rank_timeline,
+                ))
+
+            final_items = items
+
+            # 获取失败的来源
+            cursor.execute("""
+                SELECT DISTINCT css.platform_id
+                FROM crawl_source_status css
+                JOIN crawl_records cr ON css.crawl_record_id = cr.id
+                WHERE css.status = 'failed'
+            """)
+            failed_ids = [row[0] for row in cursor.fetchall()]
+
+            # 获取最新的抓取时间
+            cursor.execute("""
+                SELECT crawl_time FROM crawl_records
+                ORDER BY crawl_time DESC
+                LIMIT 1
+            """)
+
+            time_row = cursor.fetchone()
+            crawl_time = time_row[0] if time_row else self._format_time_filename()
+
+            return NewsData(
+                date=crawl_date,
+                crawl_time=crawl_time,
+                items=final_items,
+                id_to_name=id_to_name,
+                failed_ids=failed_ids,
+            )
+
+        except Exception as e:
+            print(f"[存储] 读取数据失败: {e}")
+            return None
+
+    def _get_latest_crawl_data_impl(self, date: Optional[str] = None) -> Optional[NewsData]:
+        """
+        获取最新一次抓取的数据
+
+        Args:
+            date: 日期字符串,默认为今天
+
+        Returns:
+            最新抓取的新闻数据
+        """
+        try:
+            conn = self._get_connection(date)
+            cursor = conn.cursor()
+
+            # 获取最新的抓取时间
+            cursor.execute("""
+                SELECT crawl_time FROM crawl_records
+                ORDER BY crawl_time DESC
+                LIMIT 1
+            """)
+
+            time_row = cursor.fetchone()
+            if not time_row:
+                return None
+
+            latest_time = time_row[0]
+
+            # 获取该时间的新闻数据(包含 id 用于查询排名历史)
+            cursor.execute("""
+                SELECT n.id, n.title, n.platform_id, p.name as platform_name,
+                       n.rank, n.url, n.mobile_url,
+                       n.first_crawl_time, n.last_crawl_time, n.crawl_count
+                FROM news_items n
+                LEFT JOIN platforms p ON n.platform_id = p.id
+                WHERE n.last_crawl_time = ?
+            """, (latest_time,))
+
+            rows = cursor.fetchall()
+            if not rows:
+                return None
+
+            # 收集所有 news_item_id
+            news_ids = [row[0] for row in rows]
+
+            # 批量查询排名历史(同时获取时间和排名)
+            # 过滤逻辑:只保留 last_crawl_time 之前的脱榜记录(rank=0)
+            # 这样可以避免显示新闻永久脱榜后的无意义记录
+            rank_history_map: Dict[int, List[int]] = {}
+            rank_timeline_map: Dict[int, List[Dict[str, Any]]] = {}
+            if news_ids:
+                placeholders = ",".join("?" * len(news_ids))
+                cursor.execute(f"""
+                    SELECT rh.news_item_id, rh.rank, rh.crawl_time
+                    FROM rank_history rh
+                    JOIN news_items ni ON rh.news_item_id = ni.id
+                    WHERE rh.news_item_id IN ({placeholders})
+                      AND NOT (rh.rank = 0 AND rh.crawl_time > ni.last_crawl_time)
+                    ORDER BY rh.news_item_id, rh.crawl_time
+                """, news_ids)
+                for rh_row in cursor.fetchall():
+                    news_id, rank, crawl_time = rh_row[0], rh_row[1], rh_row[2]
+
+                    # 构建 ranks 列表(去重,排除脱榜记录 rank=0)
+                    if news_id not in rank_history_map:
+                        rank_history_map[news_id] = []
+                    if rank != 0 and rank not in rank_history_map[news_id]:
+                        rank_history_map[news_id].append(rank)
+
+                    # 构建 rank_timeline 列表(完整时间线,包含脱榜)
+                    if news_id not in rank_timeline_map:
+                        rank_timeline_map[news_id] = []
+                    # 提取时间部分(HH:MM)
+                    time_part = crawl_time.split()[1][:5] if ' ' in crawl_time else crawl_time[:5]
+                    rank_timeline_map[news_id].append({
+                        "time": time_part,
+                        "rank": rank if rank != 0 else None  # 0 转为 None 表示脱榜
+                    })
+
+            items: Dict[str, List[NewsItem]] = {}
+            id_to_name: Dict[str, str] = {}
+            crawl_date = self._format_date_folder(date)
+
+            for row in rows:
+                news_id = row[0]
+                platform_id = row[2]
+                platform_name = row[3] or platform_id
+                id_to_name[platform_id] = platform_name
+
+                if platform_id not in items:
+                    items[platform_id] = []
+
+                # 获取排名历史,如果没有则使用当前排名
+                ranks = rank_history_map.get(news_id, [row[4]])
+                rank_timeline = rank_timeline_map.get(news_id, [])
+
+                items[platform_id].append(NewsItem(
+                    title=row[1],
+                    source_id=platform_id,
+                    source_name=platform_name,
+                    rank=row[4],
+                    url=row[5] or "",
+                    mobile_url=row[6] or "",
+                    crawl_time=row[8],  # last_crawl_time
+                    ranks=ranks,
+                    first_time=row[7],  # first_crawl_time
+                    last_time=row[8],   # last_crawl_time
+                    count=row[9],       # crawl_count
+                    rank_timeline=rank_timeline,
+                ))
+
+            # 获取失败的来源(针对最新一次抓取)
+            cursor.execute("""
+                SELECT css.platform_id
+                FROM crawl_source_status css
+                JOIN crawl_records cr ON css.crawl_record_id = cr.id
+                WHERE cr.crawl_time = ? AND css.status = 'failed'
+            """, (latest_time,))
+
+            failed_ids = [row[0] for row in cursor.fetchall()]
+
+            return NewsData(
+                date=crawl_date,
+                crawl_time=latest_time,
+                items=items,
+                id_to_name=id_to_name,
+                failed_ids=failed_ids,
+            )
+
+        except Exception as e:
+            print(f"[存储] 获取最新数据失败: {e}")
+            return None
+
+    def _detect_new_titles_impl(self, current_data: NewsData) -> Dict[str, Dict]:
+        """
+        检测新增的标题
+
+        该方法比较当前抓取数据与历史数据,找出新增的标题。
+        关键逻辑:只有在历史批次中从未出现过的标题才算新增。
+
+        Args:
+            current_data: 当前抓取的数据
+
+        Returns:
+            新增的标题数据 {source_id: {title: NewsItem}}
+        """
+        try:
+            # 获取历史数据
+            historical_data = self._get_today_all_data_impl(current_data.date)
+
+            if not historical_data:
+                # 没有历史数据,所有都是新的
+                new_titles = {}
+                for source_id, news_list in current_data.items.items():
+                    new_titles[source_id] = {item.title: item for item in news_list}
+                return new_titles
+
+            # 获取当前批次时间
+            current_time = current_data.crawl_time
+
+            # 收集历史标题(first_time < current_time 的标题)
+            # 这样可以正确处理同一标题因 URL 变化而产生多条记录的情况
+            historical_titles: Dict[str, set] = {}
+            for source_id, news_list in historical_data.items.items():
+                historical_titles[source_id] = set()
+                for item in news_list:
+                    first_time = getattr(item, 'first_time', item.crawl_time)
+                    if first_time < current_time:
+                        historical_titles[source_id].add(item.title)
+
+            # 检查是否有历史数据
+            has_historical_data = any(len(titles) > 0 for titles in historical_titles.values())
+            if not has_historical_data:
+                # 第一次抓取,没有"新增"概念
+                return {}
+
+            # 检测新增
+            new_titles = {}
+            for source_id, news_list in current_data.items.items():
+                hist_set = historical_titles.get(source_id, set())
+                for item in news_list:
+                    if item.title not in hist_set:
+                        if source_id not in new_titles:
+                            new_titles[source_id] = {}
+                        new_titles[source_id][item.title] = item
+
+            return new_titles
+
+        except Exception as e:
+            print(f"[存储] 检测新标题失败: {e}")
+            return {}
+
+    def _is_first_crawl_today_impl(self, date: Optional[str] = None) -> bool:
+        """
+        检查是否是当天第一次抓取
+
+        Args:
+            date: 日期字符串,默认为今天
+
+        Returns:
+            是否是第一次抓取
+        """
+        try:
+            conn = self._get_connection(date)
+            cursor = conn.cursor()
+
+            cursor.execute("""
+                SELECT COUNT(*) as count FROM crawl_records
+            """)
+
+            row = cursor.fetchone()
+            count = row[0] if row else 0
+
+            # 如果只有一条或没有记录,视为第一次抓取
+            return count <= 1
+
+        except Exception as e:
+            print(f"[存储] 检查首次抓取失败: {e}")
+            return True
+
+    def _get_crawl_times_impl(self, date: Optional[str] = None) -> List[str]:
+        """
+        获取指定日期的所有抓取时间列表
+
+        Args:
+            date: 日期字符串,默认为今天
+
+        Returns:
+            抓取时间列表(按时间排序)
+        """
+        try:
+            conn = self._get_connection(date)
+            cursor = conn.cursor()
+
+            cursor.execute("""
+                SELECT crawl_time FROM crawl_records
+                ORDER BY crawl_time
+            """)
+
+            rows = cursor.fetchall()
+            return [row[0] for row in rows]
+
+        except Exception as e:
+            print(f"[存储] 获取抓取时间列表失败: {e}")
+            return []
+
+    # ========================================
+    # 推送记录
+    # ========================================
+
+    def _has_pushed_today_impl(self, date: Optional[str] = None) -> bool:
+        """
+        检查指定日期是否已推送过
+
+        Args:
+            date: 日期字符串(YYYY-MM-DD),默认为今天
+
+        Returns:
+            是否已推送
+        """
+        try:
+            conn = self._get_connection(date)
+            cursor = conn.cursor()
+
+            target_date = self._format_date_folder(date)
+
+            cursor.execute("""
+                SELECT pushed FROM push_records WHERE date = ?
+            """, (target_date,))
+
+            row = cursor.fetchone()
+            if row:
+                return bool(row[0])
+            return False
+
+        except Exception as e:
+            print(f"[存储] 检查推送记录失败: {e}")
+            return False
+
+    def _record_push_impl(self, report_type: str, date: Optional[str] = None) -> bool:
+        """
+        记录推送
+
+        Args:
+            report_type: 报告类型
+            date: 日期字符串(YYYY-MM-DD),默认为今天
+
+        Returns:
+            是否记录成功
+        """
+        try:
+            conn = self._get_connection(date)
+            cursor = conn.cursor()
+
+            target_date = self._format_date_folder(date)
+            now_str = self._get_configured_time().strftime("%Y-%m-%d %H:%M:%S")
+
+            cursor.execute("""
+                INSERT INTO push_records (date, pushed, push_time, report_type, created_at)
+                VALUES (?, 1, ?, ?, ?)
+                ON CONFLICT(date) DO UPDATE SET
+                    pushed = 1,
+                    push_time = excluded.push_time,
+                    report_type = excluded.report_type
+            """, (target_date, now_str, report_type, now_str))
+
+            conn.commit()
+            return True
+
+        except Exception as e:
+            print(f"[存储] 记录推送失败: {e}")
+            return False
+
+    # ========================================
+    # RSS 数据存储
+    # ========================================
+
+    def _save_rss_data_impl(self, data: RSSData, log_prefix: str = "[存储]") -> tuple[bool, int, int]:
+        """
+        保存 RSS 数据到 SQLite(以 URL 为唯一标识)
+
+        Args:
+            data: RSS 数据
+            log_prefix: 日志前缀
+
+        Returns:
+            (success, new_count, updated_count)
+        """
+        try:
+            conn = self._get_connection(data.date, db_type="rss")
+            cursor = conn.cursor()
+
+            now_str = self._get_configured_time().strftime("%Y-%m-%d %H:%M:%S")
+
+            # 同步 RSS 源信息到 rss_feeds 表
+            for feed_id, feed_name in data.id_to_name.items():
+                cursor.execute("""
+                    INSERT INTO rss_feeds (id, name, updated_at)
+                    VALUES (?, ?, ?)
+                    ON CONFLICT(id) DO UPDATE SET
+                        name = excluded.name,
+                        updated_at = excluded.updated_at
+                """, (feed_id, feed_name, now_str))
+
+            # 统计计数器
+            new_count = 0
+            updated_count = 0
+
+            for feed_id, rss_list in data.items.items():
+                for item in rss_list:
+                    try:
+                        # 检查是否已存在(通过 URL + feed_id)
+                        if item.url:
+                            cursor.execute("""
+                                SELECT id, title FROM rss_items
+                                WHERE url = ? AND feed_id = ?
+                            """, (item.url, feed_id))
+                            existing = cursor.fetchone()
+
+                            if existing:
+                                # 已存在,更新记录
+                                existing_id = existing[0]
+                                cursor.execute("""
+                                    UPDATE rss_items SET
+                                        title = ?,
+                                        published_at = ?,
+                                        summary = ?,
+                                        author = ?,
+                                        last_crawl_time = ?,
+                                        crawl_count = crawl_count + 1,
+                                        updated_at = ?
+                                    WHERE id = ?
+                                """, (item.title, item.published_at, item.summary,
+                                      item.author, data.crawl_time, now_str, existing_id))
+                                updated_count += 1
+                            else:
+                                # 不存在,插入新记录(使用 ON CONFLICT 兜底处理并发/竞争场景)
+                                cursor.execute("""
+                                    INSERT INTO rss_items
+                                    (title, feed_id, url, published_at, summary, author,
+                                     first_crawl_time, last_crawl_time, crawl_count,
+                                     created_at, updated_at)
+                                    VALUES (?, ?, ?, ?, ?, ?, ?, ?, 1, ?, ?)
+                                    ON CONFLICT(url, feed_id) DO UPDATE SET
+                                        title = excluded.title,
+                                        published_at = excluded.published_at,
+                                        summary = excluded.summary,
+                                        author = excluded.author,
+                                        last_crawl_time = excluded.last_crawl_time,
+                                        crawl_count = crawl_count + 1,
+                                        updated_at = excluded.updated_at
+                                """, (item.title, feed_id, item.url, item.published_at,
+                                      item.summary, item.author, data.crawl_time,
+                                      data.crawl_time, now_str, now_str))
+                                new_count += 1
+                        else:
+                            # URL 为空,用 try-except 处理重复
+                            try:
+                                cursor.execute("""
+                                    INSERT INTO rss_items
+                                    (title, feed_id, url, published_at, summary, author,
+                                     first_crawl_time, last_crawl_time, crawl_count,
+                                     created_at, updated_at)
+                                    VALUES (?, ?, ?, ?, ?, ?, ?, ?, 1, ?, ?)
+                                """, (item.title, feed_id, "", item.published_at,
+                                      item.summary, item.author, data.crawl_time,
+                                      data.crawl_time, now_str, now_str))
+                                new_count += 1
+                            except sqlite3.IntegrityError:
+                                # 重复的空 URL 条目,忽略
+                                pass
+
+                    except sqlite3.Error as e:
+                        print(f"{log_prefix} 保存 RSS 条目失败 [{item.title[:30]}...]: {e}")
+
+            total_items = new_count + updated_count
+
+            # 记录抓取信息
+            cursor.execute("""
+                INSERT OR REPLACE INTO rss_crawl_records
+                (crawl_time, total_items, created_at)
+                VALUES (?, ?, ?)
+            """, (data.crawl_time, total_items, now_str))
+
+            # 记录抓取状态
+            cursor.execute("""
+                SELECT id FROM rss_crawl_records WHERE crawl_time = ?
+            """, (data.crawl_time,))
+            record_row = cursor.fetchone()
+            if record_row:
+                crawl_record_id = record_row[0]
+
+                # 记录成功的源
+                for feed_id in data.items.keys():
+                    cursor.execute("""
+                        INSERT OR REPLACE INTO rss_crawl_status
+                        (crawl_record_id, feed_id, status)
+                        VALUES (?, ?, 'success')
+                    """, (crawl_record_id, feed_id))
+
+                # 记录失败的源
+                for failed_id in data.failed_ids:
+                    cursor.execute("""
+                        INSERT OR IGNORE INTO rss_feeds (id, name, updated_at)
+                        VALUES (?, ?, ?)
+                    """, (failed_id, failed_id, now_str))
+
+                    cursor.execute("""
+                        INSERT OR REPLACE INTO rss_crawl_status
+                        (crawl_record_id, feed_id, status)
+                        VALUES (?, ?, 'failed')
+                    """, (crawl_record_id, failed_id))
+
+            conn.commit()
+
+            return True, new_count, updated_count
+
+        except Exception as e:
+            print(f"{log_prefix} 保存 RSS 数据失败: {e}")
+            return False, 0, 0
+
+    def _get_rss_data_impl(self, date: Optional[str] = None) -> Optional[RSSData]:
+        """
+        获取指定日期的所有 RSS 数据
+
+        Args:
+            date: 日期字符串(YYYY-MM-DD),默认为今天
+
+        Returns:
+            RSSData 对象,如果没有数据返回 None
+        """
+        try:
+            conn = self._get_connection(date, db_type="rss")
+            cursor = conn.cursor()
+
+            # 获取所有 RSS 数据
+            cursor.execute("""
+                SELECT i.id, i.title, i.feed_id, f.name as feed_name,
+                       i.url, i.published_at, i.summary, i.author,
+                       i.first_crawl_time, i.last_crawl_time, i.crawl_count
+                FROM rss_items i
+                LEFT JOIN rss_feeds f ON i.feed_id = f.id
+                ORDER BY i.published_at DESC
+            """)
+
+            rows = cursor.fetchall()
+            if not rows:
+                return None
+
+            items: Dict[str, List[RSSItem]] = {}
+            id_to_name: Dict[str, str] = {}
+            crawl_date = self._format_date_folder(date)
+
+            for row in rows:
+                feed_id = row[2]
+                feed_name = row[3] or feed_id
+
+                id_to_name[feed_id] = feed_name
+
+                if feed_id not in items:
+                    items[feed_id] = []
+
+                items[feed_id].append(RSSItem(
+                    title=row[1],
+                    feed_id=feed_id,
+                    feed_name=feed_name,
+                    url=row[4] or "",
+                    published_at=row[5] or "",
+                    summary=row[6] or "",
+                    author=row[7] or "",
+                    crawl_time=row[9],
+                    first_time=row[8],
+                    last_time=row[9],
+                    count=row[10],
+                ))
+
+            # 获取最新的抓取时间
+            cursor.execute("""
+                SELECT crawl_time FROM rss_crawl_records
+                ORDER BY crawl_time DESC
+                LIMIT 1
+            """)
+            time_row = cursor.fetchone()
+            crawl_time = time_row[0] if time_row else self._format_time_filename()
+
+            # 获取失败的源
+            cursor.execute("""
+                SELECT DISTINCT cs.feed_id
+                FROM rss_crawl_status cs
+                JOIN rss_crawl_records cr ON cs.crawl_record_id = cr.id
+                WHERE cs.status = 'failed'
+            """)
+            failed_ids = [row[0] for row in cursor.fetchall()]
+
+            return RSSData(
+                date=crawl_date,
+                crawl_time=crawl_time,
+                items=items,
+                id_to_name=id_to_name,
+                failed_ids=failed_ids,
+            )
+
+        except Exception as e:
+            print(f"[存储] 读取 RSS 数据失败: {e}")
+            return None
+
+    def _detect_new_rss_items_impl(self, current_data: RSSData) -> Dict[str, List[RSSItem]]:
+        """
+        检测新增的 RSS 条目(增量模式)
+
+        该方法比较当前抓取数据与历史数据,找出新增的 RSS 条目。
+        关键逻辑:只有在历史批次中从未出现过的 URL 才算新增。
+
+        Args:
+            current_data: 当前抓取的 RSS 数据
+
+        Returns:
+            新增的 RSS 条目 {feed_id: [RSSItem, ...]}
+        """
+        try:
+            # 获取历史数据
+            historical_data = self._get_rss_data_impl(current_data.date)
+
+            if not historical_data:
+                # 没有历史数据,所有都是新的
+                return current_data.items.copy()
+
+            # 获取当前批次时间
+            current_time = current_data.crawl_time
+
+            # 收集历史 URL(first_time < current_time 的条目)
+            historical_urls: Dict[str, set] = {}
+            for feed_id, rss_list in historical_data.items.items():
+                historical_urls[feed_id] = set()
+                for item in rss_list:
+                    first_time = getattr(item, 'first_time', item.crawl_time)
+                    if first_time < current_time:
+                        if item.url:
+                            historical_urls[feed_id].add(item.url)
+
+            # 检查是否有历史数据
+            has_historical_data = any(len(urls) > 0 for urls in historical_urls.values())
+            if not has_historical_data:
+                # 第一次抓取,没有"新增"概念
+                return {}
+
+            # 检测新增
+            new_items: Dict[str, List[RSSItem]] = {}
+            for feed_id, rss_list in current_data.items.items():
+                hist_set = historical_urls.get(feed_id, set())
+                for item in rss_list:
+                    # 通过 URL 判断是否新增
+                    if item.url and item.url not in hist_set:
+                        if feed_id not in new_items:
+                            new_items[feed_id] = []
+                        new_items[feed_id].append(item)
+
+            return new_items
+
+        except Exception as e:
+            print(f"[存储] 检测新 RSS 条目失败: {e}")
+            return {}
+
+    def _get_latest_rss_data_impl(self, date: Optional[str] = None) -> Optional[RSSData]:
+        """
+        获取最新一次抓取的 RSS 数据(当前榜单模式)
+
+        Args:
+            date: 日期字符串(YYYY-MM-DD),默认为今天
+
+        Returns:
+            最新抓取的 RSS 数据,如果没有数据返回 None
+        """
+        try:
+            conn = self._get_connection(date, db_type="rss")
+            cursor = conn.cursor()
+
+            # 获取最新的抓取时间
+            cursor.execute("""
+                SELECT crawl_time FROM rss_crawl_records
+                ORDER BY crawl_time DESC
+                LIMIT 1
+            """)
+
+            time_row = cursor.fetchone()
+            if not time_row:
+                return None
+
+            latest_time = time_row[0]
+
+            # 获取该时间的 RSS 数据
+            cursor.execute("""
+                SELECT i.id, i.title, i.feed_id, f.name as feed_name,
+                       i.url, i.published_at, i.summary, i.author,
+                       i.first_crawl_time, i.last_crawl_time, i.crawl_count
+                FROM rss_items i
+                LEFT JOIN rss_feeds f ON i.feed_id = f.id
+                WHERE i.last_crawl_time = ?
+                ORDER BY i.published_at DESC
+            """, (latest_time,))
+
+            rows = cursor.fetchall()
+            if not rows:
+                return None
+
+            items: Dict[str, List[RSSItem]] = {}
+            id_to_name: Dict[str, str] = {}
+            crawl_date = self._format_date_folder(date)
+
+            for row in rows:
+                feed_id = row[2]
+                feed_name = row[3] or feed_id
+
+                id_to_name[feed_id] = feed_name
+
+                if feed_id not in items:
+                    items[feed_id] = []
+
+                items[feed_id].append(RSSItem(
+                    title=row[1],
+                    feed_id=feed_id,
+                    feed_name=feed_name,
+                    url=row[4] or "",
+                    published_at=row[5] or "",
+                    summary=row[6] or "",
+                    author=row[7] or "",
+                    crawl_time=row[9],
+                    first_time=row[8],
+                    last_time=row[9],
+                    count=row[10],
+                ))
+
+            # 获取失败的源(针对最新一次抓取)
+            cursor.execute("""
+                SELECT cs.feed_id
+                FROM rss_crawl_status cs
+                JOIN rss_crawl_records cr ON cs.crawl_record_id = cr.id
+                WHERE cr.crawl_time = ? AND cs.status = 'failed'
+            """, (latest_time,))
+
+            failed_ids = [row[0] for row in cursor.fetchall()]
+
+            return RSSData(
+                date=crawl_date,
+                crawl_time=latest_time,
+                items=items,
+                id_to_name=id_to_name,
+                failed_ids=failed_ids,
+            )
+
+        except Exception as e:
+            print(f"[存储] 获取最新 RSS 数据失败: {e}")
+            return None

+ 1 - 1
version

@@ -1 +1 @@
-5.0.0
+5.2.0

+ 1 - 1
version_mcp

@@ -1 +1 @@
-3.1.5
+3.1.6

Энэ ялгаанд хэт олон файл өөрчлөгдсөн тул зарим файлыг харуулаагүй болно