浏览代码

feat: 优化 MCP 工具集,详细见 README-MCP-FAQ.md

sansan 4 月之前
父节点
当前提交
cef454c929

+ 257 - 298
README-EN.md

@@ -8,13 +8,11 @@
 
 <a href="https://trendshift.io/repositories/14726" target="_blank"><img src="https://trendshift.io/api/badge/repositories/14726" alt="sansan0%2FTrendRadar | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
 
-<a href="https://shandianshuo.cn" target="_blank" title="AI Voice Input, 4x Faster Than Typing ⚡"><img src="_image/shandianshuo.png" alt="FlashSpeak logo" height="55"/></a>
-
 [![GitHub Stars](https://img.shields.io/github/stars/sansan0/TrendRadar?style=flat-square&logo=github&color=yellow)](https://github.com/sansan0/TrendRadar/stargazers)
 [![GitHub Forks](https://img.shields.io/github/forks/sansan0/TrendRadar?style=flat-square&logo=github&color=blue)](https://github.com/sansan0/TrendRadar/network/members)
 [![License](https://img.shields.io/badge/license-GPL--3.0-blue.svg?style=flat-square)](LICENSE)
 [![Version](https://img.shields.io/badge/version-v4.0.3-blue.svg)](https://github.com/sansan0/TrendRadar)
-[![MCP](https://img.shields.io/badge/MCP-v1.1.1-green.svg)](https://github.com/sansan0/TrendRadar)
+[![MCP](https://img.shields.io/badge/MCP-v1.2.0-green.svg)](https://github.com/sansan0/TrendRadar)
 
 [![WeWork](https://img.shields.io/badge/WeWork-Notification-00D4AA?style=flat-square)](https://work.weixin.qq.com/)
 [![WeChat](https://img.shields.io/badge/WeChat-Notification-00D4AA?style=flat-square)](https://weixin.qq.com/)
@@ -42,44 +40,13 @@
 
 > This project is designed to be lightweight and easy to deploy
 
-<br>
-
-<details>
-<summary>🚨 <strong>【Must Read】Important Announcement: v4.0.0 Deployment & Storage Architecture Changes</strong></summary>
-
-<br>
-
-### 🛠️ Choose the Deployment Method That Fits You
-
-#### 🅰️ Option 1: Docker Deployment (Recommended 🔥)
-
-* **Features**: Most stable and simplest. Data is stored in **local SQLite**, fully under your control.
-
-* **Best for**: Users with their own server, NAS, or an always-on PC.
-
-👉 **[Jump to Docker Deployment Tutorial](#6-docker-deployment)**
-
----
-
-#### 🅱️ Option 2: GitHub Actions Deployment (Restored ✅)
-
-* **Features**: Data is no longer committed directly to the repo. Instead, it is stored in **Remote Cloud Storage**.
-
-* **Recommended**: Configure a remote cloud storage service (Cloudflare R2, Alibaba Cloud OSS, Tencent Cloud COS, etc.).
-
-👉 **[Click to View Detailed Configuration Tutorial](#-quick-start)**
-
-</details>
-
-<br>
-
 ## 📑 Quick Navigation
 
 <div align="center">
 
 | [🚀 Quick Start](#-quick-start) | [🤖 AI Analysis](#-ai-analysis) | [⚙️ Configuration Guide](#configuration-guide) | [📝 Changelog](#-changelog) | [❓ FAQ & Support](#-faq--support) |
 |:---:|:---:|:---:|:---:|:---:|
-| [🐳 Docker Deployment](#6-docker-deployment) | [🔌 MCP Clients](#-mcp-clients) | [📚 Related Projects](#-related-projects) | [🪄 Sponsors](#-sponsors) | |
+| [🐳 Docker Deployment](#6-docker-deployment) | [🔌 MCP Clients](#-mcp-clients) | [📚 Related Projects](#-related-projects) | | |
 
 </div>
 
@@ -146,189 +113,14 @@ After communication, the author indicated no concerns about server pressure, but
 
 <br>
 
-## ✨ Core Features
-
-### **Multi-Platform Trending News Aggregation**
-
-- Zhihu (知乎)
-- Douyin (抖音)
-- Bilibili Hot Search
-- Wallstreetcn (华尔街见闻)
-- Tieba (贴吧)
-- Baidu Hot Search
-- Yicai (财联社)
-- Thepaper (澎湃新闻)
-- Ifeng (凤凰网)
-- Toutiao (今日头条)
-- Weibo (微博)
-
-Default monitoring of 11 mainstream platforms, with support for adding custom platforms.
-
-> 💡 For detailed configuration, see [Configuration Guide - Platform Configuration](#1-platform-configuration)
-
-### **Smart Push Strategies**
-
-**Three Push Modes**:
-
-| Mode | Target Users | Push Feature |
-|------|--------------|--------------|
-| **Daily Summary** (daily) | Managers/Regular Users | Push all matched news of the day (includes previously pushed) |
-| **Current Rankings** (current) | Content Creators | Push current ranking matches (continuously ranked news appear each time) |
-| **Incremental Monitor** (incremental) | Traders/Investors | Push only new content, zero duplication |
-
-> 💡 **Quick Selection Guide:**
-> - 🔄 Don't want duplicate news → Use `incremental`
-> - 📊 Want complete ranking trends → Use `current`
-> - 📝 Need daily summary reports → Use `daily`
->
-> For detailed comparison and configuration, see [Configuration Guide - Push Mode Details](#3-push-mode-details)
-
-**Additional Features** (Optional):
-
-| Feature | Description | Default |
-|---------|-------------|---------|
-| **Push Time Window Control** | Set push time range (e.g., 09:00-18:00) to avoid non-work hours notifications | Disabled |
-| **Content Order Configuration** | Adjust display order of "Trending Keywords Stats" and "New Trending News" (v3.5.0 new) | Stats first |
-
-> 💡 For detailed configuration, see [Configuration Guide - Report Configuration](#7-report-configuration) and [Configuration Guide - Push Window](#8-push-window-configuration)
-
-### **Precise Content Filtering**
-
-Set personal keywords (e.g., AI, BYD, Education Policy) to receive only relevant trending news, filtering out noise.
-
-**Basic Syntax** (5 types):
-- Normal words: Basic matching
-- Required words `+`: Narrow scope
-- Filter words `!`: Exclude noise
-- Count limit `@`: Control display count (v3.2.0 new)
-- Global filter `[GLOBAL_FILTER]`: Globally exclude specified content (v3.5.0 new)
-
-**Advanced Features** (v3.2.0 new):
-- 🔢 **Keyword Sorting Control**: Sort by popularity or config order
-- 📊 **Display Count Limit**: Global config + individual override for flexible control
-
-**Group-based Management**:
-- Separate with blank lines, independent statistics for different topics
-
-> 💡 **Basic Configuration**: [Keyword Configuration - Basic Syntax](#keyword-basic-syntax)
->
-> 💡 **Advanced Configuration**: [Keyword Configuration - Advanced Settings](#keyword-advanced-settings)
->
-> 💡 You can also skip filtering and receive all trending news (leave frequency_words.txt empty)
-
-
-### **Trending Analysis**
-
-Real-time tracking of news popularity changes helps you understand not just "what's trending" but "how trends evolve."
-
-- **Timeline Tracking**: Records complete time span from first to last appearance
-- **Popularity Changes**: Tracks ranking changes and appearance frequency across time periods
-- **New Detection**: Real-time identification of emerging topics, marked with 🆕
-- **Continuity Analysis**: Distinguishes between one-time hot topics and continuously developing news
-- **Cross-Platform Comparison**: Same news across different platforms, showing media attention differences
-
-> 💡 Push format reference: [Configuration Guide - Push Format Reference](#5-push-format-reference)
-
-### **Personalized Trending Algorithm**
-
-No longer controlled by platform algorithms, TrendRadar reorganizes all trending searches:
-
-- **Prioritize High-Ranking News** (60%): Top-ranked news from each platform appears first
-- **Focus on Persistent Topics** (30%): Repeatedly appearing news is more important
-- **Consider Ranking Quality** (10%): Not just frequent, but consistently top-ranked
-
-> 💡 Weight adjustment guide: [Configuration Guide - Advanced Configuration](#4-advanced-configuration---hotspot-weight-adjustment)
-
-### **Multi-Channel Real-Time Push**
-
-Supports **WeWork** (+ WeChat push solution), **Feishu**, **DingTalk**, **Telegram**, **Email**, **ntfy**, **Bark**, **Slack** — messages delivered directly to phone and email.
-
-**📌 Multi-Account Push Notes (v3.5.0 New Feature):**
-
-- ✅ **Multi-Account Configuration Support**: All push channels (Feishu, DingTalk, WeWork, Telegram, ntfy, Bark, Slack) support configuring multiple accounts
-- ✅ **Configuration Method**: Use English semicolon `;` to separate multiple account values
-- ✅ **Example**: Set `FEISHU_WEBHOOK_URL` Secret value to `https://webhook1;https://webhook2`
-- ⚠️ **Paired Configuration**: Telegram and ntfy require paired parameter quantities to match (e.g., token and chat_id both have 2 values)
-- ⚠️ **Quantity Limit**: Default maximum 3 accounts per channel, exceeded values will be truncated
-
-### **Flexible Storage Architecture (v4.0.0 Major Update)**
-
-**Multi-Backend Support**:
-- ☁️ **Remote Cloud Storage**: GitHub Actions environment default, supports S3-compatible protocols (R2/OSS/COS, etc.), data stored in cloud, keeping repository clean
-- 💾 **Local SQLite**: Traditional SQLite database, stable and efficient (Docker/local deployment)
-- 🔀 **Auto Selection**: Auto-selects appropriate backend based on runtime environment
-
-**Data Format Hierarchy**:
-
-| Format | Role | Description |
-|--------|------|-------------|
-| **SQLite** | Primary storage | Complete data with statistics information |
-| **TXT** | Human-readable backup | Optional text records for manual viewing |
-| **HTML** | Web report | Beautiful visual report (GitHub Pages) |
-
-**Data Management Features**:
-- Auto data cleanup (configurable retention period)
-- Timezone support (configurable IANA time zone)
-- Cloud/local seamless switching
-
-> 💡 For storage configuration details, see [Configuration Details - Storage Configuration](#11-storage-configuration-v400-new)
-
-### **Multi-Platform Deployment**
-- **GitHub Actions**: Cloud automated operations (7-day check-in cycle + remote cloud storage)
-- **Docker Deployment**: Supports multi-architecture containerized operation
-- **Local Running**: Python environment direct execution
-
-
-### **AI Smart Analysis (v3.0.0 New)**
-
-AI conversational analysis system based on MCP (Model Context Protocol), enabling deep data mining with natural language.
-
-- **Conversational Query**: Ask in natural language, like "Query yesterday's Zhihu trending" or "Analyze recent Bitcoin popularity trends"
-- **14 Analysis Tools**: Date parsing, basic query, smart search, trend analysis, data insights, sentiment analysis, etc.
-- **Multi-Client Support**: Cherry Studio (GUI config), Claude Desktop, Cursor, Cline, etc.
-- **Deep Analysis Capabilities**:
-  - Topic trend tracking (popularity changes, lifecycle, viral detection, trend prediction)
-  - Cross-platform data comparison (activity stats, keyword co-occurrence)
-  - Smart summary generation, similar news finding, historical correlation search
-
-> **💡 Usage Tip**: AI features require local news data support
-> - Project includes **November 1-15** test data for immediate experience
-> - Recommend deploying the project yourself to get more real-time data
->
-> See [AI Analysis](#-ai-analysis) for details
-
-### **Zero Technical Barrier Deployment**
-
-One-click GitHub Fork to use, no programming required.
-
-> 30-second deployment: GitHub Pages (web browsing) supports one-click save as image for easy sharing
->
-> 1-minute deployment: WeWork (mobile notification)
-
-**💡 Tip:** Want a **real-time updated** web version? After forking, go to your repo Settings → Pages and enable GitHub Pages. [Preview Effect](https://sansan0.github.io/TrendRadar/).
-
-### **Reduce APP Dependencies**
-
-Transform from "algorithm recommendation captivity" to "actively getting the information you want"
-
-**Target Users:** Investors, content creators, PR professionals, news-conscious general users
-
-**Typical Scenarios:** Stock investment monitoring, brand sentiment tracking, industry trend watching, lifestyle news gathering
-
-
-| Github Pages Effect (Mobile Adapted, Email Push) | Feishu Push Effect |
-|:---:|:---:|
-| ![Github Pages Effect](_image/github-pages.png) | ![Feishu Push Effect](_image/feishu.jpg) |
+## 🪄 Sponsors
 
 <br>
 
 ## 📝 Changelog
 
->**Upgrade Instructions**:
-- **📌 Check Latest Updates**: **[Original Repository Changelog](https://github.com/sansan0/TrendRadar?tab=readme-ov-file#-changelog)**
-- **Tip**: Do NOT update this project via **Sync fork**. Check [Changelog] to understand specific [Upgrade Methods] and [Features]
-- **Major Version Upgrade**: Upgrading from v1.x to v2.y, recommend deleting existing fork and re-forking to save effort and avoid config conflicts
-
+>**📌 Check Latest Updates**: **[Original Repository Changelog](https://github.com/sansan0/TrendRadar?tab=readme-ov-file#-changelog)**:
+- **Tip**: Check [Changelog] to understand specific [Features]
 
 ### 2025/12/20 - v4.0.3
 
@@ -336,6 +128,21 @@ Transform from "algorithm recommendation captivity" to "actively getting the inf
 - Fixed incremental mode detection logic to correctly identify historical titles
 
 
+### 2025/12/26 - mcp-v1.2.0
+
+  **MCP Module Update - Optimized toolset, added aggregation & comparison features, merged redundant tools:**
+  - Added `aggregate_news` tool - Cross-platform news deduplication and aggregation
+  - Added `compare_periods` tool - Period comparison analysis (week-over-week/month-over-month)
+  - Merged `find_similar_news` + `search_related_news_history` → `find_related_news`
+  - Enhanced `get_trending_topics` - Added `auto_extract` mode for automatic trending extraction
+  - Fixed miscellaneous bugs
+  - Updated README-MCP-FAQ.md documentation in both Chinese and English (Q1-Q18)
+
+
+<details>
+<summary>👉 Click to expand: <strong>Historical Updates</strong></summary>
+
+
 ### 2025/12/13 - mcp-v1.1.0
 
 **MCP Module Update:**
@@ -346,10 +153,6 @@ Transform from "algorithm recommendation captivity" to "actively getting the inf
   - `list_available_dates`: List available dates in local/remote storage
 
 
-<details>
-<summary>👉 Click to expand: <strong>Historical Updates</strong></summary>
-
-
 ### 2025/12/17 - v4.0.1
 
 - StorageManager adds push record proxy methods
@@ -832,49 +635,201 @@ frequency_words.txt file added **required word** feature, using + sign
 
 <br>
 
-## 🚀 Quick Start
+## ✨ Core Features
 
-> **📖 Reminder**: Fork users should first **[check the latest official documentation](https://github.com/sansan0/TrendRadar?tab=readme-ov-file)** to ensure the configuration steps are up to date.
+### **Multi-Platform Trending News Aggregation**
 
-**⚠️ GitHub Actions Usage Instructions**
+- Zhihu (知乎)
+- Douyin (抖音)
+- Bilibili Hot Search
+- Wallstreetcn (华尔街见闻)
+- Tieba (贴吧)
+- Baidu Hot Search
+- Yicai (财联社)
+- Thepaper (澎湃新闻)
+- Ifeng (凤凰网)
+- Toutiao (今日头条)
+- Weibo (微博)
 
-**v4.0.0 Important Change**: Introduced "Activity Detection" mechanism—GitHub Actions now requires periodic check-in to maintain operation.
+Default monitoring of 11 mainstream platforms, with support for adding custom platforms.
 
-**🔄 Check-In Renewal Mechanism**:
-- **Running Cycle**: Valid for **7 days**—service will automatically suspend when countdown ends.
-- **Renewal Method**: Manually trigger the "Check In" workflow on the Actions page to reset the 7-day validity period.
-- **Operation Path**: `Actions` → `Check In` → `Run workflow`
-- **Design Philosophy**:
-    - If you forget for 7 days, maybe you don't really need it. Letting it stop is a digital detox, freeing you from the constant impact.
-    - GitHub Actions is a valuable public computing resource. The check-in mechanism aims to prevent wasted computing cycles, ensuring resources are allocated to truly active users who need them. Thank you for your understanding and support.
+> 💡 For detailed configuration, see [Configuration Guide - Platform Configuration](#1-platform-configuration)
 
-<details>
-<summary>👉 Click to expand: <strong>Lite Mode vs Full Mode + AI Analysis</strong></summary>
-<br>
+### **Smart Push Strategies**
 
-**📦 Data Storage (Recommended Configuration)**
+**Three Push Modes**:
 
-**Two Deployment Modes:**
+| Mode | Target Users | Push Feature |
+|------|--------------|--------------|
+| **Daily Summary** (daily) | Managers/Regular Users | Push all matched news of the day (includes previously pushed) |
+| **Current Rankings** (current) | Content Creators | Push current ranking matches (continuously ranked news appear each time) |
+| **Incremental Monitor** (incremental) | Traders/Investors | Push only new content, zero duplication |
 
-| Mode | Configuration Required | Features |
-|------|------------------------|----------|
-| **Lite Mode** | No storage configuration needed | Real-time crawling + Keyword filtering + Multi-channel push |
-| **Full Mode** | Configure remote cloud storage | Lite Mode + New detection + Trend tracking + Incremental push + AI analysis |
+> 💡 **Quick Selection Guide:**
+> - 🔄 Don't want duplicate news → Use `incremental`
+> - 📊 Want complete ranking trends → Use `current`
+> - 📝 Need daily summary reports → Use `daily`
+>
+> For detailed comparison and configuration, see [Configuration Guide - Push Mode Details](#3-push-mode-details)
 
-**Lite Mode Description**:
-- ✅ Available: Real-time news crawling, keyword filtering, hotspot weight ranking, current list push
-- ❌ Not Available: New news detection (🆕), trend tracking, incremental mode, daily summary accumulation, MCP AI analysis
+**Additional Features** (Optional):
 
-**Full Mode Description**:
-Configure remote cloud storage to unlock all features (see **Recommended Configuration: Remote Cloud Storage** below)
+| Feature | Description | Default |
+|---------|-------------|---------|
+| **Push Time Window Control** | Set push time range (e.g., 09:00-18:00) to avoid non-work hours notifications | Disabled |
+| **Content Order Configuration** | Adjust display order of "Trending Keywords Stats" and "New Trending News" (v3.5.0 new) | Stats first |
 
-**🚀 Recommended: Docker Deployment**
+> 💡 For detailed configuration, see [Configuration Guide - Report Configuration](#7-report-configuration) and [Configuration Guide - Push Window](#8-push-window-configuration)
 
-For long-term stable operation, we recommend [Docker Deployment](#6-docker-deployment), with data stored locally and no check-in required—though it does require purchasing a cloud server.
+### **Precise Content Filtering**
 
-</details>
+Set personal keywords (e.g., AI, BYD, Education Policy) to receive only relevant trending news, filtering out noise.
 
----
+**Basic Syntax** (5 types):
+- Normal words: Basic matching
+- Required words `+`: Narrow scope
+- Filter words `!`: Exclude noise
+- Count limit `@`: Control display count (v3.2.0 new)
+- Global filter `[GLOBAL_FILTER]`: Globally exclude specified content (v3.5.0 new)
+
+**Advanced Features** (v3.2.0 new):
+- 🔢 **Keyword Sorting Control**: Sort by popularity or config order
+- 📊 **Display Count Limit**: Global config + individual override for flexible control
+
+**Group-based Management**:
+- Separate with blank lines, independent statistics for different topics
+
+> 💡 **Basic Configuration**: [Keyword Configuration - Basic Syntax](#keyword-basic-syntax)
+>
+> 💡 **Advanced Configuration**: [Keyword Configuration - Advanced Settings](#keyword-advanced-settings)
+>
+> 💡 You can also skip filtering and receive all trending news (leave frequency_words.txt empty)
+
+
+### **Trending Analysis**
+
+Real-time tracking of news popularity changes helps you understand not just "what's trending" but "how trends evolve."
+
+- **Timeline Tracking**: Records complete time span from first to last appearance
+- **Popularity Changes**: Tracks ranking changes and appearance frequency across time periods
+- **New Detection**: Real-time identification of emerging topics, marked with 🆕
+- **Continuity Analysis**: Distinguishes between one-time hot topics and continuously developing news
+- **Cross-Platform Comparison**: Same news across different platforms, showing media attention differences
+
+> 💡 Push format reference: [Configuration Guide - Push Format Reference](#5-push-format-reference)
+
+### **Personalized Trending Algorithm**
+
+No longer controlled by platform algorithms, TrendRadar reorganizes all trending searches:
+
+- **Prioritize High-Ranking News** (60%): Top-ranked news from each platform appears first
+- **Focus on Persistent Topics** (30%): Repeatedly appearing news is more important
+- **Consider Ranking Quality** (10%): Not just frequent, but consistently top-ranked
+
+> 💡 Weight adjustment guide: [Configuration Guide - Advanced Configuration](#4-advanced-configuration---hotspot-weight-adjustment)
+
+### **Multi-Channel Real-Time Push**
+
+Supports **WeWork** (+ WeChat push solution), **Feishu**, **DingTalk**, **Telegram**, **Email**, **ntfy**, **Bark**, **Slack** — messages delivered directly to phone and email.
+
+**📌 Multi-Account Push Notes (v3.5.0 New Feature):**
+
+- ✅ **Multi-Account Configuration Support**: All push channels (Feishu, DingTalk, WeWork, Telegram, ntfy, Bark, Slack) support configuring multiple accounts
+- ✅ **Configuration Method**: Use English semicolon `;` to separate multiple account values
+- ✅ **Example**: Set `FEISHU_WEBHOOK_URL` Secret value to `https://webhook1;https://webhook2`
+- ⚠️ **Paired Configuration**: Telegram and ntfy require paired parameter quantities to match (e.g., token and chat_id both have 2 values)
+- ⚠️ **Quantity Limit**: Default maximum 3 accounts per channel, exceeded values will be truncated
+
+### **Flexible Storage Architecture (v4.0.0 Major Update)**
+
+**Multi-Backend Support**:
+- ☁️ **Remote Cloud Storage**: GitHub Actions environment default, supports S3-compatible protocols (R2/OSS/COS, etc.), data stored in cloud, keeping repository clean
+- 💾 **Local SQLite**: Traditional SQLite database, stable and efficient (Docker/local deployment)
+- 🔀 **Auto Selection**: Auto-selects appropriate backend based on runtime environment
+
+**Data Format Hierarchy**:
+
+| Format | Role | Description |
+|--------|------|-------------|
+| **SQLite** | Primary storage | Complete data with statistics information |
+| **TXT** | Human-readable backup | Optional text records for manual viewing |
+| **HTML** | Web report | Beautiful visual report (GitHub Pages) |
+
+**Data Management Features**:
+- Auto data cleanup (configurable retention period)
+- Timezone support (configurable IANA time zone)
+- Cloud/local seamless switching
+
+> 💡 For storage configuration details, see [Configuration Details - Storage Configuration](#11-storage-configuration-v400-new)
+
+### **Multi-Platform Deployment**
+- **GitHub Actions**: Cloud automated operations (7-day check-in cycle + remote cloud storage)
+- **Docker Deployment**: Supports multi-architecture containerized operation
+- **Local Running**: Python environment direct execution
+
+
+### **AI Smart Analysis (v3.0.0 New)**
+
+AI conversational analysis system based on MCP (Model Context Protocol), enabling deep data mining with natural language.
+
+- **Conversational Query**: Ask in natural language, like "Query yesterday's Zhihu trending" or "Analyze recent Bitcoin popularity trends"
+- **14 Analysis Tools**: Date parsing, basic query, smart search, trend analysis, data insights, sentiment analysis, etc.
+- **Multi-Client Support**: Cherry Studio (GUI config), Claude Desktop, Cursor, Cline, etc.
+- **Deep Analysis Capabilities**:
+  - Topic trend tracking (popularity changes, lifecycle, viral detection, trend prediction)
+  - Cross-platform data comparison (activity stats, keyword co-occurrence)
+  - Smart summary generation, similar news finding, historical correlation search
+
+> **💡 Usage Tip**: AI features require local news data support
+> - Project includes **November 1-15** test data for immediate experience
+> - Recommend deploying the project yourself to get more real-time data
+>
+> See [AI Analysis](#-ai-analysis) for details
+
+### **Zero Technical Barrier Deployment**
+
+One-click GitHub Fork to use, no programming required.
+
+> 30-second deployment: GitHub Pages (web browsing) supports one-click save as image for easy sharing
+>
+> 1-minute deployment: WeWork (mobile notification)
+
+**💡 Tip:** Want a **real-time updated** web version? After forking, go to your repo Settings → Pages and enable GitHub Pages. [Preview Effect](https://sansan0.github.io/TrendRadar/).
+
+### **Reduce APP Dependencies**
+
+Transform from "algorithm recommendation captivity" to "actively getting the information you want"
+
+**Target Users:** Investors, content creators, PR professionals, news-conscious general users
+
+**Typical Scenarios:** Stock investment monitoring, brand sentiment tracking, industry trend watching, lifestyle news gathering
+
+
+| Github Pages Effect (Mobile Adapted, Email Push) | Feishu Push Effect |
+|:---:|:---:|
+| ![Github Pages Effect](_image/github-pages.png) | ![Feishu Push Effect](_image/feishu.jpg) |
+
+
+<br>
+
+## 🚀 Quick Start
+
+> **📖 Reminder**: You should first **[check the latest official documentation](https://github.com/sansan0/TrendRadar?tab=readme-ov-file)** to ensure the configuration steps are up to date.
+
+### 🛠️ Choose the Deployment Method That Fits You
+
+#### 🅰️ Option A: Docker Deployment (Recommended 🔥)
+
+* **Features**: More stable than GitHub Actions
+* **Best for**: Users with their own server, NAS, or an always-on PC
+
+👉 **[Jump to Docker Deployment Tutorial](#6-docker-deployment)**
+
+#### 🅱️ Option B: GitHub Actions Deployment (This Chapter ⬇️)
+
+* **Features**: Data is stored in **Remote Cloud Storage** (no longer written to Git repo)
+* **Recommended**: Configure cloud storage service (Cloudflare R2 free tier is sufficient, Alibaba Cloud OSS, Tencent Cloud COS, etc.)
+* **Note**: Requires periodic check-in renewal (every 7 days)
 
 1️⃣ **Get project code**
 
@@ -884,7 +839,7 @@ For long-term stable operation, we recommend [Docker Deployment](#6-docker-deplo
    > - Any mention of "Fork" in this document can be understood as "Use this template"
    > - Using Fork may cause runtime issues, see [Issue #606](https://github.com/sansan0/TrendRadar/issues/606)
 
-2️⃣ **Setup GitHub Secrets (Required + Optional Platforms)**:
+2️⃣ **Setup GitHub Secrets**:
 
    In your forked repo, go to `Settings` > `Secrets and variables` > `Actions` > `New repository secret`
 
@@ -895,9 +850,37 @@ For long-term stable operation, we recommend [Docker Deployment](#6-docker-deplo
    - **DO NOT Create Custom Names**: The Secret Name must **strictly use** the names listed below (e.g., `WEWORK_WEBHOOK_URL`, `FEISHU_WEBHOOK_URL`, etc.). Do not modify or create new names arbitrarily, or the system will not recognize them
    - **Can Configure Multiple Platforms**: The system will send notifications to all configured platforms
 
-<details>
-<summary>👉 Click to expand: <strong>Multi-Account Push Notes (v3.5.0 New Feature)</strong></summary>
-<br>
+   **GitHub Actions Check-In Renewal Mechanism**:
+   - **Running Cycle**: Valid for **7 days**—service will automatically suspend when countdown ends.
+   - **Renewal Method**: Manually trigger the "Check In" workflow on the Actions page to reset the 7-day validity period.
+   - **Operation Path**: `Actions` → `Check In` → `Run workflow`
+   - **Design Philosophy**:
+     - If you forget for 7 days, maybe you don't really need it. Letting it stop is a digital detox, freeing you from the constant impact.
+     - GitHub Actions is a valuable public computing resource. The check-in mechanism aims to prevent wasted computing cycles, ensuring resources are allocated to truly active users who need them. Thank you for your understanding and support.
+
+   <details>
+   <summary>👉 Click to expand: <strong>Lite Mode vs Full Mode + AI Analysis</strong></summary>
+   <br>
+
+**Two Deployment Modes:**
+
+| Mode | Configuration Required | Features |
+|------|------------------------|----------|
+| **Lite Mode** | No storage configuration needed | Real-time crawling + Keyword filtering + Multi-channel push |
+| **Full Mode** | Configure remote cloud storage | Lite Mode + New detection + Trend tracking + Incremental push + AI analysis |
+
+**Lite Mode Description**:
+- ✅ Available: Real-time news crawling, keyword filtering, hotspot weight ranking, current list push
+- ❌ Not Available: New news detection (🆕), trend tracking, incremental mode, daily summary accumulation, MCP AI analysis
+
+**Full Mode Description**:
+Configure remote cloud storage to unlock all features (see **Recommended Configuration: Remote Cloud Storage** below)
+
+   </details>
+
+   <details>
+   <summary>👉 Click to expand: <strong>Multi-Account Push Notes (v3.5.0 New Feature)</strong></summary>
+   <br>
 
 - **Multi-Account Configuration Support**: All push channels (Feishu, DingTalk, WeWork, Telegram, ntfy, Bark, Slack) support configuring multiple accounts
 - **Configuration Method**: Use English semicolon `;` to separate multiple account values
@@ -915,7 +898,7 @@ For long-term stable operation, we recommend [Docker Deployment](#6-docker-deplo
 | `NTFY_TOPIC` | `topic1;topic2` |
 | `NTFY_TOKEN` | `;token2` (1st has no token, use empty string as placeholder) |
 
-</details>
+   </details>
 
    **Configuration Example:**
 
@@ -3494,53 +3477,6 @@ Any client supporting Model Context Protocol can connect to TrendRadar:
 
 </details>
 
-<br>
-
-## ☕ FAQ & Support
-
-> If you want to support this project, you can search **Tencent Charity** on WeChat and donate to **Education Support Programs** as you wish
->
-> Thanks to those who participated in the **one-yuan donation**! You are listed in the **Acknowledgments** at the top. Your support gives more motivation to open source maintenance. Personal donation QR code has been removed.
-
-- **GitHub Issues**: Suitable for targeted answers. Please provide complete info when asking (screenshots, error logs, system environment, etc.)
-- **WeChat Official Account**: Suitable for quick consultation. Suggest priority to communicate in public comment area of related articles. If private message, please use polite language 😉
-- 💡 Deployment successful? Come to our official account to share your experience! Your likes and suggestions are the driving force for continuous updates~
-
-
-<div align="center">
-
-| WeChat Official Account |
-|:---:|
-| <img src="_image/weixin.png" width="400" title="Silicon Tea Room"/> |
-
-</div>
-
-<br>
-
----
-
-## 🪄 Sponsors
-
-> Tracking so many trending topics daily, writing reports, replying messages making your wrists tired?
->
-> Try「FlashSpeak」AI Voice Input - Speak instead of type, 4x faster ⚡
->
-> On-device Model • Lightning Fast • Absolute Privacy • Mac/Win Support
->
-> From reading trends to content output, double your efficiency 👇
-
-<div align="center">
-
-[![Mac Download](https://img.shields.io/badge/Mac-Free_Download-FF6B6B?style=for-the-badge&logo=apple&logoColor=white)](https://shandianshuo.cn) [![Windows Download](https://img.shields.io/badge/Windows-Free_Download-FF6B6B?style=for-the-badge&logo=lightning&logoColor=white)](https://shandianshuo.cn)
-<a href="https://shandianshuo.cn" target="_blank">
-  <img src="_image/banner-shandianshuo.png" alt="FlashSpeak" width="700"/>
-</a>
-</div>
-
-
-
----
-
 
 ### Common Questions
 
@@ -3624,6 +3560,29 @@ Any client supporting Model Context Protocol can connect to TrendRadar:
 
 <br>
 
+## ☕ FAQ & Support
+
+> If you want to support this project, you can search **Tencent Charity** on WeChat and donate to **Education Support Programs** as you wish
+>
+> Thanks to those who participated in the **one-yuan donation**! You are listed in the **Acknowledgments** at the top. Your support gives more motivation to open source maintenance. Personal donation QR code has been removed.
+>
+> 🎯 Interested in sponsoring this project? Your banner will be displayed in the Sponsors section at the top.
+
+- **GitHub Issues**: Suitable for targeted answers. Please provide complete info when asking (screenshots, error logs, system environment, etc.)
+- **WeChat Official Account**: Suitable for quick consultation. Suggest priority to communicate in public comment area of related articles. If private message, please use polite language 😉
+- **Contact**: path@linux.do
+
+
+<div align="center">
+
+| WeChat Official Account |
+|:---:|
+| <img src="_image/weixin.png" width="400" title="Silicon Tea Room"/> |
+
+</div>
+
+<br>
+
 ## 📚 Related Projects
 
 > **4 Related Articles** (Chinese):

+ 134 - 37
README-MCP-FAQ-EN.md

@@ -21,7 +21,7 @@ The following optimization strategies are adopted by default, mainly to save AI
 
 **⚠️ Important:** The choice of AI model directly affects the tool call effectiveness. The smarter the AI, the more accurate the calls. When you remove the above restrictions, for example, from querying today to querying a week, first you need to have a week's data locally, and secondly, token consumption may multiply (why "may", for example, if I query "analyze 'Apple' trend in the last week", if there isn't much Apple news in that week, then token consumption may actually be less).
 
-**💡 Tip:** This project provides a dedicated date parsing tool `resolve_date_range`, which can accurately parse natural language date expressions like "last 7 days", "this week", ensuring all AI models get consistent date ranges. Recommended to use this tool first, see Q14 below for details.
+**💡 Tip:** This project provides a dedicated date parsing tool `resolve_date_range`, which can accurately parse natural language date expressions like "last 7 days", "this week", ensuring all AI models get consistent date ranges. Recommended to use this tool first, see Q18 below for details.
 
 
 ## 💰 AI Models
@@ -129,21 +129,32 @@ After testing one query, please immediately check the [SiliconFlow Billing](http
 
 ---
 
-### Q3: How to view my followed topic frequency statistics?
+### Q3: How to view trending topic statistics?
 
 **You can ask like this:**
 
-- "How many times did my followed words appear today"
-- "Check which words in my follow list are most popular"
-- "Count the frequency of followed words in frequency_words.txt"
+- "How many times did my followed words appear today" (using preset keywords)
+- "Automatically analyze what hot topics are in today's news" (auto extract)
+- "See what are the hottest words in the news" (auto extract)
 
 **Tool called:** `get_trending_topics`
 
-**Important note:**
+**Two extraction modes:**
 
-- This tool **does not** automatically extract news hotspots
-- Rather, it counts your **personal followed words** set in `config/frequency_words.txt`
-- This is a **customizable** list, you can add followed words based on your interests
+| Mode | Description | Example Question |
+|------|------|---------|
+| **keywords** | Count preset followed words (based on `config/frequency_words.txt`, default) | "How many times did my followed words appear" |
+| **auto_extract** | Auto-extract high-frequency words from news titles (no preset needed) | "Auto-analyze hot topics" |
+
+**Usage examples:**
+
+```
+# Use preset followed words (default mode)
+get_trending_topics(mode="current")
+
+# Auto-extract high-frequency words (new feature)
+get_trending_topics(extract_mode="auto_extract", top_n=20)
+```
 
 ---
 
@@ -198,22 +209,30 @@ AI: (date_range={"start": "2025-01-01", "end": "2025-01-31"})
 
 ---
 
-### Q5: How to find historical related news?
+### Q5: How to find related news?
 
 **You can ask like this:**
 
-- "Find news related to 'AI breakthrough' from yesterday"
-- "Search for historical reports about 'Tesla' from last week"
-- "Find news related to 'ChatGPT' from last month"
-- "Look for historical news related to 'iPhone launch event'"
+- "Find news similar to 'Tesla price cut'" (today)
+- "Find news related to 'AI breakthrough' from yesterday" (history)
+- "Search for historical reports about 'Tesla' from last week" (history)
+- "See if there are reports similar to this news in the last 7 days" (history)
 
-**Tool called:** `search_related_news_history`
+**Tool called:** `find_related_news`
+
+**Supported time ranges:**
+
+| Method | Description | Example |
+|--------|-------------|---------|
+| Not specified | Only query today's data (default) | "Find similar news" |
+| Preset values | yesterday, last_week, last_month | "Find related news from yesterday" |
+| Date range | `{"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}` | "Find related reports from Jan 1 to 7" |
 
 **Tool return behavior:**
 
-- Searches yesterday's data
-- Similarity threshold 0.4
+- Similarity threshold 0.5 (adjustable)
 - MCP tool returns up to 50 results to AI
+- Sorted by similarity
 - Does not include URL links
 
 **AI display behavior (Important):**
@@ -221,6 +240,12 @@ AI: (date_range={"start": "2025-01-01", "end": "2025-01-31"})
 - ⚠️ **AI usually auto-summarizes**, only showing partial related news
 - ✅ If you want to see all, need to explicitly request: "show all related news"
 
+**Can be adjusted:**
+
+- Specify time: like "find from last week"
+- Adjust threshold: like "similarity above 0.3"
+- Include links: say "need links"
+
 ---
 
 ## Trend Analysis
@@ -330,27 +355,54 @@ AI: (date_range={"start": "2024-12-01", "end": "2024-12-31"})
 
 ---
 
-### Q9: How to find similar news reports?
+### Q9: How to get deduplicated cross-platform news?
 
 **You can ask like this:**
 
-- "Find news similar to 'Tesla price cut'"
-- "Find similar reports about iPhone launch"
-- "See if there are reports similar to this news"
-- "Find similar news, need links"
+- "Help me aggregate today's news, remove duplicates"
+- "See which news is reported on multiple platforms"
+- "Show me deduplicated hotspot news"
+- "Which news are cross-platform hot topics"
 
-**Tool called:** `find_similar_news`
+**Tool called:** `aggregate_news`
 
-**Tool return behavior:**
+**Tool functionality:**
 
-- Similarity threshold 0.6
-- MCP tool returns up to 50 results to AI
-- Does not include URL links
+- Automatically identifies the same event reported by different platforms
+- Merges similar news into one aggregated news item
+- Shows platform coverage for each news item
+- Calculates comprehensive heat weight
 
-**AI display behavior (Important):**
+**Return information:**
+
+| Field | Description |
+|-------|-------------|
+| **representative_title** | Representative title |
+| **platforms** | List of covered platforms |
+| **platform_count** | Number of covered platforms |
+| **is_cross_platform** | Whether it's cross-platform news |
+| **best_rank** | Best ranking |
+| **aggregate_weight** | Comprehensive weight |
+| **sources** | Details from each platform source |
+
+**Can be adjusted:**
+
+- Specify time: like "from last week"
+- Adjust similarity threshold: like "stricter matching" (0.8) or "looser matching" (0.5)
+- Specify platform: like "only Zhihu and Weibo"
 
-- ⚠️ **AI usually auto-summarizes**, only showing partial similar news
-- ✅ If you want to see all, need to explicitly request: "show all similar news"
+**Usage examples:**
+
+```
+# Default aggregate today's news
+aggregate_news()
+
+# Stricter similarity matching
+aggregate_news(similarity_threshold=0.8)
+
+# Specify date range
+aggregate_news(date_range={"start": "2025-01-01", "end": "2025-01-07"})
+```
 
 ---
 
@@ -371,9 +423,54 @@ AI: (date_range={"start": "2024-12-01", "end": "2024-12-31"})
 
 ---
 
+### Q11: How to compare hotspot changes across different periods?
+
+**You can ask like this:**
+
+- "Compare this week and last week's hotspot changes"
+- "See what's different between this month and last month"
+- "Analyze 'artificial intelligence' heat difference in two periods"
+- "Compare platform activity changes"
+
+**Tool called:** `compare_periods`
+
+**Three comparison modes:**
+
+| Mode | Description | Use Case |
+|------|-------------|----------|
+| **overview** | Overall overview | News count change, keyword change, TOP news comparison |
+| **topic_shift** | Topic change analysis | Rising topics, falling topics, newly appeared topics |
+| **platform_activity** | Platform activity comparison | News count change by platform, fastest/slowest growing platforms |
+
+**Time period presets:**
+
+- `today` / `yesterday`: Today/Yesterday
+- `this_week` / `last_week`: This week/Last week
+- `this_month` / `last_month`: This month/Last month
+- Or use custom date range: `{"start": "2025-01-01", "end": "2025-01-07"}`
+
+**Usage examples:**
+
+```
+# Week-over-week analysis
+compare_periods(period1="last_week", period2="this_week")
+
+# Topic shift analysis
+compare_periods(period1="last_month", period2="this_month", compare_type="topic_shift")
+
+# Focus on specific topic
+compare_periods(
+    period1={"start": "2025-01-01", "end": "2025-01-07"},
+    period2={"start": "2025-01-08", "end": "2025-01-14"},
+    topic="artificial intelligence"
+)
+```
+
+---
+
 ## System Management
 
-### Q11: How to view system configuration?
+### Q12: How to view system configuration?
 
 **You can ask like this:**
 
@@ -393,7 +490,7 @@ AI: (date_range={"start": "2024-12-01", "end": "2024-12-31"})
 
 ---
 
-### Q12: How to check system running status?
+### Q13: How to check system running status?
 
 **You can ask like this:**
 
@@ -413,7 +510,7 @@ AI: (date_range={"start": "2024-12-01", "end": "2024-12-31"})
 
 ---
 
-### Q13: How to manually trigger a crawl task?
+### Q14: How to manually trigger a crawl task?
 
 **You can ask like this:**
 
@@ -452,7 +549,7 @@ AI: (date_range={"start": "2024-12-01", "end": "2024-12-31"})
 
 ## Storage Sync
 
-### Q14: How to sync data from remote storage to local?
+### Q15: How to sync data from remote storage to local?
 
 **You can ask like this:**
 
@@ -484,7 +581,7 @@ Need to configure remote storage in `config/config.yaml` or set environment vari
 
 ---
 
-### Q15: How to view storage status?
+### Q16: How to view storage status?
 
 **You can ask like this:**
 
@@ -505,7 +602,7 @@ Need to configure remote storage in `config/config.yaml` or set environment vari
 
 ---
 
-### Q16: How to view available data dates?
+### Q17: How to view available data dates?
 
 **You can ask like this:**
 
@@ -532,7 +629,7 @@ Need to configure remote storage in `config/config.yaml` or set environment vari
 
 ---
 
-### Q17: How to parse natural language date expressions? (Recommended to use first)
+### Q18: How to parse natural language date expressions? (Recommended to use first)
 
 **You can ask like this:**
 

+ 134 - 37
README-MCP-FAQ.md

@@ -21,7 +21,7 @@
 
 **⚠️ 重要:** AI 模型的选择直接影响工具调用效果,AI 越智能,调用越准确。当你解除上面的限制,比如从今天的查询,放宽到一周的查询,首先你要在本地有一周的数据,其次,token 消耗量可能会倍增(为什么说可能,比如我查询 分析'苹果'最近一周的热度趋势,如果一周中没多少苹果的新闻,那么 token消耗量可能反而很少)
 
-**💡 提示:** 本项目提供了专门的日期解析工具 `resolve_date_range`,可以准确解析"最近7天"、"本周"等自然语言日期表达式,确保所有 AI 模型获得一致的日期范围。推荐优先使用该工具,详见下方 Q14
+**💡 提示:** 本项目提供了专门的日期解析工具 `resolve_date_range`,可以准确解析"最近7天"、"本周"等自然语言日期表达式,确保所有 AI 模型获得一致的日期范围。推荐优先使用该工具,详见下方 Q18
 
 
 ## 💰 AI 模型
@@ -129,21 +129,32 @@
 
 ---
 
-### Q3: 如何查看我关注的话题频率统计?
+### Q3: 如何查看热点话题统计?
 
 **你可以这样问:**
 
-- "我关注的词今天出现了多少次"
-- "看看我的关注词列表中哪些词最热门"
-- "统计一下 frequency_words.txt 中的关注词频率"
+- "我关注的词今天出现了多少次"(使用预设关注词)
+- "自动分析今天新闻里有哪些热门话题"(自动提取)
+- "看看新闻里最热门的词是什么"(自动提取)
 
 **调用的工具:** `get_trending_topics`
 
-**重要说明:**
+**两种提取模式:**
 
-- 本工具**不是**自动提取新闻热点
-- 而是统计你在 `config/frequency_words.txt` 中设置的**个人关注词**
-- 这是一个**可自定义**的列表,你可以根据兴趣添加关注词
+| 模式 | 说明 | 示例问法 |
+|------|------|---------|
+| **keywords** | 统计预设关注词(基于 `config/frequency_words.txt`,默认) | "我的关注词出现了多少次" |
+| **auto_extract** | 自动从新闻标题提取高频词(无需预设) | "自动分析热门话题" |
+
+**使用示例:**
+
+```
+# 使用预设关注词(默认模式)
+get_trending_topics(mode="current")
+
+# 自动提取高频词(新功能)
+get_trending_topics(extract_mode="auto_extract", top_n=20)
+```
 
 ---
 
@@ -198,22 +209,30 @@ AI:(date_range={"start": "2025-01-01", "end": "2025-01-31"})
 
 ---
 
-### Q5: 如何查找历史相关新闻?
+### Q5: 如何查找相关新闻?
 
 **你可以这样问:**
 
-- "查找昨天与'人工智能突破'相关的新闻"
-- "搜索上周关于'特斯拉'的历史报道"
-- "找出上个月与'ChatGPT'相关的新闻"
-- "看看'iPhone 发布会'相关的历史新闻"
+- "找出和'特斯拉降价'相似的新闻"(今天)
+- "查找昨天与'人工智能突破'相关的新闻"(历史)
+- "搜索上周关于'ChatGPT'的相关报道"(历史)
+- "看看最近7天有没有和这条新闻相似的报道"(历史)
+
+**调用的工具:** `find_related_news`
+
+**支持的时间范围:**
 
-**调用的工具:** `search_related_news_history`
+| 方式 | 说明 | 示例 |
+|------|------|------|
+| 不指定 | 只查询今天的数据(默认) | "找相似新闻" |
+| 预设值 | yesterday, last_week, last_month | "查找昨天的相关新闻" |
+| 日期范围 | `{"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}` | "查找1月1日到7日的相关报道" |
 
 **工具返回行为:**
 
-- 搜索昨天的数据
-- 相似度阈值 0.4
+- 相似度阈值 0.5(可调整)
 - MCP 工具会返回最多 50 条结果给 AI
+- 按相似度排序
 - 不包含 URL 链接
 
 **AI 展示行为(重要):**
@@ -221,6 +240,12 @@ AI:(date_range={"start": "2025-01-01", "end": "2025-01-31"})
 - ⚠️ **AI 通常会自动总结**,只展示部分相关新闻
 - ✅ 如果你想看全部,需要明确要求:"展示所有相关新闻"
 
+**可以调整:**
+
+- 指定时间:如"查找上周的"
+- 调整阈值:如"相似度 0.3 以上的都要"
+- 包含链接:说"需要链接"
+
 ---
 
 ## 趋势分析
@@ -330,27 +355,54 @@ AI:(date_range={"start": "2024-12-01", "end": "2024-12-31"})
 
 ---
 
-### Q9: 如何查找相似的新闻报道
+### Q9: 如何获取去重后的跨平台新闻
 
 **你可以这样问:**
 
-- "找出和'特斯拉降价'相似的新闻"
-- "查找关于 iPhone 发布的类似报道"
-- "看看有没有和这条新闻相似的报道"
-- "找相似新闻,需要链接"
+- "帮我聚合今天的新闻,去掉重复的"
+- "看看哪些新闻在多个平台都有报道"
+- "给我看去重后的热点新闻"
+- "哪些新闻是跨平台热点"
 
-**调用的工具:** `find_similar_news`
+**调用的工具:** `aggregate_news`
 
-**工具返回行为:**
+**工具功能:**
 
-- 相似度阈值 0.6
-- MCP 工具会返回最多 50 条结果给 AI
-- 不包含 URL 链接
+- 自动识别不同平台报道的同一事件
+- 将相似新闻合并为一条聚合新闻
+- 显示每条新闻的平台覆盖情况
+- 计算综合热度权重
 
-**AI 展示行为(重要):**
+**返回信息:**
+
+| 字段 | 说明 |
+|------|------|
+| **representative_title** | 代表性标题 |
+| **platforms** | 覆盖的平台列表 |
+| **platform_count** | 覆盖平台数量 |
+| **is_cross_platform** | 是否跨平台新闻 |
+| **best_rank** | 最佳排名 |
+| **aggregate_weight** | 综合权重 |
+| **sources** | 各平台来源详情 |
+
+**可以调整:**
 
-- ⚠️ **AI 通常会自动总结**,只展示部分相似新闻
-- ✅ 如果你想看全部,需要明确要求:"展示所有相似新闻"
+- 指定时间:如"最近一周的"
+- 调整相似度阈值:如"更严格匹配"(0.8)或"宽松匹配"(0.5)
+- 指定平台:如"只看知乎和微博"
+
+**使用示例:**
+
+```
+# 默认聚合今天的新闻
+aggregate_news()
+
+# 更严格的相似度匹配
+aggregate_news(similarity_threshold=0.8)
+
+# 指定日期范围
+aggregate_news(date_range={"start": "2025-01-01", "end": "2025-01-07"})
+```
 
 ---
 
@@ -371,9 +423,54 @@ AI:(date_range={"start": "2024-12-01", "end": "2024-12-31"})
 
 ---
 
+### Q11: 如何对比不同时期的热点变化?
+
+**你可以这样问:**
+
+- "对比本周和上周的热点变化"
+- "看看这个月和上个月有什么不同"
+- "分析'人工智能'在两个时期的热度差异"
+- "对比各平台活跃度的变化"
+
+**调用的工具:** `compare_periods`
+
+**三种对比模式:**
+
+| 模式 | 说明 | 适用场景 |
+|------|------|---------|
+| **overview** | 总体概览 | 新闻数量变化、关键词变化、TOP新闻对比 |
+| **topic_shift** | 话题变化分析 | 上升话题、下降话题、新出现话题 |
+| **platform_activity** | 平台活跃度对比 | 各平台新闻数量变化、增长最快/最慢的平台 |
+
+**时间段预设值:**
+
+- `today` / `yesterday`: 今天/昨天
+- `this_week` / `last_week`: 本周/上周
+- `this_month` / `last_month`: 本月/上月
+- 或使用自定义日期范围:`{"start": "2025-01-01", "end": "2025-01-07"}`
+
+**使用示例:**
+
+```
+# 周环比分析
+compare_periods(period1="last_week", period2="this_week")
+
+# 话题变化分析
+compare_periods(period1="last_month", period2="this_month", compare_type="topic_shift")
+
+# 聚焦特定话题
+compare_periods(
+    period1={"start": "2025-01-01", "end": "2025-01-07"},
+    period2={"start": "2025-01-08", "end": "2025-01-14"},
+    topic="人工智能"
+)
+```
+
+---
+
 ## 系统管理
 
-### Q11: 如何查看系统配置?
+### Q12: 如何查看系统配置?
 
 **你可以这样问:**
 
@@ -393,7 +490,7 @@ AI:(date_range={"start": "2024-12-01", "end": "2024-12-31"})
 
 ---
 
-### Q12: 如何检查系统运行状态?
+### Q13: 如何检查系统运行状态?
 
 **你可以这样问:**
 
@@ -413,7 +510,7 @@ AI:(date_range={"start": "2024-12-01", "end": "2024-12-31"})
 
 ---
 
-### Q13: 如何手动触发爬取任务?
+### Q14: 如何手动触发爬取任务?
 
 **你可以这样问:**
 
@@ -452,7 +549,7 @@ AI:(date_range={"start": "2024-12-01", "end": "2024-12-31"})
 
 ## 存储同步
 
-### Q14: 如何从远程存储同步数据到本地?
+### Q15: 如何从远程存储同步数据到本地?
 
 **你可以这样问:**
 
@@ -484,7 +581,7 @@ AI:(date_range={"start": "2024-12-01", "end": "2024-12-31"})
 
 ---
 
-### Q15: 如何查看存储状态?
+### Q16: 如何查看存储状态?
 
 **你可以这样问:**
 
@@ -505,7 +602,7 @@ AI:(date_range={"start": "2024-12-01", "end": "2024-12-31"})
 
 ---
 
-### Q16: 如何查看可用的数据日期?
+### Q17: 如何查看可用的数据日期?
 
 **你可以这样问:**
 
@@ -532,7 +629,7 @@ AI:(date_range={"start": "2024-12-01", "end": "2024-12-31"})
 
 ---
 
-### Q17: 如何解析自然语言日期表达式?(推荐优先使用)
+### Q18: 如何解析自然语言日期表达式?(推荐优先使用)
 
 **你可以这样问:**
 

+ 222 - 253
README.md

@@ -8,13 +8,11 @@
 
 <a href="https://trendshift.io/repositories/14726" target="_blank"><img src="https://trendshift.io/api/badge/repositories/14726" alt="sansan0%2FTrendRadar | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
 
-<a href="https://shandianshuo.cn" target="_blank" title="AI 语音输入,比打字快 4 倍 ⚡"><img src="_image/shandianshuo.png" alt="闪电说 logo" height="55"/></a>
-
 [![GitHub Stars](https://img.shields.io/github/stars/sansan0/TrendRadar?style=flat-square&logo=github&color=yellow)](https://github.com/sansan0/TrendRadar/stargazers)
 [![GitHub Forks](https://img.shields.io/github/forks/sansan0/TrendRadar?style=flat-square&logo=github&color=blue)](https://github.com/sansan0/TrendRadar/network/members)
 [![License](https://img.shields.io/badge/license-GPL--3.0-blue.svg?style=flat-square)](LICENSE)
 [![Version](https://img.shields.io/badge/version-v4.0.3-blue.svg)](https://github.com/sansan0/TrendRadar)
-[![MCP](https://img.shields.io/badge/MCP-v1.1.1-green.svg)](https://github.com/sansan0/TrendRadar)
+[![MCP](https://img.shields.io/badge/MCP-v1.2.0-green.svg)](https://github.com/sansan0/TrendRadar)
 
 [![企业微信通知](https://img.shields.io/badge/企业微信-通知-00D4AA?style=flat-square)](https://work.weixin.qq.com/)
 [![个人微信通知](https://img.shields.io/badge/个人微信-通知-00D4AA?style=flat-square)](https://weixin.qq.com/)
@@ -42,50 +40,19 @@
 
 > 本项目以轻量,易部署为目标
 
-<br>
-
-<details>
-<summary>🚨 <strong>【必读】重要公告:v4.0.0 部署方式与存储架构变更</strong></summary>
-
-<br>
-
-### 🛠️ 请选择适合你的部署方式
-
-#### 🅰️ 方案一:Docker 部署(推荐 🔥)
-
-* **特点**:最稳定、最简单,数据存储在 **本地 SQLite**,完全自主可控。
-
-* **适用**:有自己的服务器、NAS 或长期运行的电脑。
-
-👉 **[跳转到 Docker 部署教程](#6-docker-部署)**
-
----
-
-#### 🅱️ 方案二:GitHub Actions 部署(已恢复 ✅)
-
-* **特点**:数据不再直接写入仓库(Git Commit),而是存储在 **远程云存储**。
-
-* **推荐**:配置一个远程云存储服务(Cloudflare R2、阿里云 OSS、腾讯云 COS 等)。
-
-👉 **[点击查看详细配置教程](#-快速开始)**
-
-</details>
-
-<br>
-
 ## 📑 快速导航
 
 <div align="center">
 
 | [🚀 快速开始](#-快速开始) | [🤖 AI 智能分析](#-ai-智能分析) | [⚙️ 配置详解](#配置详解) | [📝 更新日志](#-更新日志) | [❓ 答疑与交流](#问题答疑与交流) |
 |:---:|:---:|:---:|:---:|:---:|
-| [🐳 Docker部署](#6-docker-部署) | [🔌 MCP客户端](#-mcp-客户端) | [📚 项目相关](#-项目相关) | [🪄 赞助商](#-赞助商) | |
+| [🐳 Docker部署](#6-docker-部署) | [🔌 MCP客户端](#-mcp-客户端) | [📚 项目相关](#-项目相关) | | |
 
 </div>
 
 - 感谢**耐心反馈 bug** 的贡献者,你们的每一条反馈让项目更加完善😉;  
-- 感谢**为项目点 star** 的观众们,**fork** 你所欲也,**star** 我所欲也,两者得兼😍是对开源精神最好的支持;  
-- 感谢**关注[公众号](#问题答疑与交流)** 的读者们,你们的留言、点赞、分享和推荐等积极互动让内容更有温度😎。  
+- 感谢**为项目点 star** 的观众们,**fork** 你所欲也,**star** 我所欲也,两者得兼😍是对开源精神最好的支持; 
+- 感谢**关注[公众号](#问题答疑与交流)** 的读者们,你们的留言、点赞、分享和推荐等积极互动让内容更有温度😎。 
 
 <details>
 <summary>👉 点击展开:<strong>致谢名单</strong> (当前 <strong>🔥73🔥</strong> 位)</summary>
@@ -195,186 +162,14 @@
 
 <br>
 
-## ✨ 核心功能
-
-### **全网热点聚合**
-
-- 知乎
-- 抖音
-- bilibili 热搜
-- 华尔街见闻
-- 贴吧
-- 百度热搜
-- 财联社热门
-- 澎湃新闻
-- 凤凰网
-- 今日头条
-- 微博
-
-默认监控 11 个主流平台,也可自行增加额外的平台
-
-> 💡 详细配置教程见 [配置详解 - 平台配置](#1-平台配置)
-
-### **智能推送策略**
-
-**三种推送模式**:
-
-| 模式 | 适用场景 | 推送特点 |
-|------|---------|---------|
-| **当日汇总** (daily) | 企业管理者/普通用户 | 按时推送当日所有匹配新闻(会包含之前推送过的) |
-| **当前榜单** (current) | 自媒体人/内容创作者 | 按时推送当前榜单匹配新闻(持续在榜的每次都出现) |
-| **增量监控** (incremental) | 投资者/交易员 | 仅推送新增内容,零重复 |
-
-> 💡 **快速选择指南:**
-> - 🔄 不想看到重复新闻 → 用 `incremental`(增量监控)
-> - 📊 想看完整榜单趋势 → 用 `current`(当前榜单)
-> - 📝 需要每日汇总报告 → 用 `daily`(当日汇总)
->
-> 详细对比和配置教程见 [配置详解 - 推送模式详解](#3-推送模式详解)
-
-**附加功能**(可选):
-
-| 功能 | 说明 | 默认 |
-|------|------|------|
-| **推送时间窗口控制** | 设定推送时间范围(如 09:00-18:00),避免非工作时间打扰 | 关闭 |
-| **内容顺序配置** | 调整"热点词汇统计"和"新增热点新闻"的显示顺序(v3.5.0 新增) | 统计在前 |
-
-> 💡 详细配置教程见 [配置详解 - 报告配置](#7-报告配置) 和 [配置详解 - 推送时间窗口](#8-推送时间窗口配置)
-
-### **精准内容筛选**
-
-设置个人关键词(如:AI、比亚迪、教育政策),只推送相关热点,过滤无关信息
-
-**基础语法**(5种):
-- 普通词:基础匹配
-- 必须词 `+`:限定范围
-- 过滤词 `!`:排除干扰
-- 数量限制 `@`:控制显示数量(v3.2.0 新增)
-- 全局过滤 `[GLOBAL_FILTER]`:全局排除指定内容(v3.5.0 新增)
-
-**高级功能**(v3.2.0 新增):
-- 🔢 **关键词排序控制**:按热度优先 or 配置顺序优先
-- 📊 **显示数量精准限制**:全局配置 + 单独配置,灵活控制推送长度
-
-**词组化管理**:
-- 空行分隔,独立统计不同主题热点
-
-> 💡 **基础配置教程**:[关键词配置 - 基础语法](#关键词基础语法)
->
-> 💡 **高级配置教程**:[关键词配置 - 高级配置](#关键词高级配置)
->
-> 💡 也可以不做筛选,完整推送所有热点(将 frequency_words.txt 留空)
-
-### **热点趋势分析**
-
-实时追踪新闻热度变化,让你不仅知道"什么在热搜",更了解"热点如何演变"
-
-- **时间轴追踪**:记录每条新闻从首次出现到最后出现的完整时间跨度
-- **热度变化**:统计新闻在不同时间段的排名变化和出现频次
-- **新增检测**:实时识别新出现的热点话题,用🆕标记第一时间提醒
-- **持续性分析**:区分一次性热点话题和持续发酵的深度新闻
-- **跨平台对比**:同一新闻在不同平台的排名表现,看出媒体关注度差异
-
-> 💡 推送格式说明见 [配置详解 - 推送格式参考](#5-推送格式参考)
-
-### **个性化热点算法**
-
-不再被各个平台的算法牵着走,TrendRadar 会重新整理全网热搜:
-
-- **看重排名高的新闻**(占60%):各平台前几名的新闻优先显示
-- **关注持续出现的话题**(占30%):反复出现的新闻更重要
-- **考虑排名质量**(占10%):不仅多次出现,还经常排在前列
-
-> 💡 这三个比例可以调整,详见 [配置详解 - 热点权重调整](#4-热点权重调整)
-
-### **多渠道实时推送**
-
-支持**企业微信**(+ 微信推送方案)、**飞书**、**钉钉**、**Telegram**、**邮件**、**ntfy**、**Bark**、**Slack**,消息直达手机和邮箱
-
-**📌 多账号推送说明(v3.5.0 新增):**
-
-- ✅ **支持多账号配置**:所有推送渠道(飞书、钉钉、企业微信、Telegram、ntfy、Bark、Slack)均支持配置多个账号
-- ✅ **配置方式**:使用英文分号 `;` 分隔多个账号值
-- ✅ **示例**:`FEISHU_WEBHOOK_URL` 的 Secret 值填写 `https://webhook1;https://webhook2`
-- ⚠️ **配对配置**:Telegram 和 ntfy 需要保证配对参数数量一致(如 token 和 chat_id 都是 2 个)
-- ⚠️ **数量限制**:默认每个渠道最多 3 个账号,超出会被截断
-
-### **灵活存储架构**(v4.0.0 重大更新)
-
-**多存储后端支持**:
-- ☁️ **远程云存储**:GitHub Actions 环境默认,支持 S3 兼容协议(R2/OSS/COS 等),数据存储在云端,不污染仓库
-- 💾 **本地 SQLite 数据库**:Docker/本地环境默认,数据完全可控
-- 🔄 **自动后端选择**:根据运行环境智能切换存储方式
-
-**数据格式**:
-| 格式 | 用途 | 说明 |
-|------|------|------|
-| **SQLite** | 主存储 | 单文件数据库,查询快速,支持 MCP AI 分析 |
-| **TXT** | 可选快照 | 可读文本格式,方便直接查看 |
-| **HTML** | 报告展示 | 精美可视化页面,PC/移动端适配 |
-
-**数据管理**:
-- ✅ 自动清理过期数据(可配置保留天数)
-- ✅ 时区配置支持(全球时区)
-
-> 💡 详细说明见 [配置详解 - 存储配置](#9-存储配置)
-
-### **多端部署**
-- **GitHub Actions**:定时自动爬取 + 远程云存储(需签到续期)
-- **Docker 部署**:支持多架构容器化运行,数据本地存储
-- **本地运行**:Windows/Mac/Linux 直接运行
-
-
-### **AI 智能分析(v3.0.0 新增)**
-
-基于 MCP (Model Context Protocol) 协议的 AI 对话分析系统,让你用自然语言深度挖掘新闻数据
-
-- **对话式查询**:用自然语言提问,如"查询昨天知乎的热点"、"分析比特币最近的热度趋势"
-- **13 种分析工具**:涵盖基础查询、智能检索、趋势分析、数据洞察、情感分析等
-- **多客户端支持**:Cherry Studio(GUI 配置)、Claude Desktop、Cursor、Cline 等
-- **深度分析能力**:
-  - 话题趋势追踪(热度变化、生命周期、爆火检测、趋势预测)
-  - 跨平台数据对比(活跃度统计、关键词共现)
-  - 智能摘要生成、相似新闻查找、历史关联检索
-
-> **💡 使用提示**:AI 功能需要本地新闻数据支持
-> - 项目自带 **11月1-15日** 测试数据,可立即体验
-> - 建议自行部署运行项目,获取更实时的数据
->
-> 详见 [AI 智能分析](#-ai-智能分析)
-
-### **零技术门槛部署**
-
-GitHub 一键 Fork 即可使用,无需编程基础。
-
-> 30秒部署: GitHub Pages(网页浏览)支持一键保存成图片,随时分享给他人
->
-> 1分钟部署: 企业微信(手机通知)
-
-**💡 提示:** 想要**实时更新**的网页版?fork 后,进入你的仓库 Settings → Pages,启用 GitHub Pages。[效果预览](https://sansan0.github.io/TrendRadar/)。
-
-### **减少 APP 依赖**
-
-从"被算法推荐绑架"变成"主动获取自己想要的信息"
-
-**适合人群:** 投资者、自媒体人、企业公关、关心时事的普通用户
-
-**典型场景:** 股市投资监控、品牌舆情追踪、行业动态关注、生活资讯获取
-
-
-| Github Pages 效果(手机端适配、邮箱推送效果) | 飞书推送效果 |
-|:---:|:---:|
-| ![Github Pages效果](_image/github-pages.png) | ![飞书推送效果](_image/feishu.jpg) |
+## 🪄 赞助商
 
 <br>
 
 ## 📝 更新日志
 
->**升级说明**:
-- **📌 查看最新更新**:**[原仓库更新日志](https://github.com/sansan0/TrendRadar?tab=readme-ov-file#-更新日志)**
-- **提示**:不要通过 **Sync fork** 更新本项目,建议查看【历史更新】,明确具体的【升级方式】和【功能内容】
-- **大版本升级**:从 v1.x 升级到 v2.y,建议删除现有 fork 后重新 fork,这样更省力且避免配置冲突
-
+> **📌 查看最新更新**:**[原仓库更新日志](https://github.com/sansan0/TrendRadar?tab=readme-ov-file#-更新日志)** :
+- **提示**:建议查看【历史更新】,明确具体的【功能内容】
 
 
 ### 2025/12/20 - v4.0.3
@@ -383,6 +178,21 @@ GitHub 一键 Fork 即可使用,无需编程基础。
 - 修复增量模式检测逻辑,正确识别历史标题
 
 
+### 2025/12/26 - mcp-v1.2.0
+
+  **MCP 模块更新 - 优化工具集,新增聚合对比功能,合并冗余工具:**
+  - 新增 `aggregate_news` 工具 - 跨平台新闻去重聚合
+  - 新增 `compare_periods` 工具 - 时期对比分析(周环比/月环比)
+  - 合并 `find_similar_news` + `search_related_news_history` → `find_related_news`
+  - 增强 `get_trending_topics` - 新增 `auto_extract` 模式自动提取热点
+  - 修复若干bug
+  - 同步更新 README-MCP-FAQ.md 文档的中英文版 (Q1-Q18)
+
+
+<details>
+<summary>👉 点击展开:<strong>历史更新</strong></summary>
+
+
 ### 2025/12/13 - mcp-v1.1.0
 
   **MCP 模块更新:**
@@ -393,11 +203,6 @@ GitHub 一键 Fork 即可使用,无需编程基础。
     - `list_available_dates`: 列出本地/远程可用日期范围
 
 
-<details>
-<summary>👉 点击展开:<strong>历史更新</strong></summary>
-
-
-
 ### 2025/12/17 - v4.0.1
 
 - StorageManager 添加推送记录代理方法
@@ -869,11 +674,200 @@ frequency_words.txt 文件增加了一个【必须词】功能,使用 + 号
 
 </details>
 
+<br>
+
+## ✨ 核心功能
+
+### **全网热点聚合**
+
+- 知乎
+- 抖音
+- bilibili 热搜
+- 华尔街见闻
+- 贴吧
+- 百度热搜
+- 财联社热门
+- 澎湃新闻
+- 凤凰网
+- 今日头条
+- 微博
+
+默认监控 11 个主流平台,也可自行增加额外的平台
+
+> 💡 详细配置教程见 [配置详解 - 平台配置](#1-平台配置)
+
+### **智能推送策略**
+
+**三种推送模式**:
+
+| 模式 | 适用场景 | 推送特点 |
+|------|---------|---------|
+| **当日汇总** (daily) | 企业管理者/普通用户 | 按时推送当日所有匹配新闻(会包含之前推送过的) |
+| **当前榜单** (current) | 自媒体人/内容创作者 | 按时推送当前榜单匹配新闻(持续在榜的每次都出现) |
+| **增量监控** (incremental) | 投资者/交易员 | 仅推送新增内容,零重复 |
+
+> 💡 **快速选择指南:**
+> - 🔄 不想看到重复新闻 → 用 `incremental`(增量监控)
+> - 📊 想看完整榜单趋势 → 用 `current`(当前榜单)
+> - 📝 需要每日汇总报告 → 用 `daily`(当日汇总)
+>
+> 详细对比和配置教程见 [配置详解 - 推送模式详解](#3-推送模式详解)
+
+**附加功能**(可选):
+
+| 功能 | 说明 | 默认 |
+|------|------|------|
+| **推送时间窗口控制** | 设定推送时间范围(如 09:00-18:00),避免非工作时间打扰 | 关闭 |
+| **内容顺序配置** | 调整"热点词汇统计"和"新增热点新闻"的显示顺序(v3.5.0 新增) | 统计在前 |
+
+> 💡 详细配置教程见 [配置详解 - 报告配置](#7-报告配置) 和 [配置详解 - 推送时间窗口](#8-推送时间窗口配置)
+
+### **精准内容筛选**
+
+设置个人关键词(如:AI、比亚迪、教育政策),只推送相关热点,过滤无关信息
+
+**基础语法**(5种):
+- 普通词:基础匹配
+- 必须词 `+`:限定范围
+- 过滤词 `!`:排除干扰
+- 数量限制 `@`:控制显示数量(v3.2.0 新增)
+- 全局过滤 `[GLOBAL_FILTER]`:全局排除指定内容(v3.5.0 新增)
+
+**高级功能**(v3.2.0 新增):
+- 🔢 **关键词排序控制**:按热度优先 or 配置顺序优先
+- 📊 **显示数量精准限制**:全局配置 + 单独配置,灵活控制推送长度
+
+**词组化管理**:
+- 空行分隔,独立统计不同主题热点
+
+> 💡 **基础配置教程**:[关键词配置 - 基础语法](#关键词基础语法)
+>
+> 💡 **高级配置教程**:[关键词配置 - 高级配置](#关键词高级配置)
+>
+> 💡 也可以不做筛选,完整推送所有热点(将 frequency_words.txt 留空)
+
+### **热点趋势分析**
+
+实时追踪新闻热度变化,让你不仅知道"什么在热搜",更了解"热点如何演变"
+
+- **时间轴追踪**:记录每条新闻从首次出现到最后出现的完整时间跨度
+- **热度变化**:统计新闻在不同时间段的排名变化和出现频次
+- **新增检测**:实时识别新出现的热点话题,用🆕标记第一时间提醒
+- **持续性分析**:区分一次性热点话题和持续发酵的深度新闻
+- **跨平台对比**:同一新闻在不同平台的排名表现,看出媒体关注度差异
+
+> 💡 推送格式说明见 [配置详解 - 推送格式参考](#5-推送格式参考)
+
+### **个性化热点算法**
+
+不再被各个平台的算法牵着走,TrendRadar 会重新整理全网热搜:
+
+- **看重排名高的新闻**(占60%):各平台前几名的新闻优先显示
+- **关注持续出现的话题**(占30%):反复出现的新闻更重要
+- **考虑排名质量**(占10%):不仅多次出现,还经常排在前列
+
+> 💡 这三个比例可以调整,详见 [配置详解 - 热点权重调整](#4-热点权重调整)
+
+### **多渠道实时推送**
+
+支持**企业微信**(+ 微信推送方案)、**飞书**、**钉钉**、**Telegram**、**邮件**、**ntfy**、**Bark**、**Slack**,消息直达手机和邮箱
+
+**📌 多账号推送说明(v3.5.0 新增):**
+
+- ✅ **支持多账号配置**:所有推送渠道(飞书、钉钉、企业微信、Telegram、ntfy、Bark、Slack)均支持配置多个账号
+- ✅ **配置方式**:使用英文分号 `;` 分隔多个账号值
+- ✅ **示例**:`FEISHU_WEBHOOK_URL` 的 Secret 值填写 `https://webhook1;https://webhook2`
+- ⚠️ **配对配置**:Telegram 和 ntfy 需要保证配对参数数量一致(如 token 和 chat_id 都是 2 个)
+- ⚠️ **数量限制**:默认每个渠道最多 3 个账号,超出会被截断
+
+### **灵活存储架构**(v4.0.0 重大更新)
+
+**多存储后端支持**:
+- ☁️ **远程云存储**:GitHub Actions 环境默认,支持 S3 兼容协议(R2/OSS/COS 等),数据存储在云端,不污染仓库
+- 💾 **本地 SQLite 数据库**:Docker/本地环境默认,数据完全可控
+- 🔄 **自动后端选择**:根据运行环境智能切换存储方式
+
+**数据格式**:
+| 格式 | 用途 | 说明 |
+|------|------|------|
+| **SQLite** | 主存储 | 单文件数据库,查询快速,支持 MCP AI 分析 |
+| **TXT** | 可选快照 | 可读文本格式,方便直接查看 |
+| **HTML** | 报告展示 | 精美可视化页面,PC/移动端适配 |
+
+**数据管理**:
+- ✅ 自动清理过期数据(可配置保留天数)
+- ✅ 时区配置支持(全球时区)
+
+> 💡 详细说明见 [配置详解 - 存储配置](#9-存储配置)
+
+### **多端部署**
+- **GitHub Actions**:定时自动爬取 + 远程云存储(需签到续期)
+- **Docker 部署**:支持多架构容器化运行,数据本地存储
+- **本地运行**:Windows/Mac/Linux 直接运行
+
+
+### **AI 智能分析(v3.0.0 新增)**
+
+基于 MCP (Model Context Protocol) 协议的 AI 对话分析系统,让你用自然语言深度挖掘新闻数据
+
+- **对话式查询**:用自然语言提问,如"查询昨天知乎的热点"、"分析比特币最近的热度趋势"
+- **13 种分析工具**:涵盖基础查询、智能检索、趋势分析、数据洞察、情感分析等
+- **多客户端支持**:Cherry Studio(GUI 配置)、Claude Desktop、Cursor、Cline 等
+- **深度分析能力**:
+  - 话题趋势追踪(热度变化、生命周期、爆火检测、趋势预测)
+  - 跨平台数据对比(活跃度统计、关键词共现)
+  - 智能摘要生成、相似新闻查找、历史关联检索
+
+> **💡 使用提示**:AI 功能需要本地新闻数据支持
+> - 项目自带 **11月1-15日** 测试数据,可立即体验
+> - 建议自行部署运行项目,获取更实时的数据
+>
+> 详见 [AI 智能分析](#-ai-智能分析)
+
+### **零技术门槛部署**
+
+GitHub 一键 Fork 即可使用,无需编程基础。
+
+> 30秒部署: GitHub Pages(网页浏览)支持一键保存成图片,随时分享给他人
+>
+> 1分钟部署: 企业微信(手机通知)
+
+**💡 提示:** 想要**实时更新**的网页版?fork 后,进入你的仓库 Settings → Pages,启用 GitHub Pages。[效果预览](https://sansan0.github.io/TrendRadar/)。
+
+### **减少 APP 依赖**
+
+从"被算法推荐绑架"变成"主动获取自己想要的信息"
+
+**适合人群:** 投资者、自媒体人、企业公关、关心时事的普通用户
+
+**典型场景:** 股市投资监控、品牌舆情追踪、行业动态关注、生活资讯获取
+
+
+| Github Pages 效果(手机端适配、邮箱推送效果) | 飞书推送效果 |
+|:---:|:---:|
+| ![Github Pages效果](_image/github-pages.png) | ![飞书推送效果](_image/feishu.jpg) |
+
+
 <br>
 
 ## 🚀 快速开始
 
-> **📖 提醒**:Fork 用户建议先 **[查看最新官方文档](https://github.com/sansan0/TrendRadar?tab=readme-ov-file)**,确保配置步骤是最新的。
+> **📖 提醒**:建议先 **[查看最新官方文档](https://github.com/sansan0/TrendRadar?tab=readme-ov-file)**,确保配置步骤是最新的。
+
+### 🛠️ 请选择适合你的部署方式
+
+#### 🅰️ 方案一:Docker 部署(推荐 🔥)
+
+* **特点**:比 GitHub Actions 更稳定
+* **适用**:有自己的服务器、NAS 或长期运行的电脑
+
+👉 **[跳转到 Docker 部署教程](#6-docker-部署)**
+
+#### 🅱️ 方案二:GitHub Actions 部署(本章节内容 ⬇️)
+
+* **特点**:数据存储在 **远程云存储**(不再写入 Git 仓库)
+* **推荐**:配置云存储服务(Cloudflare R2 免费额度足够、阿里云 OSS、腾讯云 COS 等)
+* **注意**:需定期签到续期(7天一次)
 
 1️⃣ **获取项目代码**
 
@@ -883,15 +877,18 @@ frequency_words.txt 文件增加了一个【必须词】功能,使用 + 号
    > - 后续文档中提到的 "Fork" 均可理解为 "Use this template"
    > - 使用 Fork 可能导致运行异常,详见 [Issue #606](https://github.com/sansan0/TrendRadar/issues/606)
 
-2️⃣ **设置 GitHub Secrets(必需 + 可选平台)**:
+2️⃣ **设置 GitHub Secrets**:
 
    在你 Fork 后的仓库中,进入 `Settings` > `Secrets and variables` > `Actions` > `New repository secret`
 
-   **⚠️ GitHub Actions 使用说明**
+   **📌 重要说明(请务必仔细阅读):**
 
-   **v4.0.0 重要变更**:引入「活跃度检测」机制,GitHub Actions 需定期签到以维持运行。
+   - **一个 Name 对应一个 Secret**:每添加一个配置项,点击一次"New repository secret"按钮,填写一对"Name"和"Secret"
+   - **保存后看不到值是正常的**:出于安全考虑,保存后重新编辑时,只能看到 Name(名称),看不到 Secret(值)的内容
+   - **严禁自创名称**:Secret 的 Name(名称)必须**严格使用**下方列出的名称(如 `WEWORK_WEBHOOK_URL`、`FEISHU_WEBHOOK_URL` 等),不能自己随意修改或创造新名称,否则系统无法识别
+   - **可以同时配置多个平台**:系统会向所有配置的平台发送通知
 
-   **🔄 签到续期机制**:
+   **GitHub Actions 签到续期机制**:
    - **运行周期**:有效期为 **7 天**,倒计时结束后服务将自动挂起。
    - **续期方式**:在 Actions 页面手动触发 "Check In" workflow,即可重置 7 天有效期。
    - **操作路径**:`Actions` → `Check In` → `Run workflow`
@@ -899,13 +896,6 @@ frequency_words.txt 文件增加了一个【必须词】功能,使用 + 号
      - 如果 7 天都忘了签到,或许这些资讯对你来说并非刚需。适时的暂停,能帮你从信息流中抽离,给大脑留出喘息的空间。
      - GitHub Actions 是宝贵的公共计算资源。引入签到机制旨在避免算力的无效空转,确保资源能分配给真正活跃且需要的用户。感谢你的理解与支持。
 
-   **📌 重要说明(请务必仔细阅读):**
-
-   - **一个 Name 对应一个 Secret**:每添加一个配置项,点击一次"New repository secret"按钮,填写一对"Name"和"Secret"
-   - **保存后看不到值是正常的**:出于安全考虑,保存后重新编辑时,只能看到 Name(名称),看不到 Secret(值)的内容
-   - **严禁自创名称**:Secret 的 Name(名称)必须**严格使用**下方列出的名称(如 `WEWORK_WEBHOOK_URL`、`FEISHU_WEBHOOK_URL` 等),不能自己随意修改或创造新名称,否则系统无法识别
-   - **可以同时配置多个平台**:系统会向所有配置的平台发送通知
-
    <details>
    <summary>👉 点击展开:<strong>轻量模式 vs 完整模式 + AI分析</strong></summary>
    <br>
@@ -924,10 +914,6 @@ frequency_words.txt 文件增加了一个【必须词】功能,使用 + 号
 **完整模式说明**:
 配置远程云存储后解锁全部功能(见下方 **推荐配置:远程云存储**)
 
-**🚀 推荐:Docker 部署**
-
-如需长期稳定运行,建议使用 [Docker 部署](#6-docker-部署),数据存储在本地,无需签到,不过需要额外付费购买云服务器。
-
    </details>
 
    <details>
@@ -3523,10 +3509,12 @@ MCP Inspector 是官方调试工具,用于测试 MCP 连接:
 > 如果你想支持本项目,可通过微信搜索**腾讯公益**,对里面的**助学**相关的项目随心捐助
 >
 > 感谢参与过**一元点赞**的朋友,已收录至顶部**致谢名单**!你们的支持让开源维护更有动力,个人打赏码现已移除。
+>
+> 🎯 如果你有兴趣赞助本项目,你的 Banner 将展示在顶部赞助商位置
 
 - **GitHub Issues**:适合针对性强的解答。提问时请提供完整信息(截图、错误日志、系统环境等)。
 - **公众号交流**:适合快速咨询。建议优先在相关文章下的公共留言区交流,如私信,请文明礼貌用语😉
-- 💡 部署成功了?来公众号说说感受吧,你的点赞和留言都是我继续更新的动力~
+- **联系方式**:path@linux.do
 
 
 <div align="center">
@@ -3537,25 +3525,6 @@ MCP Inspector 是官方调试工具,用于测试 MCP 连接:
 
 </div>
 
-<br>
-
-
-## 🪄 赞助商
-
-> 每天追踪这么多热点,写报告、回复消息是否让手腕疲惫?        
-> 试试「闪电说」AI 语音输入法 —— 用说的,比打字快 4 倍 ⚡ 。从看热点到输出内容,让效率翻倍 👇
-
-<div align="center">
-
-[![Mac下载](https://img.shields.io/badge/Mac-免费下载-FF6B6B?style=for-the-badge&logo=apple&logoColor=white)](https://shandianshuo.cn) [![Windows下载](https://img.shields.io/badge/Windows-免费下载-FF6B6B?style=for-the-badge&logo=lightning&logoColor=white)](https://shandianshuo.cn)
-<a href="https://shandianshuo.cn" target="_blank">
-  <img src="_image/banner-shandianshuo.png" alt="闪电说" width="700"/>
-</a>
-</div>
-
-
-
----
 
 <br>
 

+ 1 - 1
mcp_server/__init__.py

@@ -4,4 +4,4 @@ TrendRadar MCP Server
 提供基于MCP协议的新闻聚合数据查询和系统管理接口。
 """
 
-__version__ = "1.1.1"
+__version__ = "1.2.0"

+ 184 - 87
mcp_server/server.py

@@ -153,31 +153,36 @@ async def get_latest_news(
 @mcp.tool
 async def get_trending_topics(
     top_n: int = 10,
-    mode: str = 'current'
+    mode: str = 'current',
+    extract_mode: str = 'keywords'
 ) -> str:
     """
-    获取个人关注词的新闻出现频率统计(基于 config/frequency_words.txt)
-
-    注意:本工具不是自动提取新闻热点,而是统计你在 config/frequency_words.txt 中
-    设置的个人关注词在新闻中出现的频率。你可以自定义这个关注词列表。
+    获取热点话题统计
 
     Args:
-        top_n: 返回TOP N关注词,默认10
-        mode: 模式选择
-            - daily: 当日累计数据统计
-            - current: 最新一批数据统计(默认)
+        top_n: 返回TOP N话题,默认10
+        mode: 时间模式
+            - "daily": 当日累计数据统计
+            - "current": 最新一批数据统计(默认)
+        extract_mode: 提取模式
+            - "keywords": 统计预设关注词(基于 config/frequency_words.txt,默认)
+            - "auto_extract": 自动从新闻标题提取高频词(无需预设,自动发现热点)
 
     Returns:
-        JSON格式的关注词频率统计列表
+        JSON格式的话题频率统计列表
+
+    Examples:
+        - 使用预设关注词: get_trending_topics(mode="current")
+        - 自动提取热点: get_trending_topics(extract_mode="auto_extract", top_n=20)
     """
     tools = _get_tools()
-    result = tools['data'].get_trending_topics(top_n=top_n, mode=mode)
+    result = tools['data'].get_trending_topics(top_n=top_n, mode=mode, extract_mode=extract_mode)
     return json.dumps(result, ensure_ascii=False, indent=2)
 
 
 @mcp.tool
 async def get_news_by_date(
-    date_query: Optional[str] = None,
+    date_range: Optional[Union[Dict[str, str], str]] = None,
     platforms: Optional[List[str]] = None,
     limit: int = 50,
     include_url: bool = False
@@ -186,10 +191,11 @@ async def get_news_by_date(
     获取指定日期的新闻数据,用于历史数据分析和对比
 
     Args:
-        date_query: 日期查询,可选格式:
-            - 自然语言: "今天", "昨天", "前天", "3天前"
-            - 标准日期: "2024-01-15", "2024/01/15"
-            - 默认值: "今天"(节省token)
+        date_range: 日期范围,支持多种格式:
+            - 范围对象: {"start": "2025-01-01", "end": "2025-01-07"}
+            - 自然语言: "今天", "昨天", "本周", "最近7天"
+            - 单日字符串: "2025-01-15"
+            - 默认值: "今天"
         platforms: 平台ID列表,如 ['zhihu', 'weibo', 'douyin']
                    - 不指定时:使用 config.yaml 中配置的所有平台
                    - 支持的平台来自 config/config.yaml 的 platforms 配置
@@ -215,7 +221,7 @@ async def get_news_by_date(
     """
     tools = _get_tools()
     result = tools['data'].get_news_by_date(
-        date_query=date_query,
+        date_range=date_range,
         platforms=platforms,
         limit=limit,
         include_url=include_url
@@ -232,7 +238,7 @@ async def analyze_topic_trend(
     analysis_type: str = "trend",
     date_range: Optional[Union[Dict[str, str], str]] = None,
     granularity: str = "day",
-    threshold: float = 3.0,
+    spike_threshold: float = 3.0,
     time_window: int = 24,
     lookahead_hours: int = 6,
     confidence_threshold: float = 0.7
@@ -257,7 +263,7 @@ async def analyze_topic_trend(
                     - **获取方式**: 调用 resolve_date_range 工具解析自然语言日期
                     - **默认**: 不指定时默认分析最近7天
         granularity: 时间粒度(trend模式),默认"day"(仅支持 day,因为底层数据按天聚合)
-        threshold: 热度突增倍数阈值(viral模式),默认3.0
+        spike_threshold: 热度突增倍数阈值(viral模式),默认3.0
         time_window: 检测时间窗口小时数(viral模式),默认24
         lookahead_hours: 预测未来小时数(predict模式),默认6
         confidence_threshold: 置信度阈值(predict模式),默认0.7
@@ -282,7 +288,7 @@ async def analyze_topic_trend(
         analysis_type=analysis_type,
         date_range=date_range,
         granularity=granularity,
-        threshold=threshold,
+        threshold=spike_threshold,
         time_window=time_window,
         lookahead_hours=lookahead_hours,
         confidence_threshold=confidence_threshold
@@ -398,34 +404,46 @@ async def analyze_sentiment(
 
 
 @mcp.tool
-async def find_similar_news(
+async def find_related_news(
     reference_title: str,
-    threshold: float = 0.6,
+    date_range: Optional[Union[Dict[str, str], str]] = None,
+    threshold: float = 0.5,
     limit: int = 50,
     include_url: bool = False
 ) -> str:
     """
-    查找与指定新闻标题相似的其他新闻
+    查找与指定新闻标题相关的其他新闻(支持当天和历史数据)
 
     Args:
-        reference_title: 新闻标题(完整或部分)
-        threshold: 相似度阈值,0-1之间,默认0.6
+        reference_title: 参考新闻标题(完整或部分)
+        date_range: 日期范围(可选)
+            - 不指定: 只查询今天的数据
+            - "today": 今天
+            - "yesterday": 昨天
+            - "last_week": 最近7天
+            - "last_month": 最近30天
+            - {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}: 自定义范围
+        threshold: 相似度阈值,0-1之间,默认0.5
                    注意:阈值越高匹配越严格,返回结果越少
-        limit: 返回条数限制,默认50,最大100
-               注意:实际返回数量取决于相似度匹配结果,可能少于请求值
+        limit: 返回条数限制,默认50
         include_url: 是否包含URL链接,默认False(节省token)
 
     Returns:
-        JSON格式的相似新闻列表,包含相似度分数
+        JSON格式的相关新闻列表,按相似度排序
+
+    Examples:
+        - 查找今天的相似新闻: find_related_news(reference_title="特斯拉降价")
+        - 查找历史相关新闻: find_related_news(reference_title="特斯拉降价", date_range="last_week")
+        - 自定义日期范围: find_related_news(reference_title="AI突破", date_range={"start": "2025-01-01", "end": "2025-01-15"})
 
     **重要:数据展示策略**
-    - 本工具返回完整的相似新闻列表
-    - **默认展示方式**:展示全部返回的新闻(包括相似度分数)
-    - 仅在用户明确要求"总结"或"挑重点"时才进行筛选
+    - 本工具返回完整的相关新闻列表(包括相似度分数)
+    - 仅在用户明确要求"总结"时才进行筛选
     """
     tools = _get_tools()
-    result = tools['analytics'].find_similar_news(
+    result = tools['search'].find_related_news_unified(
         reference_title=reference_title,
+        date_range=date_range,
         threshold=threshold,
         limit=limit,
         include_url=include_url
@@ -459,6 +477,128 @@ async def generate_summary_report(
     return json.dumps(result, ensure_ascii=False, indent=2)
 
 
+@mcp.tool
+async def aggregate_news(
+    date_range: Optional[Union[Dict[str, str], str]] = None,
+    platforms: Optional[List[str]] = None,
+    similarity_threshold: float = 0.7,
+    limit: int = 50,
+    include_url: bool = False
+) -> str:
+    """
+    跨平台新闻聚合 - 对相似新闻进行去重合并
+
+    将不同平台报道的同一事件合并为一条聚合新闻,
+    显示该新闻在各平台的覆盖情况和综合热度。
+
+    **使用场景:**
+    - 想要看到去重后的热点新闻(避免同一事件在不同平台重复展示)
+    - 分析某个话题在多个平台的覆盖情况
+    - 获取跨平台的综合热度排名
+
+    Args:
+        date_range: 日期范围(可选)
+            - 不指定: 查询今天
+            - {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}: 日期范围
+        platforms: 平台过滤列表,如 ['zhihu', 'weibo']
+        similarity_threshold: 相似度阈值,0.3-1.0之间,默认0.7
+                              越高越严格(仅合并非常相似的标题)
+        limit: 返回聚合新闻数量,默认50
+        include_url: 是否包含URL链接,默认False
+
+    Returns:
+        JSON格式的聚合结果,包含:
+        - summary: 聚合统计(原始数量、去重后数量、去重率)
+        - aggregated_news: 聚合后的新闻列表
+            - representative_title: 代表标题
+            - platforms: 覆盖的平台列表
+            - platform_count: 覆盖平台数
+            - is_cross_platform: 是否跨平台新闻
+            - best_rank: 最佳排名
+            - aggregate_weight: 综合权重
+            - sources: 各平台来源详情
+        - statistics: 平台覆盖统计
+
+    Examples:
+        - aggregate_news()  # 聚合今天所有平台的新闻
+        - aggregate_news(similarity_threshold=0.8)  # 更严格的相似度匹配
+        - aggregate_news(date_range={"start": "2025-01-01", "end": "2025-01-07"})
+
+    **重要:数据展示策略**
+    - 本工具返回去重聚合后的新闻列表
+    - 跨平台新闻(is_cross_platform=true)通常更具新闻价值
+    - 可优先展示 platform_count > 1 的新闻
+    """
+    tools = _get_tools()
+    result = tools['analytics'].aggregate_news(
+        date_range=date_range,
+        platforms=platforms,
+        similarity_threshold=similarity_threshold,
+        limit=limit,
+        include_url=include_url
+    )
+    return json.dumps(result, ensure_ascii=False, indent=2)
+
+
+@mcp.tool
+async def compare_periods(
+    period1: Union[Dict[str, str], str],
+    period2: Union[Dict[str, str], str],
+    topic: Optional[str] = None,
+    compare_type: str = "overview",
+    platforms: Optional[List[str]] = None,
+    top_n: int = 10
+) -> str:
+    """
+    时期对比分析 - 比较两个时间段的新闻数据
+
+    对比不同时期的热点话题、平台活跃度、新闻数量等维度。
+
+    **使用场景:**
+    - 对比本周和上周的热点变化
+    - 分析某个话题在两个时期的热度差异
+    - 查看各平台活跃度的周期性变化
+
+    Args:
+        period1: 第一个时间段(基准期)
+            - {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}: 日期范围
+            - "today", "yesterday", "this_week", "last_week", "this_month", "last_month": 预设值
+        period2: 第二个时间段(对比期,格式同 period1)
+        topic: 可选的话题关键词(聚焦特定话题的对比)
+        compare_type: 对比类型
+            - "overview": 总体概览(默认)- 新闻数量、关键词变化、TOP新闻
+            - "topic_shift": 话题变化分析 - 上升话题、下降话题、新出现话题
+            - "platform_activity": 平台活跃度对比 - 各平台新闻数量变化
+        platforms: 平台过滤列表,如 ['zhihu', 'weibo']
+        top_n: 返回 TOP N 结果,默认10
+
+    Returns:
+        JSON格式的对比分析结果,包含:
+        - periods: 两个时期的日期范围
+        - compare_type: 对比类型
+        - overview/topic_shift/platform_comparison: 具体对比结果(根据类型)
+
+    Examples:
+        - compare_periods(period1="last_week", period2="this_week")  # 周环比
+        - compare_periods(period1="last_month", period2="this_month", compare_type="topic_shift")
+        - compare_periods(
+            period1={"start": "2025-01-01", "end": "2025-01-07"},
+            period2={"start": "2025-01-08", "end": "2025-01-14"},
+            topic="人工智能"
+          )
+    """
+    tools = _get_tools()
+    result = tools['analytics'].compare_periods(
+        period1=period1,
+        period2=period2,
+        topic=topic,
+        compare_type=compare_type,
+        platforms=platforms,
+        top_n=top_n
+    )
+    return json.dumps(result, ensure_ascii=False, indent=2)
+
+
 # ==================== 智能检索工具 ====================
 
 @mcp.tool
@@ -540,50 +680,6 @@ async def search_news(
     return json.dumps(result, ensure_ascii=False, indent=2)
 
 
-@mcp.tool
-async def search_related_news_history(
-    reference_text: str,
-    time_preset: str = "yesterday",
-    threshold: float = 0.4,
-    limit: int = 50,
-    include_url: bool = False
-) -> str:
-    """
-    基于种子新闻,在历史数据中搜索相关新闻
-
-    Args:
-        reference_text: 参考新闻标题(完整或部分)
-        time_preset: 时间范围预设值,可选:
-            - "yesterday": 昨天
-            - "last_week": 上周 (7天)
-            - "last_month": 上个月 (30天)
-            - "custom": 自定义日期范围(需要提供 start_date 和 end_date)
-        threshold: 相关性阈值,0-1之间,默认0.4
-                   注意:综合相似度计算(70%关键词重合 + 30%文本相似度)
-                   阈值越高匹配越严格,返回结果越少
-        limit: 返回条数限制,默认50,最大100
-               注意:实际返回数量取决于相关性匹配结果,可能少于请求值
-        include_url: 是否包含URL链接,默认False(节省token)
-
-    Returns:
-        JSON格式的相关新闻列表,包含相关性分数和时间分布
-
-    **重要:数据展示策略**
-    - 本工具返回完整的相关新闻列表
-    - **默认展示方式**:展示全部返回的新闻(包括相关性分数)
-    - 仅在用户明确要求"总结"或"挑重点"时才进行筛选
-    """
-    tools = _get_tools()
-    result = tools['search'].search_related_news_history(
-        reference_text=reference_text,
-        time_preset=time_preset,
-        threshold=threshold,
-        limit=limit,
-        include_url=include_url
-    )
-    return json.dumps(result, ensure_ascii=False, indent=2)
-
-
 # ==================== 配置与系统管理工具 ====================
 
 @mcp.tool
@@ -827,28 +923,29 @@ def run_server(
     print("    === 基础数据查询(P0核心)===")
     print("    1. get_latest_news        - 获取最新新闻")
     print("    2. get_news_by_date       - 按日期查询新闻(支持自然语言)")
-    print("    3. get_trending_topics    - 获取趋势话题")
+    print("    3. get_trending_topics    - 获取趋势话题(支持自动提取)")
     print()
     print("    === 智能检索工具 ===")
-    print("    4. search_news                  - 统一新闻搜索(关键词/模糊/实体)")
-    print("    5. search_related_news_history  - 历史相关新闻检索")
+    print("    4. search_news            - 统一新闻搜索(关键词/模糊/实体)")
+    print("    5. find_related_news      - 相关新闻查找(支持历史数据)")
     print()
     print("    === 高级数据分析 ===")
     print("    6. analyze_topic_trend      - 统一话题趋势分析(热度/生命周期/爆火/预测)")
     print("    7. analyze_data_insights    - 统一数据洞察分析(平台对比/活跃度/关键词共现)")
     print("    8. analyze_sentiment        - 情感倾向分析")
-    print("    9. find_similar_news        - 相似新闻查找")
-    print("    10. generate_summary_report - 每日/每周摘要生成")
+    print("    9. aggregate_news           - 跨平台新闻聚合去重")
+    print("    10. compare_periods         - 时期对比分析(周环比/月环比)")
+    print("    11. generate_summary_report - 每日/每周摘要生成")
     print()
     print("    === 配置与系统管理 ===")
-    print("    11. get_current_config      - 获取当前系统配置")
-    print("    12. get_system_status       - 获取系统运行状态")
-    print("    13. trigger_crawl           - 手动触发爬取任务")
+    print("    12. get_current_config      - 获取当前系统配置")
+    print("    13. get_system_status       - 获取系统运行状态")
+    print("    14. trigger_crawl           - 手动触发爬取任务")
     print()
     print("    === 存储同步工具 ===")
-    print("    14. sync_from_remote        - 从远程存储拉取数据到本地")
-    print("    15. get_storage_status      - 获取存储配置和状态")
-    print("    16. list_available_dates    - 列出本地/远程可用日期")
+    print("    15. sync_from_remote        - 从远程存储拉取数据到本地")
+    print("    16. get_storage_status      - 获取存储配置和状态")
+    print("    17. list_available_dates    - 列出本地/远程可用日期")
     print("=" * 60)
     print()
 

+ 94 - 55
mcp_server/services/data_service.py

@@ -17,6 +17,22 @@ from ..utils.errors import DataNotFoundError
 class DataService:
     """数据访问服务类"""
 
+    # 中文停用词列表(用于 auto_extract 模式)
+    STOPWORDS = {
+        '的', '了', '在', '是', '我', '有', '和', '就', '不', '人', '都', '一',
+        '一个', '上', '也', '很', '到', '说', '要', '去', '你', '会', '着', '没有',
+        '看', '好', '自己', '这', '那', '来', '被', '与', '为', '对', '将', '从',
+        '以', '及', '等', '但', '或', '而', '于', '中', '由', '可', '可以', '已',
+        '已经', '还', '更', '最', '再', '因为', '所以', '如果', '虽然', '然而',
+        '什么', '怎么', '如何', '哪', '哪些', '多少', '几', '这个', '那个',
+        '他', '她', '它', '他们', '她们', '我们', '你们', '大家', '自己',
+        '这样', '那样', '怎样', '这么', '那么', '多么', '非常', '特别',
+        '应该', '可能', '能够', '需要', '必须', '一定', '肯定', '确实',
+        '正在', '已经', '曾经', '将要', '即将', '刚刚', '马上', '立刻',
+        '回应', '发布', '表示', '称', '曝', '官方', '最新', '重磅', '突发',
+        '热搜', '刷屏', '引发', '关注', '网友', '评论', '转发', '点赞'
+    }
+
     def __init__(self, project_root: str = None):
         """
         初始化数据服务
@@ -282,29 +298,61 @@ class DataService:
             }
         }
 
+    def _extract_words_from_title(self, title: str, min_length: int = 2) -> List[str]:
+        """
+        从标题中提取有意义的词语(用于 auto_extract 模式)
+
+        Args:
+            title: 新闻标题
+            min_length: 最小词长
+
+        Returns:
+            关键词列表
+        """
+        # 移除URL和特殊字符
+        title = re.sub(r'http[s]?://\S+', '', title)
+        title = re.sub(r'\[.*?\]', '', title)  # 移除方括号内容
+        title = re.sub(r'[【】《》「」『』""''・·•]', '', title)  # 移除中文标点
+
+        # 使用正则表达式分词(中文和英文)
+        # 匹配连续的中文字符或英文单词
+        words = re.findall(r'[\u4e00-\u9fff]{2,}|[a-zA-Z]{2,}[a-zA-Z0-9]*', title)
+
+        # 过滤停用词和短词
+        keywords = [
+            word for word in words
+            if word and len(word) >= min_length and word.lower() not in self.STOPWORDS
+            and word not in self.STOPWORDS
+        ]
+
+        return keywords
+
     def get_trending_topics(
         self,
         top_n: int = 10,
-        mode: str = "current"
+        mode: str = "current",
+        extract_mode: str = "keywords"
     ) -> Dict:
         """
-        获取个人关注词的新闻出现频率统计
-
-        注意:本工具基于 config/frequency_words.txt 中的个人关注词列表进行统计,
-        而不是自动从新闻中提取热点话题。用户可以自定义这个关注词列表。
+        获取热点话题统计
 
         Args:
-            top_n: 返回TOP N关注词
-            mode: 模式 - daily(当日累计), current(最新一批)
+            top_n: 返回TOP N话题
+            mode: 时间模式
+                - "daily": 当日累计数据统计
+                - "current": 最新一批数据统计(默认)
+            extract_mode: 提取模式
+                - "keywords": 统计预设关注词(基于 config/frequency_words.txt)
+                - "auto_extract": 自动从新闻标题提取高频词
 
         Returns:
-            关注词频率统计字典
+            话题频率统计字典
 
         Raises:
             DataNotFoundError: 数据不存在
         """
         # 尝试从缓存获取
-        cache_key = f"trending_topics:{top_n}:{mode}"
+        cache_key = f"trending_topics:{top_n}:{mode}:{extract_mode}"
         cached = self.cache.get(cache_key, ttl=1800)  # 30分钟缓存
         if cached:
             return cached
@@ -318,38 +366,13 @@ class DataService:
                 suggestion="请确保爬虫已经运行并生成了数据"
             )
 
-        # 加载关键词配置
-        word_groups = self.parser.parse_frequency_words()
-
-        # 根据mode选择要处理的标题数据
-        titles_to_process = {}
-
+        # 根据 mode 选择要处理的标题数据
         if mode == "daily":
-            # daily模式:处理当天所有累计数据
             titles_to_process = all_titles
-
         elif mode == "current":
-            # current模式:只处理最新一批数据(最新时间戳的文件)
-            if timestamps:
-                # 找出最新的时间戳
-                latest_timestamp = max(timestamps.values())
-
-                # 重新读取,只获取最新时间的数据
-                # 这里我们通过timestamps字典反查找最新文件对应的平台
-                latest_titles, _, _ = self.parser.read_all_titles_for_date()
-
-                # 由于read_all_titles_for_date返回所有文件的合并数据,
-                # 我们需要通过timestamps来过滤出最新批次
-                # 简化实现:使用当前所有数据作为最新批次
-                # (更精确的实现需要解析服务支持按时间过滤)
-                titles_to_process = latest_titles
-            else:
-                titles_to_process = all_titles
-
+            titles_to_process = all_titles  # 简化实现
         else:
-            raise ValueError(
-                f"不支持的模式: {mode}。支持的模式: daily, current"
-            )
+            raise ValueError(f"不支持的模式: {mode}。支持的模式: daily, current")
 
         # 统计词频
         word_frequency = Counter()
@@ -358,17 +381,26 @@ class DataService:
         # 遍历要处理的标题
         for platform_id, titles in titles_to_process.items():
             for title in titles.keys():
-                # 对每个关键词组进行匹配
-                for group in word_groups:
-                    all_words = group.get("required", []) + group.get("normal", [])
-
-                    for word in all_words:
-                        if word and word in title:
-                            word_frequency[word] += 1
-
-                            if word not in keyword_to_news:
-                                keyword_to_news[word] = []
-                            keyword_to_news[word].append(title)
+                if extract_mode == "keywords":
+                    # 基于预设关键词统计
+                    word_groups = self.parser.parse_frequency_words()
+                    for group in word_groups:
+                        all_words = group.get("required", []) + group.get("normal", [])
+                        for word in all_words:
+                            if word and word in title:
+                                word_frequency[word] += 1
+                                if word not in keyword_to_news:
+                                    keyword_to_news[word] = []
+                                keyword_to_news[word].append(title)
+
+                elif extract_mode == "auto_extract":
+                    # 自动提取关键词
+                    extracted_words = self._extract_words_from_title(title)
+                    for word in extracted_words:
+                        word_frequency[word] += 1
+                        if word not in keyword_to_news:
+                            keyword_to_news[word] = []
+                        keyword_to_news[word].append(title)
 
         # 获取TOP N关键词
         top_keywords = word_frequency.most_common(top_n)
@@ -382,8 +414,8 @@ class DataService:
                 "keyword": keyword,
                 "frequency": frequency,
                 "matched_news": len(set(matched_news)),  # 去重后的新闻数量
-                "trend": "stable",  # TODO: 需要历史数据来计算趋势
-                "weight_score": 0.0  # TODO: 需要实现权重计算
+                "trend": "stable",
+                "weight_score": 0.0
             })
 
         # 构建结果
@@ -391,8 +423,9 @@ class DataService:
             "topics": topics,
             "generated_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
             "mode": mode,
+            "extract_mode": extract_mode,
             "total_keywords": len(word_frequency),
-            "description": self._get_mode_description(mode)
+            "description": self._get_mode_description(mode, extract_mode)
         }
 
         # 缓存结果
@@ -400,13 +433,19 @@ class DataService:
 
         return result
 
-    def _get_mode_description(self, mode: str) -> str:
+    def _get_mode_description(self, mode: str, extract_mode: str = "keywords") -> str:
         """获取模式描述"""
-        descriptions = {
+        mode_desc = {
             "daily": "当日累计统计",
             "current": "最新一批统计"
-        }
-        return descriptions.get(mode, "未知模式")
+        }.get(mode, "未知时间模式")
+
+        extract_desc = {
+            "keywords": "基于预设关注词",
+            "auto_extract": "自动提取高频词"
+        }.get(extract_mode, "未知提取模式")
+
+        return f"{mode_desc} - {extract_desc}"
 
     def get_current_config(self, section: str = "all") -> Dict:
         """

+ 572 - 20
mcp_server/tools/analytics.py

@@ -16,7 +16,8 @@ from ..utils.validators import (
     validate_limit,
     validate_keyword,
     validate_top_n,
-    validate_date_range
+    validate_date_range,
+    validate_threshold
 )
 from ..utils.errors import MCPError, InvalidParameterError, DataNotFoundError
 
@@ -943,13 +944,7 @@ class AnalyticsTools:
         try:
             # 参数验证
             reference_title = validate_keyword(reference_title)
-
-            if not 0 <= threshold <= 1:
-                raise InvalidParameterError(
-                    "threshold 必须在 0 到 1 之间",
-                    suggestion="推荐值:0.5-0.8"
-                )
-
+            threshold = validate_threshold(threshold, default=0.6, min_value=0.0, max_value=1.0)
             limit = validate_limit(limit, default=50)
 
             # 读取数据
@@ -1650,12 +1645,7 @@ class AnalyticsTools:
         """
         try:
             # 参数验证
-            if threshold < 1.0:
-                raise InvalidParameterError(
-                    "threshold 必须大于等于 1.0",
-                    suggestion="推荐值:2.0-5.0"
-                )
-
+            threshold = validate_threshold(threshold, default=3.0, min_value=1.0, max_value=100.0)
             time_window = validate_limit(time_window, default=24, max_limit=72)
 
             # 读取当前和之前的数据
@@ -1787,12 +1777,13 @@ class AnalyticsTools:
         try:
             # 参数验证
             lookahead_hours = validate_limit(lookahead_hours, default=6, max_limit=48)
-
-            if not 0 <= confidence_threshold <= 1:
-                raise InvalidParameterError(
-                    "confidence_threshold 必须在 0 到 1 之间",
-                    suggestion="推荐值:0.6-0.8"
-                )
+            confidence_threshold = validate_threshold(
+                confidence_threshold,
+                default=0.7,
+                min_value=0.0,
+                max_value=1.0,
+                param_name="confidence_threshold"
+            )
 
             # 收集最近3天的数据用于预测
             keyword_trends = defaultdict(list)
@@ -1993,3 +1984,564 @@ class AnalyticsTools:
                 unique_topics[platform] = list(unique)[:5]  # 最多5个
 
         return unique_topics
+
+    # ==================== 跨平台聚合工具 ====================
+
+    def aggregate_news(
+        self,
+        date_range: Optional[Union[Dict[str, str], str]] = None,
+        platforms: Optional[List[str]] = None,
+        similarity_threshold: float = 0.7,
+        limit: int = 50,
+        include_url: bool = False
+    ) -> Dict:
+        """
+        跨平台新闻聚合 - 对相似新闻进行去重合并
+
+        将不同平台报道的同一事件合并为一条聚合新闻,
+        显示该新闻在各平台的覆盖情况和综合热度。
+
+        Args:
+            date_range: 日期范围(可选)
+                - 不指定: 查询今天
+                - {\"start\": \"YYYY-MM-DD\", \"end\": \"YYYY-MM-DD\"}: 日期范围
+            platforms: 平台过滤列表,如 ['zhihu', 'weibo']
+            similarity_threshold: 相似度阈值,0-1之间,默认0.7
+            limit: 返回聚合新闻数量,默认50
+            include_url: 是否包含URL链接,默认False
+
+        Returns:
+            聚合结果字典,包含:
+            - aggregated_news: 聚合后的新闻列表
+            - statistics: 聚合统计信息
+        """
+        try:
+            # 参数验证
+            platforms = validate_platforms(platforms)
+            similarity_threshold = validate_threshold(
+                similarity_threshold, default=0.7, min_value=0.3, max_value=1.0
+            )
+            limit = validate_limit(limit, default=50)
+
+            # 处理日期范围
+            if date_range:
+                date_range_tuple = validate_date_range(date_range)
+                start_date, end_date = date_range_tuple
+            else:
+                start_date = end_date = datetime.now()
+
+            # 收集所有新闻
+            all_news = []
+            current_date = start_date
+
+            while current_date <= end_date:
+                try:
+                    all_titles, id_to_name, _ = self.data_service.parser.read_all_titles_for_date(
+                        date=current_date,
+                        platform_ids=platforms
+                    )
+
+                    for platform_id, titles in all_titles.items():
+                        platform_name = id_to_name.get(platform_id, platform_id)
+
+                        for title, info in titles.items():
+                            news_item = {
+                                "title": title,
+                                "platform": platform_id,
+                                "platform_name": platform_name,
+                                "date": current_date.strftime("%Y-%m-%d"),
+                                "ranks": info.get("ranks", []),
+                                "count": len(info.get("ranks", [])),
+                                "rank": info["ranks"][0] if info["ranks"] else 999
+                            }
+
+                            if include_url:
+                                news_item["url"] = info.get("url", "")
+                                news_item["mobileUrl"] = info.get("mobileUrl", "")
+
+                            # 计算权重
+                            news_item["weight"] = calculate_news_weight(news_item)
+                            all_news.append(news_item)
+
+                except DataNotFoundError:
+                    pass
+
+                current_date += timedelta(days=1)
+
+            if not all_news:
+                return {
+                    "success": True,
+                    "aggregated_news": [],
+                    "total": 0,
+                    "message": "未找到新闻数据"
+                }
+
+            # 执行聚合
+            aggregated = self._aggregate_similar_news(
+                all_news, similarity_threshold, include_url
+            )
+
+            # 按综合权重排序
+            aggregated.sort(key=lambda x: x["aggregate_weight"], reverse=True)
+
+            # 限制返回数量
+            results = aggregated[:limit]
+
+            # 统计信息
+            total_original = len(all_news)
+            total_aggregated = len(aggregated)
+            dedup_rate = 1 - (total_aggregated / total_original) if total_original > 0 else 0
+
+            platform_coverage = Counter()
+            for item in aggregated:
+                for p in item["platforms"]:
+                    platform_coverage[p] += 1
+
+            return {
+                "success": True,
+                "summary": {
+                    "original_count": total_original,
+                    "aggregated_count": total_aggregated,
+                    "returned_count": len(results),
+                    "deduplication_rate": f"{dedup_rate * 100:.1f}%",
+                    "similarity_threshold": similarity_threshold,
+                    "date_range": {
+                        "start": start_date.strftime("%Y-%m-%d"),
+                        "end": end_date.strftime("%Y-%m-%d")
+                    }
+                },
+                "aggregated_news": results,
+                "statistics": {
+                    "platform_coverage": dict(platform_coverage),
+                    "multi_platform_news": len([a for a in aggregated if len(a["platforms"]) > 1]),
+                    "single_platform_news": len([a for a in aggregated if len(a["platforms"]) == 1])
+                }
+            }
+
+        except MCPError as e:
+            return {"success": False, "error": e.to_dict()}
+        except Exception as e:
+            return {"success": False, "error": {"code": "INTERNAL_ERROR", "message": str(e)}}
+
+    def _aggregate_similar_news(
+        self,
+        news_list: List[Dict],
+        threshold: float,
+        include_url: bool
+    ) -> List[Dict]:
+        """
+        对新闻列表进行相似度聚合
+
+        Args:
+            news_list: 新闻列表
+            threshold: 相似度阈值
+            include_url: 是否包含URL
+
+        Returns:
+            聚合后的新闻列表
+        """
+        if not news_list:
+            return []
+
+        # 按权重排序,优先保留高权重新闻作为代表
+        sorted_news = sorted(news_list, key=lambda x: x.get("weight", 0), reverse=True)
+
+        aggregated = []
+        used_indices = set()
+
+        for i, news in enumerate(sorted_news):
+            if i in used_indices:
+                continue
+
+            # 创建聚合组
+            group = {
+                "representative_title": news["title"],
+                "platforms": [news["platform_name"]],
+                "platform_ids": [news["platform"]],
+                "dates": [news["date"]],
+                "best_rank": news["rank"],
+                "total_count": news["count"],
+                "aggregate_weight": news.get("weight", 0),
+                "sources": [{
+                    "platform": news["platform_name"],
+                    "rank": news["rank"],
+                    "date": news["date"]
+                }]
+            }
+
+            if include_url and news.get("url"):
+                group["urls"] = [{
+                    "platform": news["platform_name"],
+                    "url": news.get("url", ""),
+                    "mobileUrl": news.get("mobileUrl", "")
+                }]
+
+            used_indices.add(i)
+
+            # 查找相似新闻
+            for j, other_news in enumerate(sorted_news):
+                if j in used_indices:
+                    continue
+
+                similarity = self._calculate_similarity(news["title"], other_news["title"])
+
+                if similarity >= threshold:
+                    # 合并到当前组
+                    if other_news["platform_name"] not in group["platforms"]:
+                        group["platforms"].append(other_news["platform_name"])
+                        group["platform_ids"].append(other_news["platform"])
+
+                    if other_news["date"] not in group["dates"]:
+                        group["dates"].append(other_news["date"])
+
+                    group["best_rank"] = min(group["best_rank"], other_news["rank"])
+                    group["total_count"] += other_news["count"]
+                    group["aggregate_weight"] += other_news.get("weight", 0) * 0.5  # 额外权重
+
+                    group["sources"].append({
+                        "platform": other_news["platform_name"],
+                        "rank": other_news["rank"],
+                        "date": other_news["date"]
+                    })
+
+                    if include_url and other_news.get("url"):
+                        if "urls" not in group:
+                            group["urls"] = []
+                        group["urls"].append({
+                            "platform": other_news["platform_name"],
+                            "url": other_news.get("url", ""),
+                            "mobileUrl": other_news.get("mobileUrl", "")
+                        })
+
+                    used_indices.add(j)
+
+            # 添加聚合信息
+            group["platform_count"] = len(group["platforms"])
+            group["is_cross_platform"] = len(group["platforms"]) > 1
+
+            aggregated.append(group)
+
+        return aggregated
+
+    # ==================== 时期对比分析工具 ====================
+
+    def compare_periods(
+        self,
+        period1: Union[Dict[str, str], str],
+        period2: Union[Dict[str, str], str],
+        topic: Optional[str] = None,
+        compare_type: str = "overview",
+        platforms: Optional[List[str]] = None,
+        top_n: int = 10
+    ) -> Dict:
+        """
+        时期对比分析 - 比较两个时间段的新闻数据
+
+        支持多种对比维度:热度对比、话题变化、平台活跃度等。
+
+        Args:
+            period1: 第一个时间段
+                - {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}: 日期范围
+                - "today", "yesterday", "last_week", "last_month": 预设值
+            period2: 第二个时间段(格式同 period1)
+            topic: 可选的话题关键词(聚焦特定话题的对比)
+            compare_type: 对比类型
+                - "overview": 总体概览(默认)
+                - "topic_shift": 话题变化分析
+                - "platform_activity": 平台活跃度对比
+            platforms: 平台过滤列表
+            top_n: 返回 TOP N 结果,默认10
+
+        Returns:
+            对比分析结果字典
+        """
+        try:
+            # 参数验证
+            platforms = validate_platforms(platforms)
+            top_n = validate_top_n(top_n, default=10)
+
+            if compare_type not in ["overview", "topic_shift", "platform_activity"]:
+                raise InvalidParameterError(
+                    f"不支持的对比类型: {compare_type}",
+                    suggestion="支持的类型: overview, topic_shift, platform_activity"
+                )
+
+            # 解析时间段
+            date_range1 = self._parse_period(period1)
+            date_range2 = self._parse_period(period2)
+
+            if not date_range1 or not date_range2:
+                raise InvalidParameterError(
+                    "无效的时间段格式",
+                    suggestion="使用 {'start': 'YYYY-MM-DD', 'end': 'YYYY-MM-DD'} 或预设值如 'last_week'"
+                )
+
+            # 收集两个时期的数据
+            data1 = self._collect_period_data(date_range1, platforms, topic)
+            data2 = self._collect_period_data(date_range2, platforms, topic)
+
+            # 根据对比类型执行不同的分析
+            if compare_type == "overview":
+                result = self._compare_overview(data1, data2, date_range1, date_range2, top_n)
+            elif compare_type == "topic_shift":
+                result = self._compare_topic_shift(data1, data2, date_range1, date_range2, top_n)
+            else:  # platform_activity
+                result = self._compare_platform_activity(data1, data2, date_range1, date_range2)
+
+            result["success"] = True
+            result["compare_type"] = compare_type
+            result["periods"] = {
+                "period1": {
+                    "start": date_range1[0].strftime("%Y-%m-%d"),
+                    "end": date_range1[1].strftime("%Y-%m-%d")
+                },
+                "period2": {
+                    "start": date_range2[0].strftime("%Y-%m-%d"),
+                    "end": date_range2[1].strftime("%Y-%m-%d")
+                }
+            }
+
+            if topic:
+                result["topic_filter"] = topic
+
+            return result
+
+        except MCPError as e:
+            return {"success": False, "error": e.to_dict()}
+        except Exception as e:
+            return {"success": False, "error": {"code": "INTERNAL_ERROR", "message": str(e)}}
+
+    def _parse_period(self, period: Union[Dict[str, str], str]) -> Optional[tuple]:
+        """解析时间段为日期范围元组"""
+        today = datetime.now()
+
+        if isinstance(period, str):
+            if period == "today":
+                return (today, today)
+            elif period == "yesterday":
+                yesterday = today - timedelta(days=1)
+                return (yesterday, yesterday)
+            elif period == "last_week":
+                return (today - timedelta(days=7), today - timedelta(days=1))
+            elif period == "this_week":
+                # 本周一到今天
+                days_since_monday = today.weekday()
+                monday = today - timedelta(days=days_since_monday)
+                return (monday, today)
+            elif period == "last_month":
+                return (today - timedelta(days=30), today - timedelta(days=1))
+            elif period == "this_month":
+                first_of_month = today.replace(day=1)
+                return (first_of_month, today)
+            else:
+                return None
+        elif isinstance(period, dict):
+            try:
+                start = datetime.strptime(period["start"], "%Y-%m-%d")
+                end = datetime.strptime(period["end"], "%Y-%m-%d")
+                return (start, end)
+            except (KeyError, ValueError):
+                return None
+        return None
+
+    def _collect_period_data(
+        self,
+        date_range: tuple,
+        platforms: Optional[List[str]],
+        topic: Optional[str]
+    ) -> Dict:
+        """收集指定时期的新闻数据"""
+        start_date, end_date = date_range
+        all_news = []
+        all_keywords = Counter()
+        platform_stats = Counter()
+
+        current_date = start_date
+        while current_date <= end_date:
+            try:
+                all_titles, id_to_name, _ = self.data_service.parser.read_all_titles_for_date(
+                    date=current_date,
+                    platform_ids=platforms
+                )
+
+                for platform_id, titles in all_titles.items():
+                    platform_name = id_to_name.get(platform_id, platform_id)
+
+                    for title, info in titles.items():
+                        # 如果指定了话题,过滤不相关的新闻
+                        if topic and topic.lower() not in title.lower():
+                            continue
+
+                        news_item = {
+                            "title": title,
+                            "platform": platform_id,
+                            "platform_name": platform_name,
+                            "date": current_date.strftime("%Y-%m-%d"),
+                            "ranks": info.get("ranks", []),
+                            "rank": info["ranks"][0] if info["ranks"] else 999
+                        }
+                        news_item["weight"] = calculate_news_weight(news_item)
+                        all_news.append(news_item)
+
+                        # 统计平台
+                        platform_stats[platform_name] += 1
+
+                        # 提取关键词
+                        keywords = self._extract_keywords(title)
+                        all_keywords.update(keywords)
+
+            except DataNotFoundError:
+                pass
+
+            current_date += timedelta(days=1)
+
+        return {
+            "news": all_news,
+            "news_count": len(all_news),
+            "keywords": all_keywords,
+            "platform_stats": platform_stats,
+            "date_range": date_range
+        }
+
+    def _compare_overview(
+        self,
+        data1: Dict,
+        data2: Dict,
+        range1: tuple,
+        range2: tuple,
+        top_n: int
+    ) -> Dict:
+        """总体概览对比"""
+        # 计算变化
+        count_change = data2["news_count"] - data1["news_count"]
+        count_change_pct = (count_change / data1["news_count"] * 100) if data1["news_count"] > 0 else 0
+
+        # TOP 关键词对比
+        top_kw1 = [kw for kw, _ in data1["keywords"].most_common(top_n)]
+        top_kw2 = [kw for kw, _ in data2["keywords"].most_common(top_n)]
+
+        new_keywords = [kw for kw in top_kw2 if kw not in top_kw1]
+        disappeared_keywords = [kw for kw in top_kw1 if kw not in top_kw2]
+        persistent_keywords = [kw for kw in top_kw1 if kw in top_kw2]
+
+        # TOP 新闻对比
+        top_news1 = sorted(data1["news"], key=lambda x: x.get("weight", 0), reverse=True)[:top_n]
+        top_news2 = sorted(data2["news"], key=lambda x: x.get("weight", 0), reverse=True)[:top_n]
+
+        return {
+            "overview": {
+                "period1_count": data1["news_count"],
+                "period2_count": data2["news_count"],
+                "count_change": count_change,
+                "count_change_percent": f"{count_change_pct:+.1f}%"
+            },
+            "keyword_analysis": {
+                "new_keywords": new_keywords[:5],
+                "disappeared_keywords": disappeared_keywords[:5],
+                "persistent_keywords": persistent_keywords[:5]
+            },
+            "top_news": {
+                "period1": [{"title": n["title"], "platform": n["platform_name"]} for n in top_news1],
+                "period2": [{"title": n["title"], "platform": n["platform_name"]} for n in top_news2]
+            }
+        }
+
+    def _compare_topic_shift(
+        self,
+        data1: Dict,
+        data2: Dict,
+        range1: tuple,
+        range2: tuple,
+        top_n: int
+    ) -> Dict:
+        """话题变化分析"""
+        kw1 = data1["keywords"]
+        kw2 = data2["keywords"]
+
+        # 计算热度变化
+        all_keywords = set(kw1.keys()) | set(kw2.keys())
+        keyword_changes = []
+
+        for kw in all_keywords:
+            count1 = kw1.get(kw, 0)
+            count2 = kw2.get(kw, 0)
+            change = count2 - count1
+
+            if count1 > 0:
+                change_pct = (change / count1) * 100
+            elif count2 > 0:
+                change_pct = 100  # 新出现
+            else:
+                change_pct = 0
+
+            keyword_changes.append({
+                "keyword": kw,
+                "period1_count": count1,
+                "period2_count": count2,
+                "change": change,
+                "change_percent": round(change_pct, 1)
+            })
+
+        # 按变化幅度排序
+        rising = sorted([k for k in keyword_changes if k["change"] > 0],
+                       key=lambda x: x["change"], reverse=True)[:top_n]
+        falling = sorted([k for k in keyword_changes if k["change"] < 0],
+                        key=lambda x: x["change"])[:top_n]
+        new_topics = [k for k in keyword_changes if k["period1_count"] == 0 and k["period2_count"] > 0][:top_n]
+
+        return {
+            "rising_topics": rising,
+            "falling_topics": falling,
+            "new_topics": new_topics,
+            "total_keywords": {
+                "period1": len(kw1),
+                "period2": len(kw2)
+            }
+        }
+
+    def _compare_platform_activity(
+        self,
+        data1: Dict,
+        data2: Dict,
+        range1: tuple,
+        range2: tuple
+    ) -> Dict:
+        """平台活跃度对比"""
+        ps1 = data1["platform_stats"]
+        ps2 = data2["platform_stats"]
+
+        all_platforms = set(ps1.keys()) | set(ps2.keys())
+        platform_changes = []
+
+        for platform in all_platforms:
+            count1 = ps1.get(platform, 0)
+            count2 = ps2.get(platform, 0)
+            change = count2 - count1
+
+            if count1 > 0:
+                change_pct = (change / count1) * 100
+            elif count2 > 0:
+                change_pct = 100
+            else:
+                change_pct = 0
+
+            platform_changes.append({
+                "platform": platform,
+                "period1_count": count1,
+                "period2_count": count2,
+                "change": change,
+                "change_percent": round(change_pct, 1)
+            })
+
+        # 按变化排序
+        platform_changes.sort(key=lambda x: x["change"], reverse=True)
+
+        return {
+            "platform_comparison": platform_changes,
+            "most_active_growth": platform_changes[0] if platform_changes else None,
+            "least_active_growth": platform_changes[-1] if platform_changes else None,
+            "total_activity": {
+                "period1": sum(ps1.values()),
+                "period2": sum(ps2.values())
+            }
+        }

+ 46 - 24
mcp_server/tools/data_query.py

@@ -154,39 +154,55 @@ class DataQueryTools:
     def get_trending_topics(
         self,
         top_n: Optional[int] = None,
-        mode: Optional[str] = None
+        mode: Optional[str] = None,
+        extract_mode: Optional[str] = None
     ) -> Dict:
         """
-        获取个人关注词的新闻出现频率统计
-
-        注意:本工具基于 config/frequency_words.txt 中的个人关注词列表进行统计,
-        而不是自动从新闻中提取热点话题。这是一个个人可定制的关注词列表,
-        用户可以根据自己的兴趣添加或删除关注词。
+        获取热点话题统计
 
         Args:
-            top_n: 返回TOP N关注词,默认10
-            mode: 模式 - daily(当日累计), current(最新一批), incremental(增量)
+            top_n: 返回TOP N话题,默认10
+            mode: 时间模式
+                - "daily": 当日累计数据统计
+                - "current": 最新一批数据统计(默认)
+            extract_mode: 提取模式
+                - "keywords": 统计预设关注词(基于 config/frequency_words.txt,默认)
+                - "auto_extract": 自动从新闻标题提取高频词
 
         Returns:
-            关注词频率统计字典,包含每个关注词在新闻中出现的次数
+            话题频率统计字典
 
         Example:
             >>> tools = DataQueryTools()
+            >>> # 使用预设关注词
             >>> result = tools.get_trending_topics(top_n=5, mode="current")
-            >>> print(len(result['topics']))
-            5
-            >>> # 返回的是你在 frequency_words.txt 中设置的关注词的频率统计
+            >>> # 自动提取高频词
+            >>> result = tools.get_trending_topics(top_n=10, extract_mode="auto_extract")
         """
         try:
             # 参数验证
             top_n = validate_top_n(top_n, default=10)
-            valid_modes = ["daily", "current", "incremental"]
+            valid_modes = ["daily", "current"]
             mode = validate_mode(mode, valid_modes, default="current")
 
+            # 验证 extract_mode
+            if extract_mode is None:
+                extract_mode = "keywords"
+            elif extract_mode not in ["keywords", "auto_extract"]:
+                return {
+                    "success": False,
+                    "error": {
+                        "code": "INVALID_PARAMETER",
+                        "message": f"不支持的提取模式: {extract_mode}",
+                        "suggestion": "支持的模式: keywords, auto_extract"
+                    }
+                }
+
             # 获取趋势话题
             trending_result = self.data_service.get_trending_topics(
                 top_n=top_n,
-                mode=mode
+                mode=mode,
+                extract_mode=extract_mode
             )
 
             return {
@@ -210,7 +226,7 @@ class DataQueryTools:
 
     def get_news_by_date(
         self,
-        date_query: Optional[str] = None,
+        date_range: Optional[Union[Dict[str, str], str]] = None,
         platforms: Optional[List[str]] = None,
         limit: Optional[int] = None,
         include_url: bool = False
@@ -219,10 +235,10 @@ class DataQueryTools:
         按日期查询新闻,支持自然语言日期
 
         Args:
-            date_query: 日期查询字符串(可选,默认"今天"),支持:
-                - 相对日期:今天、昨天、前天、3天前、yesterday、3 days ago
-                - 星期:上周一、本周三、last monday、this friday
-                - 绝对日期:2025-10-10、10月10日、2025年10月10日
+            date_range: 日期范围(可选,默认"今天"),支持:
+                - 范围对象:{"start": "2025-01-01", "end": "2025-01-07"}
+                - 相对日期:今天、昨天、前天、3天前
+                - 单日字符串:2025-10-10
             platforms: 平台ID列表,如 ['zhihu', 'weibo']
             limit: 返回条数限制,默认50
             include_url: 是否包含URL链接,默认False(节省token)
@@ -236,7 +252,7 @@ class DataQueryTools:
             >>> result = tools.get_news_by_date(platforms=['zhihu'], limit=20)
             >>> # 指定日期
             >>> result = tools.get_news_by_date(
-            ...     date_query="昨天",
+            ...     date_range="昨天",
             ...     platforms=['zhihu'],
             ...     limit=20
             ... )
@@ -245,9 +261,15 @@ class DataQueryTools:
         """
         try:
             # 参数验证 - 默认今天
-            if date_query is None:
-                date_query = "今天"
-            target_date = validate_date_query(date_query)
+            if date_range is None:
+                date_range = "今天"
+            # 处理 date_range:支持字符串或对象
+            if isinstance(date_range, dict):
+                # 范围对象,取 start 日期
+                date_str = date_range.get('start', '今天')
+            else:
+                date_str = date_range
+            target_date = validate_date_query(date_str)
             platforms = validate_platforms(platforms)
             limit = validate_limit(limit, default=50)
 
@@ -263,7 +285,7 @@ class DataQueryTools:
                 "news": news_list,
                 "total": len(news_list),
                 "date": target_date.strftime("%Y-%m-%d"),
-                "date_query": date_query,
+                "date_range": date_range,
                 "platforms": platforms,
                 "success": True
             }

+ 190 - 11
mcp_server/tools/search_tools.py

@@ -11,7 +11,7 @@ from difflib import SequenceMatcher
 from typing import Dict, List, Optional, Tuple, Union
 
 from ..services.data_service import DataService
-from ..utils.validators import validate_keyword, validate_limit
+from ..utils.validators import validate_keyword, validate_limit, validate_threshold
 from ..utils.errors import MCPError, InvalidParameterError, DataNotFoundError
 
 
@@ -95,7 +95,7 @@ class SearchTools:
                 )
 
             limit = validate_limit(limit, default=50)
-            threshold = max(0.0, min(1.0, threshold))
+            threshold = validate_threshold(threshold, default=0.6, min_value=0.0, max_value=1.0)
 
             # 处理日期范围
             if date_range:
@@ -491,9 +491,34 @@ class SearchTools:
 
         return intersection / union
 
+    def _jaccard_similarity(self, list1: List[str], list2: List[str]) -> float:
+        """
+        计算两个列表的 Jaccard 相似度
+
+        Args:
+            list1: 列表1
+            list2: 列表2
+
+        Returns:
+            Jaccard 相似度 (0-1之间)
+        """
+        if not list1 or not list2:
+            return 0.0
+
+        set1 = set(list1)
+        set2 = set(list2)
+
+        intersection = len(set1 & set2)
+        union = len(set1 | set2)
+
+        if union == 0:
+            return 0.0
+
+        return intersection / union
+
     def search_related_news_history(
         self,
-        reference_text: str,
+        reference_title: str,
         time_preset: str = "yesterday",
         start_date: Optional[datetime] = None,
         end_date: Optional[datetime] = None,
@@ -505,7 +530,7 @@ class SearchTools:
         在历史数据中搜索与给定新闻相关的新闻
 
         Args:
-            reference_text: 参考新闻标题或内容
+            reference_title: 参考新闻标题或内容
             time_preset: 时间范围预设值,可选:
                 - "yesterday": 昨天
                 - "last_week": 上周 (7天)
@@ -523,7 +548,7 @@ class SearchTools:
         Example:
             >>> tools = SearchTools()
             >>> result = tools.search_related_news_history(
-            ...     reference_text="人工智能技术突破",
+            ...     reference_title="人工智能技术突破",
             ...     time_preset="last_week",
             ...     threshold=0.4,
             ...     limit=50
@@ -533,8 +558,8 @@ class SearchTools:
         """
         try:
             # 参数验证
-            reference_text = validate_keyword(reference_text)
-            threshold = max(0.0, min(1.0, threshold))
+            reference_title = validate_keyword(reference_title)
+            threshold = validate_threshold(threshold, default=0.4, min_value=0.0, max_value=1.0)
             limit = validate_limit(limit, default=50)
 
             # 确定查询日期范围
@@ -564,7 +589,7 @@ class SearchTools:
                 )
 
             # 提取参考文本的关键词
-            reference_keywords = self._extract_keywords(reference_text)
+            reference_keywords = self._extract_keywords(reference_title)
 
             if not reference_keywords:
                 raise InvalidParameterError(
@@ -587,7 +612,7 @@ class SearchTools:
 
                         for title, info in titles.items():
                             # 计算标题相似度
-                            title_similarity = self._calculate_similarity(reference_text, title)
+                            title_similarity = self._calculate_similarity(reference_title, title)
 
                             # 提取标题关键词
                             title_keywords = self._extract_keywords(title)
@@ -636,7 +661,7 @@ class SearchTools:
                     "success": True,
                     "results": [],
                     "total": 0,
-                    "query": reference_text,
+                    "query": reference_title,
                     "time_preset": time_preset,
                     "date_range": {
                         "start": search_start.strftime("%Y-%m-%d"),
@@ -662,7 +687,7 @@ class SearchTools:
                     "returned_count": len(results),
                     "requested_limit": limit,
                     "threshold": threshold,
-                    "reference_text": reference_text,
+                    "reference_title": reference_title,
                     "reference_keywords": reference_keywords,
                     "time_preset": time_preset,
                     "date_range": {
@@ -699,3 +724,157 @@ class SearchTools:
                     "message": str(e)
                 }
             }
+
+    def find_related_news_unified(
+        self,
+        reference_title: str,
+        date_range: Optional[Union[Dict[str, str], str]] = None,
+        threshold: float = 0.5,
+        limit: int = 50,
+        include_url: bool = False
+    ) -> Dict:
+        """
+        统一的相关新闻查找工具 - 整合相似新闻和历史相关搜索
+
+        Args:
+            reference_title: 参考新闻标题
+            date_range: 日期范围(可选)
+                - 不指定: 只查询今天的数据
+                - {"start": "YYYY-MM-DD", "end": "YYYY-MM-DD"}: 查询指定日期范围
+                - "today": 今天
+                - "yesterday": 昨天
+                - "last_week": 最近7天
+                - "last_month": 最近30天
+            threshold: 相似度阈值,0-1之间,默认0.5
+            limit: 返回条数限制,默认50
+            include_url: 是否包含URL链接,默认False
+
+        Returns:
+            相关新闻列表,按相似度排序
+        """
+        try:
+            # 参数验证
+            reference_title = validate_keyword(reference_title)
+            threshold = validate_threshold(threshold, default=0.5, min_value=0.0, max_value=1.0)
+            limit = validate_limit(limit, default=50)
+
+            # 确定日期范围
+            today = datetime.now()
+            
+            if date_range is None or date_range == "today":
+                # 只查询今天
+                search_dates = [today]
+            elif isinstance(date_range, str):
+                # 预设时间范围
+                if date_range == "yesterday":
+                    search_dates = [today - timedelta(days=1)]
+                elif date_range == "last_week":
+                    search_dates = [today - timedelta(days=i) for i in range(7)]
+                elif date_range == "last_month":
+                    search_dates = [today - timedelta(days=i) for i in range(30)]
+                else:
+                    # 单日字符串格式
+                    try:
+                        single_date = datetime.strptime(date_range, "%Y-%m-%d")
+                        search_dates = [single_date]
+                    except ValueError:
+                        search_dates = [today]
+            elif isinstance(date_range, dict):
+                # 日期范围对象
+                start_str = date_range.get("start")
+                end_str = date_range.get("end")
+                if start_str and end_str:
+                    start_date = datetime.strptime(start_str, "%Y-%m-%d")
+                    end_date = datetime.strptime(end_str, "%Y-%m-%d")
+                    search_dates = []
+                    current = start_date
+                    while current <= end_date:
+                        search_dates.append(current)
+                        current += timedelta(days=1)
+                else:
+                    search_dates = [today]
+            else:
+                search_dates = [today]
+
+            # 提取参考标题的关键词
+            reference_keywords = self._extract_keywords(reference_title)
+
+            # 收集所有相关新闻
+            all_related_news = []
+            
+            for search_date in search_dates:
+                try:
+                    all_titles, id_to_name, _ = self.data_service.parser.read_all_titles_for_date(search_date)
+                    
+                    for platform_id, titles in all_titles.items():
+                        platform_name = id_to_name.get(platform_id, platform_id)
+                        
+                        for title, info in titles.items():
+                            if title == reference_title:
+                                continue
+                            
+                            # 计算相似度(使用混合算法)
+                            text_similarity = self._calculate_similarity(reference_title, title)
+                            
+                            # 如果有关键词,也计算关键词重合度
+                            if reference_keywords:
+                                title_keywords = self._extract_keywords(title)
+                                keyword_similarity = self._jaccard_similarity(reference_keywords, title_keywords)
+                                # 混合相似度:70% 文本 + 30% 关键词
+                                similarity = 0.7 * text_similarity + 0.3 * keyword_similarity
+                            else:
+                                similarity = text_similarity
+                            
+                            if similarity >= threshold:
+                                news_item = {
+                                    "title": title,
+                                    "platform": platform_id,
+                                    "platform_name": platform_name,
+                                    "date": search_date.strftime("%Y-%m-%d"),
+                                    "similarity": round(similarity, 3),
+                                    "rank": info["ranks"][0] if info["ranks"] else 0
+                                }
+                                
+                                if include_url:
+                                    news_item["url"] = info.get("url", "")
+                                
+                                all_related_news.append(news_item)
+                                
+                except Exception:
+                    # 某天数据读取失败,跳过
+                    continue
+
+            # 按相似度排序
+            all_related_news.sort(key=lambda x: x["similarity"], reverse=True)
+            
+            # 限制数量
+            results = all_related_news[:limit]
+
+            # 统计信息
+            from collections import Counter
+            platform_dist = Counter([n["platform_name"] for n in all_related_news])
+            date_dist = Counter([n["date"] for n in all_related_news])
+
+            return {
+                "success": True,
+                "summary": {
+                    "total_found": len(all_related_news),
+                    "returned_count": len(results),
+                    "reference_title": reference_title,
+                    "threshold": threshold,
+                    "date_range": {
+                        "start": min(search_dates).strftime("%Y-%m-%d"),
+                        "end": max(search_dates).strftime("%Y-%m-%d")
+                    } if search_dates else None
+                },
+                "results": results,
+                "statistics": {
+                    "platform_distribution": dict(platform_dist),
+                    "date_distribution": dict(date_dist)
+                }
+            }
+
+        except MCPError as e:
+            return {"success": False, "error": e.to_dict()}
+        except Exception as e:
+            return {"success": False, "error": {"code": "INTERNAL_ERROR", "message": str(e)}}

+ 207 - 6
mcp_server/utils/validators.py

@@ -2,6 +2,7 @@
 参数验证工具
 
 提供统一的参数验证功能。
+支持 MCP 客户端将参数序列化为字符串的情况。
 """
 
 from datetime import datetime
@@ -9,11 +10,144 @@ from typing import List, Optional, Union
 import os
 import json
 import yaml
+import ast
 
 from .errors import InvalidParameterError
 from .date_parser import DateParser
 
 
+# ==================== 辅助函数:处理字符串序列化 ====================
+
+def _parse_string_to_list(value: str) -> List[str]:
+    """
+    将字符串解析为列表
+
+    支持格式:
+    - JSON 数组: '["zhihu", "weibo"]'
+    - Python 列表字符串: "['zhihu', 'weibo']"
+    - 逗号分隔: "zhihu, weibo" 或 "zhihu,weibo"
+
+    Args:
+        value: 字符串值
+
+    Returns:
+        解析后的列表
+
+    Raises:
+        InvalidParameterError: 解析失败
+    """
+    value = value.strip()
+
+    if not value:
+        return []
+
+    # 尝试 JSON 解析: '["zhihu", "weibo"]'
+    try:
+        parsed = json.loads(value)
+        if isinstance(parsed, list):
+            return [str(item) for item in parsed]
+        # 如果解析结果不是列表,继续尝试其他方式
+    except json.JSONDecodeError:
+        pass
+
+    # 尝试 Python 字面量解析: "['zhihu', 'weibo']"
+    try:
+        parsed = ast.literal_eval(value)
+        if isinstance(parsed, list):
+            return [str(item) for item in parsed]
+        if isinstance(parsed, str):
+            # 单个字符串,包装成列表
+            return [parsed]
+    except (ValueError, SyntaxError):
+        pass
+
+    # 尝试逗号分隔: "zhihu, weibo" 或 "zhihu,weibo"
+    if ',' in value:
+        items = [item.strip() for item in value.split(',')]
+        return [item for item in items if item]
+
+    # 单个值
+    return [value]
+
+
+def _parse_string_to_int(value: str, param_name: str = "参数") -> int:
+    """
+    将字符串解析为整数
+
+    Args:
+        value: 字符串值
+        param_name: 参数名(用于错误消息)
+
+    Returns:
+        解析后的整数
+
+    Raises:
+        InvalidParameterError: 解析失败
+    """
+    value = value.strip()
+
+    try:
+        # 尝试直接转换
+        return int(value)
+    except ValueError:
+        pass
+
+    # 尝试解析浮点数后取整
+    try:
+        return int(float(value))
+    except ValueError:
+        raise InvalidParameterError(
+            f"{param_name} 必须是整数,无法解析: {value}",
+            suggestion=f"请提供有效的整数值,如: 10, 50, 100"
+        )
+
+
+def _parse_string_to_float(value: str, param_name: str = "参数") -> float:
+    """
+    将字符串解析为浮点数
+
+    Args:
+        value: 字符串值
+        param_name: 参数名(用于错误消息)
+
+    Returns:
+        解析后的浮点数
+
+    Raises:
+        InvalidParameterError: 解析失败
+    """
+    value = value.strip()
+
+    try:
+        return float(value)
+    except ValueError:
+        raise InvalidParameterError(
+            f"{param_name} 必须是数字,无法解析: {value}",
+            suggestion=f"请提供有效的数字值,如: 0.6, 3.0"
+        )
+
+
+def _parse_string_to_bool(value: str) -> bool:
+    """
+    将字符串解析为布尔值
+
+    Args:
+        value: 字符串值
+
+    Returns:
+        解析后的布尔值
+    """
+    value = value.strip().lower()
+
+    if value in ('true', '1', 'yes', 'on'):
+        return True
+    elif value in ('false', '0', 'no', 'off', ''):
+        return False
+    else:
+        # 默认非空字符串为 True
+        return bool(value)
+
+
 def get_supported_platforms() -> List[str]:
     """
     从 config.yaml 动态获取支持的平台列表
@@ -41,12 +175,19 @@ def get_supported_platforms() -> List[str]:
         return []
 
 
-def validate_platforms(platforms: Optional[List[str]]) -> List[str]:
+def validate_platforms(platforms: Optional[Union[List[str], str]]) -> List[str]:
     """
     验证平台列表
 
     Args:
-        platforms: 平台ID列表,None表示使用 config.yaml 中配置的所有平台
+        platforms: 平台ID列表或字符串,None表示使用 config.yaml 中配置的所有平台
+                   支持多种格式:
+                   - None: 使用默认平台
+                   - ["zhihu", "weibo"]: JSON 数组
+                   - '["zhihu", "weibo"]': JSON 数组字符串
+                   - "['zhihu', 'weibo']": Python 列表字符串
+                   - "zhihu, weibo": 逗号分隔字符串
+                   - "zhihu": 单个平台字符串
 
     Returns:
         验证后的平台列表
@@ -65,6 +206,13 @@ def validate_platforms(platforms: Optional[List[str]]) -> List[str]:
         # 返回配置文件中的平台列表(用户的默认配置)
         return supported_platforms if supported_platforms else []
 
+    # 支持字符串形式的列表输入(某些 MCP 客户端会将 JSON 数组序列化为字符串)
+    if isinstance(platforms, str):
+        platforms = _parse_string_to_list(platforms)
+        if not platforms:
+            # 空字符串或解析后为空,使用默认平台
+            return supported_platforms if supported_platforms else []
+
     if not isinstance(platforms, list):
         raise InvalidParameterError("platforms 参数必须是列表类型")
 
@@ -88,12 +236,12 @@ def validate_platforms(platforms: Optional[List[str]]) -> List[str]:
     return platforms
 
 
-def validate_limit(limit: Optional[int], default: int = 20, max_limit: int = 1000) -> int:
+def validate_limit(limit: Optional[Union[int, str]], default: int = 20, max_limit: int = 1000) -> int:
     """
     验证数量限制参数
 
     Args:
-        limit: 限制数量
+        limit: 限制数量(整数或字符串)
         default: 默认值
         max_limit: 最大限制
 
@@ -106,6 +254,10 @@ def validate_limit(limit: Optional[int], default: int = 20, max_limit: int = 100
     if limit is None:
         return default
 
+    # 支持字符串形式的整数(某些 MCP 客户端会将数字序列化为字符串)
+    if isinstance(limit, str):
+        limit = _parse_string_to_int(limit, "limit")
+
     if not isinstance(limit, int):
         raise InvalidParameterError("limit 参数必须是整数类型")
 
@@ -256,12 +408,12 @@ def validate_keyword(keyword: str) -> str:
     return keyword
 
 
-def validate_top_n(top_n: Optional[int], default: int = 10) -> int:
+def validate_top_n(top_n: Optional[Union[int, str]], default: int = 10) -> int:
     """
     验证TOP N参数
 
     Args:
-        top_n: TOP N数量
+        top_n: TOP N数量(整数或字符串)
         default: 默认值
 
     Returns:
@@ -320,6 +472,55 @@ def validate_config_section(section: Optional[str]) -> str:
     return validate_mode(section, valid_sections, "all")
 
 
+def validate_threshold(
+    threshold: Optional[Union[float, int, str]],
+    default: float = 0.6,
+    min_value: float = 0.0,
+    max_value: float = 1.0,
+    param_name: str = "threshold"
+) -> float:
+    """
+    验证阈值参数(浮点数)
+
+    Args:
+        threshold: 阈值(浮点数、整数或字符串)
+        default: 默认值
+        min_value: 最小值
+        max_value: 最大值
+        param_name: 参数名(用于错误消息)
+
+    Returns:
+        验证后的阈值
+
+    Raises:
+        InvalidParameterError: 参数无效
+    """
+    if threshold is None:
+        return default
+
+    # 支持字符串形式的数字(某些 MCP 客户端会将数字序列化为字符串)
+    if isinstance(threshold, str):
+        threshold = _parse_string_to_float(threshold, param_name)
+
+    # 整数转浮点数
+    if isinstance(threshold, int):
+        threshold = float(threshold)
+
+    if not isinstance(threshold, float):
+        raise InvalidParameterError(
+            f"{param_name} 必须是数字类型",
+            suggestion=f"请提供 {min_value} 到 {max_value} 之间的数字"
+        )
+
+    if threshold < min_value or threshold > max_value:
+        raise InvalidParameterError(
+            f"{param_name} 必须在 {min_value} 到 {max_value} 之间,当前值: {threshold}",
+            suggestion=f"推荐值: {default}"
+        )
+
+    return threshold
+
+
 def validate_date_query(
     date_query: str,
     allow_future: bool = False,

+ 6 - 5
pyproject.toml

@@ -1,7 +1,7 @@
 [project]
-name = "trendradar-mcp"
-version = "1.1.0"
-description = "TrendRadar MCP Server - 新闻热点聚合工具"
+name = "trendradar"
+version = "4.0.3"
+description = "TrendRadar - 热点新闻聚合与分析工具"
 requires-python = ">=3.10"
 dependencies = [
     "requests>=2.32.5,<3.0.0",
@@ -12,7 +12,8 @@ dependencies = [
 ]
 
 [project.scripts]
-trendradar = "mcp_server.server:run_server"
+trendradar = "trendradar.__main__:main"
+trendradar-mcp = "mcp_server.server:run_server"
 
 [dependency-groups]
 dev = []
@@ -22,4 +23,4 @@ requires = ["hatchling"]
 build-backend = "hatchling.build"
 
 [tool.hatch.build.targets.wheel]
-packages = ["mcp_server"]
+packages = ["trendradar", "mcp_server"]