乐于分享
好东西不私藏

AI论文写作帮手!基于MCP协议的Latex论文写作助手

AI论文写作帮手!基于MCP协议的Latex论文写作助手

AI论文写作帮手!基于MCP协议的Latex论文写作助手

项目背景

在写论文的时候感觉Introduction写起来很有难度,要合理的将段落和文献结合起来相对比较繁琐。虽然之前也有阅读大量相关文献,但是总感觉写起来很奇怪很别扭。因此,想着能否有一款AI应用能够帮助我自动生成Introduction,并且能够自动生成参考文献列表。同时,我还希望它能够参考其他文献的行文风格来创作,因为每一类期刊的文章都有自己相对应的风格。

同时考虑到我使用的是Vscode + Latex这样一个编辑环境,因此我希望直接可以使用mcp工具来帮助我进行辅助创作,因此我使用了之前文章提到的Vscode + Roo Code + MCP的方案来进行实现。

项目目标

  • 实现一个基于MCP协议的Latex论文写作助手。
  • 能够参考其他文献的行文风格来创作。
  • 能够将生成的论文内容保存到本地。
  • 在生成相应章节内容时懂得自动引用相关文献,并且能够将其和行文合理串联起来。

具体实现

1. 创建MCP服务器

和传统服务一样,都需要创建服务器进行交互。

// 创建 MCP 服务器const server = newServer(  {name"mdpi-server-mcp-server",version"1.0.0",  },  {capabilities: {tools: {},    },  });

2.注册工具列表处理器

MCP中的ListToolsRequestSchema主要做两件事:

  • 描述工具清单

    • name:工具的唯一标识
    • description:用途说明(给大模型看,帮助它判断何时调用)
    • inputSchema:参数定义(JSON Schema 格式,客户端据此校验输入)
  • 按协议响应

MCP 定义了一个标准请求码 ListToolsRequestSchema,客户端连接后会发送这个请求。服务器的处理器必须返回符合规范的 JSON。

// 注册工具列表处理器server.setRequestHandler(ListToolsRequestSchemaasync () => {return {tools: [      {name"search_arxiv",description"搜索 arXiv 论文",inputSchema: {type"object",properties: {query: {type"string",description"搜索英文关键词"            },maxResults: {type"number",description"最大结果数量",default5            }          },required: ["query"]        }      },      {name"revise_section",description"对已有章节内容进行学术化润色、扩写或调整,保持 MDPI LaTeX 格式。",inputSchema: {type"object",properties: {content: { type"string"description"需要修改的原始文本(LaTeX 片段)" },instruction: { type"string"description"修改要求,如 '扩写至 500 字'、'加强逻辑衔接'、'转为被动语态'" }, autoRefs: {"type""boolean","description""是否自动搜索并引用相关 arXiv 论文(默认 false)","default"false            },refQuery: {"type""string","description""自定义文献搜索关键词(若不提供,则从章节主题中自动生成)"            },refCount: {"type""number","description""期望引用的参考文献数量(默认 5)","default"5            }          },required: ["content""instruction"]        }      },      {name"search_and_format_references",description"根据关键词搜索 arXiv 论文,返回格式化的 BibTeX 条目,可直接用于 LaTeX 文档。",inputSchema: {type"object",properties: {query: { type"string"description"搜索关键词(如 'sEMG joint moment prediction')" },maxResults: { type"number"description"返回结果数量(默认 5)"default5 }          },required: ["query"]        }      },      {name"generate_paper_section",description"根据用户指定的章节类型(如 introduction、related_work、experiments、conclusion)和自定义要点,生成 LaTeX 格式的章节草稿。可指定参考文献目录以模仿其写作风格。",inputSchema: {type"object",properties: {section: {type"string",enum: ["abstract""introduction""related_work""method""experiments""results""discussion""conclusion"],description"要生成的章节类型"            },additionalInstructions: { type"string"description"额外要求(如强调创新点、包含对比方法等)" },previousContent: { type"string"description"已有的前文内容(用于上下文连贯)" },referenceDir: { type"string"description"包含参考文献文件的本地目录路径,用于模仿其写作风格(可选)" }, autoRefs: {"type""boolean","description""是否自动搜索并引用相关 arXiv 论文(默认 false)","default"false            },refQuery: {"type""string","description""自定义文献搜索关键词(若不提供,则从章节主题中自动生成)"            },refCount: {"type""number","description""期望引用的参考文献数量(默认 5)","default"5            }          },required: ["section"]        }      }    ]  };});

3.注册工具调用处理器

CallToolRequestSchema负责处理客户端请求。具体职责有:

  • 解析请求:提取工具名和参数。
  • 分发执行:根据工具名跳转到对应的业务逻辑(通常是一个 switch分支)。
  • 返回结果:将执行结果封装成 MCP 规定的响应格式({ content: […] })返回给客户端。
server.setRequestHandler(CallToolRequestSchemaasync (request) => {const { name, arguments: args } = request.params;try {switch (name) {case"revise_section": {const { content, instruction, autoRefs, refQuery, refCount = 5 } = args asany;let refContext = "";letbibPathsstring[] = [];// 自动搜索文献if (autoRefs) {const query = refQuery || buildRefQuery("", instruction, content);try {const { refPrompt, bibContent } = awaitbuildRefContext(query, refCount);if (refPrompt.trim()) {              refContext = refPrompt;const bibFileName = `refs_${Date.now()}.bib`;              fs.writeFileSync(path.join(WORK_DIR, bibFileName), bibContent, "utf-8");              bibPaths.push(bibFileName);            }          } catch (err) {console.warn("自动参考文献搜索失败:", err);          }        }// 构建提示const systemPrompt = `You are a meticulous academic editor. Revise the provided LaTeX text according to the user's instruction. Maintain the LaTeX formatting and existing citation commands. ${autoRefs ? 'If you add new statements, cite the provided references where appropriate using \\cite{id}.' : ''}`;let userPrompt = `Original text:\n${content}\n\nRevision instruction: ${instruction}`;if (refContext) {          userPrompt += `\n\n${refContext}\n\nAdd or keep citations as appropriate.`;        }const revised = awaitcallOpenRouterAPI(userPrompt, systemPrompt);const outPath = path.join(WORK_DIR`revised_${Date.now()}.tex`);        fs.writeFileSync(outPath, revised, "utf-8");let resultText = revised;if (bibPaths.length > 0) {          resultText += `\n\n📚 参考文献文件已保存: ${bibPaths.join(", ")}`;        }return {content: [{type"text",text: resultText,file: path.basename(outPath),            ...(bibPaths.length > 0 && { bibFiles: bibPaths })          }]        };      }case"search_and_format_references": {const { query, maxResults = 5 } = args asany;const bib = awaitsearchAndFormatReferences(query, maxResults);const outPath = path.join(WORK_DIR`refs_${Date.now()}.bib`);        fs.writeFileSync(outPath, bib, "utf-8");return {content: [{ type"text"text: bib, file: path.basename(outPath) }]        };      }case"generate_paper_section": {const { section, additionalInstructions, previousContent, autoRefs, refQuery, refCount = 5 } = args asany;let refContext = "";letbibPathsstring[] = [];// 自动搜索文献if (autoRefs) {const query = refQuery || buildRefQuery(section, additionalInstructions, previousContent);try {const { refPrompt, bibContent } = awaitbuildRefContext(query, refCount);if (refPrompt.trim()) {              refContext = refPrompt;// 保存 BibTeX 文件const bibFileName = `refs_${Date.now()}.bib`;              fs.writeFileSync(path.join(WORK_DIR, bibFileName), bibContent, "utf-8");              bibPaths.push(bibFileName);            }          } catch (err) {console.warn("自动参考文献搜索失败:", err);          }        }// 构建系统提示const systemPrompt = `You are an academic writer specializing in lower-limb joint moment estimation using deep learning. Write a ${section} section in LaTeX format suitable for MDPI template. Use professional, coherent English. ${autoRefs ? 'You MUST cite the provided references at appropriate places using \\cite{id}.' : ''}`;let userPrompt = `Generate the ${section} section for a paper on deep learning-based joint moment estimation.`;if (previousContent) {          userPrompt += `\n\nContext from previous sections:\n${previousContent.slice(01000)}`;        }if (additionalInstructions) {          userPrompt += `\n\nAdditional instructions: ${additionalInstructions}`;        }if (refContext) {          userPrompt += `\n\n${refContext}\n\nPlease incorporate citations naturally in the generated text.`;        }        userPrompt += `\n\nOutput only the LaTeX content for this section.`;const sectionText = awaitcallOpenRouterAPI(userPrompt, systemPrompt);const outPath = path.join(WORK_DIR`${section}_${Date.now()}.tex`);        fs.writeFileSync(outPath, sectionText, "utf-8");let resultText = sectionText;if (bibPaths.length > 0) {          resultText += `\n\n📚 参考文献文件已保存: ${bibPaths.join(", ")}`;        }return {content: [{type"text",text: resultText,file: path.basename(outPath),            ...(bibPaths.length > 0 && { bibFiles: bibPaths })          }]        };      }case"search_arxiv": {const { query, maxResults = 5 } = args as { querystringmaxResults?: number };const results = awaitsearchArxivPapers(query, maxResults);return {content: [{type"text",text`找到 ${results.papers.length} 篇相关论文(总计 ${results.totalResults} 篇):\n\n${results.papers.map((paper, index) =>`${index + 1}. **${paper.title}**\n   ID: ${paper.id}\n   发布日期: ${paper.published}\n   作者: ${paper.authors.map((author: any) => author.name || author).join(', ')}\n   摘要: ${paper.summary.substring(0300)}...\n   URL: ${paper.url}\n`            ).join('\n')}`          }]        };      }default:thrownewError(`Unknown tool: ${name}`);    }  } catch (error) {return {content: [{type"text",text`工具执行失败: ${error instanceofError ? error.message : String(error)}`      }],isErrortrue    };  }});

4.一些工具函数和辅助函数

// 搜索文献并且格式化为 BibTeX 条目asyncfunctionsearchAndFormatReferences(querystringmaxResultsnumber): Promise<string> {const results = awaitsearchArxivPapers(query, maxResults);constbibEntriesstring[] = [];for (const paper of results.papers) {const entry = `@article{${paper.id},  title = {${paper.title}},  author = {${paper.authors.map((a: any) => a.name).join(' and ')}},  journal = {arXiv preprint},  year = {${paper.published.slice(04)}},  eprint = {${paper.id}},  archivePrefix = {arXiv},  primaryClass = {cs.LG},  url = {${paper.url}}}`;    bibEntries.push(entry);  }return bibEntries.join('\n\n');}// 调用 OpenRouter APIasyncfunctioncallOpenRouterAPI(promptstringsystemPrompt?: string): Promise<string> {try {constmessagesArray<{ rolestringcontentstring }> = [];if (systemPrompt) {      messages.push({ role"system"content: systemPrompt });    }    messages.push({ role"user"content: prompt });const response = await axios.post(OPENROUTER_API_URL, {modelOPENROUTER_MODEL,messages: messages,streamfalse,max_tokens8192,temperature0.7,top_p0.7,    }, {headers: {"Authorization"`Bearer ${OPENROUTER_API_KEY}`,"Content-Type""application/json","HTTP-Referer""http://localhost",  // 根据 OpenRouter 要求添加"X-Title""Mdpi Writer MCP Server"      }    });return response.data.choices[0].message.content;  } catch (error) {console.error("调用 OpenRouter API 时出错:", error);thrownewError(`AI 调用失败: ${error instanceofError ? error.message : String(error)}`);  }}// 构建文献查询functionbuildRefQuery(sectionstring,additionalInstructions?: string,textContent?: string): string {const baseQuery = "lower limb joint moment estimation deep learning";constsectionKeywordsRecord<stringstring> = {abstract"summary abstract",introduction"introduction background",related_work"related work literature review",method"method model architecture",experiments"experiment dataset evaluation",results"results analysis",discussion"discussion implications",conclusion"conclusion future work"  };let query = baseQuery;if (sectionKeywords[section]) query += " " + sectionKeywords[section];if (additionalInstructions) query += " " + additionalInstructions;if (textContent) {// 取前 200 个字符作为补充const snippet = textContent.slice(0200).replace(/[^a-zA-Z0-9\s]/g" ");    query += " " + snippet;  }return query.slice(0300); // 限制长度}// 构建引用bib格式asyncfunctionbuildRefContext(querystringmaxResultsnumber): Promise<{refPromptstring;bibContentstring;}> {const results = awaitsearchArxivPapers(query, maxResults);constbibEntriesstring[] = [];constsummariesstring[] = [];for (const paper of results.papers) {    bibEntries.push(`@article{${paper.id},  title = {${paper.title}},  author = {${paper.authors.map((a: any) => a.name).join(' and ')}},  journal = {arXiv preprint},  year = {${paper.published.slice(0,4)}},  eprint = {${paper.id}},  archivePrefix = {arXiv},  primaryClass = {cs.LG},  url = {${paper.url}}}`);    summaries.push(`- \\cite{${paper.id}${paper.title}${paper.summary.substring(0200)}...`);  }const refPrompt = `**Relevant literature for citation (use \\cite{id} in the text):**${summaries.join('\n')}`;const bibContent = bibEntries.join('\n\n');return { refPrompt, bibContent };}// 支持的文献文件扩展名constREFERENCE_EXTS = ['.txt''.md''.tex''.bib''.pdf'];// 读取文献内容(这里截取了字符长度,可以按需修改)asyncfunctionreadReferenceContents(dirPathstring): Promise<string> {if (!fs.existsSync(dirPath)) {thrownewError(`目录不存在: ${dirPath}`);  }const files = fs.readdirSync(dirPath);let combinedText = "";for (const file of files) {const ext = path.extname(file).toLowerCase();if (!REFERENCE_EXTS.includes(ext)) continue;const filePath = path.join(dirPath, file);const content = fs.readFileSync(filePath, "utf-8");// 截取前 5000 字符以免 token 过多    combinedText += `\n\n--- 文件: ${file} ---\n${content.slice(05000)}`;  }if (!combinedText) {thrownewError("目录下没有可读取的文献文件(支持 .txt, .md, .tex, .bib, .pdf(文本内容))");  }return combinedText;}// 分析写作风格asyncfunctionanalyzeWritingStyle(referenceTextstring): Promise<string> {const systemPrompt = "You are an expert in academic writing analysis.";const userPrompt = `Analyze the following academic writing samples and summarize the key characteristics of the writing style. Include aspects such as:- Sentence structure (e.g., long and complex vs. short and direct)- Use of technical terminology and jargon- Tone (e.g., formal, objective, cautious)- Paragraph organization and flow- Typical expressions or phrase patternsProvide a concise paragraph describing the style, which can be used as guidance for generating new text in the same style.Samples:${referenceText}`;returnawaitcallOpenRouterAPI(userPrompt, systemPrompt);}// 工具函数:搜索 arXiv 论文asyncfunctionsearchArxivPapers(querystringmaxResultsnumber = 5): Promise<{ totalResultsnumberpapersany[] }> {try {const results = await arxivClient.search({start0,searchQuery: {include: [          { field"all"value: query }        ]      },maxResults: maxResults    });const papers = results.entries.map(entry => {const urlParts = entry.url.split('/');const arxivId = urlParts[urlParts.length - 1];return {id: arxivId,url: entry.url,title: entry.title.replace(/\s+/g' ').trim(),summary: entry.summary.replace(/\s+/g' ').trim(),published: entry.published,authors: entry.authors || []      };    });return {totalResults: results.totalResults,papers: papers    };  } catch (error) {console.error("搜索 arXiv 论文时出错:", error);thrownewError(`搜索失败: ${error instanceofError ? error.message : String(error)}`);  }}

导入MCP服务

详细可见我之前的一篇文章,详细介绍了在Vscode中如何导入MCP服务。

链接:在Vscode中导入MCP服务[1]

效果样例

  • 生成摘要
生成摘要
  • 文献引用
文献引用
  • bib文献列表生成
bib文献列表生成

结语

这一套流程下来还是挺花费token的,并且基本上只能生成一个大概的introduction,可能不如直接使用cli或者网页版交互更快。不过大部分文献还是可以用的,文献和引用的对应大多还是比较精准的。如果对项目感兴趣可以访问项目链接[2]

参考链接

  1. 在Vscode中导入MCP服务: https://zhuanlan.zhihu.com/p/2008950958814672364
  2. 项目链接: https://github.com/yeffky/mdpi-writer-mcp-server/