OpenClaw 面临架构大考,VLM成广告走私新渠道

2026-03-23 ~ 2026-03-30 · arXiv cs.CR · 共 174 篇

上周（3月23日至30日）arXiv 安全方向（cs.CR）共收录 174 篇新论文。其中 75 篇（约 43%） 聚焦核心 AI 大模型与 Agent 安全。本周最大的变化是：针对底层 Agent 架构（特别是 OpenClaw）的系统级漏洞分类首次系统化公开，同时 VLM (视觉语言模型) 开始遇到极具实战性的“语义级后门”注入威胁。

174本周总论文

75AI 核心安全

43%占比

2OpenClaw 专题

🔐 核心聚焦：AI 基础设施的真实脆弱面

本周的安全研究高度集中在“架构层”和“语义层”的深水区。研究者们不再局限于简单的提示词注入，而是开始系统性清算现有 Agent 框架底层的隔离设计缺陷，并探索了 VLM 的语义攻击盲区。

为何我们要特别关注“分类学”（Taxonomy）研究？
面对 OpenClaw 这样复杂的系统，单一的 CVE 漏洞往往只是冰山一角。本周的重大研究通过分析横跨沙箱、插件、提示词等架构层的190多个安全漏洞，揭开了“无需认证权限即可完成远程代码执行 (RCE)”的系统性隐患。

精选论文：漏洞体系、语义后门与去中心化操纵

🎫 A Systematic Taxonomy of Security Vulnerabilities in the OpenClaw Framework

提出了首个针对 OpenClaw 运行时的漏洞全景分类图。它发现仅仅通过插件和网关中看似不高的低危漏洞进行组合，攻击者可以构建出完整的 RCE 攻击链，并指出其底层的命令白名单很容易被类似于 busybox 或 shell option 的特性绕过。

📄 2603.27517v1

框架安全

🚧 Hidden Ads: Behavior Triggered Semantic Backdoors for VLM

提出了一种令人毛骨悚然的 VLM (视觉语言模型) 后门注入方式：不需要传统的诡异像素块作为触发器，而是利用“自然用户行为”。当用户上传如“美食”等语义相关的图片时，预埋了后门的推荐模型会在回答的尾部自动拼接上恶意的推广和植入广告。

📄 2603.27522v1

视觉后门

⏪ Ordering Power is Sanctioning Power: Sanction Evasion-MEV

重新审视了加密领域的制裁有效性。揭示了当 USDT/USDC 主体发起针对黑名单地址的冻结时，黑客可以通过 MEV（最大可提取价值）贿赂矿工节点，抢在冻结交易打包前强行将被制裁资产转移的黑暗产业链（成功率惊人）。

📄 2603.27739v1

链上逃逸

📊 本周 AI 安全全貌概览

子领域	论文数	占比
AI Agent 架构与自动化攻防	28	16%
大模型安全、对齐与越狱注入	22	12%
AI 数据隐私与边缘联邦学习	25	14%
Web3、DeFi与区块链治理	11	6%
前沿密码学与量子女巫抵抗	13	7%
AI赋能的下一代威胁感知	9	5%
系统底座、硬件与云原生安全	12	6%
基础安全前沿杂项	54	31%

💡 编者观察

本周研究观察到的行业转向信号：

⚠ 框架底层防御的匮乏
Agent框架不能单纯信赖执行沙箱。OpenClaw 本周被爆出的系统级组合漏洞，证明了只在隔离网关加一层权限白名单是不起作用的。参数注入一旦发生，整个调用链都会被迫配合执行。

🔬 语义型后门的暗网化
多模态模型的后门攻击已经极其优雅，从贴图攻击演化为了“自然行为触发”。对于 VLM 的商业化落地，这种植入恶性连带信息（广告、造黄谣等）的隐式后门攻击将是巨大的信任灾难。

📚 本周论文全列表 (174 篇)

AI Agent 架构与自动化攻防 (28)

Towards Context-Aware Image Anonymization with Multi-Agent Reasoning[2603.27817v1]
A Systematic Taxonomy of Security Vulnerabilities in the OpenClaw AI Agent Framework[2603.27517v1]
SkillTester: Benchmarking Utility and Security of Agent Skills[2603.28815v1]
SafetyDrift: Predicting When AI Agents Cross the Line Before They Actually Do[2603.27148v1]
SafeClaw-R: Towards Safe and Secure Multi-Agent Personal Assistants[2603.28807v1]
Red-MIRROR: Agentic LLM-based Autonomous Penetration Testing with Reflective Verification and Knowledge-augmented Interaction[2603.27127v1]
Hermes Seal: Zero-Knowledge Assurance for Autonomous Vehicle Communications[2603.26343v1]
Knowdit: Agentic Smart Contract Vulnerability Detection with Auditing Knowledge Summarization[2603.26270v1]
Clawed and Dangerous: Can We Trust Open Agentic Systems?[2603.26221v1]
AVDA: Autonomous Vibe Detection Authoring for Cybersecurity[2603.25930v2]
From Logic Monopoly to Social Contract: Separation of Power and the Institutional Foundations for Autonomous Agent Economies[2603.25100v1]
The System Prompt Is the Attack Surface: How LLM Agent Configuration Shapes Security and Creates Exploitable Vulnerabilities[2603.25056v1]
AIP: Agent Identity Protocol for Verifiable Delegation Across MCP and A2A[2603.24775v1]
Infrastructure for Valuable, Tradable, and Verifiable Agent Memory[2603.24564v1]
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers[2603.24414v1]
AgentRFC: Security Design Principles and Conformance Testing for Agent Protocols[2603.23801v1]
The Cognitive Firewall:Securing Browser Based AI Agents Against Indirect Prompt Injection Via Hybrid Edge Cloud Defense[2603.23791v1]
RTS-ABAC: Real-Time Server-Aided Attribute-Based Authorization & Access Control for Substation Automation Systems[2603.23012v1]
AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-based Mobile GUI Agents[2603.23007v1]
SoK: The Attack Surface of Agentic AI -- Tools, and Autonomy[2603.22928v1]
Agent-Sentry: Bounding LLM Agents via Execution Provenance[2603.22868v1]
Agent Audit: A Security Analysis System for LLM Agent Applications[2603.22853v1]
Observable Channels, Not Just Storage: Evaluating Privacy Leakage in LLM Agent Pipelines[2603.22751v2]
CAPTCHA Solving for Native GUI Agents: Automated Reasoning-Action Data Generation and Self-Corrective Training[2603.23559v1]
STRIATUM-CTF: A Protocol-Driven Agentic Framework for General-Purpose CTF Solving[2603.22577v1]
Model Context Protocol Threat Modeling and Analyzing Vulnerabilities to Prompt Injection with Tool Poisoning[2603.22489v1]
Are AI-assisted Development Tools Immune to Prompt Injection?[2603.21642v1]
Auditing MCP Servers for Over-Privileged Tool Capabilities[2603.21641v1]

大模型安全、对齐与越狱注入 (22)

Hidden Ads: Behavior Triggered Semantic Backdoors for Advertisement Injection in Vision Language Models[2603.27522v1]
GUARD-SLM: Token Activation-Based Defense Against Jailbreak Attacks for Small Language Models[2603.28817v1]
Sovereign Context Protocol: An Open Attribution Layer for Human-Generated Content in the Age of Large Language Models[2603.27094v1]
Reentrancy Detection in the Age of LLMs[2603.26497v1]
Protecting User Prompts Via Character-Level Differential Privacy[2603.26032v1]
Unveiling the Resilience of LLM-Enhanced Search Engines against Black-Hat SEO Manipulation[2603.25500v1]
Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models[2603.25412v1]
Shape and Substance: Dual-Layer Side-Channel Attacks on Local Vision-Language Models[2603.25403v2]
PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems[2603.25164v1]
IrisFP: Adversarial-Example-based Model Fingerprinting with Enhanced Uniqueness and Robustness[2603.24996v2]
Bridging Code Property Graphs and Language Models for Program Analysis[2603.24837v1]
Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs[2603.24511v1]
Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage[2603.23966v2]
How Vulnerable Are Edge LLMs?[2603.23822v1]
Leveraging Large Language Models for Trustworthiness Assessment of Web Applications[2603.23781v1]
Not All Tokens Are Created Equal: Query-Efficient Jailbreak Fuzzing for LLMs[2603.23269v1]
Robust Safety Monitoring of Language Models via Activation Watermarking[2603.23171v2]
Does Teaming-Up LLMs Improve Secure Code Generation? A Comprehensive Evaluation with Multi-LLMSecCodeEval[2603.22717v1]
BioShield: A Context-Aware Firewall for Securing Bio-LLMs[2603.22612v1]
OrgForge-IT: A Verifiable Synthetic Benchmark for LLM-Based Insider Threat Detection[2603.22499v1]
Evaluating the Reliability and Fidelity of Automated Judgment Systems of Large Language Models[2603.22214v1]
Structured Visual Narratives Undermine Safety Alignment in Multimodal Large Language Models[2603.21697v1]

AI 数据隐私与边缘联邦学习 (25)

SNEAKDOOR: Stealthy Backdoor Attacks against Distribution Matching-based Dataset Condensation[2603.28824v1]
Gender-Based Heterogeneity in Youth Privacy-Protective Behavior for Smart Voice Assistants: Evidence from Multigroup PLS-SEM[2603.27117v1]
Privacy-Preserving Iris Recognition: Performance Challenges and Outlook[2603.26890v1]
Towards Privacy-Preserving Federated Learning using Hybrid Homomorphic Encryption[2603.26417v1]
Privacy-Enhancing Encryption in Data Sharing: A Survey on Security, Performance and Functionality[2603.26224v1]
EPDQ: Efficient and Privacy-Preserving Exact Distance Query on Encrypted Graphs[2603.26219v1]
Gaussian Shannon: High-Precision Diffusion Model Watermarking Based on Communication[2603.26167v1]
Not All Entities are Created Equal: A Dynamic Anonymization Framework for Privacy-Preserving Retrieval-Augmented Generation[2603.26074v1]
Supercharging Federated Intelligence Retrieval[2603.25374v1]
On the Vulnerability of Deep Automatic Modulation Classifiers to Explainable Backdoor Threats[2603.25310v1]
Physical Backdoor Attack Against Deep Learning-Based Modulation Classification[2603.25304v1]
An Explainable Federated Framework for Zero Trust Micro-Segmentation in IIoT Networks[2603.24754v1]
Amplified Patch-Level Differential Privacy for Free via Random Cropping[2603.24695v1]
PAC-DP: Personalized Adaptive Clipping for Differentially Private Federated Learning[2603.24003v1]
An Empirical Analysis of Google Play Data Safety Disclosures: A Consistency Study of Privacy Indicators in Mobile Gaming Apps[2603.23935v1]
Byzantine-Robust and Differentially Private Federated Optimization under Weaker Assumptions[2603.23472v1]
PRETTINESS -- Privacy pResErving aTTrIbute maNagEment SyStem[2603.23221v1]
Privacy-Aware Smart Cameras: View Coverage via Socially Responsible Coordination[2603.23197v1]
Multi-User Multi-Key Image Steganography with Key Isolation[2603.23005v1]
A Critical Review on the Effectiveness and Privacy Threats of Membership Inference Attacks[2603.22987v1]
Beyond Theoretical Bounds: Empirical Privacy Loss Calibration for Text Rewriting Under Local Differential Privacy[2603.22968v1]
Privacy-Preserving EHR Data Transformation via Geometric Operators: A Human-AI Co-Design Technical Report[2603.22954v1]
Combinatorial Privacy: Private Multi-Party Bitstream Grand Sum by Hiding in Birkhoff Polytopes[2603.22808v4]
In-network Attack Detection with Federated Deep Learning in IoT Networks: Real Implementation and Analysis[2603.21596v1]
Hardening Confidential Federated Compute against Side-channel Attacks[2603.21469v1]

Web3、DeFi与区块链治理 (11)

Ordering Power is Sanctioning Power: Sanction Evasion-MEV and the Limits of On-Chain Enforcement[2603.27739v1]
HFIPay: Privacy-Preserving, Cross-Chain Cryptocurrency Payments to Human-Friendly Identifiers[2603.26970v1]
Auditing Blockchain Innovations: Technical Challenges Beyond Traditional Finance[2603.26361v1]
Bitcoin Smart Accounts: Trust-Minimized Native Bitcoin DeFi Infrastructure[2603.26293v1]
PEB Separation and State Migration: Unmasking the New Frontiers of DeFi AML Evasion[2603.26290v1]
zk-X509: Privacy-Preserving On-Chain Identity from Legacy PKI via Zero-Knowledge Proofs[2603.25190v2]
SolRugDetector: Investigating Rug Pulls on Solana[2603.24625v1]
An Adaptive Neuro-Fuzzy Blockchain-AI Framework for Secure and Intelligent FinTech Transactions[2603.23829v1]
n-VM: A Multi-VM Layer-1 Architecture with Shared Identity and Token State[2603.23670v1]
Albank -- a case study on the use of ethereum blockchain technology and smart contracts for secure decentralized bank application[2603.21894v1]
Connecting Distributed Ledgers: Surveying Novel Interoperability Solutions in On-chain Finance[2603.21797v1]

前沿密码学与量子女巫抵抗 (13)

Quantum Bit Error Rate Analysis in BB84 Quantum Key Distribution: Measurement, Statistical Estimation, and Eavesdropping Detection[2603.27278v1]
Attacks on Sparse LWE and Sparse LPN with new Sample-Time tradeoffs[2603.27190v1]
Information-Theoretic Solutions for Seedless QRNG Bootstrapping and Hybrid PQC-QKD Key Combination[2603.26907v1]
Cryptanalysis of a PIR Scheme based on Linear Codes over Rings[2603.26409v1]
Send the Key in Cleartext: Halving Key Consumption while Preserving Unconditional Security in QKD Authentication[2603.25496v1]
Efficient ML-DSA Public Key Management Method with Identity for PKI and Its Application[2603.25043v1]
IPsec based on Quantum Key Distribution: Adapting non-3GPP access to 5G Networks to the Quantum Era[2603.24426v1]
Efficient Encrypted Computation in Convolutional Spiking Neural Networks with TFHE[2603.26781v1]
On the Vulnerability of FHE Computation to Silent Data Corruption[2603.23253v1]
mmFHE: mmWave Sensing with End-to-End Fully Homomorphic Encryption[2603.22437v1]
Asymptotically Ideal Hierarchical Secret Sharing Based on CRT for Integer Ring[2603.22011v1]
Asymptotically Ideal Conjunctive Hierarchical Secret Sharing Scheme Based on CRT for Polynomial Ring[2603.22001v1]
Q-AGNN: Quantum-Enhanced Attentive Graph Neural Network for Intrusion Detection[2603.22365v1]

AI赋能的下一代威胁感知 (9)

Context-Aware Phishing Email Detection Using Machine Learning and NLP[2603.27326v1]
Machine Learning Transferability for Malware Detection[2603.26632v1]
Understanding AI Methods for Intrusion Detection and Cryptographic Leakage[2603.25826v1]
CANGuard: A Spatio-Temporal CNN-GRU-Attention Hybrid Architecture for Intrusion Detection in In-Vehicle CAN Networks[2603.25763v1]
Targeted Adversarial Traffic Generation : Black-box Approach to Evade Intrusion Detection Systems in IoT Networks[2603.23438v1]
An Experimental Study of Machine Learning-Based Intrusion Detection for OPC UA over Industrial Private 5G Networks[2603.23416v1]
Security Barriers to Trustworthy AI-Driven Cyber Threat Intelligence in Finance: Evidence from Practitioners[2603.23304v1]
How Far Should We Need to Go : Evaluate Provenance-based Intrusion Detection Systems in Industrial Scenarios[2603.22982v1]
TLS Certificate and Domain Feature Analysis of Phishing Domains in the Danish .dk Namespace[2603.21652v1]

系统底座、硬件与云原生安全 (12)

Finding Memory Leaks in C/C++ Programs via Neuro-Symbolic Augmented Static Analysis[2603.27224v2]
SPARK: Secure Predictive Autoscaling for Robust Kubernetes[2603.26833v1]
Disguising Topology and Side-Channel Information through Covert Gate- and ML-Enabled IP Camouflaging[2603.25904v1]
ALPS: Automated Least-Privilege Enforcement for Securing Serverless Functions[2603.25393v1]
Design and Development of an ML/DL Attack Resistance of RC-Based PUF for IoT Security[2603.28798v1]
Towards Remote Attestation of Microarchitectural Attacks: The Case of Rowhammer[2603.24172v2]
Walma: Learning to See Memory Corruption in WebAssembly[2603.24167v1]
Toward a Multi-Layer ML-Based Security Framework for Industrial IoT[2603.24111v2]
Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution[2603.23064v2]
Explainable Threat Attribution for IoT Networks Using Conditional SHAP and Flow Behavior Modelling[2603.22771v1]
Semi-Automated Threat Modeling of Cloud-Based Systems Through Extracting Software Architecture from Configuration and Network Flow[2603.22603v1]
Framework for Risk-Based IoT Cybersecurity Audit Engagements[2603.22191v1]

基础安全前沿杂项 (54)

Decentralized Proof-of-Location for Content Provenance: Towards Capture-Time Authenticity[2603.27883v1]
Attacking AI Accelerators by Leveraging Arithmetic Properties of Addition[2603.27439v1]
"Elementary, My Dear Watson." Detecting Malicious Skills via Neuro-Symbolic Reasoning across Heterogeneous Artifacts[2603.27204v1]
Detecting Protracted Vulnerabilities in Open Source Projects[2603.27067v1]
On the Optimal Number of Grids for Differentially Private Non-Interactive $K$-Means Clustering[2603.26963v1]
Evolution-Based Timed Opacity under a Universal Observation Model[2603.26573v1]
Hidden Elo: Private Matchmaking through Encrypted Rating Systems[2603.26407v2]
ROAST: Risk-aware Outlier-exposure for Adversarial Selective Training of Anomaly Detectors Against Evasion Attacks[2603.26093v1]
A Large-scale Empirical Study on the Generalizability of Disclosed Java Library Vulnerability Exploits[2603.25997v1]
Neighbor-Aware Localized Concept Erasure in Text-to-Image Diffusion Models[2603.25994v1]
Why Safety Probes Catch Liars But Miss Fanatics[2603.25861v1]
TAAC: A gate into Trustable Audio Affective Computing[2603.25570v1]
Multi-target Coverage-based Greybox Fuzzing[2603.25354v1]
Second order Recurrences, quadratic number fields and cyclic codes[2603.25343v1]
Usability of Passwordless Authentication in Wi-Fi Networks: A Comparative Study of Passkeys and Passwords in Captive Portals[2603.25290v1]
Mitigating Evasion Attacks in Fog Computing Resource Provisioning Through Proactive Hardening[2603.25257v1]
A Public Theory of Distillation Resistance via Constraint-Coupled Reasoning Architectures[2603.25022v1]
LiteGuard: Efficient Task-Agnostic Model Fingerprinting with Enhanced Generalization[2603.24982v1]
On the Foundations of Trustworthy Artificial Intelligence[2603.24904v1]
Sovereign AI at the Front Door of Care: A Physically Unidirectional Architecture for Secure Clinical Intelligence[2603.24898v1]
An Approach to Generate Attack Graphs with a Case Study on Siemens PCS7 Blueprint for Water Treatment Plants[2603.24888v1]
Trusted-Execution Environment (TEE) for Solving the Replication Crisis in Academia[2603.24878v1]
AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective[2603.24857v1]
Analysing the Safety Pitfalls of Steering Vectors[2603.24543v1]
A Large-Scale Study of Telegram Bots[2603.24302v1]
Software Supply Chain Smells: Lightweight Analysis for Secure Dependency Management[2603.24282v2]
Attack Assessment and Augmented Identity Recognition for Human Skeleton Data[2603.24232v1]
Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search[2603.24203v1]
When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm[2603.24079v1]
Forensic Implications of Localized AI: Artifact Analysis of Ollama, LM Studio, and llama.cpp[2603.23996v1]
AetherWeave: Sybil-Resistant Robust Peer Discovery with Stake[2603.23793v1]
Space Fabric: A Satellite-Enhanced Trusted Execution Architecture[2603.23745v1]
CSTS: A Canonical Security Telemetry Substrate for AI-Native Cyber Detection[2603.23459v1]
Canonical Byte-String Encoding for Finite-Ring Cryptosystems[2603.23364v1]
What a Mesh: Formal Security Analysis of WPA3 SAE Wireless Authentication[2603.23352v1]
The Power of Power Codes: New Classes of Easy Instances for the Linear Equivalence Problem[2603.23230v1]
Gyokuro: Source-assisted Private Membership Testing using Trusted Execution Environments[2603.23226v1]
TRAP: Hijacking VLA CoT-Reasoning via Adversarial Patches[2603.23117v1]
Secure Two-Party Matrix Multiplication from Lattices and Its Application to Encrypted Control[2603.22857v1]
Digital Twin Enabled Simultaneous Learning and Modeling for UAV-assisted Secure Communications with Eavesdropping Attacks[2603.22753v1]
BlindMarket: Enabling Verifiable, Confidential, and Traceable IP Core Distribution in Zero-Trust Settings[2603.22685v1]
Precision-Varying Prediction (PVP): Robustifying ASR systems against adversarial attacks[2603.22590v1]
Tock: From Research to Securing 10 Million Computers[2603.22585v1]
Adversarial Vulnerabilities in Neural Operator Digital Twins: Gradient-Free Attacks on Nuclear Thermal-Hydraulic Surrogates[2603.22525v1]
CTF as a Service: A reproducible and scalable infrastructure for cybersecurity training[2603.22511v2]
Architecture-Derived CBOMs for Cryptographic Migration: A Security-Aware Architecture Tradeoff Method[2603.22442v1]
TALUS: Threshold ML-DSA with One-Round Online Signing via Boundary Clearance and Carry Elimination[2603.22109v2]
SecureBreak -- A dataset towards safe and secure models[2603.21975v1]
Publicly Understandable Electronic Voting: A Non-Cryptographic, End-to-End Verifiable Scheme[2603.21833v1]
Cybersecurity Guidance for Smart Homes: A Cross-National Review of Government Sources[2603.21703v1]
Bridges connecting Encryption Schemes[2603.21694v1]
Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks[2603.21654v1]
A Survey of Web Application Security Tutorials[2603.21556v1]
When the Abyss Looks Back: Unveiling Evolving Dark Patterns in Cookie Consent Banners[2603.21515v1]

数据来源：arXiv cs.CR，2026-03-23 至 2026-03-30 · 共 174 篇 · 自动化筛选 + 人工评述

如需本周全部论文列表或详细分析资料，欢迎留言交流。