乐于分享
好东西不私藏

PDF格式要凉了? | 经济学人

PDF格式要凉了? | 经济学人

文章背景

1993 年 PDF 刚推出时曾被嘲讽为最愚蠢的创意,如今却已成为全球最普及的文件格式,存量超 2.5 万亿。但在 AI 时代,这一经典格式正面临挑战。

PDF 在手机上阅读体验差、复制内容不便、对无障碍阅读不友好,还常被用于传播恶意软件。更关键的是,大模型难以正确解析 PDF 的复杂排版,容易出现阅读顺序错误、混淆页眉页脚等问题,成为 AI “幻觉” 的重要诱因。为此,Factify 等初创公司试图开发更适配 AI 的新文件格式,意图取代 PDF。不过 PDF 协会与 Adobe 专家认为,问题不在于格式本身,而在于 AI 工具的适配能力不足。目前 Adobe、谷歌等企业已推出能更好处理 PDF 的 AI 功能,PDF 凭借极高的普及度和兼容性,仍将在 AI 时代继续占据主流地位,其统治远未结束。

正文字数390 word

【Para. 1】When Adobe introduced the portable document format (PDF) in 1993, a consultant from Gartner called it “the dumbest idea I’ve ever heard in my life”. Users would have to twiddle their thumbs waiting for the megabyte-sized files to download over their dial-up internet, then wait again for their PCs to render them. The software-maker’s board wanted to kill the project. But as sharing digital files became essential, the PDF triumphed—particularly after the Internal Revenue Service, America’s tax authority, started using it for its forms. Today more than 2.5trn PDFs float in the ether. But will the format survive the ai revolution?

【Para. 2】PDFs still have drawbacks. They are a pain to view on a smartphone. Copying data from them is fiddly. Software tools that read screens for blind people struggle with PDFs. The file type, which Adobe relinquished control over in 2008, is also a vehicle for malware: a fifth of email-based cyber-attacks utilise PDF attachments, according to Check Point, a cyber-security firm.

【Para. 3】Lately another source of criticism has emerged. The large language models underpinning generative AI often bamboozled by PDFs, reading a page set in columns from left to right rather than top to bottom, say, or getting confused by headers and footers. Trouble parsing PDFs is one of the reasons AI chatbots occasionally “hallucinate”, generating nonsense.

【Para. 4】Enter the disrupters. Startups such as Factify are on a mission to build a new file type that is better suited to the technology. Matan Gavish, its boss, talks of his “megalomaniac” vision of displacing the PDF.

【Para. 5】Yet Duff Johnson, head of the PDF Association, protector of the format, argues that the fault lies not in the file type but in ourselves. He contends that there is no reason developers cannot build bots that are able to use PDFs. The AI assistant embedded in Acrobat, Adobe’s PDF reader, is designed to do precisely that, notes Leonard Rosenthol, the software firm’s PDF guru. Google, a leader in AI, has rolled out a tool for developers using its Gemini models that makes it easier to ingest PDFs. The format’s reign is not over yet.

【声明】:本文原文摘选自经济学人 ,原文版权归杂志所有,仅供个人学习交流使用。

如需获取外刊精读资源,加入训练营,每天都提供四份学习文档

点链接直接查看精读训练营更多详情:

👉👉👉 外刊精读训练营

本站文章均为手工撰写未经允许谢绝转载:夜雨聆风 » PDF格式要凉了? | 经济学人

评论 抢沙发

5 + 4 =
  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
×
订阅图标按钮