Documents articles

pdf adobe startups documents large language model

The humble PDF is becoming a problem for AI

PDFs are structurally hostile to large language models
Looking ahead: Three decades after Adobe introduced the Portable Document Format – a design intended to preserve the appearance of printed pages across devices – PDFs are facing pressure from a completely different kind of reader: artificial intelligence. The same fixed layouts that made PDFs indispensable to human users now make them difficult for large language models to interpret. Unlike web pages or plain-text files, columns, embedded graphics, and hidden metadata in PDFs often confuse machine parsing systems trained to process linear text.