A clearer signal of how inventors learn & build on each other's ideas

© 2025 EPFL

© 2025 EPFL

A new study published in the prestigious Strategic Management Journal by EPFL researcher Gaétan de Rassenfosse and collaborators introduces a novel way to trace how knowledge flows among inventors and firms—by looking inside the body of patent documents. The authors show that “in-text” patent citations provide a much cleaner signal of where ideas originate than front-page citations, which researchers have traditionally relied on.

Scholars and policymakers widely use patent citations to map the diffusion of technical knowledge. Yet the conventional source—citations listed on the patent’s front page—often reflects legal formalities and examiner practices more than genuine inventor input.

The new study, co-authored by Cyril Verluise (QuantumBlack), Gabriele Cristelli (London School of Economics), Kyle Higham (Motu Economic and Public Policy Research), and Gaétan de Rassenfosse (EPFL), demonstrates that in-text citations tell a different and more authentic story.

“When a patent cites another patent within its technical description, it’s often because the inventor(s) relied on it,” explains de Rassenfosse. “These in-text citations act as footprints of inventive knowledge; direct traces of how ideas build on previous work.”

Analyzing nearly 50 million in-text citations across 8 million U.S. patents, the researchers find that these references connect geographically and thematically closer inventions—clues that researchers have used in the past to demonstrate the presence of knowledge flows. Survey evidence from U.S. patent attorneys confirms the pattern: in-text citations are up to 44% more likely to originate with inventors than comparable front-page citations.

Despite being less frequent than front-page references, in-text citations prove statistically robust in economic models of knowledge diffusion.

“Front-page citations have long been treated as the paper trail of knowledge flows,” says de Rassenfosse. “Our work shows that part of that trail was hidden in plain sight—inside the patent text itself.”

Until recently, in-text citations were not available to researchers, and the team processed all U.S. patent documents to identify, extract, and disambiguate these citations. The team also makes the entire dataset and code publicly available, providing an open, reproducible foundation for future research on innovation dynamics and firm knowledge strategies.

The findings open new possibilities for studying how ideas spread, how firms reuse their own inventions, and how regions develop technological capabilities. They also offer managers and policymakers a sharper tool to understand the geography of knowledge and the mechanisms that drive cumulative innovation.