A cross-verified database of notable people, 3500BC-2018AD
Morgane Laouenan (),
Palaash Bhargava,
Jean-Benoît Eyméoud (),
Olivier Gergaud,
Guillaume Plique () and
Etienne Wasmer
Additional contact information
Morgane Laouenan: CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique, LIEPP - Laboratoire interdisciplinaire d'évaluation des politiques publiques (Sciences Po) - Sciences Po - Sciences Po, CNRS - Centre National de la Recherche Scientifique
Palaash Bhargava: Department of Economics Columbia University - Columbia University [New York]
Jean-Benoît Eyméoud: LIEPP - Laboratoire interdisciplinaire d'évaluation des politiques publiques (Sciences Po) - Sciences Po - Sciences Po
Guillaume Plique: médialab - médialab (Sciences Po) - Sciences Po - Sciences Po, Kedge BS - Kedge Business School
Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) from HAL
Abstract:
A new strand of literature aims at building the most comprehensive and accurate database of notable individuals. We collect a massive amount of data from various editions of and . Using deduplication techniques over these partially overlapping sources, we cross-verify each retrieved information. For some variables, adds 15% more information when missing in . We find very few errors in the part of the database that contains the most documented individuals but nontrivial error rates in the bottom of the notability distribution, due to sparse information and classification errors or ambiguity. Our strategy results in a cross-verified database of 2.29 million individuals (an elite of 1/43,000 of human being having ever lived), including a third who are not present in the English edition of . Data collection is driven by specific social science questions on gender, economic growth, urban and cultural development. We document an Anglo-Saxon bias present in the English edition of , and document when it matters and when not.
Date: 2022
New Economics Papers: this item is included in nep-big and nep-his
Note: View the original document on HAL open archive server: https://hal.science/hal-03930666
References: View references in EconPapers View complete reference list from CitEc
Citations:
Published in Scientific Data , 2022, 9 (1), pp.290. ⟨10.1038/s41597-022-01369-4⟩
Downloads: (external link)
https://hal.science/hal-03930666/document (application/pdf)
Related works:
Working Paper: A cross-verified database of notable people, 3500BC-2018AD (2022)
Working Paper: A cross-verified database of notable people, 3500BC-2018AD (2022)
Working Paper: A Cross-verified Database of Notable People, 3500BC-2018AD (2021)
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:cesptp:hal-03930666
DOI: 10.1038/s41597-022-01369-4
Access Statistics for this paper
More papers in Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) from HAL
Bibliographic data for series maintained by CCSD ().