Name-gender coding

A thin wrapper around the (World Gender Name Dictionary 2.0).

This is a thin wrapper around the World Gender Name Dictionary 2.0 (WGND). The WGND is a dataset of name-gender pairs. It was originally produced to help historians of science and intellectual property to measure “women’s contribution to all fields of innovation and creativity.” The WGND 2.0 contains “26 million records linking given names and 195 different countries and territories.”

This implementation is limited. It only includes name-gender pairs when there is conflict betwee names across sources, geography, and gender. Put differently, this wrapper only reports the gendered valance of a name when there is not controversy within the larger WGND 2.0 database.

Demonstration

Initialize the program

from obiter.wgnd import *
database = wgnd()
Downloading data from WGND2.0
WGND 2.0 name-gender (_i.e._ No code) contains 3,491,141 unique name observations. 
This file is based on WGND 2.0 name-gender-code but it omits all known conflicting names across sources, geography and gender.

Read about the project here: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/MSEGSJ

Dataset citation: Raffo, Julio, 2021, "WGND 2.0", https://doi.org/10.7910/DVN/MSEGSJ, Harvard Dataverse, V1, UNF:6:5rI3h1mXzd6zkVhHurelLw== [fileUNF]
Data downloaded
print(database.get_gender('Simon'))
M
print(database.get_gender('Dana'))
F
print(database.get_gender('Vic'))
unknown