Experiments with the Getty’s Provenance Data
The Getty Provenance Index (GPI) is a set of databases that constitutes 1.5 million records relating to the sale and transmission of works of art, primarily in Western Europe.
The GPI is a remarkably rich resource for art historians, and at times an overwhelming one. Its contents didn’t start out as database records; each data point started as an entry in an archival document, each document with its own history, recording conventions, and shorthand. The reality of collecting this data from imperfect, eccentric, and far-flung sources means that the extent and scope of the index are quite complex.
We were curious about the kinds of questions the GPI could and couldn’t answer. Could we use this collection of data to develop meaningful art historical insights? The result of our ten-week investigation is the project website you see here: five case studies, each exploring a different art history question with the aid of the GPI.
About the Team
We’re a group of nine UCLA scholars: three graduate students, five undergraduates, and one faculty member. We completed this project between March and June 2015 as a capstone class for the UCLA Digital Humanities program, a collaborative project that constitutes the final step in students’ course sequence through the undergraduate minor and graduate certificate.
About This Project
We undertook this project in collaboration with the Getty Research Institute (GRI). The GRI is in the process of redesigning the provenance index and felt it could benefit from additional insight on the way scholars might like to use the resource.
We started by trying to understand the contours of this resource. It took us much longer than we expected—perhaps four weeks—to begin to feel we had a sense of what the databases contained. Our breakthrough was our trip to the GRI’s
Special Collections, where head of special collections Sally McKay showed us some of the paper records from which the index originated. We also had the chance to ask questions directly of Getty database specialist Ruth Cuadra, head of provenance research Christian Huemer, and digital humanities specialist Emily Pugh. For the first time, we began to understand the material basis of this collection, and the ways in which the documents’ eccentric histories informed the data contained in the GPI.
Once we began to grasp the shape of the database, we were able to formulate research questions. With the assistance of many secondary sources, and the help of a number of experts, we were able to develop our hypotheses, create visualizations, and produce the work you see here.
To clean and refine data, we used Excel and OpenRefine. To produce these visualizations, we used a variety of tools, including RAW, Google Fusion Tables, Plotly, Wordle, D3, and QGIS.
Getty Research Institute