Met Museum Open Access
The entire collection of one of the world's biggest museums, exposed as a 290 MB CSV. You can grep it, plot it, train a model on it. None of it requires attribution.
| Maintained by | The Metropolitan Museum of Art |
|---|---|
| Released | February 2017, updated weekly since |
| Size | Around 492,000 objects; the CSV is about 290 MB |
| License | CC0 1.0 — public domain, no attribution required |
| Formats | CSV for bulk work, JSON via a REST API for live queries |
| Images included? | Yes, for roughly 80% of objects (those flagged public-domain) |
| API auth | None. Rate limit is generous (80 requests / second). |
What you're getting
Every catalogued object the Met holds: paintings, sculpture, textiles, weapons, musical instruments, photographs, prints, costumes, the decorative arts — about half a million records, give or take depending on the week. Each row has over fifty fields, from the obvious ones (title, artist, year) to the surprisingly granular (which gallery the object is installed in, the exact credit line printed under it, the regional culture it came from). For most objects, the high-resolution image is included too — you can download a sixty-megapixel scan of Wheat Field with Cypresses and do whatever you want with it.
Why this one is special
Most museums treat their collection like the family jewels: a handful of items photographed for press releases, the rest behind a paywall, with watermarks on anything you can see. The Met went the other direction in 2017 and just gave it all away. The metadata, the images, the API access. No attribution required. No commercial restriction. For a 150-year-old institution holding objects worth billions of dollars, that was a pretty radical thing to do — and other museums have been quietly copying the model ever since (the Smithsonian, the Art Institute of Chicago, Cleveland, all came later).
How to get at it
Two ways in. For bulk work, grab the CSV from GitHub. It's a flat file — awk it, pandas it, drop it into SQLite, whatever suits you:
curl -O https://github.com/metmuseum/openaccess/raw/master/MetObjects.csv
For one-off lookups or live queries, hit the REST API directly. No key, no auth, just curl:
# List all object IDs (about 492,000 of them)
curl https://collectionapi.metmuseum.org/public/collection/v1/objects
# Fetch a single object — Van Gogh's "Wheat Field with Cypresses"
curl https://collectionapi.metmuseum.org/public/collection/v1/objects/436535
# Search for "Vermeer"
curl "https://collectionapi.metmuseum.org/public/collection/v1/search?q=vermeer"
Full API docs live at metmuseum.github.io.
What a record looks like
Here's a trimmed version of what comes back for that Van Gogh:
{
"objectID": 436535,
"isPublicDomain": true,
"primaryImage": "https://images.metmuseum.org/.../DT1567.jpg",
"department": "European Paintings",
"objectName": "Painting",
"title": "Wheat Field with Cypresses",
"culture": "",
"period": "",
"artistDisplayName": "Vincent van Gogh",
"artistNationality": "Dutch",
"objectDate": "1889",
"medium": "Oil on canvas",
"dimensions": "28 7/8 x 36 3/4 in. (73.2 x 93.4 cm)",
"creditLine": "Purchase, The Annenberg Foundation Gift, 1993",
"GalleryNumber": "822"
}
What people have done with it
Google Arts & Culture ingests Met images for its zoom-deep viewers. Researchers have used the dataset to study how colour palettes shifted across centuries, the geographic provenance of catalogued objects, and gender representation among named artists. Generative-art projects train on the public-domain image subset. Hobbyists have plotted every Met object on a world map by its culture's origin and built timelines of acquisition year by year. The dataset gets paired with Wikidata to fill in the often-thin artist biographies. There's plenty of room left for new ideas — most of the collection's quieter corners haven't been visualised yet.
A few gotchas before you dive in
- Not every record has an image. "Public domain" and "has primary image" are separate columns — filter on both, or you'll get a lot of empty image URLs.
- The CSV is updated weekly. The GitHub repo is the source of truth — don't rely on third-party mirrors.
- Some fields are very sparse.
Artist Wikidata URLexists as a column but is blank more often than not. - Object dimensions are free-text, not numeric. You'll need a bit of regex if you want comparable measurements.
- The API returns image URLs, not image bytes. Fetch those separately from
images.metmuseum.org.