Met Museum Open Access

The entire collection of one of the world's biggest museums, exposed as a 290 MB CSV. You can grep it, plot it, train a model on it. None of it requires attribution.

Maintained by	The Metropolitan Museum of Art
Released	February 2017, updated weekly since
Size	Around 492,000 objects; the CSV is about 290 MB
License	CC0 1.0 — public domain, no attribution required
Formats	CSV for bulk work, JSON via a REST API for live queries
Images included?	Yes, for roughly 80% of objects (those flagged public-domain)
API auth	None. Rate limit is generous (80 requests / second).

What you're getting

Every catalogued object the Met holds: paintings, sculpture, textiles, weapons, musical instruments, photographs, prints, costumes, the decorative arts — about half a million records, give or take depending on the week. Each row has over fifty fields, from the obvious ones (title, artist, year) to the surprisingly granular (which gallery the object is installed in, the exact credit line printed under it, the regional culture it came from). For most objects, the high-resolution image is included too — you can download a sixty-megapixel scan of Wheat Field with Cypresses and do whatever you want with it.

Why this one is special

Most museums treat their collection like the family jewels: a handful of items photographed for press releases, the rest behind a paywall, with watermarks on anything you can see. The Met went the other direction in 2017 and just gave it all away. The metadata, the images, the API access. No attribution required. No commercial restriction. For a 150-year-old institution holding objects worth billions of dollars, that was a pretty radical thing to do — and other museums have been quietly copying the model ever since (the Smithsonian, the Art Institute of Chicago, Cleveland, all came later).

How to get at it

Two ways in. For bulk work, grab the CSV from GitHub. It's a flat file — awk it, pandas it, drop it into SQLite, whatever suits you:

curl -O https://github.com/metmuseum/openaccess/raw/master/MetObjects.csv

For one-off lookups or live queries, hit the REST API directly. No key, no auth, just curl:

# List all object IDs (about 492,000 of them)
curl https://collectionapi.metmuseum.org/public/collection/v1/objects

# Fetch a single object — Van Gogh's "Wheat Field with Cypresses"
curl https://collectionapi.metmuseum.org/public/collection/v1/objects/436535

# Search for "Vermeer"
curl "https://collectionapi.metmuseum.org/public/collection/v1/search?q=vermeer"

Full API docs live at metmuseum.github.io.

What a record looks like

Here's a trimmed version of what comes back for that Van Gogh:

{
  "objectID": 436535,
  "isPublicDomain": true,
  "primaryImage": "https://images.metmuseum.org/.../DT1567.jpg",
  "department": "European Paintings",
  "objectName": "Painting",
  "title": "Wheat Field with Cypresses",
  "culture": "",
  "period": "",
  "artistDisplayName": "Vincent van Gogh",
  "artistNationality": "Dutch",
  "objectDate": "1889",
  "medium": "Oil on canvas",
  "dimensions": "28 7/8 x 36 3/4 in. (73.2 x 93.4 cm)",
  "creditLine": "Purchase, The Annenberg Foundation Gift, 1993",
  "GalleryNumber": "822"
}

What people have done with it

Google Arts & Culture ingests Met images for its zoom-deep viewers. Researchers have used the dataset to study how colour palettes shifted across centuries, the geographic provenance of catalogued objects, and gender representation among named artists. Generative-art projects train on the public-domain image subset. Hobbyists have plotted every Met object on a world map by its culture's origin and built timelines of acquisition year by year. The dataset gets paired with Wikidata to fill in the often-thin artist biographies. There's plenty of room left for new ideas — most of the collection's quieter corners haven't been visualised yet.

A few gotchas before you dive in

Not every record has an image. "Public domain" and "has primary image" are separate columns — filter on both, or you'll get a lot of empty image URLs.
The CSV is updated weekly. The GitHub repo is the source of truth — don't rely on third-party mirrors.
Some fields are very sparse. Artist Wikidata URL exists as a column but is blank more often than not.
Object dimensions are free-text, not numeric. You'll need a bit of regex if you want comparable measurements.
The API returns image URLs, not image bytes. Fetch those separately from images.metmuseum.org.