The Pillbox Database
The National Library of Medicine's Pillbox dataset contained 8,693 photographs of pills, with an accompanying database of drug information. It was built to help with the identification of unknown pills.

What is it?
A dataset containing 8,693 high resolution photographs of pills marketed in the U.S. market, along with an accompanying database containing 83,925 rows of detailed information on the medication contained in each pill.
The database includes each pill’s shape (“CAPSULE”, “HEXAGON”, “DIAMOND”,”OVAL”, etc.), color, imprint (the markings on the pill), size of the pill, active ingredients, prescription name, medicine name and manufacturer. Most of the pill information comes from the Food and Drug Administration’s drug labels database.
Each pill is photographed in high resolution against a uniform gray background, along with rulers to show the dimensions of each pill.
The website described the project as “...one of the largest free databases of prescription and over-the-counter drug information and images, combining data from pharmaceutical companies, Food and Drug Administration, National Institutes of Health, and Department of Veterans Affairs.”
Why was this data collected?
According to the site, the purpose of the database and the accompanying website was to provide a tool to help identify unknown pills based on their shape, color, marking and other attributes.
“This system is designed for use by emergency physicians, first responders, other health care providers, Poison Control Center staff, and concerned citizens” read a press release from the project’s beta launch in 2009.









Who collected this data?
An archived version of the website says the dataset “...contains images which were either submitted by pharmaceutical companies to the Food and Drug Administration (FDA) as part of the drug label submission process or created through pill photography programs at the National Library of Medicine and the Department of Veterans Affairs.”
When was this collected?
The project launched in 2009 but was officially retired on January 29, 2021. The photos and data are still available “to support research, development, and education”. The database appeared to be continually updated during the lifespan of the project.
I reached out to the National Library of Medicine about why Pillbox was shut down, and I received this response, attributed to "the Pillbox team":
"Pillbox initially provided pill image data from a variety of data sources, however providers discontinued submitting content over the last several years, leaving pill information static and increasingly out of date. Additionally, much of the information provided in Pillbox was duplicative of NLM’s DailyMed. DailyMed is a highly used drug information resource that contains drug labeling information as supplied by the Food and Drug Administration (FDA). Given the availability of information from other NLM resources, we do not currently plan to reinstate Pillbox."
For this post, I went through and curated a collection of 1,500 of the more interesting looking pills from the database, and processed them to present them in a more interesting way.
Using some command line graphics (ImageMagick) I replaced the grey background from each photo with a tinted version of the dominant color, then assembled them into a quilt.
I found Color Thief very helpful for this.
You can view my slides that I made a few years back for a NICAR talk that used Pillbox as an example here.

Who decides the pill colors and shapes?
In the U.S., the Food and Drug Administration publishes detailed guidelines for the design of pills, with the goal of reducing drug misidentification. A big concept in these guidelines is understanding the context in which any given pill will be taken: Who will actually be administering the pill? What is the environment like in which the pill is administered? What other pills might this be taken with?
A few years ago, I spoke with the leader of a manufacturing group at a major pharmaceutical company about how these pills get their colors, shapes and sizes. It was a fascinating conversation, and here are some of the interesting takeaways:
- I learned that size is one of the most important considerations, for both handling and ingestion. Anything below 100mg is very hard for people to handle. If it is a controlled release drug, that could make the pill larger.
- The shape can be subject to aesthetic considerations, but more often than not it is determined by practical considerations. Round pills and tablets are the most common shapes, as they are the easiest to manufacturer, but also the strongest and less likely to chip – which really matters. A chipped or broken tablet can reduce the dosage that the person receives.
- Score marks on the pills may be aesthetic or functional to allow for splitting into smaller doses.
- Colors can be determined by the composition of the drug, but more often it is the coating on the pill that gives it the color. There is a lot of purpose to the colors that are chosen, but it is a mixture of marketing and R&D (research and development). The key thing manufacturers are trying to avoid with regards to color and shape is to avoid mix ups at the pharmacy.
- Size, colors and embossments can help people identify not only different drugs but also different dosages. Also, not all colorants are acceptable to all regulators around the world, so the market it will be sold in is a consideration.
- The markings on the tablet are very specifically designed by each company and for each medicine. They are designed to be unique, and are submitted to regulatory agencies as part of the approval process.

Thanks for reading! 💊💊💊
You can subscribe to our newsletter to get future posts delivered to your inbox for free. 👉🏻 📫 Subscribe now.
Sharing is caring
📣 If you think your followers or friends may like it, please consider sharing it.
🙋🏻♀️ If you have any suggestions, comments or requests, please email them to beautifulpublicdata@gmail.com
Thanks for reading!
- Jon Keegan (@jonkeegan)