Who’s Buying Books?

Why We Still Lack Transparent Sales Data

Photograph of an open laptop on a glass table. The screen is on and displaying statistics with a large area graph containing two overlapping color sets of data.

My senior year of college: a time of capstone classes and final projects. As the last requirement of Susquehanna University’s Honors Program, which I am a member of, students are expected to conduct an interdisciplinary research project over two semesters. For my project, I am combining my majors, Publishing & Editing and Marketing, by trying to determine the extent to which banned books are merely a marketing ploy, and to what extent the banning of books is keeping titles out of the hands of people who are not discussing their topics otherwise. In my research though, I came across an even more troubling issue: the unavailability of publishing sales data and the utter lack of transparency from the largest bookmakers and sellers. 

At first thought, you would think this data is easy to come by with the abundance of bestseller lists from major outlets like The New York Times, but these lists don’t include any actual sales information. Bits and pieces of this sales data are available through outlets such as NPD, which is a global market information company that offers data, industry information, and analytics to clients who pay for their services. In one article, they name a few specific titles, such as Gender Queer, The Complete Maus, and Antiracist Baby, whose sales skyrocketed after their controversies were publicized. However, you can’t draw any meaningful conclusions from a handful of token examples. I wanted to see what the reality is behind the few big-picture numbers that we are given. In attempting to answer vital questions about these figures and find other examples to enhance my research, I kept hitting wall after wall. Publishing companies and large retailers like Amazon, Barnes & Noble, Target, and Walmart are incredibly restrictive when sharing their sales data. Their data is proprietary, and outlets like NPD are only available to the few who can afford such services. Keeping this data locked behind gates is truly a disservice to scholarship and the publishing industry. It is nearly impossible to know what readers are obtaining which titles and if readers in some areas of the country are less likely to read certain kinds of books than those in others. This is especially significant when it comes to banned books because of the cultural implications. If we aren’t able to see where these banned titles are being purchased or who they are being purchased by, it is impossible to know how far the cultural conversations surrounding their topics and themes are spreading. However, I still wanted to try to break through the many walls in my path to at least start bringing awareness to the issue that is book banning. 

I started by reaching out to NPD, the American Association of Publishers, and the American Booksellers Association (ABA) to see if they had a method that would allow me to access publishing sales data, but I was met with no success. Instead, I had to use The Wayback Machine, which is a digital archive of the Internet that allows you to view earlier versions of a website, to try to compile a list of a title’s appearances on the ABA’s nine regional bestseller lists (since they do not archive the lists themselves). I thought I may be able to gain some insights if a title consistently appeared much higher on the bestseller list in one region or if it made significantly fewer appearances on one list than the others. This was a tedious and exhausting process, and at the end of it, the data I’d collected was not helpful enough to find any patterns or make real conclusions.  

In the frustration that followed these disappointing results, I discovered that there are others experiencing the issue of proprietary publishing data. For instance, Melanie Walsh, a data scientist and literary scholar, went looking for book sales data during the COVID-19 pandemic and was surprised to find it purposefully hidden from anyone outside the industry. It confused her because this data is also so influential in determining book contracts and, therefore, authors’ lives. Along with this, it is the only way for us to understand the contemporary literary world. Despite this, most sales data is housed on an exclusive subscription service operated by NPD called BookScan, which all the big publishing companies, publishing professionals, and authors rely on, but just about everyone else is banned from using. It also doesn’t help that the terms of service for Amazon and Barnes & Noble prohibit openly discussing exact sales figures for books. With the immense amount of effort that goes into keeping this data secret, it makes you wonder what these companies and sellers have to hide. 

I understand that publishers and retailers may not want to make their sales information public in order to keep their trade secrets from competitors, avoid interference, or save face, but how much impact can that knowledge truly have? For huge companies like Amazon, Target, HarperCollins, and Penguin Random House, just to name a few, there is no way that sharing sales data would provide a way for competitors to gain leverage over them, especially if all the firms in the marketplace were allowing broader access to their sales figures. Some also suggest that book data should remain inaccessible to the general public because, if it does become broadly available, the variety of what is published will narrow to replicate what proves to be popular.

I simply don’t agree with this. Already, trade publishers are not taking chances on many titles that aren’t coming out of the “bestseller factory.” The inability for anyone outside the publishing industry to analyze the sales and distribution information for books truly speaks to something larger. If we can’t understand how far books spread, especially those that contribute to difficult but necessary conversations, we can’t understand our own culture. We have no way of knowing what areas are underserved or where efforts to support the reading or sales of certain titles can be improved. It is concerning that this data isn’t easier to come by so that we can assess the cultural sphere of our nation. It’s impossible to see what books do, to know how many lives they touch, and to understand the conversations being had across our country and beyond when sales data from publishing companies and large book retailers alike are locked behind an impenetrable paywall.


A portrait of Julia Adams, a recent Susquehanna University graduate.

Julia Adams (’23) is a recent graduate of Susquehanna University where she earned a dual degree in publishing & editing and marketing. She currently resides in New Jersey with her family and her dog, Ruby. Julia’s favorite place to be is on a beach with a book, and she hopes to always be involved with the sport of swimming in one way or another. She will be attending The Columbia Publishing Course at Oxford this September.

Previous
Previous

Ageism in Publishing

Next
Next

Nonbinary Representation