The datasets we’re looking at this week


you read Data is plural, a weekly newsletter of useful/curious datasets. Below you will find the May 25, 2022, editionreprinted with permission from FiveThirtyEight.

Edition 2022.05.25

The supercomputers, the infrastructures allow it, the results of the European elections, the sociograms of Moreno and Jennings and the library of Art Garfunkel.

Supercomputers. Since 1993, a team of researchers regularly evaluates the most powerful computers in the world. The resulting TOP500 lists are released twice a year, in June and November, using a performance benchmark developed by team member Jack Dongarra, who became a Turing Award winner this year. The downloadable versions show the name, rank, location, manufacturer, year of construction, power consumption, technical specifications and more of each supercomputer. As seen in: “The Race to Build the Fastest Supercomputer”, by Datawrapper’s Edurne Morillowho recommends visiting Barcelona’s MareNostrum, ranked 74th on the latest list and housed in a former chapel.

If the infrastructure allows it. The U.S. government’s Federal Infrastructure Permitting Dashboard tracks “environmental review and permitting processes for large or complex infrastructure projects,” particularly those funded by the Department of Transportation and those involved in a coordination effort. voluntary exam known as FAST-41. The comprehensive dashboard dataset describes more than 12,000 milestones related to nearly 1,000 projects, about half of which have been completed. Online, you can search through projects and browse their characteristics and schedules.

Results of the European elections. Dominik Schraff et al. have created EU-NED, a dataset that harmonizes European election results at sub-national level, providing party vote totals for NUTS 2 and NUTS 3 geographical units from 31 countries. The dataset covers from 1990 to 2020 and uses party IDs from PartyFacts (DIP 2019.01.16), making it easy to link records to other projects. [h/t Christian Breuer]

The sociograms of Moreno and Jennings. In the 1930s, Jacob Moreno and Helen Hall Jennings created a series of “sociograms” representing the seating preferences of classmates. These graphs “are often considered the earliest examples of social network analysis and visualization,” according to historian and network analysis practitioner Martin Grandjean, who translated them into simple data files. [h/t Christian Miles + Jer Thorp]

The Garfunkel Art Library. The legendary folk singer’s official website includes a catalog of “every book Art has read since 1968.” It lists each book’s title, author, year published, month/year read, page count, and whether it was one of the musician’s favorites. Recently, AI engineer Corey Christensen converted HTML pages into a downloadable dataset.

Suggestions for datasets? Critical? To rent out? Send your comments to [email protected] Looking for past datasets? This spreadsheet has them all. To visit to subscribe and browse previous editions.


About Author

Comments are closed.