Digital scholarship (DS) and digital humanities (DH) projects take many forms, including digital exhibits, multimedia essays, text analysis, data visualizations, interactive maps, comparative timelines, and more. Below is a long list of software, digital tools, frameworks, tutorials, and other resources used by scholars engaged in DS and DH research. Please note this list is for informational purposes only. Falvey Library does not subscribe or provide access to all of the software on this list.
Draw.Chat is an online whiteboard tool that offers free collaborative drawing board solutions for online meetings. Users can draw, chat, or communicate via audio and video conference.
hypothes.is is a handy web-based annotation tool used to annotate text on websites and documents. It gives the user the ability to highlight and create notes for private or public use as well as create private groups for collaborations.
Padlet allows you to create pads and boards that students can post to and annotate, which is a great way to make a class interactive. Students can post comments, links to videos, and images anonymously.
Perusall is a platform pdf and web screen capture annotation tool that allows students and faculty to engage with, and transform, written material in a style akin to social media posting. Student and faculty engagement of the material in the text helps facilitate and transform class discussions.
DATA ANALYSIS AND VISUALIZATION
Airtable is an easy-to-use online platform for creating and sharing relational databases and visualizations. The user interface is simple, colorful, friendly, and allows anyone to spin up a database in minutes.
Atlas.ti is a qualitative research tool that can be used for coding and analyzing transcripts & field notes, building literature reviews, creating network diagrams, and data visualization.
Chart Blocks is an online data visualization tool with a free plan that lets you build and host 30 charts at a time. All charts are publicly viewable, so private data should not be used.
Data wrapper is a free, easy-to-use, web-based data visualization tool that allows only basic customizations but renders visually attractive graphics. Charts built in Data Wrapper are meant to be embedded in a website.
A catalogue that will help you with selecting appropriate chart types, along with other helpful links to different data visualization tools
A website gallery of all relevant data visualizations, so you can find the right visualization for your research project and get inspiration on how to do it.
A useful compilation of free data visualization tools.
A cross-platform app for analyzing qualitative and mixed methods research with text, photos, audio, videos, spreadsheet data and more.
Excel is a familiar tool used by many people as part of Microsoft Office Suite. Instead of using default chart styles, users should take care to customize and refine their charts in Excel for best results.
NVivo is a qualitative data and text analysis (QDA) computer software package produced by QSR International. NVivo helps qualitative researchers to organize, analyze and find insights in unstructured or qualitative data like interviews, open-ended survey responses, journal articles, social media and web content, where deep levels of analysis on small or large volumes of data are required
OpenRefine (previously Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data. OpenRefine always keeps your data private on your own computer until you want to share or collaborate.
Raw is an easy-to-use open web app for creating visualizations using D3.js.
Tabula allows you to extract that data into a CSV or Microsoft Excel spreadsheet using a simple, easy-to-use interface. Tabula works on Mac, Windows and Linux.
Taguette is a free and open source qualitative research tool (which works on all operating systems!). With Taguette, users can import PDFs, Word Docs (.docx), Text files (.txt), HTML, EPUB, MOBI, Open Documents (.odt), and Rich Text Files (.rtf).
After uploading documents, users can highlight words, sentences, or paragraphs and tag them with the codes you create. All the work you do in Taguette is completely exportable, including tagged documents, codebooks, highlights for a specific tag, highlights for all tags, and a list of tags with their descriptions.
Users can upload documents in any language to Taguette. The interface is currently available in English (US), French, German, Italian, and Spanish
Tableau produces a wide variety of beautiful interactive visualizations and maps, and its drag-and-drop interface makes it easy to learn and use. A free academic license is available to students, instructors, and non-profit academic researchers. Publishing visualizations on the web requires a Tableau Public account. Public visualizations should not use private or personally identifiable information.
VizCheck lets you check your visualization for colorblind safeness.
Digital Exhibits AND DIGITAL PUBLISHING
CollectionBuilder is an open source tool for creating digital collections and exhibit websites that are driven by metadata and powered by modern static web technology, such as Jekyll.
Ed is a Jekyll theme designed for textual editors based on minimal computing principles, and focused on legibility, durability, ease and flexibility. Our underlying technology is easy to learn and teach, and can produce beautifully rendered scholarly or reading editions of texts meant to last. To start using Ed, please see our documentation for installation instructions and more.
Exhibit is a IIIF (see IIIF under Standards and Frameworks in this list) storytelling tool from The University of St Andrews. Created by Mnemoscene with support from the Esmée Fairbairn Collections Fund.
Mukurtu (MOOK-oo-too) is a grassroots project aiming to empower communities to manage, share, narrate, and exchange their digital heritage in culturally relevant and ethically-minded ways. They are committed to maintaining an open, community-driven approach to Mukurtu’s continued development. Their first priority is to help build a platform that fosters relationships of respect and trust.
Omeka provides open-source web publishing platforms for sharing digital collections and creating media-rich online exhibits. There are currently three versions of Omeka: Omeka.net, Omeka Classic, and Omeka S.
Omeka.net: If you’re new to Omeka, it is recommended that you use Omeka.net. Here, you can build an Omeka website hosted for free on the Omeka.net servers. Omeka.net works well for classroom instruction, allowing each student to manage their own site independently, but limits the amount of free themes you can use. For more storage space and access to a wider variety of plugins and themes, there are various purchasing options available to you.
Omeka Classic: This is the classic version of Omeka and is best for individual projects and educations. It is best for those who run an independent website on their own servers, where you can install your own version of Omeka and download and install plugins and themes on your own.
Omeka S: This is the next-generation version of Omeka Classic and is best for multiple projects and users. Omeka S is designed for universities, galleries, libraries, archives and museums working with larger digital collections and multiple projects, where users can collaborate and create exhibits from a shared pool of items, media, and metadata.
Open Journal Systems (OJS) is an open source software application for managing and publishing scholarly journals. Originally developed and released by PKP (Public Knowledge Project) in 2001 to improve access to research, it is the most widely used open source journal publishing platform in existence, with over 10,000 journals using it worldwide.
Pressbooks is a WordPress-based online platform for self-publishing books in multiple formats: e-books, webbooks, and print-ready PDFs. The software, which is open source, makes significant changes to the WordPress admin interface, web presentation, and export routines. Although popular with monographs, edited collections, and open textbooks, its built-in interactive H5P plug-in also makes it a good choice for Open Educational Resources.
Scalar is a free, open source authoring and publishing platform that’s designed to make it easy for authors to write long-form, born-digital scholarship online. Scalar enables users to assemble media from multiple sources and juxtapose them with their own writing in a variety of ways, with minimal technical expertise required.
Wax incorporates IIIF (see IIIF under Standards and Frameworks in this list) and its underlying technology has been made for learning and teaching, and can produce beautifully rendered, high-quality image collections and scholarly exhibits.
GIS and OTHER MAPPING TOOLS
Based on the classic Esri Story Map templates, ArcGIS Story Maps provides a new, modern platform for telling stories with maps, text, images, and other media.
Esri Story Maps let you combine ArcGIS Online maps with narrative text, images, and multimedia content. They make it easy to harness the power of maps and geography to tell your story. Users can create maps using multiple Esri Story Map templates, such as the Swiper, SpyGlass, Shortlist, Journal, and Cascade templates.
Flâneur is a Jekyll theme for maps and texts using Leaflet.
A tool for editing GeoJSON data on the internet. It enables editing through a map interface, raw GeoJSON, and exporting and importing a large number of formats.
A proprietary online georeferencing tool that assigns geographical location to any image. Georeferencer is used by the David Rumsey Historical Maps collection and Old Maps Online.
Geographic Resources Analysis Support System (GRASS) is a geographic information system software suite used for geospatial data management and analysis, image processing, producing graphics and maps, spatial and temporal modeling, and visualizing. It can handle raster, topological vector, image processing, and graphic data.
This github repository by HandsOnDataViz allows users to create a Leaflet story map with a linked Google Sheets template and scrolling narrative. It supports images, audio and video embeddings, and Leaflet TileLayer/geojson overlays.
Developed by Tim Waters, Map Warper is a free to use, open source map georectifier and image georeferencer tool for individuals and small groups. Map Warper has been by organizations like the New York Public Library.
A mapping platform that allows you to create, share and collaborate on interactive maps online.
A tool for topologically aware shape simplification. Reads and writes Shapefile, GeoJSON and TopoJSON formats.
Mapping platform designed for quick publishing of zoomable maps online for web applications, mobile devices and 3D visualisations.
Tableau is a visual analytics platform that allows you to create data visualizations and interactive maps.
A geotemporal exhibit-builder that allows you to create beautiful, complex maps, image annotations, and narrative sequences from Omeka collections of archives and artifacts, and to connect your maps and narratives with timelines that are more-than-usually sensitive to ambiguity and nuance (See also Omeka).
A free and open-source cross-platform desktop geographic information system that supports viewing, editing, and analysis of geospatial data.
This web-GIS platform allows you to publish your spatial data and QGIS maps on the internet without having to set up a server or infrastructure.
Open-source layers and designs for displaying maps available for reuse. These map layers and styles were created by Stamen, a San Francisco development studio focused on data visualization and map making. Stamen heavily uses OpenStreetMap data for displaying and creating maps and visuals.
A free tool developed by Northwestern University's Knight Lab to help you tell stories on the web that highlight the locations of a series of events.
Written by Douglas Luke, this book introduces modern network analysis techniques in R to social, physical, and health scientists. The mathematical foundations of network analysis are emphasized in an accessible way and readers are guided through the basic steps of network studies: network conceptualization, data collection and management, network description, visualization, and building and testing statistical models of networks.
Cytoscape is an open source bioinformatics software platform for visualizing molecular interaction networks and integrating with gene expression profiles and other state data.
Gephi is a visualization tool that allows users to make colorful graphs and networks from textual data by revealing links between textual objects, social network patterns, and more. Gephi is easy to use and popular among humanists and social scientists alike.
"Introduction to Network Analysis with R" is a blog post and tutorial written by Jesse Sadler. This tutorial begins with a short introduction to the basic vocabulary of network analysis, followed by a discussion of the process for getting data into the proper structure for network analysis and the R packages for network analysis.
This interactive application by Elijah Meeks and Maya Krishnan is designed to provide an overview of various network analysis principles used for analysis and representation. It also provides a few examples of untraditional networks used in digital humanities scholarship.
A series of blog posts by Scott Weingart about Network Analysis.
NodeXL Basic and NodeXL Pro are add-ins for Microsoft® Excel® (2010, 2013, 2016, 365) that support social network and content analysis. NodeXL Basic is available freely and openly to all. It is positioned as a browser for files created with NodeXL Pro which offers advanced features for professional social network and content analysis.
VOSviewer is a software tool for constructing and visualizing bibliometric networks. These networks may for instance include journals, researchers, or individual publications, and they can be constructed based on citation, bibliographic coupling, co-citation, or co-authorship relations. VOSviewer also offers text mining functionality that can be used to construct and visualize co-occurrence networks of important terms extracted from a body of scientific literature.
Other DH RESOURCES
DH Box is a laboratory in the cloud that can be deployed quickly and easily. It's simply accessed from any computer as long as you've got an internet connection and some contextual knowledge.
Ready-to-go configurations of Omeka, NLTK, IPython, R Studio, and Mallet are included in the DH Box platform. Through this praxis friendly environment, professors and students have instant classroom access to a cadre of gold-standard DH tools. Professors will be able to launch a DH computer lab in just a few minutes.
Doing Digital Scholarship from the Social Science Research Council (SSRC) Labs offers a self-guided introduction to digital scholarship, designed for digital novices. It allows you to dip a toe into a very large field of practice. It starts with the basics, such as securing web server space, preserving data, and improving your search techniques. It then moves forward to explore different methods used for analyzing data, designing digitally inflected teaching assignments, and creating the building blocks required for publishing digital work.
Humanities Data in R: Exploring Networks, Geospatial Data, Images and Text by Taylor Arnold and Lauren Tilton published by Springer provides a user-friendly beginners guide to the key concepts in the digital humanities by focusing on four major types of humanities data structures: networks, text corpora, geospatial data, and images.
The Programming Historian is a peer-reviewed academic journal of digital humanities and digital history methodology. It publishes tutorials that help humanists learn a wide range of digital tools, techniques, and workflows to facilitate research and teaching.
Created by Dr. William Mattingly, this site is dedicated to all things Python for DH projects, ranging from history to art, and from courses to projects. The courses provided on this site will help you gain an understanding of Python, its syntax, and how to apply it to your own DH project. Begin with the Introduction to Python for DH course before moving into the more advanced task-specific courses.
STANDARDS AND FRAMEWORKS
The IIIF (International Image Interoperability Framework) defines several application programming interfaces that provide a standardised method of describing and delivering images over the web, as well as "presentation based metadata" about structured sequences of images. IIIF is driven by a community of research, national and state libraries, museums, companies and image repositories committed to providing access to high quality image-based resources.
Using JSON-LD, linked data, and standard W3C web protocols such as Web Annotation, IIIF makes it easy to parse and share digitized materials, migrate across technology systems, and provide enhanced image access for scholars and researchers, allowing for:
- Fast, rich, zoom and pan delivery of images
- Manipulation of size, scale, region of interest, rotation, quality and format.
- Annotation - IIIF has native compatibility with the W3C annotation working group’s Web Annotation Data Model, which supports annotating content on the Web. Users can comment on, transcribe, and draw on image-based resources using the Web’s inherent architecture.
- Assemble and use image-based resources from across the Web, regardless of source. Compare pages, build an exhibit, or view a virtual collection of items served from different sites.
- Stable image URIs, or share it for reference by others–or yourself in a different environment.
Below is a list of IIIF compliant viewers and image servers that can be used to show and host IIIF Manifests and Images:
IIIF compliant image viewers:
- Universal Viewer: an open source project to enable cultural heritage institutions to present their digital artifacts in a IIIF-compliant and highly customisable user interface.
- Leaflet IIIF: A Leaflet plugin that enables zoomable IIIF images to be easily and quickly displayed. Leaflet + Leaflet-IIIF weigh in at just 35 KB and include great features like accessible keyboard controls and native touch/mobile support. Check out the demo.
- Canvas Panel: A React library to build IIIF Presentation 3 level viewing experiences including support for annotations.
Cantaloupe: an open-source image server enabling on-demand generation of derivatives of high-resolution source images. With available operations including cropping, scaling, and rotation, it can support deep-zooming image viewers, as well as on-the-fly thumbnail generation.
IIPImage Server: a high performance image server for streaming high resolution images. It supports advanced image features such as 16 and 32 bit color depths, floating point data, CIELAB colorimetric images and scientific imagery such as multispectral or hyperspectral images. The server is an Fast CGI module written in C++ that is designed to be embedded within a host web server such as Apache, Lighttpd, MyServer or Nginx.
Loris: an open source, Python-based image server that supports the IIIF Image API ver 2.0. Loris supports JPEG and TIFF sources as well as JPEG2000.
digilib: An open source, Java based image server for high resolution images. It supports the IIIF Image API and a native API that also allows brightness, contrast and color corrections and other operations. digilib supports JPEG, TIFF, PNG, JPEG2000 and other image formats via Java ImageIO. digilib also has a web client that offers continuous zoom, referenceable views, image annotations and other features for scholarly work.
Riiif: A Ruby IIIF image server as a rails engine. Note that RIIIF is meant for development convenience and will not scale to the needs of most production-level applications.
Text analYSIS/ Text AND DATA MINING
A freeware corpus analysis toolkit for concordancing and text analysis.
The Gale Digital Scholar Lab is an online suite of text analysis, data mining, and data visualization tools that can be used to build, clean, and analyze corpuses from Gale’s Primary Sources, or any texts uploaded. The tools cover document clustering, Named Entity Recognition, Ngrams, parts of speech, sentiment analysis, and topic modeling.
An online search engine that charts the frequencies of any set of search strings using a yearly count of n-grams found in sources printed between 1500 and 2019 in Google's text corpora in English, Chinese, French, German, Hebrew, Italian, Russian, or Spanish.
An integrated system for text modeling making it simple to go from a set of documents to an interactive visualization of Latent Dirichlet allocation (LDA) topic models generated using the InPhO VSM module. More advanced analysis is made possible by a built-in pipeline to Jupyter (iPython) notebooks.
A tutorial by François Dominic Laramée published in The Programming Historian. This tutorial teaches users how to conduct ‘stylometric analysis’ on texts and determine authorship of disputed texts.
Mallet is a machine learning software program that is used through the command line with Python. Though it requires some technical skill to install and run, it can produce powerful results by generating “topics,” or lists of words that frequently appear together in corpora.
A leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning
A visualization tool designed to help researchers explore large collections of text documents through the use of probabilistic topic modeling. SerendipSlim is an updated version of an earlier tool called Serendip, which first appeared in a publication at IEEE VAST 2014. Serendip was created by Eric Alexander and Joe Kohlmann, working as part of the Visualizing English Print project, a cross-disciplinary collaboration of computer scientists and literature scholars interested in bringing the practices of data visualization and statistical analysis to the study of historical documents.
This book explores text-mining techniques in R. Using tidytext, an R package that authors Julia Silge and David Robinson developed you will learn more about text analysis methods, such as sentiment analysis, n-grams, correlations, etc.
Written by Matthew Jockers, this book provides a practical introduction to computational text analysis using the open source programming language R. Readers begin working with text right away and each book chapter works through a new technique or process such that readers gain a broad exposure to core R procedures and a basic understanding of the possibilities of computational text analysis at both the micro and macro scale
A guide introducing concepts and methodologies for literary text analysis programming. This guide uses Jupyter Notebooks based on the Python scripting language and the Natural Language Processing Toolkit (NLTK).
An open-source, web-based application for performing text and data mining. Developed by Stéfan Sinclair at McGill University and Geoffrey Rockwell at the University of Alberta, Voyant Tools was created to support scholarly reading and interpretation of texts. See Falvey's How-to-Guide on Text Analysis 101: Voynat Tools to learn more.
Preceden is a lightweight timeline maker that helps you quickly create great looking timelines and project roadmaps. With our intuitive web-based interface, export options, and more
Sutori is a "collaborative instructional and presentation tool for the classroom." It can be used as an alternative to traditional presentations such as PowerPoint or Prezi. The stories can be viewed one panel at a time, like a slideshow, or scrolled through, like a timeline.
Timeline JS is an open-source tool that enables anyone to build visually rich, interactive timelines by Northwestern University's Knight Lab. Beginners can create a timeline using nothing more than a Google spreadsheet
Timetoast is a web-based tool for creating interactive timelines. Users create a profile and add events to make a timeline. Each event can include text, a photo, and a link.
Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages. Say you’ve found some webpages that display data relevant to your research, such as date or address information, but that do not provide any way of downloading the data directly. Beautiful Soup helps you pull particular content from a webpage, remove the HTML markup, and save the information. It is a tool for web scraping that helps you clean up and parse the documents you have pulled down from the web.
Twarc is a command line tool and Python library for archiving Twitter JSON data. Each tweet is represented as a JSON object that is exactly what was returned from the Twitter API. Twarc will handle Twitter API's rate limits for you. In addition to letting you collect tweets twarc can also help you collect users, trends and hydrate tweet ids. Twarc was developed as part of the Documenting the Now project which was funded by the Mellon Foundation.
TAGS is a free Google Sheet template which lets you setup and run automated collection of search results from Twitter. Developed by Martin Hawksey, TAGS allows you to collect tweets, up to seven days in the past and tweets written into the future. Specifically, TAGS allows you to collect tweets from specific users and gather tweets under a certain hashtag.
Developed by Michael Kearney, rtweet is a package used by R programmers to extract data from Twitter's REST and stream- ing APIs.
WEBSITES, CoNTENT MANAGEMENT SYSTEMS (CMS), AND BLOGGING
Concrete5 is an open-source content management system for publishing content on the World Wide Web and intranets. concrete5 is designed for ease of use, for users with a minimum of technical skills. It enables users to edit site content directly from the page.
Drupal is a free and open-source web content management framework written in PHP and distributed under the GNU General Public License. Drupal provides a back-end framework for at least 12% of the top 10,000 websites worldwide – ranging from personal blogs to corporate, political, and government sites.
If your project site is published from a private or internal repository owned by an organization using GitHub Enterprise Cloud, you can manage access control for the site. For more information, see "Changing the visibility of your GitHub Pages site."
To get started, see "Creating a GitHub Pages site."
Squarespace is a service that offers bundled web-hosting and customizable website templates. They offer a student discount for the first year.
Wix.com provides cloud-based web development services. It allows users to create HTML5 websites and mobile sites through the use of online drag and drop tools.
Wordpress is a content management system written in PHP and features a plugin web architecture and a template system, referred to within WordPress as Themes for creating websites. WordPress was originally created as a blog publishing system but has evolved to support other web content types including more traditional mailing systems, media galleries, membership sites, and online stores. There are currently two versions of Wordpress listed below:
Wordpress.org is where you'll find the free Wordpress software that you cand download and install on your own web server to create a website.
Wordpress.com, on the other hand, offers web hosting for you. You don't have to download software, pay for hosting, or manage a web server. You will, however, be required to create an account on WordPress.com and many website features are paid upgrades.
Virtual REALITY, AUGMENTED REALITY, AND 3D Modeling
A-Frame is an open-source web framework for building virtual reality experiences. It is maintained by developers from Supermedium and Google. A-Frame is an entity component system framework for Three.js where developers can create 3D and WebVR scenes using HTML and CSS.
Agisoft Metashape, previously named Agisoft Photoscan, is a stand-alone software. This software offers many interesting features like photogrammetric triangulation, point cloud data, measurements for distances, volumes and areas, 3D model generation and textures, for example. Agisoft Metashape appears to be a complete software, useful for various applications such as cultural heritage documentation or visual effects production.
Blender is a free and open source 3D creation suite. It supports the entirety of the 3D pipeline—modeling, rigging, animation, simulation, rendering, compositing and motion tracking, video editing and 2D animation pipeline.
Autodesk Maya, commonly shortened to just Maya, is a 3D computer graphics application that runs on Windows, macOS and Linux, originally developed by Alias Systems Corporation and currently owned and developed by Autodesk.
Mozilla Hubs is a place where you can get together with friends online in a virtual-social space. With a single click, you can create a virtual 3D space and invite others to join you using a URL.
ReCap Photo is an Autodesk 360 service designed to create high resolution 3D data from photos to enable users to visualize and share 3D data. By leveraging the power of the cloud to process and store massive data files, users can upload images on Autodesk 360 and instantly create a 3D mesh model.
Sketchfab is a platform to publish, share, discover, buy and sell 3D, VR and AR content. It provides a viewer based on the WebGL and WebXR technologies that allows users to display 3D models on the web, to be viewed on any mobile browser, desktop browser or Virtual Reality headset.
Tinkercad is a free, online 3D modeling program that runs in a web browser, known for its simplicity and ease of use
Unity gives users the ability to create games and experiences in both 2D and 3D, and the engine offers a primary scripting API in C#, for both the Unity editor in the form of plugins, and games themselves, as well as drag and drop functionality.