HIST4805 Artificial Intelligence in/and History

robot archivist

Maybe this isn’t an image of a robot archivist. Maybe it’s a robot student. Still generated with DALL-E though.

Schedule

On this page you’ll find the schedule, readings, links to materials, and general instructions.

Part One is the Fall 2024 Term.

Part Two is the Winter 2025 Term.

General Structure of Any Given Session

Each session will open with twenty to thirty minutes of freewriting. Try to identify the things that perplexed you in the readings before you came to class, and that you would like to talk about now. Revisit what you wrote in previous classes to connect with what you’re thinking now. These pieces of writing will be collected each week into a digital garden. At the end of this term we will use this writing in conjunction with an AI to examine what it is we, as a group, think about AI.

With weeks where there are readings, there will be designated students to take us through them. When it is your turn, use the guidance under Reading Like A Predator and Leading a Session to direct your attention. Use this to build up your general summary of a piece. Oh, see that bar down the right hand side of the screen? That’s the Hypothes.is tool bar, and you can use that to annotate webtext, if you join our reading group and install the related plugin (for sites that don’t have it installed).

Everyone will have already read the piece, so you shouldn’t need to go into overly deep detail for your summary. Your job is to draw connections for us to other things you have read/experienced (whether in this class or elsewhere) to highlight the critical elements in your view. Your job is to tell us the questions that occurred to you, the things that didn’t make sense to you, the things you think we need to explore as a group. Never say, ‘What do you guys think?’ Rather, tie the critical elements expressly to the doing of history. For instance: “When person X discusses the problem of ‘hallucination’, how does this challenge the authority of a historian? Where are the sources of authority in this context?”.

See the updates for detailed instructions about free writing, session leading, and other sundry matters.

Hey! I hear you. Isn’t AI kinda naff, now? Maybe. The hype bubble has certainly been something to behold. So should we still care about all this? When bubbles deflate, what remains still needs to be reckoned with, so read the piece by Widder & Hicks soon!

Widder, DG and Hicks, M. 2024 Watching the Generative AI Hype Bubble Deflate. arxiv.link.


Part One

If podcasts are your thing, you can supplement pretty much any week below with an episode of Mystery AI Hype Theater 3000 by Emily Bender and Alex Hanna. If you do, let me know.

☞ Week 1 The Dream of AI

…and why LLM are models of culture and history first and foremost. You will come to class already having had a look at the readings. In this session I will introduce the main themes of the course, my expectations, we’ll find out about your expectations, and we’ll play with a simple language model and wonder: how can this represent intelligence? And if it doesn’t, then what does it represent?


Activities

  • Intro activity: Have you used GPT or similar, and for what?
  • Overview of this course and what we’re trying to do
  • Markov Chains (a nice explanation here; don’t worry about the code section). Let’s play.
  • Setting up Obsidian.md for our freewriting/digital garden (we’ll want to set up some file-naming conventions). You might prefer Tangent as a note making app better than Obsidian, since it has a more restricted feature set and, perhaps, gentler learning curve. The main thing is: having a plain text note making app that allows you to generate connections
  • Determine session leaders for next week (who will take us through the readings below for next week)

Readings

  • Tiffany Chan, Author Function, ‘Context’ section. 2017.
    Karawynn Long. “Language is a Poor Heuristic for Intelligence”. Nine Lives blog, June 26, 2023. link
  • Allison Parish. Language models can only write poetry. 2021
  • Ted Underwood. “Mapping the Latent Spaces of Culture” Startwords 3: Parrots. August 1, 2022. original, archived.
  • Anige Wang. “Is My Toddler a Stochastic Parrot?”. The New Yorker, November 15, 2023. link

I expect everyone to have these read for next week. For the rest of this course, if a reading is given for a particular week, I expect you to have read that work before class.


☞ Week 2 Flavours of AI

…culminating in neural networks, transformers, and attention. We’ll start with the cyberneticists. And the Dartmouth meeting of 1955. We’ll talk about ‘intelligence’ and what a problem that is.

Activities

  • Freewriting: given what we saw last week, and given what you read for this week, where are you starting from? What do you want to know? What questions have already emerged for you? (Write in chunks; interlink as appropriate)
  • Session Leaders: take us through your article
  • Graham: ‘The Last Decade Has Been Wild’
  • Intro to AI for GLAM A ‘Software Carpentries’ lesson on AI; we’ll explore their materials (also, since their lesson is a work-in-progress, we’ll also contribute by offering feedback; we’ll see how far we get)
  • Determine session leaders for next week (who will take us through the readings for next week)

Bonus

McCarthy, Minksy, Rochester & Shannon. “A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence” (1955)


☞ Week 3 Earlier Antecedents and A Digression into the Longer History of Representing Information as Bits

In this week, I want to talk about the longer history of encoding ‘information’ or ‘knowledge’ outside of human brains (and why information might not be the same as knowledge). This’ll be a deep dive, back to the origins of writing, but will also include a whirlwind tour of the foundations of digital computing.

Activities

  • Freewriting; plant in our digital garden. I’ll walk you through this. Ask yourself: what do I really know about how computers work? For me as a historian, does it matter?
  • SG on information theory, Claude Shannon, Alan Turing, and a potted history of information from earliest times
  • Session Leaders on the Readings
  • Discussion
  • AI Forecasting Challenge by Nicholas Cardini
  • Determine session leaders for next week

Readings

  • Emily Bender, "Thought experiment in the National Library of Thailand. Medium, May 24, 2023. link
  • Thomas Haigh. “Conjoined Twins: Artificial Intelligence and the Invention of Computer Science.”" Communications of the ACM 66, no. 6 (May 24, 2023): 33–37. link; our library.
  • Thomas Haigh. “There Was No ‘First AI Winter.’”" Communications of the ACM 66, no. 12 (November 17, 2023): 35–39. link; our library.
  • Stephen Wolfram, “What is ChatGPT Doing… And Why Does It Work?”. Writings blog, February 14, 2023. link. This is a very long piece; Read the sections ‘It’s Just Adding One Word at a Time’;‘Where Do the Probabilities Come From?’‘So…What is ChatGPT Doing, and Why Does It Work’. Go deeper if you’re so inclined.

☞ Week 4 Eliza and the Illusion of Intelligence

You’ve no doubt heard about the ‘imitation game’? That was Turing’s answer to the problem of ‘how can we know if something is ‘intelligent’?’. But Turing actually sidesteps the question…

Activities

  • Freewriting; plant in our digital garden. If you had access to a truly ‘intelligent’ machine, what would it do for you as a historian? What would it do to you as a historian? What are the issues, as you see them?
  • SG on the problem of ‘intelligence’ and the troubling connections with, well, everything
  • Session Leaders on the readings
  • Discussion
  • Have a conversation with Eliza.
  • Determine session leaders for next week

Readings


☞ Week 5 The OpenAI Civil War of 4.20 pm November 17 2023 - 1.14 am November 22 2023

For one strange weekend in November of 2023, the competing philosophies, governance, business models, and imagined dangers of AI dominated the news. The real dangers of AI never got a look in at all…

Activities

  • Freewriting; plant in our digital garden. Quickly search a major daily newspaper (eg, Washington Post, The Guardian) for stories about AI. Characterise the discourse: how is AI framed? Look at Carleton University or the University of Ottawa websites, and do the same thing again. How much trouble are we in?
  • SG on how we talk about AI and why that matters
  • Session Leaders on the readings
  • Discussion
  • Determine session leaders for next week

Readings


☞ Week 6 The corpuses/corpses of AI

Collections, copyright, intellectural property, labour: the actual dangers of AI and why the Luddites are right. And oh boy ‘publish or perish’ is tailor made for text generation abuse.

Activities

  • Let’s play Semantle
  • Freewriting. How do you organize information?
  • SG on datasets, power, & Luddites
  • Session Leaders on the readings
  • Discussion
  • Determine session leaders for next week

Readings


☞ Week 7 Current Deployments of AI in the GLAM sector

We take a brisk look at current uses of AI in various parts of the galleries, libraries, archives, and museums sector. What issues do practitioners identify?

This was the interface for the Met Museum’s ‘Gen Studio’ which no longer exists, though the source code is at https://github.com/microsoft/GenStudio.

Activities

  • Freewriting Imagine a local museum. How would you deploy some sort of ‘ai’ technology to enhance visitor experience? To connect communities? To better understand the collection? Or…?
  • Session Leaders on the readings
  • Discussion
  • SG Explains the Plan for the Second Part of this Semester

FALL READING WEEK

Use this time to get acquainted with what comes next. In these ‘hands on’ sessions, we will as a group try our hands at a variety of approaches. Some of these might work easily, some might take more effort. Sometimes you might need to set yourself up in small teams, sometimes you might be able to work entirely on your own. It will depend on the constellation of interests and abilities in the classroom. For each session, I will set the scene for what we are trying to do, I will draw attention to the elements that I think are most interesting, and I will provide whatever other resources might be necessary. Depending on how things go, we might try more complex things than are listed here; there will be a bit of flexibility over these next few weeks. Watch for updates.

It is the process which matters, not the final result of ‘completion’ of a task. The best work will exhibit critical curiosity about each stage in the process, and will chase down the implications.

You will document your process, your findings, your questions, your concerns, your roadblocks. At all times, you’ll be thinking about the bigger picture: what does this mean for the doing of history? What does this mean for the work of a historian? What does this imply for public engagement with history?

You’ll submit your documentation to me each week as per usual as textfiles using the markdown. I will plant these in our digital garden.
Finally, make yourself acquainted with this document:


☞ Week 8 Hands ON: Training your own GPT 2 model

First, we’ll try our hand at a version of GPT 2 (from 2019) that can run in a spreadsheet. Windows users will be at an advantage here, so maybe pair up. “Spreadsheets are all you need.ai”. Click on the link for the Excel binary to download the spreadsheet; also watch the video.

Then, using Google Colab and following along with this tutorial from the Programming Historian by former Carleton student Chantal Brousseau you will train and interrogate your own large language model. Some of these datasets might be useful.

What kinds of questions might this approach be useful for? Find a corpus of materials. Share resources. Document your choices. Document your training. Document your results.


☞ Week 9 Hands ON: Reading an AI Image

Last week, you selected your own data and trained your own LLM to explore and experiment with what patterns you might find, and to think through what it might mean for historical research. This week, we’ll work backwards from images to try to understand the underlying dataset. Can we, as Eryk Salvaggio has argued, understand them as infographics? And if so, what might we learn about the representation of Canadian history in these models?

Following his loose methodology select moments from Canadian history and let’s work out what is going on in the dataset that powers Craiyon or NightCafe. Other options are possible. Do not pay for anything.

Again, document everything.


☞ Week 10 Hands ON: CLIP

(addition, Sept 19: This piece by Arnold & Tilton on computer vision and the MET collection is worth talking about )

We’ll use the LLM package to install the CLIP multi-modal combined image and text model (into Colab again). We’ll then create a corpus of historical imagery and project these images into the model, to create a kind of image search engine. Then we’ll use Salvaggio’s method to explore the results.

Google Colab notebooks are shared with you in ‘sandbox’ mode. You can save a copy to your own Google drive. You can share the resulting notebooks by clicking on ‘file’ > ‘download’ > ‘download .ipynb’ and then dropping the file into the course share webpage. Add comment cells as appropriate to document your process and thoughts.

Alternatively, you can give Teachable Machines a whirl: make a historical imagery classifier.


☞ Week 11 Hands ON: LLMs as Natural Language Text Processors

We’ll use OCR software and the LLM package again to transcribe images of text and handwriting, then correct errors. The LLM will handle error correction and then organizing into structured data.

Or we’ll try, at least.

As an aside, an interesting thing you can do with structured data, once you have it, is project it into a knowledge graph embedding space. Here’s a case study where we used GPT3 to do that to generate ‘hot tips’ for the antiquities trade.


☞ Week 12 Hands ON: LLM as a Research Assistant

We’ll look at the GPT-Researcher project, and then develop our own version (again, using the LLM package), aimed at the Chronicling America newspaper database. Maybe we can build something for use with Canadiana.ca.


Part Two (Winter 2025)

Precise meeting dates are to be determined. Meeting 1 will happen during the first week of the winter term. In the intervals between formal meetings, your groups will use our scheduled class time to come together to work. That can happen in the class space, or elsewhere as suits you. You can of course come together more often as you need to do.

You will arrive in the first session having already built a few things. Over the holiday you should have been sketching out ideas about what you want to do.

☞ Meeting 1: What is an ethical and useful thing to do with AI in/for History?

This will be an unconference style workshop. You’ll throw ideas on the board about what you want to do. There’ll be plenty of work to go around; you’ll figure out broadly a topic/approach you want to explore, and come together with other students who want to explore the same ideas. The desired outcome here is that three to five teams will emerge. Ideally each team should have someone on it who is game for getting into the digital weeds (ie, at least one of you should be prepared to do a bit of the ‘coding’, keeping in mind the things I will have been saying the entire time about what that actually means). You will leave the meeting fired up about what you want to understand.


☞ Meeting 2: Backwards Design

For this meeting, you’ll have ready some mockups for the group of what you want to achieve. But you will not present your own work. Rather, you’ll spend the first portion of the class looking at one of the other teams’ work. You will read their materials and your group will present those instead. You will present what you think the end result of their ideal project implementation will be: we start at the end, and then design backwards to figure out how to get there. You’ll highlight potentials and possible perils. Groups whose work is being presented will listen quietly and take notes. This exercise will reveal to you things that you may have missed because you are too close to your project. After the presentation, each group will identify the three main things they’ve learned from hearing others’ interpretation of the work, and will lay out their initial ideas on how to address these.

Mockups can include visual layouts, research outlines, narrative prose, and much more besides. You are welcome to use AI tools to help you think these things through - a paradata document will include all prompts and models consulted.


☞ Meeting 3: Work In Progress (mid term check in)

This meeting will involve each group giving a progress report on how their project is going. Each group should include a discussion of what’s going well, what’s a current issue that is being solved, new opportunities for the project they’ve identified, and any issues that present a serious problem to the work. The class as a whole will listen carefully, and suggest issues to think about or possible solutions.


☞ Meeting 4: Paradata In Progress

The paradata that documents the process of the project as it currently stands should be made available to the group before we meet. In this meeting, other groups will try to replicate aspects of your project from the paradata, to see if there are any elements missing or not fully explicated. Paradata should also relate the process of doing whatever it is you’re doing with broader discussions (literature) on historical method. You may use AI to reverse outline your work; a reverse outline helps you see if you’ve missed important connective tissue in what you are creating. This is just a suggestion. But again any such use must detail the prompts, models, and iterations in the paradata (you can see things getting recursive quite quickly if you’re not careful.)


☞ Meeting 5 and 6

These two meetings, towards the end of the term, will be an opportunity for each group to show off their finished project. Presentations need to carefully situate the process of both building the thing and what the thing itself implies for the practice of history.


☞ Meeting 7: The Writing Of The Book

This meeting will be a book sprint where we collate our materials into a handbook for historians, and publish online. All projects will also be made available through Github, with appropriate supporting materials.


☞ All materials for Part 2 are formally due on the last day of term.