I built the original Rest Is History Interrogator in 2023, it was partly an experiment to see how effective code generation could be in building hobby projects. It proved very effective but the original version ended up being a bit of a mess and hard to maintain. I built simple AI functionality into that version but didn't release it because the API costs at the time were too high to justify. Since then, the cost of AI inference has fallen dramatically to the point where it's reasonable to make the functionality available. It's still a fairly naive RAG implementation but works well enough.
This version is a mostly complete rewrite of the original and a demonstration of how much easier it is to build applications with the tools that are now available. Code generation is now sufficiently effective that large parts of application functionality can be built without writing any code at all. Indeed, there are parts of the application code that I've barely even looked at.
We are very nearly at the point where any individual can build tools which make their lives or jobs easier which is a very exciting prospect.
I'm Matthew. Like many others, podcasts have mostly replaced the radio as the primary way I listen to audio for entertainment, professional information and news.There are a few podcasts, including The Rest is History for which I have listened to many if not all episodes.
Some, like EconTalk have transcripts available to read and search, but most do not. This makes discovering whether someone said what you think they did or told a story the way you remember it difficult. Building a searchable database of transcripts of any podcast I might fancy has long been a back-burner project which, like most of my back-burner projects I didn't have any real expectation of doing. It's not that I couldn't do it, but that the knowledge I would need to acquire to do it wasn't all that useful to me outside of the specific project and so the investment of time was difficult to justify.
That changed when large language models (LLMs) started to become effective ways to help build - or just build - code. The project became more managable still when OpenAI released their Whisper project which offered the ability to do high-quality transcriptions of any audio in many languages on consumer grade computer hardware. Then Georgi Gerganov worked minor miracles to port that to C++ dramatically increasing the performance and hence reducing the compute and time required to transcribe. His whisper.cpp project meant I could get 400 episodes of The Rest is History transcribed using 3 computers (an old i7 server, an i5 Mac Mini and a M1 Macbook Air) in my house over about 72 hours. As this was winter, I even got to keep my office warm by doing it...
The combination of Whisper.cpp, GitHub CoPilot and the original ChatGPT got the first version of the application built much faster that I could have done it myself and without having to spend money on transcription services from Google GCP or Amazon AWS. With the launch of GPT4 it's even easier to get write code that works well enough to be useful. Anyone with experience building web-based applications professionally will notice all the signs that I don't do that in the quality of the HTML.
I've now added some further functionality to complement the simple search functionality. I've used the OpenAI embeddings service to create a better (or different at any rate) version of the search which allows for more free-form (semantic in the industry jargon) searches. The original would find nothing for a search of "the wrong shoes affecting global politics" or "Cromwell's behaviour in Ireland" but the new version does. While this is an improvement, it's really just the groundwork to allow a LLM to search the podcast and answer questions about it. That already works rather well but I need to figure out how to avoid it costing me a fortune.
This whole thing has been a bit of fun and an experiment is learning how much a below average developer can get done using the tools that are now available to us. I hope you find it useful.