A notebook parsing a wikimedia file from Cinemorgue to extract actor/actress names and movie titles.
Cinemorgue is an encyclopedia that is dedicated to documenting which actors or actresses "died" in which movie or TV show. Having started as a separate website, the documentation effort proved to be too big a job for one person, so the wiki was born where everyone is allowed to submit their additions and corrections directly.
So if you've ever wondered "Has so-and-so ever done a death scene?" or "What's that movie where what'sisname kills such-and-such?", then this index will strive to answer those questions.
This notebook will produce a csv from their .xml export, and only focuses on movies. It will contain actor name, movie title, year, and a cause of death.
If you are just looking for the cleaned data, it's in the output folder!
Make sure you have UV installed. You can install it using pip:
pip install uv
The project requires Python 3.9. You have two options:
- Using pyenv (recommended for managing Python versions):
# Install and set Python 3.9 as the project version
pyenv install 3.9
pyenv local 3.9
- Without pyenv:
- Manually install Python 3.9 from python.org
- Ensure
python --version
shows 3.9.x before proceeding
-
Download the "Current Pages" xml dump at the bottom of the page from https://cinemorgue.fandom.com/wiki/Special:Statistics. Add the file to the root of the directory.
-
Clone the repo
git clone https://github.com/MDeanLindsay/Cinemorgue.git cd Cinemorgue
-
Create and activate virtual environment
uv venv source .venv/bin/activate # On Unix/MacOS # OR .\.venv\Scripts\activate # On Windows
-
Install dependencies
uv pip install -r requirements.lock
If you need to update dependencies:
- Edit
requirements.txt
to add/remove packages - Generate new lock file:
uv pip compile requirements.txt -o requirements.lock
- Install from new lock file:
uv pip install -r requirements.lock
After installation, start Jupyter Lab:
jupyter lab
Open Cinemorgue.ipynb
to get the party started.