Skip to content

Commit f2a3819

Browse files
Merge pull request #3 from openize-com/muhammadumar-patch
MarkItDown v25.6.0: Gemini & Mistral LLM Integration, CLI Upgrades, and Cleaner API
2 parents 54db5f6 + ea7361d commit f2a3819

File tree

6 files changed

+187
-57
lines changed

6 files changed

+187
-57
lines changed

README.md

Lines changed: 44 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,12 @@
44
![License](https://img.shields.io/badge/license-MIT-green)
55
![Status](https://img.shields.io/badge/status-alpha-orange)
66

7-
Openize.MarkItDown for Python is a package that converts documents into Markdown format. It supports multiple file formats, provides flexible output handling, and integrates with LLMs for extended processing.
7+
Openize.MarkItDown for Python is a package that converts documents into Markdown format. It supports multiple file formats, provides flexible output handling, and integrates with LLMs for extended processing including OpenAI, Claude, Gemini, and Mistral.
88

99
## Features
1010

1111
- Convert `.docx`, `.pdf`, `.xlsx`, and `.pptx` to Markdown.
12-
- Save Markdown files locally or send them to an LLM for processing.
12+
- Save Markdown files locally or send them to an LLM for processing (OpenAI, Claude, Gemini, Mistral).
1313
- Structured with the **Factory & Strategy Pattern** for scalability.
1414
- Works with Windows and Linux-compatible paths.
1515
- Command-line interface for easy use.
@@ -24,17 +24,23 @@ This package depends on the Aspose libraries, which are commercial products:
2424

2525
You'll need to obtain valid licenses for these libraries separately. The package will install these dependencies, but you're responsible for complying with Aspose's licensing terms.
2626

27+
LLM support requires valid API keys and potentially the following dependencies:
28+
29+
- `openai` for OpenAI
30+
- `anthropic` for Claude
31+
- `requests` for Gemini and Mistral REST APIs
32+
2733
## Installation
2834

2935
### From TestPyPI
3036

31-
```sh
37+
```bash
3238
pip install openize-markitdown-python
3339
```
3440

3541
### From Source
3642

37-
```sh
43+
```bash
3844
git clone https://github.com/openize-com/openize-markitdown-python.git
3945
cd openize-markitdown-python\packages\markitdown
4046
pip install -e . --verbose
@@ -44,12 +50,15 @@ pip install -e . --verbose
4450

4551
### Command Line Interface
4652

47-
```sh
53+
```bash
4854
# Convert a file and save locally
4955
markitdown document.docx -o output_folder
5056

51-
# Process with an LLM (requires OPENAI_API_KEY environment variable)
52-
markitdown document.docx -o output_folder --insert_into_llm
57+
# Process with an LLM (requires corresponding API key)
58+
markitdown document.docx -o output_folder --llm openai
59+
markitdown document.docx -o output_folder --llm claude
60+
markitdown document.docx -o output_folder --llm gemini
61+
markitdown document.docx -o output_folder --llm mistral
5362
```
5463

5564
### Python API
@@ -61,50 +70,62 @@ from openize.markitdown.core import MarkItDown
6170
input_file = "report.pdf"
6271
output_dir = "output_markdown"
6372

64-
# Create MarkItDown instance
65-
converter = MarkItDown(output_dir)
73+
# Create MarkItDown instance with desired LLM
74+
converter = MarkItDown(output_dir, llm_client_name="mistral")
6675

6776
# Convert document and send output to LLM
68-
converter.convert_document(input_file, insert_into_llm=True)
77+
converter.convert_document(input_file)
6978

7079
print("Conversion completed and data sent to LLM.")
7180
```
7281

7382
## Environment Variables
7483

75-
- `ASPOSE_LICENSE_PATH`: Required when using the Aspose Paid APIs. This should be set to the full path of your Aspose license file.
76-
- `OPENAI_API_KEY`: Required when using the `insert_into_llm=True` option or the `--llm` flag.
77-
- `OPENAI_MODEL`: Specifies the OpenAI model name (default: `gpt-4`).
84+
| Variable | Description |
85+
|-----------------------|--------------------------------------------------------------------|
86+
| `ASPOSE_LICENSE_PATH` | Path to Aspose license file (required if using paid features) |
87+
| `OPENAI_API_KEY` | API key for OpenAI integration |
88+
| `OPENAI_MODEL` | (Optional) Model name for OpenAI (default: `gpt-4`) |
89+
| `CLAUDE_API_KEY` | API key for Claude integration |
90+
| `CLAUDE_MODEL` | (Optional) Model name for Claude (default: `claude-v1`) |
91+
| `GEMINI_API_KEY` | API key for Gemini integration |
92+
| `GEMINI_MODEL` | (Optional) Model name for Gemini (default: `gemini-pro`) |
93+
| `MISTRAL_API_KEY` | API key for Mistral integration |
94+
| `MISTRAL_MODEL` | (Optional) Model name for Mistral (default: `mistral-medium`) |
7895

79-
To set these variables:
96+
### Setting Environment Variables
8097

81-
For Unix-based systems:
98+
**Unix-based systems:**
8299

83100
```bash
84101
export ASPOSE_LICENSE_PATH="/path/to/license"
85-
export OPENAI_API_KEY="your-api-key"
86-
export OPENAI_MODEL="gpt-4"
102+
export OPENAI_API_KEY="your-openai-key"
103+
export CLAUDE_API_KEY="your-claude-key"
104+
export GEMINI_API_KEY="your-gemini-key"
105+
export MISTRAL_API_KEY="your-mistral-key"
87106
```
88107

89-
For Windows (PowerShell):
108+
**Windows (PowerShell):**
90109

91110
```powershell
92111
$env:ASPOSE_LICENSE_PATH = "C:\path\to\license"
93-
$env:OPENAI_API_KEY = "your-api-key"
94-
$env:OPENAI_MODEL = "gpt-4"
112+
$env:OPENAI_API_KEY = "your-openai-key"
113+
$env:CLAUDE_API_KEY = "your-claude-key"
114+
$env:GEMINI_API_KEY = "your-gemini-key"
115+
$env:MISTRAL_API_KEY = "your-mistral-key"
95116
```
96117

97-
## Contributing
118+
## Contributing
98119

99-
We appreciate your interest in contributing to this project! To ensure a smooth collaboration, please follow these steps when submitting a pull request:
120+
We appreciate your interest in contributing to this project! To ensure a smooth collaboration, please follow these steps when submitting a pull request:
100121

101122
1. **Fork & Clone** – Fork the repository and clone it to your local machine.
102123
2. **Create a Branch** – Use a new branch for your contribution.
103124
3. **Sign the Contributor License Agreement (CLA)** – Before your first contribution can be accepted, you must sign our CLA via [CLA Assistant](https://cla-assistant.io). You will be prompted to sign it when submitting your first pull request. You can also review the CLA here: [https://cla.openize.com/agreement](https://cla.openize.com/agreement).
104125
4. **Submit a Pull Request (PR)** – Once your changes are ready, open a PR with a clear description.
105126
5. **Review & Feedback** – Our maintainers will review your PR and provide feedback if needed.
106127

107-
By contributing, you agree to the terms of the CLA and confirm that your changes comply with the project's licensing policies.
128+
By contributing, you agree to the terms of the CLA and confirm that your changes comply with the project's licensing policies.
108129

109130
## License
110131

packages/markitdown/README.md

Lines changed: 42 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,12 @@
44
![License](https://img.shields.io/badge/license-MIT-green)
55
![Status](https://img.shields.io/badge/status-alpha-orange)
66

7-
Openize.MarkItDown for Python converts documents into Markdown format. It supports multiple file formats, provides flexible output handling, and integrates with LLMs for extended processing.
7+
Openize.MarkItDown for Python converts documents into Markdown format. It supports multiple file formats, provides flexible output handling, and integrates with popular LLMs for post-processing, including OpenAI, Claude, Gemini, and Mistral.
88

99
## Features
1010

1111
- Convert `.docx`, `.pdf`, `.xlsx`, and `.pptx` to Markdown.
12-
- Save Markdown files locally or send them to an LLM for processing.
12+
- Save Markdown files locally or send them to an LLM (OpenAI, Claude, Gemini, Mistral).
1313
- Structured with the **Factory & Strategy Pattern** for scalability.
1414
- Works with Windows and Linux-compatible paths.
1515
- Command-line interface for easy use.
@@ -24,73 +24,85 @@ This package depends on the Aspose libraries, which are commercial products:
2424

2525
You'll need to obtain valid licenses for these libraries separately. The package will install these dependencies, but you're responsible for complying with Aspose's licensing terms.
2626

27-
## Installation
27+
LLM integration may require the following additional packages or valid API credentials:
28+
29+
- `openai` (for OpenAI)
30+
- `anthropic` (for Claude)
31+
- `requests` (used for Gemini and Mistral REST APIs)
2832

29-
### From TestPyPI
33+
## Installation
3034

31-
```sh
35+
```bash
3236
pip install openize-markitdown-python
3337
```
3438

3539
## Usage
3640

3741
### Command Line Interface
3842

39-
```sh
43+
```bash
4044
# Convert a file and save locally
4145
markitdown document.docx -o output_folder
4246

43-
# Process with an LLM (requires OPENAI_API_KEY environment variable)
44-
markitdown document.docx -o output_folder --insert_into_llm
47+
# Process with an LLM (requires appropriate API key)
48+
markitdown document.docx -o output_folder --llm openai
49+
markitdown document.docx -o output_folder --llm claude
50+
markitdown document.docx -o output_folder --llm gemini
51+
markitdown document.docx -o output_folder --llm mistral
4552
```
4653

4754
### Python API
4855

4956
```python
5057
from openize.markitdown.core import MarkItDown
5158

52-
# Define input file and output directory
5359
input_file = "report.pdf"
5460
output_dir = "output_markdown"
5561

56-
# Create MarkItDown instance
57-
converter = MarkItDown(output_dir)
58-
59-
# Convert document and send output to LLM
60-
converter.convert_document(input_file, insert_into_llm=True)
61-
62-
print("Conversion completed and data sent to LLM.")
62+
converter = MarkItDown(output_dir, llm_client_name="gemini")
63+
converter.convert_document(input_file)
6364

65+
print("Conversion completed and data sent to Gemini.")
6466
```
6567

6668
## Environment Variables
6769

68-
- `ASPOSE_LICENSE_PATH`: Required when using the Aspose Paid APIs. This should be set to the full path of your Aspose license file.
69-
- `OPENAI_API_KEY`: Required when using the `insert_into_llm=True` option or the `--llm` flag.
70-
- `OPENAI_MODEL`: Specifies the OpenAI model name (default: `gpt-4`).
70+
The following environment variables are used to control license and LLM access:
7171

72-
To set these variables:
72+
| Variable | Description |
73+
|---------------------|------------------------------------------------------------|
74+
| `ASPOSE_LICENSE_PATH` | Required to activate Aspose license (if using paid APIs) |
75+
| `OPENAI_API_KEY` | Required for OpenAI integration |
76+
| `OPENAI_MODEL` | (Optional) OpenAI model name (default: `gpt-4`) |
77+
| `CLAUDE_API_KEY` | Required for Claude integration |
78+
| `CLAUDE_MODEL` | (Optional) Claude model name (default: `claude-v1`) |
79+
| `GEMINI_API_KEY` | Required for Gemini integration |
80+
| `GEMINI_MODEL` | (Optional) Gemini model name (default: `gemini-pro`) |
81+
| `MISTRAL_API_KEY` | Required for Mistral integration |
82+
| `MISTRAL_MODEL` | (Optional) Mistral model name (default: `mistral-medium`) |
7383

74-
For Unix-based systems:
84+
### Setting Environment Variables
7585

86+
**Unix-based (Linux/macOS):**
7687
```bash
7788
export ASPOSE_LICENSE_PATH="/path/to/license"
78-
export OPENAI_API_KEY="your-api-key"
79-
export OPENAI_MODEL="gpt-4"
89+
export OPENAI_API_KEY="your-openai-key"
90+
export CLAUDE_API_KEY="your-claude-key"
91+
export GEMINI_API_KEY="your-gemini-key"
92+
export MISTRAL_API_KEY="your-mistral-key"
8093
```
8194

82-
For Windows (PowerShell):
83-
95+
**Windows PowerShell:**
8496
```powershell
8597
$env:ASPOSE_LICENSE_PATH = "C:\path\to\license"
86-
$env:OPENAI_API_KEY = "your-api-key"
87-
$env:OPENAI_MODEL = "gpt-4"
98+
$env:OPENAI_API_KEY = "your-openai-key"
99+
$env:CLAUDE_API_KEY = "your-claude-key"
100+
$env:GEMINI_API_KEY = "your-gemini-key"
101+
$env:MISTRAL_API_KEY = "your-mistral-key"
88102
```
89103

90104
## License
91105

92106
This package is licensed under the MIT License. However, it depends on Aspose libraries, which are proprietary, closed-source libraries.
93107

94-
⚠️ Users must obtain a valid license for Aspose libraries separately. This repository does not include or distribute any proprietary components.
95-
96-
108+
⚠️ You must obtain valid licenses for Aspose libraries separately. This repository does not include or distribute any proprietary components.

packages/markitdown/setup.cfg

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11

22
[metadata]
33
name = openize-markitdown-python
4-
version = 25.5.0
4+
version = 25.6.0
55

66
author = Openize
77
author_email = [email protected]
@@ -33,7 +33,8 @@ install_requires =
3333
aspose-cells-python>=23.0.0
3434
aspose-slides>=23.0.0
3535
openai>=1.0.0
36-
anthropic>=3.0.0
36+
anthropic>=0.3.11
37+
requests>=2.25.0 # Needed for Gemini and Mistral HTTP API calls
3738

3839
[options.packages.find]
3940
where = src

packages/markitdown/setup.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,10 @@ def install_if_missing(package, module_name=None):
2121
("aspose-slides", "asposeslides"),
2222
("openai", "openai"),
2323
("anthropic", "anthropic"),
24+
("requests", "requests"), # Required for Gemini/Mistral REST API
25+
# Optional SDKs (uncomment if using them instead of raw HTTP)
26+
# ("google-generativeai", "google.generativeai"),
27+
# ("mistralai", "mistralai"),
2428
]
2529

2630
# Install missing dependencies before proceeding

packages/markitdown/src/openize/markitdown/llm_strategy.py

Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,86 @@ def process(self, md_file):
8181
except Exception as e:
8282
logging.exception(f"Unexpected error processing {md_file}: {e}")
8383

84+
class GeminiClient(LLMStrategy):
85+
def __init__(self):
86+
self.api_key = os.getenv("GEMINI_API_KEY")
87+
self.model = os.getenv("GEMINI_MODEL", "gemini-pro")
88+
89+
if not self.api_key:
90+
raise ValueError("Missing Gemini API key. Please set it in the environment.")
91+
92+
self.api_url = f"https://generativelanguage.googleapis.com/v1beta/models/{self.model}:generateContent"
93+
94+
def process(self, md_file):
95+
try:
96+
import requests
97+
98+
with open(md_file, "r", encoding="utf-8") as file:
99+
content = file.read()
100+
101+
headers = {
102+
"Content-Type": "application/json",
103+
"Authorization": f"Bearer {self.api_key}"
104+
}
105+
data = {
106+
"contents": [
107+
{"parts": [{"text": content}]}
108+
]
109+
}
110+
111+
response = requests.post(self.api_url, headers=headers, json=data)
112+
response.raise_for_status()
113+
114+
reply = response.json()
115+
text = reply["candidates"][0]["content"]["parts"][0]["text"]
116+
logging.info(f"Gemini Response for {md_file}: {text}")
117+
118+
except FileNotFoundError:
119+
logging.error(f"Markdown file not found: {md_file}")
120+
except Exception as e:
121+
logging.exception(f"Gemini processing error: {e}")
122+
class MistralClient(LLMStrategy):
123+
def __init__(self):
124+
self.api_key = os.getenv("MISTRAL_API_KEY")
125+
self.model = os.getenv("MISTRAL_MODEL", "mistral-medium")
126+
127+
if not self.api_key:
128+
raise ValueError("Missing Mistral API key. Please set it in the environment.")
129+
130+
self.api_url = "https://api.mistral.ai/v1/chat/completions"
131+
132+
def process(self, md_file):
133+
try:
134+
import requests
135+
136+
with open(md_file, "r", encoding="utf-8") as file:
137+
content = file.read()
138+
139+
headers = {
140+
"Authorization": f"Bearer {self.api_key}",
141+
"Content-Type": "application/json"
142+
}
143+
data = {
144+
"model": self.model,
145+
"messages": [
146+
{"role": "system", "content": "Process this Markdown content."},
147+
{"role": "user", "content": content}
148+
]
149+
}
150+
151+
response = requests.post(self.api_url, headers=headers, json=data)
152+
response.raise_for_status()
153+
154+
result = response.json()
155+
message = result["choices"][0]["message"]["content"]
156+
logging.info(f"Mistral Response for {md_file}: {message}")
157+
158+
except FileNotFoundError:
159+
logging.error(f"Markdown file not found: {md_file}")
160+
except Exception as e:
161+
logging.exception(f"Mistral processing error: {e}")
162+
163+
84164
class LLMFactory:
85165
@staticmethod
86166
def get_llm(client_name: str) -> LLMStrategy:
@@ -89,5 +169,10 @@ def get_llm(client_name: str) -> LLMStrategy:
89169
return OpenAIClient()
90170
elif client_name == "claude":
91171
return ClaudeClient()
172+
elif client_name == "gemini":
173+
return GeminiClient()
174+
elif client_name == "mistral":
175+
return MistralClient()
92176
else:
93177
raise ValueError(f"Unknown LLM client: {client_name}")
178+

0 commit comments

Comments
 (0)