Skip to content

Commit 065733f

Browse files
committed
Merge branch 'main' of https://github.com/Redliana/cmm-data
# Conflicts: # .github/ISSUE_TEMPLATE/bug_report.md # .github/ISSUE_TEMPLATE/feature_request.md # .gitignore # CONTRIBUTING.md # LICENSE # README.md
2 parents c8d9e74 + 11aab38 commit 065733f

77 files changed

Lines changed: 14218 additions & 460 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 15 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,42 +1,32 @@
11
---
2-
name: Bug Report
3-
about: Report a bug in the preprocessing pipeline or data
2+
name: Bug report
3+
about: Create a report to help us improve
44
title: '[BUG] '
55
labels: bug
66
assignees: ''
77
---
88

9-
## Bug Description
9+
**Describe the bug**
1010
A clear and concise description of what the bug is.
1111

12-
## Steps to Reproduce
13-
1. Go to '...'
14-
2. Run command '...'
12+
**To Reproduce**
13+
Steps to reproduce the behavior:
14+
1. Import '...'
15+
2. Call function '...'
1516
3. See error
1617

17-
## Expected Behavior
18+
**Expected behavior**
1819
A clear and concise description of what you expected to happen.
1920

20-
## Actual Behavior
21-
What actually happened, including error messages.
22-
23-
## Error Output
21+
**Error message**
2422
```
25-
Paste any error messages or stack traces here
23+
Paste the full error message here
2624
```
2725

28-
## Environment
29-
- OS: [e.g., macOS 14.0, Ubuntu 22.04]
30-
- Python version: [e.g., 3.11.5]
31-
- Repository version/commit: [e.g., v1.0.0 or commit hash]
32-
33-
## Data Files Affected
34-
List any specific data files or commodities affected:
35-
- [ ] MCS 2025 Industry Trends
36-
- [ ] Salient Commodity Data
37-
- [ ] World Production Data
38-
- [ ] MRDS Deposits
39-
- [ ] Unified Corpus
26+
**Environment:**
27+
- OS: [e.g., macOS 14.0, Ubuntu 22.04]
28+
- Python version: [e.g., 3.11.0]
29+
- cmm_data version: [e.g., 0.1.0]
4030

41-
## Additional Context
31+
**Additional context**
4232
Add any other context about the problem here.
Lines changed: 19 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,30 @@
11
---
2-
name: Feature Request
3-
about: Suggest a new feature or enhancement
2+
name: Feature request
3+
about: Suggest an idea for this project
44
title: '[FEATURE] '
55
labels: enhancement
66
assignees: ''
77
---
88

9-
## Feature Description
10-
A clear and concise description of the feature you'd like.
9+
**Is your feature request related to a problem? Please describe.**
10+
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
1111

12-
## Use Case
13-
Explain the problem this feature would solve or the use case it enables.
12+
**Describe the solution you'd like**
13+
A clear and concise description of what you want to happen.
1414

15-
## Proposed Solution
16-
Describe how you envision this feature working.
15+
**Describe alternatives you've considered**
16+
A clear and concise description of any alternative solutions or features you've considered.
1717

18-
## Alternatives Considered
19-
Describe any alternative solutions or features you've considered.
18+
**Additional context**
19+
Add any other context or screenshots about the feature request here.
2020

21-
## Category
22-
What area does this feature relate to?
23-
- [ ] Data preprocessing
24-
- [ ] New data source integration
25-
- [ ] Fine-tuning pipeline
26-
- [ ] Evaluation/benchmarking
27-
- [ ] Documentation
21+
**Dataset/Loader affected**
22+
- [ ] USGSCommodityLoader
23+
- [ ] USGSOreDepositsLoader
24+
- [ ] OSTIDocumentsLoader
25+
- [ ] PreprocessedCorpusLoader
26+
- [ ] GAChronostratigraphicLoader
27+
- [ ] NETLREECoalLoader
28+
- [ ] OECDSupplyChainLoader
29+
- [ ] Visualizations
2830
- [ ] Other
29-
30-
## Additional Context
31-
Add any other context, mockups, or examples here.

.github/RELEASE_TEMPLATE.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# CMM Data v0.1.0
2+
3+
**Critical Minerals Modeling Data Access Library** - Initial Release
4+
5+
## Highlights
6+
7+
- Unified API for accessing 7 critical minerals datasets
8+
- 80+ mineral commodities from USGS Mineral Commodity Summaries
9+
- 3,298 preprocessed documents for LLM training
10+
- Built-in visualizations for production charts
11+
- Pip-installable with optional geo/viz dependencies
12+
13+
## Installation
14+
15+
```bash
16+
pip install cmm-data
17+
18+
# With all features
19+
pip install "cmm-data[full]"
20+
```
21+
22+
## Quick Start
23+
24+
```python
25+
import cmm_data
26+
27+
# View available datasets
28+
print(cmm_data.get_data_catalog())
29+
30+
# Load lithium production data
31+
df = cmm_data.load_usgs_commodity("lithi", "world")
32+
print(df[['Country', 'Prod_t_est_2022', 'Reserves_t']])
33+
34+
# List critical minerals
35+
print(cmm_data.list_critical_minerals())
36+
```
37+
38+
## Data Loaders
39+
40+
| Loader | Dataset | Records |
41+
|--------|---------|---------|
42+
| `USGSCommodityLoader` | USGS MCS 2023 | 80+ commodities |
43+
| `USGSOreDepositsLoader` | USGS NGDB | 356 fields |
44+
| `PreprocessedCorpusLoader` | CMM Corpus | 3,298 documents |
45+
| `OECDSupplyChainLoader` | OECD/IEA | Trade & policy data |
46+
| `GAChronostratigraphicLoader` | Geoscience Australia | 9 surfaces |
47+
| `NETLREECoalLoader` | NETL | REE in coal |
48+
| `OSTIDocumentsLoader` | DOE OSTI | Technical reports |
49+
50+
## Requirements
51+
52+
- Python 3.9+
53+
- pandas >= 2.0.0
54+
- numpy >= 1.24.0
55+
56+
## Documentation
57+
58+
- [README](https://github.com/PNNL-CMM/cmm-data#readme)
59+
- [API Reference](https://pnnl-cmm.github.io/cmm-data/)
60+
- [Examples](https://github.com/PNNL-CMM/cmm-data/tree/main/examples)
61+
62+
## What's Changed
63+
64+
* Initial release of CMM Data package
65+
* 7 data loaders for critical minerals datasets
66+
* Visualization module with production charts
67+
* Comprehensive documentation and examples
68+
* GitHub Actions CI/CD workflows
69+
* Pre-commit hooks for code quality
70+
71+
**Full Changelog**: https://github.com/PNNL-CMM/cmm-data/commits/v0.1.0
72+
73+
---
74+
75+
*Developed by Pacific Northwest National Laboratory (PNNL)*

.github/workflows/docs.yml

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
name: Documentation
2+
3+
on:
4+
push:
5+
branches: [ main ]
6+
paths:
7+
- 'docs/**'
8+
- 'src/**'
9+
- '.github/workflows/docs.yml'
10+
workflow_dispatch:
11+
12+
# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
13+
permissions:
14+
contents: read
15+
pages: write
16+
id-token: write
17+
18+
# Allow only one concurrent deployment
19+
concurrency:
20+
group: "pages"
21+
cancel-in-progress: false
22+
23+
jobs:
24+
build:
25+
runs-on: ubuntu-latest
26+
steps:
27+
- name: Checkout
28+
uses: actions/checkout@v4
29+
30+
- name: Set up Python
31+
uses: actions/setup-python@v5
32+
with:
33+
python-version: '3.11'
34+
35+
- name: Install dependencies
36+
run: |
37+
python -m pip install --upgrade pip
38+
pip install -e ".[viz]"
39+
pip install -r docs/requirements.txt
40+
41+
- name: Build Sphinx documentation
42+
run: |
43+
cd docs
44+
make html
45+
46+
- name: Setup Pages
47+
uses: actions/configure-pages@v4
48+
49+
- name: Upload artifact
50+
uses: actions/upload-pages-artifact@v3
51+
with:
52+
path: 'docs/_build/html'
53+
54+
deploy:
55+
environment:
56+
name: github-pages
57+
url: ${{ steps.deployment.outputs.page_url }}
58+
runs-on: ubuntu-latest
59+
needs: build
60+
steps:
61+
- name: Deploy to GitHub Pages
62+
id: deployment
63+
uses: actions/deploy-pages@v4

.github/workflows/publish.yml

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
name: Publish Package
2+
3+
on:
4+
release:
5+
types: [published]
6+
workflow_dispatch:
7+
8+
jobs:
9+
build:
10+
runs-on: ubuntu-latest
11+
steps:
12+
- name: Checkout
13+
uses: actions/checkout@v4
14+
15+
- name: Set up Python
16+
uses: actions/setup-python@v5
17+
with:
18+
python-version: '3.11'
19+
20+
- name: Install build dependencies
21+
run: |
22+
python -m pip install --upgrade pip
23+
pip install build twine
24+
25+
- name: Build package
26+
run: python -m build
27+
28+
- name: Check package
29+
run: twine check dist/*
30+
31+
- name: Upload build artifacts
32+
uses: actions/upload-artifact@v4
33+
with:
34+
name: dist
35+
path: dist/
36+
37+
# Publish to Test PyPI first (optional)
38+
publish-test:
39+
runs-on: ubuntu-latest
40+
needs: build
41+
if: github.event_name == 'workflow_dispatch'
42+
environment: test-pypi
43+
permissions:
44+
id-token: write
45+
steps:
46+
- name: Download artifacts
47+
uses: actions/download-artifact@v4
48+
with:
49+
name: dist
50+
path: dist/
51+
52+
- name: Publish to Test PyPI
53+
uses: pypa/gh-action-pypi-publish@release/v1
54+
with:
55+
repository-url: https://test.pypi.org/legacy/
56+
57+
# Publish to PyPI on release
58+
publish:
59+
runs-on: ubuntu-latest
60+
needs: build
61+
if: github.event_name == 'release'
62+
environment: pypi
63+
permissions:
64+
id-token: write
65+
steps:
66+
- name: Download artifacts
67+
uses: actions/download-artifact@v4
68+
with:
69+
name: dist
70+
path: dist/
71+
72+
- name: Publish to PyPI
73+
uses: pypa/gh-action-pypi-publish@release/v1

0 commit comments

Comments
 (0)