Skip to content

jenul-ferdinand/md2data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

md2data: Markdown → Structured Data

MD2Data Logo

A high-performance tool that parses Markdown documents and converts them into structured data formats i.e., JSON, YAML, XML, or TOML. Built with Rust for maximum speed and reliability, with bindings for Node.js and Python. Available as both a Rust library and CLI tool.

How It Works

md2data works in three steps.

Markdown (input)
   ↓
Parser (Markdown → AST)
   ↓
Serializer (AST → JSON/YAML/TOML/XML)

Markdown is parsed into an Abstract Syntax Tree (AST) using pulldown-cmark, which is also a Rust-based CommonMark compliant parser. The AST is simply converted to your desired output format using serde.

Architecture

The core parsing and serialization logic can be found in crates/md2data.

The bindings for Python and Node.js are in bindings/python (PyO3 & maturin) and bindings/node (napi-rs).

Features

  • Multiple output formats: Convert Markdown to JSON, YAML, TOML, or XML.
  • High performance: Written in Rust with zero-cost abstractions.
  • Python and Node.js support: Bindings available for Python and Node.js too.
  • Structured Output: Generates a clean data representation of your Markdown.

Getting Started

Installation

📦 Cargo

cargo install md2data

🦀 Manual installation for Rust

git clone https://github.com/yourusername/md2data.git
cd md2data
cargo build --release

⚡ Node.js

npm install md2data

🐍 Python

pip install md2data

Usage

As a rust library
use md2data::{convert_str, OutputFormat, ParsingMode};

let markdown = r#"# Hello World

This is a **markdown** document."#;

let json = convert_str(markdown, OutputFormat::Json, ParsingMode::Minified).unwrap();
let yaml = convert_str(markdown, OutputFormat::Yaml, ParsingMode::Minified).unwrap();
In the command line
# Convert to JSON (default)
md2data input.md

# Specify output format
md2data input.md --format yaml
md2data input.md --format toml
md2data input.md --format xml

# Read from stdin
echo "# Hello World" | md2data - --format json

# Output to file
md2data input.md --format json -o output.json
In a node.js script
const { convert } = require('md2data');

const markdown = `
# Hello World

This is a **markdown** document.
`;

const json = convert(markdown, 'json');
const yaml = convert(markdown, 'yaml');
const toml = convert(markdown, 'toml');
const xml = convert(markdown, 'xml');

console.log(json);
In a python script
from md2data import convert

markdown = """
# Hello World

This is a **markdown** document.
"""

json_output = convert(markdown, 'json')
yaml_output = convert(markdown, 'yaml')
toml_output = convert(markdown, 'toml')
xml_output = convert(markdown, 'xml')

print(json_output)

Output Examples

See docs/examples for example outputs.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Development Setup

# Clone the repository
git clone https://github.com/yourusername/md2data.git
cd md2data

# Build all components
cargo build

# Run tests
cargo test

# Build Node.js binding
cd bindings/node
npm install
npm run build

# Build Python binding
cd bindings/python
maturin develop

License

MIT License - see LICENSE file for details

Acknowledgments

  • Built with pulldown-cmark for Markdown parsing
  • Inspired by existing tools like md2json and markdown-to-json
  • Designed to provide better multi-format support and cross-language compatibility
  • Classic EmojiOne (now JoyPixels) emojis used for the logo design.

Made with Rust 🦀

About

Markdown → JSON/XML/YAML/TOML. Written in Rust. Extremely Fast.

Resources

Stars

Watchers

Forks

Sponsor this project