diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index 5560b24..189dc3d 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -61,7 +61,7 @@ representative at an online or offline event. Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at -[https://kiranparajuli.com.np][contact method]. +[https://kiranparajuli.com.np](https://kiranparajuli.com.np). All complaints will be reviewed and investigated promptly and fairly. All community leaders are obligated to respect the privacy and security of the diff --git a/README.md b/README.md index 20b783b..bc70042 100644 --- a/README.md +++ b/README.md @@ -1,122 +1,110 @@
A very lightweight Markdown Parser powered by Regex
```
- 2. An indent of 4 can make the block a code
-
- For a codeblock inside a list
- it should be indented at least twice.
-3. **List:**
- - Ordered: `{any digit}. Item 1`
- - Unordered: `-|+|* Item 1`
- - Checklist: `-|+|*|{digit}. [ ] Item 1` (Can be _ordered_ or _unordered_)
- - Can be Lazy
- - Items must have the same intent to be included in the same list
- - Allows other _paragraph items_ inside content (recursive lex and parsing)
-4. **Quote:**
- - Levels: 0 to infinity
- - Lines must be the same indent to be included within the same quote
- - Allows other _paragraph items_ inside content (recursive lex and parsing)
- - Can be Lazy
-5. **Image:**
- - Link: String (Required)
- - Alt text: String (Required)
- - Title: String (Optional)
- - Width: Number (Optional)
- - Height: Number (Optional)
- - Indentation: NOT IMPLEMENTED YET
-6. **Comment:**
- - Lexer contains it
- - Parser also contains it
- - Example: ``
-7. **Line:**
- - Defined as: `---`
- - Consecutive lines are merged into one
-8. **Table:**
- - Equal number of cell counts
- - Equal number of indentations
- - Cell content should allow emphasis
- - Table heading is optional
- - Table heading is separated by `|---|,|:--:|`
-9. **Newline:**
- - Consecutive newlines are merged into one
-10. **Paragraph:**
- - Anything else
- - Line Breaks:
-
- 1. If a line ends with 2 or more than 2 spaces, then, a line break is inserted.
- 2. Otherwise, the lines are merged into one.
-11. **HTML:**
- - Lexer contains it
- - Parser also contains it
- - No escaping for HTML tags
- - Parsed as it is
-12. **Front Matter:**
- - Only lexer contains it
- - Starts and Ends with `---` (it can be surrounded by whitespaces, but should have exactly 3 dashes without spaces in between)
- - If array or object are provided in exact JSON literal format, then they are parsed as JS objects and arrays
- - Otherwise, they are parsed as strings
- - Example: `---\ntags: ["a", "b", "c"]\n---` is parsed as `{tags: ["a", "b", "c"]}`
-
-
-### 🎺 Emphasis
-Emphasis can be inside the content of any paragraph types. Even emphasis items can have emphasis inside 🤩.
-
-1. **Bold:** wrapped inside `**` | `__` | odd number of `*` | `_`
-2. **Italic:** wrapped inside `*` | `_` | even number of `*` | `_`
-3. **Code:** wrapped inside backticks
-4. **Strike:** wrapped inside `~~` | even number of `~`
-5. **Underline:** wrapped inside `++` | even number of `+`
-6. **Link:** wrapped as `[title](url 'title')` where `title` is optional
-7. **Image:** wrapped as `![alt text](url 'title' width height)` where `title`, `width` & `height` are optional
-
-### 🛹 Escaping
-1. Escaping is done by using `\` before the character to be escaped.
-2. If you need text like `# text` but don't want it to be treated as a heading, then you can escape it as `\# text
-3. Escaping is done for the following characters:
- - `*`, `_`, `[`, `]`, `(`, `)`, `!`, `~`, `+`, `<`, `>`, `&`, `"`, `'`
-4. Nothing is escaped in the lexer (content wise)
-5. Everything is escaped inside of `code` and `codeblock`
-6. Non HTML characters are escaped inside of other tokens
-
-Ref: https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
-
-### 👻 HTML Sanitization
-**A BIG TODO**
+## 🎠 Installation
+
+```bash
+npm i htmlmark
+```
+
+## 💠 Usage
+
+```js
+import HtmlMark from 'htmlmark';
+
+const opts = {
+ indent: 4,
+ highlightFn: (code, lang) => {
+ // return the highlighted code
+ },
+ useLinkRefs: true
+};
+
+const htmlmark = new HtmlMark(opts);
+
+htmlmark.tokenize("## Hello World"); // returns the tokens
+htmlmark.parse("## Hello World"); // returns the HTML code
+```
+
+## 🎡 Options
+
+| Option | Type | Default | Description |
+|-------------|------------|-------------|---------------------------------------------------|
+| indent | `number` | `4` | Number of spaces (or tabs) to use for indentation |
+| highlightFn | `function` | `undefined` | Function to highlight the code |
+| useLinkRefs | `boolean` | `true` | Whether to use link references or not |
+
+
+## 🎢 APIs
+- `tokenize(markdown: string): Token[]`:
+ Returns the lexer from the markdown string
+- `parse(markdown: string): string`:
+ Returns the HTML code from the markdown string
+- `getFrontMatter(markdown: string): FrontMatter{}`:
+ Returns the front matter from the markdown string
+
+### Lexer
+The provided markdown string is scanned line by line and checked against various regex patterns to produce the accurate markdown tokens. A general token structure is as:
+
+```json
+{
+ "indent": 0,
+ "level": 1,
+ "raw": "# Heading One Text",
+ "setext": false,
+ "type": "heading",
+ "value": "Heading One Text",
+ "tokens": [{
+ "raw": "Heading One Text",
+ "type": "text",
+ "value": "Heading One Text"
+ }]
+}
+```
+
+### Front Matter
+The front matter is the metadata of the markdown file. It is written in the YAML format and is separated from the markdown content by a line containing three hyphens `---`. It must be placed at the top of the markdown file.
+
+#### Example:
+
+```md
+---
+title: Hello World
+date: 2021-01-01
+author: John Doe
+---
+
+## Hello World
+Lorem ipsum dollar sit amet
+```
+
+The above markdown file will produce the following front matter:
+
+```json
+{
+ "title": "Hello World",
+ "date": "2021-01-01",
+ "author": "John Doe"
+}
+```
+
+## 💁 Contributing to HtmlMark
+Contributions are always welcome, no matter how large or small. Before contributing, please read the [code of conduct](https://github.com/kiranparajuli589/htmlmark/blob/main/CODE_OF_CONDUCT.md 'code of conduct'). You can also find the development guide [here](https://github.com/kiranparajuli589/htmlmark/blob/main/CONTRIBUTING.md 'here').
+
+## 📝 License
+GNU GENERAL PUBLIC LICENSE v3.0 © [Kiran Parajuli](https://kiranparajuli.com.np 'Kiran Parajuli')