Skip to content
This repository has been archived by the owner on Nov 18, 2024. It is now read-only.
/ structured Public archive

Extracts structured data from unstructured input. Programming language agnostic. Uses llama.cpp

License

Notifications You must be signed in to change notification settings

distantmagic/structured

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Structured (work in progress)

Extracts structured data from unstructured input. Programming language agnostic. Uses llama.cpp.

About Structured

The project started as a Go conversion of https://github.com/jxnl/instructor/, but evolved in a more general-purpose library.

Structured maps data from arbitrary JSON schema to arbitrary Go struct (or just plain JSON).

It also features a language-agnostic HTTP server that you can set up in front of llama.cpp.

It is focused on llama.cpp. Support for other vendor APIs (like OpenAI or Anthropic) might be added in the future.

Key Features

  1. Language-agnostic HTTP server
  2. Go library with a simple API
  3. Model agnostic
  4. Focused on llama.cpp

Installation

Download the latest release from the releases page.

Alternatively you can clone the repository and build it yourself:

git clone [email protected]:distantmagic/structured.git
cd structured
go build

How It Works

sequenceDiagram
    You->>Structured: JSON schema + data
    Structured->>llama.cpp: extract
    llama.cpp->>Structured: extracted entity
    Structured->>Structured: validates extracted entity (double check)
    Structured-->>llama.cpp: retry if validation fails
    Structured->>You: JSON matching your schema
Loading

HTTP API

Start a server and point it to your local llama.cpp instance:

./structured \
	--llamacpp-host 127.0.0.1 \
	--llamacpp-port 8081 \
	--port 8080

Structured server connects to llama.cpp to extract the data.

Extract Entity

Include schema and data in your POST body. The server will respond with JSON matching your schema:

Request:
POST http://127.0.0.1:8080/extract/entity
{
  "schema": {
    "type": "object",
    "properties": {
      "hello": {
        "type": "string"
      }
    },
    "required": ["hello"]
  },
  "data": "Say 'world'"
}

Response:
{
  "hello": "world"
}

Programmatic Usage

Instead of using the HTTP API, you can use the Go library directly.

API can change with time until all features are implemented.

Initializing the Mapper

Point it to your local llama.cpp instance:

import (
	"fmt"
	"net/http"
	"testing"

	"github.com/distantmagic/structured/structured"
	"github.com/distantmagic/paddler/llamacpp"
	"github.com/distantmagic/paddler/netcfg"
)

var entityExtractor *EntityExtractor = &structured.EntityExtractor{
	LlamaCppClient: &llamacpp.LlamaCppClient{
		HttpClient: http.DefaultClient,
		LlamaCppConfiguration: &llamacpp.LlamaCppConfiguration{
			HttpAddress: &netcfg.HttpAddressConfiguration{
				Host:   "127.0.0.1",
				Port:   8081,
				Scheme: "http",
			},
		},
	},
	MaxRetries: 3,
}

Extracting Structured Data from String

After initializing the mapper, you can extract structured data from a string by providing a JSON schema and the string:

import "github.com/distantmagic/structured/structured"

responseChannel := make(chan structured.EntityExtractorResult)

go entityExtractor.ExtractFromString(
	responseChannel,
	map[string]any{
		"type": "object",
		"properties": map[string]any{
			"name": map[string]string{
				"type": "string",
			},
			"surname": map[string]string{
				"type": "string",
			},
			"age": map[string]string{
				"description": "Age in years.",
				"type":        "integer",
			},
		},
	},
	"I am John Doe - living for 40 years and I still like to play chess.",
)

for result := range responseChannel {
	if result.Error != nil {
		panic(result.Error)
	}

	// map[name:John, surname:Doe, age:40]
	fmt.Print(result.Result)
}

Mapping Extracted Result onto an Arbitrary Struct

Once you obtain the result you can map it to an arbitrary struct:

import "github.com/distantmagic/structured/structured"

type myTestPerson struct {
	Name    string `json:"name"`
	Surname string `json:"surname"`
	Age     int    `json:"age"`
}

func DoUnmarshalsToStruct(result structured.EntityExtractorResult) {
	var person myTestPerson

	err := structured.UnmarshalToStruct(result, &person)

	if nil != err {
		panic(err)
	}

	person.Name // John
	person.Surname // Doe
}

See Also

Paddler - (work in progress) llama.cpp load balancer, supervisor and request queue

Community

About

Extracts structured data from unstructured input. Programming language agnostic. Uses llama.cpp

Resources

License

Stars

Watchers

Forks

Packages

No packages published