Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge new changes #1

Open
wants to merge 125 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
125 commits
Select commit Hold shift + click to select a range
97e423c
ci test
frozenkp Jun 4, 2018
24e7acf
ci test
frozenkp Jun 4, 2018
b2ea4a0
add codecov
frozenkp Jun 4, 2018
a826557
add codecov
frozenkp Jun 4, 2018
9ec4d8f
script update
frozenkp Jun 4, 2018
f7e2046
script update
frozenkp Jun 4, 2018
1875c71
add codecov badge
frozenkp Jun 4, 2018
233c513
Add test cases for C0
5teven1in Jun 5, 2018
2b597df
Add test cases for C1
5teven1in Jun 5, 2018
cc2e34d
fix heap (declared)
frozenkp Jun 5, 2018
40fa13a
Format confusion.go
5teven1in Jun 5, 2018
e92e615
Add test cases for C0
5teven1in Jun 5, 2018
7255c67
Add test cases for cross_fold
5teven1in Jun 5, 2018
9102224
add kdtree test
frozenkp Jun 6, 2018
1b9e563
Create cluster_extra_test.go
Jun 6, 2018
6b3f7dd
format
frozenkp Jun 6, 2018
b7662dc
Add test cases for mat
5teven1in Jun 5, 2018
d74ab2f
add kdtree test
frozenkp Jun 6, 2018
6338cc3
format
frozenkp Jun 6, 2018
77b8faa
C0
kshs010239 Jun 8, 2018
d74c48e
recover
kshs010239 Jun 8, 2018
30071eb
some test for C9
kshs010239 Jun 14, 2018
80bc1ac
some test for C0
kshs010239 Jun 14, 2018
bf90755
testcase
kshs010239 Jun 14, 2018
cebc60d
add knn_test
frozenkp Jun 7, 2018
99c3bfa
Update README.md
5teven1in Jun 8, 2018
b601ffa
error
kshs010239 Jun 14, 2018
2e9588e
Add Convey "Test more code"
Jun 8, 2018
eff9fb2
Update em_test.go
Jun 8, 2018
9bfe206
Update em_test.go
Jun 8, 2018
1e1b5f1
Format code
5teven1in Jun 16, 2018
6af3dea
Fix repo info
5teven1in Jun 16, 2018
2a48e1f
Update .travis.yml and coverage script
5teven1in Jun 16, 2018
c367040
Fix variable name in convey
5teven1in Jun 16, 2018
b75b54a
Fix import cycle
5teven1in Jun 16, 2018
46d6837
Merge pull request #212 from frozenkp/add-testing
Sentimentron Jul 1, 2018
be0d096
Removing the not-very-helpful info println
ryanthecubfan Aug 27, 2018
7374d36
Merge pull request #215 from ryanthecubfan/remove-optimisations-println
Sentimentron Aug 30, 2018
7d17054
resolves #222
Mar 13, 2019
019f101
change gonum matrix definitions to match with current gonum version
Soypete Mar 20, 2019
6e8d84a
remove vendor
Soypete Mar 20, 2019
2eb4fbd
Merge pull request #1 from Soypete/fix/matrix
Soypete Mar 20, 2019
e219301
Adapt for go 1.12 - Issue #225
Mar 22, 2019
993a961
Ask Travis to test Go 1.11 and Go 1.12
Sentimentron Mar 23, 2019
fbd5c5a
Merge pull request #224 from Soypete/master
Sentimentron Mar 23, 2019
b42867a
Merge branch 'master' into issue222
Sentimentron Mar 23, 2019
392c775
Drop support for Go 1.8
Sentimentron Mar 23, 2019
744cde5
Merge pull request #227 from Sentimentron/add-go-12
Sentimentron Mar 23, 2019
54a9a24
Merge pull request #226 from tboudalier/master
Sentimentron Mar 23, 2019
5d79ed7
Merge pull request #223 from sugarme/issue222
Sentimentron Mar 23, 2019
ac9fa85
Fix for error that happens on Go 1.11 and above
Sentimentron Mar 23, 2019
82e59c8
Merge pull request #228 from Sentimentron/fix-11
Sentimentron Mar 23, 2019
1fd926a
linear_models: update to use new mat64 API
Sentimentron Jun 16, 2019
864198d
clustering: update to new mat64 API
Sentimentron Jun 16, 2019
495bb91
neural: update to new mat64 API
Sentimentron Jun 18, 2019
850c914
metrics: fixing pairwise CloneFrom
Sentimentron Jun 18, 2019
c3cae57
Merge pull request #233 from Sentimentron/fixes-jun-19
Sentimentron Jun 19, 2019
bffc4a5
Replace Panics with error returns to BernoulliNBClassifier Fit method…
JustinJudd Jul 17, 2019
fadd963
Added String method to meet Classifier interface
JustinJudd Jul 17, 2019
6fcc2b4
Merge pull request #235 from JustinJudd/classInt
Sentimentron Jul 25, 2019
ae8d7f7
adds error checking for err variable that were being left unchecked
yaserazfar Dec 26, 2019
3e43e74
Merge pull request #240 from yaserazfar/check_unchecked_errors
Sentimentron Dec 30, 2019
8848652
Added Decision Tree Classifier
Yushgoel Jul 16, 2020
d1228c5
Adding Integration For Fixed Data Grid in Predict And Evaluate
Yushgoel Jul 18, 2020
16eac7d
Adding Regression Trees
Yushgoel Jul 18, 2020
08529c4
Added Comments for Regressor
Yushgoel Jul 18, 2020
c083759
Adding Changes
Yushgoel Jul 22, 2020
065f45d
Bump Go versions
AlekSi Jul 22, 2020
c793b1d
Restore workaround
AlekSi Jul 22, 2020
c40ae76
Require Go 1.13+
AlekSi Jul 22, 2020
b16b60f
Adding Example script for CART
Yushgoel Jul 23, 2020
c0c3b2e
Fixing Sorting
Yushgoel Jul 25, 2020
a6614fa
Merge pull request #247 from AlekSi/patch-1
Sentimentron Jul 25, 2020
abed408
Updating Dataset + Naming
Yushgoel Jul 26, 2020
91a27e3
Fixing Comments
Yushgoel Jul 27, 2020
ef751e6
Adding cart_test.go
Yushgoel Jul 27, 2020
2d2af0a
Removing Clutter
Yushgoel Jul 28, 2020
1954aae
Changing name of Use_not
Yushgoel Jul 30, 2020
d587340
Renaming Impurity Functions
Yushgoel Jul 30, 2020
7276108
Adding Documentation
Yushgoel Jul 30, 2020
7f8ce6d
Removing Panics
Yushgoel Jul 31, 2020
ae2338c
Updating package level details
Yushgoel Jul 31, 2020
9d1ac82
Optimizing Loss Calculation
Yushgoel Aug 1, 2020
6a42fcd
catching nInstances == 0
Yushgoel Aug 1, 2020
cd2b86a
Changing var name
Yushgoel Aug 1, 2020
8ae385c
Complexity Analysis for Algorithm
Yushgoel Aug 1, 2020
db7f9de
Merge pull request #2 from sjwhitworth/master
Yushgoel Aug 1, 2020
4e5315a
Merge pull request #3 from sjwhitworth/master
Yushgoel Aug 1, 2020
cad05a0
Updating Logistic.go
Yushgoel Aug 1, 2020
cee05df
Merge pull request #1 from Yushgoel/cart
Yushgoel Aug 1, 2020
e55a329
Fixing Bug
Yushgoel Aug 1, 2020
2a54c10
Merge pull request #4 from Yushgoel/cart
Yushgoel Aug 1, 2020
b689fe0
Fixing Typo + tmp file
Yushgoel Aug 3, 2020
27b86ce
Delete tmp
Yushgoel Aug 6, 2020
c39ef51
Merge pull request #249 from Yushgoel/cart_reviewed
Sentimentron Aug 6, 2020
6aa37ac
Merge pull request #5 from sjwhitworth/master
Yushgoel Aug 14, 2020
452acba
Adding Isolation + Fixing previous import issue
Yushgoel Aug 24, 2020
d20c03e
Adding isolation_test
Yushgoel Aug 25, 2020
5a66fb9
Adding Example Script
Yushgoel Aug 27, 2020
0270ec8
IsolationForest in trees.go
Yushgoel Aug 27, 2020
333997b
Adding Comments
Yushgoel Aug 30, 2020
fef3034
Merge pull request #6 from Yushgoel/IsolationForest
Yushgoel Aug 30, 2020
a380d19
Removed Changing Seed
Yushgoel Sep 6, 2020
e3a09cf
File paths
Yushgoel Sep 8, 2020
6fed29e
Merge pull request #250 from Yushgoel/IsolationForest_reviewed
Sentimentron Sep 8, 2020
76577c4
Create dataframe_go.go
Yushgoel Oct 27, 2020
47b57c4
Replacing Nest with Switch Case
Yushgoel Nov 4, 2020
66e5d57
Merge pull request #8 from Yushgoel/dataFrameCompatibility
Yushgoel Nov 4, 2020
4aff2a9
Removing cyclic import
Yushgoel Nov 4, 2020
5e8076f
Merge branch 'dataFrameCompatibilityReviewed' of https://github.com/Y…
Yushgoel Nov 4, 2020
9ed5a13
Merge pull request #256 from Yushgoel/dataFrameCompatibilityReviewed
Sentimentron Nov 22, 2020
7240e2c
Adding Go module files
niko-dunixi Nov 27, 2020
7580cac
Fixing compilation errors that break the library
niko-dunixi Nov 27, 2020
294d65f
Merge pull request #259 from paul-nelson-baker/master
Sentimentron Nov 27, 2020
6489b3b
Fix id3 model loading
Oliveirakun Jan 8, 2021
d33eb47
Fix random forest model loading
Oliveirakun Jan 10, 2021
cde96fa
Merge pull request #261 from Oliveirakun/fix-model-load
Sentimentron Jan 17, 2021
093beec
Update go.sum
wonyonyon May 1, 2021
00d4cfd
Merge pull request #266 from wonyonyon/patch-1
Sentimentron May 11, 2021
d0cad66
Fix typo in hello world example
louisguitton Sep 5, 2021
947ee72
Merge pull request #268 from louisguitton/fix-typo-hello-world
Sentimentron Sep 6, 2021
0ae13fe
Example now pulls from correct filepath for dataset
EliDavis3D Oct 5, 2021
a8b69c2
Merge pull request #269 from EliDavis3D/patch-1
Sentimentron Oct 14, 2021
0f33e2f
feat: allow missing values when parsing csvs
samyshehata Dec 18, 2022
74ae077
Merge pull request #284 from sshehata/master
Sentimentron Dec 28, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,6 @@

# go test coverprofiles
*.coverprofile

#vim
*.sw*
13 changes: 9 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
language: go
go:
- "1.8"
- "1.9"
- "1.10"
- 1.13.x
- 1.14.x
env:
# Temporary workaround for go 1.6
# Temporary workaround for Go 1.6+
- GODEBUG=cgocheck=0
before_install:
- sudo apt-get update -qq
Expand All @@ -14,3 +13,9 @@ before_install:
install:
- go get github.com/smartystreets/goconvey/convey
- go get -v ./...

script:
- ./coverage.sh

after_success:
- bash <(curl -s https://codecov.io/bash)
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ GoLearn
<img src="http://talks.golang.org/2013/advconc/gopherhat.jpg" width=125><br>
[![GoDoc](https://godoc.org/github.com/sjwhitworth/golearn?status.png)](https://godoc.org/github.com/sjwhitworth/golearn)
[![Build Status](https://travis-ci.org/sjwhitworth/golearn.png?branch=master)](https://travis-ci.org/sjwhitworth/golearn)<br>
[![Code Coverage](https://codecov.io/gh/sjwhitworth/golearn/branch/master/graph/badge.svg)](https://codecov.io/gh/sjwhitworth/golearn)

[![Support via Gittip](https://rawgithub.com/twolfson/gittip-badge/0.2.0/dist/gittip.png)](https://www.gittip.com/sjwhitworth/)

Expand Down Expand Up @@ -39,7 +40,7 @@ func main() {
// Load in a dataset, with headers. Header attributes will be stored.
// Think of instances as a Data Frame structure in R or Pandas.
// You can also create instances from scratch.
rawData, err := base.ParseCSVToInstances("datasets/iris.csv", false)
rawData, err := base.ParseCSVToInstances("datasets/iris.csv", true)
if err != nil {
panic(err)
}
Expand Down
5 changes: 5 additions & 0 deletions base/csv.go
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,11 @@ func ParseCSVBuildInstancesFromReader(r io.ReadSeeker, attrs []Attribute, hasHea
}
}
for i, v := range record {
// support missing values
if v == "" {
continue
}

u.Set(specs[i], rowCounter, specs[i].attr.GetSysValFromString(strings.TrimSpace(v)))
}
rowCounter++
Expand Down
66 changes: 66 additions & 0 deletions base/dataframe_go.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
package base

import (
"fmt"
"reflect"
"strconv"

"github.com/rocketlaunchr/dataframe-go"
)

// ConvertDataFrameToInstances converts a DataFrame-go dataframe object to Golearn Fixed Data Grid. Allows for compabitibility between dataframe and golearn's ML models.
// df is the dataframe Object. classAttrIndex is the index of the class Attribute in the data.i
func ConvertDataFrameToInstances(df *dataframe.DataFrame, classAttrIndex int) FixedDataGrid {

// Creating Attributes based on Dataframe
names := df.Names()
attrs := make([]Attribute, len(names))

newInst := NewDenseInstances()

for i := range names {
col := df.Series[i]
if reflect.TypeOf(col.Value(0)).Kind() == reflect.String {
attrs[i] = new(CategoricalAttribute)
attrs[i].SetName(names[i])
} else {
attrs[i] = NewFloatAttribute(names[i])
}
}

// Add the attributes
newSpecs := make([]AttributeSpec, len(attrs))
for i, a := range attrs {

newSpecs[i] = newInst.AddAttribute(a)
}
// Adding the class attribute
newInst.AddClassAttribute(attrs[classAttrIndex])

// Allocate space
nRows := df.NRows()
newInst.Extend(df.NRows())

// Write the data based on DataType
for i := 0; i < nRows; i++ {
for j := range names {
col := df.Series[j]

var val string
switch v := col.Value(i).(type) {
case string:
val = v
case int64:
val = strconv.FormatInt(v, 10)
case float64:
val = fmt.Sprintf("%f", v)
case float32:
val = fmt.Sprintf("%f", v)
}

newInst.Set(newSpecs[j], i, newSpecs[j].GetAttribute().GetSysValFromString(val))
}
}

return newInst
}
2 changes: 1 addition & 1 deletion base/error.go
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,6 @@ func WrapError(err error) error {
}

func FormatError(err error, format string, args ...interface{}) error {
description := fmt.Sprintf(format, args)
description := fmt.Sprintf(format, args...)
return DescribeError(description, err)
}
22 changes: 22 additions & 0 deletions base/error_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
package base

import (
. "github.com/smartystreets/goconvey/convey"
"testing"
)

func TestId3(t *testing.T) {
Convey("Doing a error test", t, func() {
var _gerr GoLearnError
gerr := &_gerr
gerr.attachFormattedStack()
s := gerr.Error()
So(s, ShouldNotBeNil)
err := DescribeError("test", nil)
So(err, ShouldNotBeNil)
err = WrapError(nil)
So(err, ShouldNotBeNil)
s = wrapLinesWithTabPrefix("123\ntest\n")
So(s, ShouldEqual, "\t123\n\ttest\n\t")
})
}
43 changes: 41 additions & 2 deletions base/mat_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@ import (
func TestInlineMat64Creation(t *testing.T) {

Convey("Given a literal array...", t, func() {
mat := mat.NewDense(4, 3, []float64{
X := mat.NewDense(4, 3, []float64{
1, 0, 1,
0, 1, 1,
0, 0, 0,
1, 1, 0,
})
inst := InstancesFromMat64(4, 3, mat)
inst := InstancesFromMat64(4, 3, X)
attrs := inst.AllAttributes()
Convey("Attributes should be well-defined...", func() {
So(len(attrs), ShouldEqual, 3)
Expand All @@ -34,6 +34,45 @@ func TestInlineMat64Creation(t *testing.T) {
So(val, ShouldAlmostEqual, 1.0)
})

Convey("Getting size should work...", func() {
attrLen, rows := inst.Size()
So(attrLen, ShouldEqual, 3)
So(rows, ShouldEqual, 4)
})

Convey("Getting row string should work...", func() {
So(inst.RowString(0), ShouldEqual, "0")
})

Convey("Getting attribute not in it should error...", func() {
Y := mat.NewDense(1, 4, []float64{1, 2, 3, 4})
ins := InstancesFromMat64(1, 4, Y)
attr := ins.AllAttributes()
_, err := inst.GetAttribute(attr[3])
So(err.Error(), ShouldEqual, "Couldn't find a matching attribute")
})

Convey("Generate human-readable summary...", func() {
output := inst.String()
So(output, ShouldStartWith, "Instances with")
So(output, ShouldContainSubstring, "Attributes:")
So(output, ShouldContainSubstring, "Data:")
})

})

}

func TestStringWithExceedMaxRow(t *testing.T) {
Convey("Given a long literal array...", t, func() {
v := make([]float64, 35, 35)
X := mat.NewDense(35, 1, v)
inst := InstancesFromMat64(35, 1, X)
output := inst.String()
So(output, ShouldStartWith, "Instances with")
So(output, ShouldContainSubstring, "Attributes:")
So(output, ShouldContainSubstring, "Data:")
So(output, ShouldContainSubstring, "undisplayed")

})
}
6 changes: 3 additions & 3 deletions base/serialize.go
Original file line number Diff line number Diff line change
Expand Up @@ -49,17 +49,17 @@ func (f *FunctionalTarReader) GetNamedFile(name string) ([]byte, error) {
if err != nil {
return nil, WrapError(err)
}

if int64(len(ret)) != hdr.Size {
if int64(len(ret)) < hdr.Size {
log.Printf("Size mismatch, got %d byte(s) for %s, expected %d (err was %s)", len(ret), hdr.Name, hdr.Size, err)
} else {
return nil, WrapError(fmt.Errorf("Size mismatch, expected %d byte(s) for %s, got %d", len(ret), hdr.Name, hdr.Size))
}
}
if err != nil {
return nil, err
}

returnCandidate = ret
break
}
}
if returnCandidate == nil {
Expand Down
112 changes: 112 additions & 0 deletions clustering/cluster_extra_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
package clustering

import (
. "github.com/smartystreets/goconvey/convey"
"testing"
)

func Test(t *testing.T) {
Convey("Only m[0]", t, func() {
m1 := ClusterMap(make(map[int][]int))
m1[0] = []int{1, 2}

m2 := ClusterMap(make(map[int][]int))
m2[0] = []int{1, 2}

ret, err := m1.Equals(m2)
So(err, ShouldBeNil)
So(ret, ShouldBeTrue)

})

Convey("Nothing in m", t, func() {
m1 := ClusterMap(make(map[int][]int))

m2 := ClusterMap(make(map[int][]int))

ret, err := m1.Equals(m2)
So(err, ShouldBeNil)
So(ret, ShouldBeTrue)

})

Convey("Many elements in m", t, func() {
m1 := ClusterMap(make(map[int][]int))
m1[0] = []int{1, 2, 3, 4, 5}
m1[1] = []int{11, 12, 13, 14, 15}

m2 := ClusterMap(make(map[int][]int))
m2[0] = []int{1, 2, 3, 4, 5}
m2[1] = []int{11, 12, 13, 14, 15}

ret, err := m1.Equals(m2)
So(err, ShouldBeNil)
So(ret, ShouldBeTrue)

})

Convey("m[0] not the same", t, func() {
m1 := ClusterMap(make(map[int][]int))
m1[1] = []int{1, 2, 3}
m1[0] = []int{4, 5}

m2 := ClusterMap(make(map[int][]int))
m2[1] = []int{1, 2, 3}
m2[0] = []int{6, 5}

_, err := m1.Equals(m2)
So(err, ShouldNotBeNil)
})

Convey("m[0] size diff", t, func() {
m1 := ClusterMap(make(map[int][]int))
m1[1] = []int{1, 2, 3}
m1[0] = []int{4, 5}

m2 := ClusterMap(make(map[int][]int))
m2[1] = []int{1, 2, 3}

_, err := m1.Equals(m2)
So(err, ShouldNotBeNil)
})

Convey("m[1] size diff", t, func() {
m1 := ClusterMap(make(map[int][]int))
m1[1] = []int{1, 3}
m1[0] = []int{4, 5}

m2 := ClusterMap(make(map[int][]int))
m2[1] = []int{1, 2, 3}
m1[0] = []int{4, 5}

_, err := m1.Equals(m2)
So(err, ShouldNotBeNil)
})

Convey("m[1] duplicate", t, func() {
m1 := ClusterMap(make(map[int][]int))
m1[1] = []int{1, 1}
m1[0] = []int{4, 5}

m2 := ClusterMap(make(map[int][]int))
m1[1] = []int{1, 1}
m1[0] = []int{4, 5}

_, err := m1.Equals(m2)
So(err, ShouldNotBeNil)
})

Convey("m[0] duplicate", t, func() {
m1 := ClusterMap(make(map[int][]int))
m1[1] = []int{1, 2}
m1[0] = []int{4, 4}

m2 := ClusterMap(make(map[int][]int))
m1[1] = []int{1, 2}
m1[0] = []int{4, 4}

_, err := m1.Equals(m2)
So(err, ShouldNotBeNil)
})

}
2 changes: 1 addition & 1 deletion clustering/dbscan.go
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
package clustering

import (
"gonum.org/v1/gonum/mat"
"github.com/sjwhitworth/golearn/base"
"github.com/sjwhitworth/golearn/metrics/pairwise"
"gonum.org/v1/gonum/mat"
"math/big"
)

Expand Down
2 changes: 1 addition & 1 deletion clustering/dbscan_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ package clustering

import (
"bufio"
"gonum.org/v1/gonum/mat"
"github.com/sjwhitworth/golearn/base"
"github.com/sjwhitworth/golearn/metrics/pairwise"
. "github.com/smartystreets/goconvey/convey"
"gonum.org/v1/gonum/mat"
"math"
"math/big"
"os"
Expand Down
Loading