Skip to content

Commit a735581

Browse files
committed
write up initial irnet work
1 parent a036584 commit a735581

File tree

3 files changed

+40
-10
lines changed

3 files changed

+40
-10
lines changed

README.md

Lines changed: 34 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Requirements:
44
* install `docker`
55
* install `curl`
66
* Make sure docker allows at least 3GB of RAM (see `Docker`>`Preferences`>`Advanced`
7-
or equivalent)
7+
or equivalent) for sqlova, or 5GB or RAM for irnet.
88

99
## sqlova
1010

@@ -78,16 +78,45 @@ Some questions about [iris.csv](https://en.wikipedia.org/wiki/Iris_flower_data_s
7878
| how many setosa rows are there | 50 | `SELECT count(col0) FROM iris WHERE Species = ? ['setosa']` |
7979

8080
There are plenty of types of questions this model cannot answer (and that aren't covered
81-
in the dataset it is trained on, or in the sql it is permitted to generate). I hope to
82-
track research in the area and substitute in models as they become available:
81+
in the dataset it is trained on, or in the sql it is permitted to generate).
82+
83+
## irnet
84+
85+
This wraps up a published pretrained model for IRNet (https://github.com/microsoft/IRNet).
86+
The model released so far isn't Bert-flavored, and I haven't completely nailed down all the
87+
details of running it, so don't judge the model by playing with it here.
88+
89+
Fetch and start irnet running as an api server on port 5050:
90+
91+
```
92+
docker run --name irnet -d -p 5050:5050 -v $PWD/cache:/cache paulfitz/sqlova
93+
```
94+
95+
Be super patient! Especially on the first run, when a few large models need to
96+
be downloaded and unpacked.
97+
98+
You can then ask questions of individual csv files as before, or several csv files
99+
(just repeat `-F "[email protected]"`) or a simple sqlite db with tables related by foreign keys.
100+
In this last case, the model can answer using joins.
101+
102+
```
103+
curl -F "[email protected]" -F "q=what city is The Firm headquartered in?" localhost:5050
104+
# Answer: SELECT T1.city FROM locations AS T1 JOIN organizations AS T2 WHERE T2.company = 1
105+
curl -F "[email protected]" -F "q=who is the CEO of Omni Cooperative" localhost:5050
106+
# Answer: SELECT T1.name FROM people AS T1 JOIN organizations AS T2 WHERE T2.company = 1
107+
curl -F "[email protected]" -F "q=what company has Dracula as CEO" localhost:5050
108+
# Answer: SELECT T1.company FROM organizations AS T1 JOIN people AS T2 WHERE T2.name = 1
109+
```
110+
111+
## Other models
112+
113+
I hope to track research in the area and substitute in models as they become available:
83114

84115
* [WikiSQL leaderboard](https://github.com/salesforce/WikiSQL#leaderboard)
85116
* [Spider leaderboard](https://yale-lily.github.io/spider)
86-
* [IRNet](https://github.com/microsoft/IRNet) - I've started work on [supporting this](https://github.com/paulfitz/mlsql/tree/master/irnet).
87117
* [Spider Schema GNN](https://github.com/benbogin/spider-schema-gnn)
88118
* Is there any code for [X-SQL](https://www.microsoft.com/en-us/research/publication/x-sql-reinforce-context-into-schema-representation/)?
89119
* [SyntaxSQL](https://github.com/taoyds/syntaxSQL)
90120
* [NL2SQL Challenge](https://tianchi.aliyun.com/competition/entrance/231716/information)
91121
* A term paper including a Sqlova reimplementation with tweaks: [Search Like a Human: Neural Machine Translation for Database Search](https://web.stanford.edu/class/cs224n/reports/custom/15709203.pdf)
92122
* [NL2SQL-BERT](https://github.com/guotong1988/NL2SQL-BERT) gives an example of how to add features derived from the table content to improve results.
93-

companies.sqlite

16 KB
Binary file not shown.

irnet/server/prediction_server.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -152,11 +152,12 @@ def handle_request0(request):
152152
q = request.form['q']
153153

154154
# brute force removal of any old requests
155-
subprocess.run([
156-
"rm",
157-
"-rf",
158-
"/cache/case_*"
159-
])
155+
if not TRIAL_RUN:
156+
subprocess.run([
157+
"bash",
158+
"-c",
159+
"rm -rf /cache/case_*"
160+
])
160161
key = "case_" + str(uuid.uuid4())
161162
data_dir = os.path.join('/cache', key)
162163
os.makedirs(os.path.join(data_dir, 'data'), exist_ok=True)

0 commit comments

Comments
 (0)