Skip to content

Commit 1b8e989

Browse files
committed
updating readme
Signed-off-by: Vanessa Sochat <[email protected]>
1 parent 3d1615e commit 1b8e989

File tree

1 file changed

+13
-8
lines changed

1 file changed

+13
-8
lines changed

Diff for: README.md

+13-8
Original file line numberDiff line numberDiff line change
@@ -101,14 +101,16 @@ discourse API, and probably is redundant.
101101

102102
### 3. Cluster Posts
103103

104-
We are going to try using [Doc2Vec](https://radimrehurek.com/gensim/models/doc2vec.html) on the sentences for each post, and then generating embeddings, and using kmeans for the embeddings. Since I didn't want to install a ton of Python libraries on my host, I decided
105-
to build a container to generate the notebook.
104+
We are going to try using [Doc2Vec](https://radimrehurek.com/gensim/models/doc2vec.html) on the sentences for each post, and then generating embeddings, and using kmeans for the embeddings. Since I didn't want to install a ton of Python libraries on my host, I decided to build a container to generate the notebook.
106105

107106
```bash
108107
$ docker build -t vanessa/askci-cluster-gensim .
109108
```
110109

111-
Then run the container, and map port 8888 to expose the notebook.
110+
You actually don't need to do this if you don't want to, the container
111+
is provide on [Docker Hub](https://hub.docker.com/r/vanessa/askci-cluster-gensim/tags) (note
112+
that I've also tagged a version for the date of data export). Either way,
113+
then run the container, and map port 8888 to expose the notebook.
112114

113115
```bash
114116
$ docker run -it -p 8888:8888 vanessa/askci-cluster-gensim
@@ -118,13 +120,16 @@ $ docker run -it -p 8888:8888 vanessa/askci-cluster-gensim
118120
decided to use one here to make it easy to show the work on GitHub and
119121
generate plots inline.
120122

121-
What you'll need to do is interact with the notebook
123+
If you run the container and make changes that you want to keep,
124+
what you'll need to do is interact with the notebook
122125
in your browser (given the URL that you are provided) and then Download
123126
to your computer to save. If we bind directories there could be a whole
124-
mess of weird permissions, so this seems like a reasonable approach.
127+
mess of weird permissions, but if you want to try that, it would work too.
125128

126-
What I wound up doing is copying the notebook and data files that I needed out
127-
of the container, after saving:
129+
I'm not a fan of click to download and then (still) needing to move and rename
130+
the file, so what I wound up doing is copying the notebook and data files that I needed out
131+
of the container, after saving. For a container named "amazing_ganguly", you
132+
can copy both notebooks and data generated (if you run them):
128133

129134
```bash
130135
# Notebooks
@@ -133,7 +138,7 @@ $ docker cp amazing_ganguly:/home/jovyan/cluster-analysis-tags.ipynb cluster-ana
133138

134139
# Data Output
135140
$ docker cp amazing_ganguly:/home/jovyan/askci-post-tsne-179x2.json docs/askci-post-tsne-179x2.json
136-
for num in {1..7}; do
141+
for num in {1..10}; do
137142
docker cp amazing_ganguly:/home/jovyan/askci-tags-ica-embeddings-ncomps-${num}.json docs/askci-tags-ica-embeddings-ncomps-${num}.json
138143
done
139144
```

0 commit comments

Comments
 (0)