started to put the finishing touches on other bits of doc

hugofirth · hugofirth · commit f5077b01a76f · 2018-01-03T15:34:20.000Z
diff --git a/INSTRUCTIONS.md b/INSTRUCTIONS.md
@@ -6,14 +6,13 @@ been freshly installed and be mostly unmodified. The standard Ubuntu image from
 etc...) will work perfectly.
 
 If you are comfortable with the technology involved in building Glance and don't want to 
-use the "default" configuration for some reason then of course you can just build it yourself. To that end we also
-provide a list of software prerequisites and a brief description of the build structure. Most people should try and use 
+use the "default" configuration for some reason then of course you can just build it yourself. Most people should try and use 
 Ubuntu 16 however, as this has been well tested and we provide a number of helpful scripts for installations with that OS. 
 
 The guide does assume very basic familiarity with the use of a bash shell on Ubuntu, though all commands to be executed 
 are listed verbatim and explained. 
 
-### Installing Glance on Ubuntu 16 (the first path) 
+### Installing Glance on Ubuntu 16 
 
 1. Check that the packages already installed on your server are up to date: 
     ```bash
@@ -193,9 +192,11 @@ Glance's database: This can be done with the following command:
                            an optional integer to use as a seed when randomly (uniformly) selecting student records to include in a survey.
     ```
     
-    The first 3 options are compulsory, whilst the remaining 4 are optional depending on the surveys you wish to generate.
-    When run, the above command will create surveys in the Glance database and return links where they may be 
-    accessed and conducted by instructors. 
+    The first 3 options are compulsory, whilst the remaining 4 are optional depending on the 
+    surveys you wish to generate. When run, the above command will create surveys in the Glance 
+    database and return their ids (long strings of number and letters like: 
+    `6d7ca6f2-7970-4f24-881b-4f84c0386c63`). These may be used to access surveys in a web browser, 
+    as detailed in the next step. 
     
     **Note**: Such a large number of options are included in the `glance-cli` tools in order to support configurability,
     however they also create edge cases where combinations of commands or options may fail to behave as they should. 
@@ -205,14 +206,27 @@ Glance's database: This can be done with the following command:
     
     Effort has been made to make the error reporting of the `glance-cli` tools fairly comprehensive. If you encounter such 
     an error, please adjust your combination of command line options accordingly. If you encounter no error, but Glance
-    still isn't generating surveys as you believe it should, please submit an [issue](). 
+    still isn't generating surveys as you believe it should, please submit an [issue](https://github.com/NewcastleComputingScience/student-outcome-accelerator/issues/new). 
     
     **Note**: The optional `random-seed` option is used if you wish to ensure that `glance-cli`  generates the exact same
     surveys as on a previous execution (selects the same students etc...). Most of the time it can be safely ignored.
     
-5. Check the surveys are working. 
+5. Check the surveys are working in your browser of choice. You can access them at 
+_http://server.address/index.html#survey/{id}_ where _{id}_ is the long string of characters 
+produced by the _generate_ command. If you would like to check that a collection of surveys is 
+working the url to use is similar: _http://server.address/index.html#collection/{id}_ where _{id}_ 
+corresponds to one of the long strings listed as a collection in the output of _generate_. 
+
+    **Note**: _server.address_ corresponds to the ip address or domain name associated with the 
+    server on which you are setting up Glance. If you are trying to test the surveys from the same
+    machine you should instead use _http://localhost/index..._.
+
+6. Share individual surveys or collections with instructors using the links from the previous step.
+Once they have completed some (a simple process which is briefly explained [here](README.md)), you 
+will wish to download their results. One way of doing this is simply to take a backup of the 
+Glance database, which we explain later. Another way is to run the following command: 
+
 
-6. Download some results.
 
 ### Performing miscellaneous tasks 
 
@@ -230,86 +244,10 @@ Glance's database: This can be done with the following command:
 
 ### Support and issues
 
-If you have any issues with any of the steps in this Guide, please submit an issue here according to the following template:
-
-### Other information
-
-If you have followed the above guides successfully you may safely ignore this section.
-
-##### Software prerequisites for Glance
-
-
-##### Glance build description
-
-
-1. Create Postgres database with correct details using the following two commands:
-    ```
-    psql -c 'create user postgres createdb'
-    psql -c 'create database glance_eval' -U postgres
-    ```
-
-2. Download and extract datafiles from provided NCL Dropoff link.
-
-3. Start the `sbt` console in the `soar` directory by executing the `sbt` command.
-
-4. Once the sbt console has started, generate the database schema using the following command: 
-    ```
-    glance-evalJVM/flywayMigrate
-    ``` 
-    
-5. Unfortunately the prepackaged versions of the cli tools are failing silently at the moment. I'm figuring out why as 
- we speak, but in the mean time run the `transform` job (which prepares sql12 data for insertion into the glance 
- database) using the following command in sbt: 
-    ```
-    glance-eval-cli/run transform -c /Location/Of/CSClusterSessions.csv -r /Location/Of/RecapSessions.csv -m /Location/Of/NessMarks.csv -o /Directory/To/Write/Transformed/Csvs -p CSC -y 2015 -s 2
-    ``` 
-    
-6. Run the `generate` job (which creates the surveys in the glance database) using the following command in sbt:
-    ```
-    glance-eval-cli/run generate -i /Location/Of/marks.csv --modules CSC3621,CSC3222,CSC2026
-    ```
-    **Note** that `marks.csv` is generated by the previous job.
-    
-7. Run `load-support` job (which loads cluster and recap info transformed by step 5, into the glance database) using the
-following command in sbt: 
-    ```
-    glance-eval-cli/run load-support -c /Location/Of/clusterSessions.csv -r /Location/Of/recapSessions.csv
-    ```
-    **Note** that `clusterSessions.csv` and `recapSessions.csv` are generated by the `transform` job.
-
-8. Exit the `sbt` console and manually execute .sql dump in `glance-eval-cli/cs-surveys-bin` against the glance 
-postgres database. This loads module titles, descriptions, keywords, and start dates/durations (where needed).
-
-9. Restart the sbt console (using `sbt`).
-
-10. Start the survey app with the following command:
-    ```
-    glance-evalJVM/reStart
-    ``` 
-    Once you have done this, the api is available [here](http://localhost:8080), whilst the front-end is available  
-    [here](http://localhost:12345/glance-eval/js/target/scala-2.11/classes/index-dev.html).
-    
-11. If you want to load specific surveys (rather than the default) then you need to use the following url structure: 
-`.../index-dev.html#survey/{id}` where `{id}` corresponds to the uuid string for the the survey in question. E.g:
-
-    ```
-    .../index-dev.html#survey/13927f7f-ded8-4862-a61f-66b7dd90b709   
-    ```
-    
-    I suggest we keep a list of the links and the surveys they correspond to so that we can quickly load them at the 
-    start of each in person session.
-
-### Updating (no data changes)
+If you have any issues with any of the steps in this Guide, please submit an issue 
+[here](https://github.com/NewcastleComputingScience/student-outcome-accelerator/issues/new) and we 
+will try to help.
 
-1. Pull down the soar repo
-2. re-run `glance-evalJVM/flyWayMigrate` 
-3. Perform steps 8-9 above.
 
-### Updating (data changes)
-1. Pull down the soar repo
-2. Backup database with the `pg_dump` utility or similar
-3. Run `glance-evalJVM/flywayClean` then `glance-evalJVM/flywayMigrate`. **Note** that the clean command will wipe the 
-database.
-4. Rerun steps 4-9 above. 
 
 
diff --git a/README.md b/README.md
@@ -1,110 +1,15 @@
-## Student Outcome Accelerator
+## Student Outcome Accelerator (SOAR) - Glance
 
-**NOTE**: This documentation is woefully out of date - watch this space.
+Welcome to the source code repository for Glance, a web application designed to measure the accuracy
+of instructors' models of student performance in higher education institutions. 
 
-This repo contains the Soar framework for performing analytics jobs on data from Higher education. Soar is being 
-developed as part of a HEFCE funded project, the aims of which include improving the quality of student and teacher
-experiences at Universities. 
+The tool was created as part of a [HEFCE](http://www.hefce.ac.uk/) funded project at 
+[Newcastle University](http://www.ncl.ac.uk/), focusing on "Human in the Loop" Learning analytics.
+It remains under active development.
 
-Broadly we will focus on the concept of "Human in the Loop" analytics, in order to allow faculty members to implicitly
-train and retrain sophisticated predictive models of student performance without expensive interventions by data 
-scientists etc... More on that here when we know what "that" is.
+### How it works
 
-The rest of this readme describes the modules of the Soar framework, as well as instructions for their installation 
-and use.
+### Use cases
 
-### Core
+### Using Glance at your University
 
-This module contains the base datastructures, types and helper methods upon which the rest of the project depends. There 
-is no need to separately build the Core module; it will be automatically built where needed by the other modules.
-
-### Model 
-
-This module contains a minimal and untested spark job for producing models which predict the outcomes
-of student/module pairings based upon past performance. 
-
-If this model is successful it could be expanded upon to incorporate more sources of data, and to 
-act as a long running web service.
-
-In order to build and run the model, you will need the following pre-requisites:
-
-* Java jdk 8
-* Sbt 13.*
-* Spark 2.1.0
-
-Everything else will be fetched by Sbt. You can normally install sbt as a debian package (or 
-through homebrew on OSX). Otherwise, get it [here](http://www.scala-sbt.org/download.html).
-
-Assuming you have all the pre-requisites, run the following commands:
-
-1. `sbt clean`
-2. `sbt assembly`
-3. `sudo ./submit.sh input_file.csv output_directory`
-
-I will expand in the immediate future to allow interactive on-demand predictions from a generated
-or loaded model.
-
-**Note**: The submit script requires sudo at the moment because the quickest fix for specifying an
-output location for spark's log4j logs was to use the system /var/log/ dir. Will modify to user home
-soon but that requires a little bit of platform specific runtime hackery so its on the todo list...
-
-### Evaluation
-
-This module contains code used in the empirical evaluation of various parts of Soar. Over time as we describe and 
-perform experiments for the purposes of publication, they will be added to this module. 
-
-In order to make our evaluation of Soar as reproducible as possible, this module may be assembled into a single 
-executable jar, much like the **model** module.
-
-The first round of experiments evaluating Soar simply uses surveys containing samples of student marks to compare the 
-predictive accuracy of the `ALS` algorithm employed in **model** to domain experts (teachers of modules). As such, 
-the evaluation jar only has two functions at the moment: to generate these surveys and to read completed surveys and
-calculate measures of accuracy for the predictions they contain. 
-
-Assuming you have all the pre-requisites (which are the same as for **model**), you may build the evaluation jar 
-using the following commands:
-
-1. `sbt clean`
-2. `sbt install`
-3. `sbt evaluation/assembly`
-
-Once this has finished running, you can then generate the surveys using the following command: 
-
-4. `sudo ./submit.sh input.csv output_directory`
-
-As with executing the **model** jar, `input.csv` contains a list of Student/Module scores in the form _Student Number, 
-Module Code, Score_. The submit.sh file contains some default command line options which you may well want to change, 
-such as what modules you would like to generate surveys for. 
-
-If you would like to examine the other command line options, you can execute the evaluation jar directly with the 
-following command (from within the project root directory): 
-
-`java -jar evaluation/target/scala-2.11/soar-eval.jar --help`
-
-This will produce a traditional unix style help dialogue:
-
-> Soar Evaluation Survey generator 0.1.x
-> 
->Usage: SoarEvalGen [options]
-> 
->    -i, --input   <file>                    input is a required .csv file containing student/module scores. Format "StudentNumber, Module Code, Percentage" 
->
->    -o, --output  <directory>               output is a required parameter specifying the directory to write the surveys to.
->                             
->    -e, --elided  e.g. 20                   elided is an optional parameter specifying how many student records to partially elide in the generated surveys.
->
->    -m, --modules e.g. CSC1021, CSC2024...  modules is the list of modules for which to elide a students records. Only one module record will be elided per student. One survey is generated per elided module code.
->
->    -c, --common  e.g. CSC2024              common is an optional parameter specifying an additional module to elide student records for in *all* generated surveys.
->
->    -s, --seed    <int>                     seed is an optional parameter specifying a number to use as a seed when randomly selecting student records to elide.
-
-Once you have executed the job, you will see that within the specified output directory, a folder has been created for 
-each of the modules specified (except the module specified as common to all surveys, if any). Inside each of these 
-folders is a file called `survey.csv` which may be directly opened with a spreadsheet program.
-
-Those student/module scores  which we would like module leaders to predict have been been given the place holder 
-value _-1.0_ for clarity. All scores following such a negative placeholder score have been elided.
-
-Please keep track of which survey file belongs to which module code, as it may be harder to tell once they have been 
-filled in by members of staff.
diff --git a/SCHEMA.md b/SCHEMA.md
@@ -22,9 +22,8 @@ them. You can see which visualisations require which files [here](VISUALISATIONS
 are unlikely to be an exhaustive description of all the useful student data available to each institution. Instead they 
 are simply the data which is needed to power Glance's various visualisations.  
 
-If you have any questions about the files specified in this document, please create an issue with the following format:
-
-TODO: Write an issue template for reporting schema issues.
+If you have any questions about the files specified in this document, please create an issue 
+[here](https://github.com/NewcastleComputingScience/student-outcome-accelerator/issues/new)
 
 ## Files