diff --git a/Lab-2-solutions.ipynb b/Lab-2-solutions.ipynb new file mode 100644 index 0000000..eeb8168 --- /dev/null +++ b/Lab-2-solutions.ipynb @@ -0,0 +1,1017 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "uk7yc0nadBGa" + }, + "source": [ + "# Lab 2\n", + "\n", + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github//afarbin/DATA1401-Spring-2020/blob/master/Labs/Lab-2/Lab-2.ipynb)\n", + "\n", + "## Submitting lab solutions\n", + "\n", + "At the end of the previous lab, you should have set up a \"Solutions\" directory in your Google Drive, with a fork of the class git repository that pull from Dr. Farbin's verison and pushes to your own fork. \n", + "\n", + "Unfortunately due to a typo in the previous lab, you probably forked the 2019 version of the gitlab repository for this course. Unless you noticed and corrected the error, you'll have to fork again.\n", + "\n", + "In addition, due to some problems with the setup in Google Colab, we will be submitting our solutions to your fork using the web interface. Instructions on how to use the command-line are in this notebook, but we suggest you do not follow them unless you are working in a jupyter notebook and not Google Colab." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "OMNaOnRksNK3" + }, + "source": [ + "You may also choose to delete the fork from your GitHub account. " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "J_R64sQDqv0A" + }, + "source": [ + "## Repeating last steps of Lab 1\n", + "\n", + "### Create your own fork\n", + "We will create a new fork where you can keep track and submit your work, following [these instructions](https://help.github.com/articles/fork-a-repo/).\n", + "\n", + "Goto to github.com and log in.\n", + "\n", + "Next, create a fork of the [2020 class repository](https://github.com/afarbin/DATA1401-Spring-2020). Click the link and press the \"Fork\" button on the top right. Select your repository as where you want to place the fork.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "edTvE6rOqv0C" + }, + "source": [ + "### Make a local clone (Advanced)\n", + "\n", + "Before we get started, please mount your Google Drive using by clicking the file icon on the left, then clicking \"Mount Drive\", and following the instructions as you did in the previous lab.\n", + "\n", + "If you did complete Lab 1 and therefore created a 2019 fork and a local clone in you Google Drive, delete the local clone:\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "2u6B-rfNr1wN" + }, + "outputs": [], + "source": [ + "!rm -rf drive/My\\ Drive/Data-1401-Repo" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "BDVI5nu8-2RH" + }, + "source": [ + "Now we will check out your fork in your Google Drive / Colab. If you will be doing everything on your own computer instead of Google Colab/Drive, you are welcome to install Git on your computer and perform the following steps (appropriately modified) on your computer instead.\n", + "\n", + "Start by listing the contents of your current directory." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "e5tXg0f8qv0D" + }, + "outputs": [], + "source": [ + "%cd /content/drive/My\\ Drive\n", + "!ls" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "WYsyYcg1qv0J" + }, + "source": [ + "Make a new directory:" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "Z7noY1hMqv0L" + }, + "outputs": [], + "source": [ + "!mkdir Data-1401-Repo\n", + "%cd Data-1401-Repo" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "fwsBdTnYqv0Q" + }, + "source": [ + "From the github page for your fork, press the green \"Clone or download\" button and copy the URL.\n", + "\n", + "Goto to your notebook and use the following command to clone the repository, pasting the URL you just copied:\n" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "8w42MH6Jqv0S" + }, + "outputs": [], + "source": [ + "# What you past here should look like:\n", + "#!git clone https://github.com/ \u001b[40;33;01m/proc/kcore\u001b[0m\n", + "lrwxrwxrwx 1 root root 13 Feb 6 19:12 \u001b[01;36mfd\u001b[0m -> \u001b[01;34m/proc/self/fd\u001b[0m\n", + "crw-rw-rw- 1 root root 1, 7 Feb 6 19:12 \u001b[40;33;01mfull\u001b[0m\n", + "crw-rw-rw- 1 root root 10, 229 Feb 6 19:12 \u001b[40;33;01mfuse\u001b[0m\n", + "drwxrwxrwt 2 root root 40 Feb 6 19:12 \u001b[30;42mmqueue\u001b[0m\n", + "crw-rw-rw- 1 root root 1, 3 Feb 6 19:12 \u001b[40;33;01mnull\u001b[0m\n", + "lrwxrwxrwx 1 root root 8 Feb 6 19:12 \u001b[01;36mptmx\u001b[0m -> \u001b[40;33;01mpts/ptmx\u001b[0m\n", + "drwxr-xr-x 2 root root 0 Feb 6 19:12 \u001b[01;34mpts\u001b[0m\n", + "crw-rw-rw- 1 root root 1, 8 Feb 6 19:12 \u001b[40;33;01mrandom\u001b[0m\n", + "drwxrwxrwt 2 root root 40 Feb 6 19:15 \u001b[30;42mshm\u001b[0m\n", + "lrwxrwxrwx 1 root root 15 Feb 6 19:12 \u001b[01;36mstderr\u001b[0m -> \u001b[40;33;01m/proc/self/fd/2\u001b[0m\n", + "lrwxrwxrwx 1 root root 15 Feb 6 19:12 \u001b[01;36mstdin\u001b[0m -> \u001b[40;33;01m/proc/self/fd/0\u001b[0m\n", + "lrwxrwxrwx 1 root root 15 Feb 6 19:12 \u001b[01;36mstdout\u001b[0m -> \u001b[40;33;01m/proc/self/fd/1\u001b[0m\n", + "crw-rw-rw- 1 root root 5, 0 Feb 6 19:12 \u001b[40;33;01mtty\u001b[0m\n", + "crw-rw-rw- 1 root root 1, 9 Feb 6 19:12 \u001b[40;33;01murandom\u001b[0m\n", + "crw-rw-rw- 1 root root 1, 5 Feb 6 19:12 \u001b[40;33;01mzero\u001b[0m\n", + "\u001b]0;root@b280feab87e4: /dev\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/dev\u001b[00m# quit\n", + "bash: quit: command not found\n", + "\u001b]0;root@b280feab87e4: /dev\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/dev\u001b[00m# exit\n", + "exit\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "H4QTSFJg-Qp1", + "colab_type": "text" + }, + "source": [ + "Answer: Looks like I dont have any SSDs " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "7P9EG0KOqvz2", + "colab_type": "text" + }, + "source": [ + "## Text File Manipulation\n", + "\n", + "As explained in lecture, Unix stores most information in text files. For example, the list of all users and their home directories are stored in \"/etc/passwd\". Let get some familiarity with the most commonly used commands to manipulate files.\n", + "\n", + " - You can see the contents contents a file using the \"cat\" (concatenate) command. Try executing \"cat /etc/passwd\". You'll get a huge list that will go by your screen quickly. \n", + " \n", + " - To go through the file page by page, you can use the \"less\" or \"more\" commands. \n", + " \n", + " - You can see the first or last N (N=10 by default) lines of a file using \"head\" or \"tail\" commands. For example \"tail -20 /etc/passwd\" will list the last 20 lines. \n", + " \n", + " - You can search a test file using the \"grep\" command, which takes a string keyword as the first argument and a filename as the second, and by default prints out every line in the file that contrains the string. So for example you can do \"grep \\$USER /etc/passwd\" to find the line corresponding to your account. Some useful flags: \n", + " \n", + " - \"-i\" ignores the case of the keyword\n", + " - \"-v\" display those lines that do NOT match \n", + " - \"-n\" precede each matching line with the line number \n", + " - \"-c\" print only the total count of matched lines \n", + " \n", + " For example \"grep -c \\$USER /etc/passwd\" should show that you are in the password file just once. \n", + " \n", + " - The \"wc\" (word count) command counts the number of lines, words, and characters in a file. By default \"wc\" gives you all three numbers, but \"-w\", \"-l\", or \"-c\" flags \n", + "\n", + "*Exercise 4:* Count how many lines in the password file contain the letter \"w\". " + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "UlsANMuf2qMs", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 595 + }, + "outputId": "b312de92-5a8a-480a-b9b5-b147479bd97b" + }, + "source": [ + "!/bin/bash --noediting\n" + ], + "execution_count": 5, + "outputs": [ + { + "output_type": "stream", + "text": [ + "bash: cannot set terminal process group (124): Inappropriate ioctl for device\n", + "bash: no job control in this shell\n", + "\u001b]0;root@b280feab87e4: /content\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/content\u001b[00m# cat /etc/passwd\n", + "root:x:0:0:root:/root:/bin/bash\n", + "daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin\n", + "bin:x:2:2:bin:/bin:/usr/sbin/nologin\n", + "sys:x:3:3:sys:/dev:/usr/sbin/nologin\n", + "sync:x:4:65534:sync:/bin:/bin/sync\n", + "games:x:5:60:games:/usr/games:/usr/sbin/nologin\n", + "man:x:6:12:man:/var/cache/man:/usr/sbin/nologin\n", + "lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin\n", + "mail:x:8:8:mail:/var/mail:/usr/sbin/nologin\n", + "news:x:9:9:news:/var/spool/news:/usr/sbin/nologin\n", + "uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin\n", + "proxy:x:13:13:proxy:/bin:/usr/sbin/nologin\n", + "www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin\n", + "backup:x:34:34:backup:/var/backups:/usr/sbin/nologin\n", + "list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin\n", + "irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin\n", + "gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin\n", + "nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin\n", + "_apt:x:100:65534::/nonexistent:/usr/sbin/nologin\n", + "systemd-network:x:101:104:systemd Network Management,,,:/run/systemd/netif:/usr/sbin/nologin\n", + "systemd-resolve:x:102:105:systemd Resolver,,,:/run/systemd/resolve:/usr/sbin/nologin\n", + "messagebus:x:103:107::/nonexistent:/usr/sbin/nologin\n", + "nvidia-persistenced:x:104:108:NVIDIA Persistence Daemon,,,:/nonexistent:/sbin/nologin\n", + "\u001b]0;root@b280feab87e4: /content\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/content\u001b[00m# wc /etc/passwd\n", + " 23 33 1243 /etc/passwd\n", + "\u001b]0;root@b280feab87e4: /content\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/content\u001b[00m# grep wc -i W /etc/passwd\n", + "grep: W: No such file or directory\n", + "\u001b]0;root@b280feab87e4: /content\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/content\u001b[00m# grep -c W -i /etc/passwd\n", + "3\n", + "\u001b]0;root@b280feab87e4: /content\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/content\u001b[00m# exit\n", + "exit\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SZuhLbD8qvz5", + "colab_type": "text" + }, + "source": [ + "## Redirection\n", + "\n", + "Unix provides programs \"pipes\" for input and output. Most of what you see on the screen when you run a program was written to the \"stdout\" (standard output) pipe. Other pipes are \"stdin\" (standard input) and \"stderr\" (standard error), where error messages are written.\n", + "\n", + "As discussed in lecture, the basic commands of are simple, but you can chain them to do complicated things. Redirection is how you chain these commands, directing the output of one command to the input of the next.\n", + "\n", + "As an example, consider the \"cat\" command. Cat takes stdin and outputs it to stdout. Type \"cat\" and press enter and confirm. You can get back to the command prompt by pressing \"control-c\" (sends terminate singal) or \"control-d\" (end of file character). Note that from now on we will use the convention: \"control-d\" = \"^D\"\n", + "\n", + "*Exercise 5a:* Using \"cat\" and indirection you can write things into a file. The \">\" symbol directs stdout into a file. Try \"cat > favorite-colors-list.txt\" and then type in your 3 favorite colors, each on it's own line. Use \"^D\" to end your input." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "H5vxtcXnqvz6", + "colab_type": "text" + }, + "source": [ + "Use \"cat\", \"more\", or \"less\" to confirm that you file is as you expect it. \">>\" allows you to append to the file. \n", + "\n", + "*Exercise 5b:* Append 2 more colors to your file." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "twRKNaGy3XGw", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 95 + }, + "outputId": "cfc613bc-1641-4d3b-e3c0-57b4f4e1ef26" + }, + "source": [ + "!/bin/bash --noediting" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "stream", + "text": [ + "bash: cannot set terminal process group (124): Inappropriate ioctl for device\n", + "bash: no job control in this shell\n", + "\u001b]0;root@f065f1fd4329: /content\u0007\u001b[01;32mroot@f065f1fd4329\u001b[00m:\u001b[01;34m/content\u001b[00m# " + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DZODNKiAqvz8", + "colab_type": "text" + }, + "source": [ + "The \"sort\" command sorts what it sees on stdin. Instead of taking input from the terminal, you can direct the shell to take stdin from a file using \"<\". Try \"sort < favorite-color-list.txt\" and \"sort < favorite-color-list.txt > sorted-favorite-color-list.txt\".\n", + "\n", + "Finally, instead of piping input / output into files, you can directly chain one program into another using \"|\". So for example, you can do \"cat /etc/passwd | grep -i \\$USER | wc -l\". \n", + "\n", + "*Exercise 5c:* Use indirection to count the number of users on TACC with your first name. Copy the command you used into box below." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "oP9XlZl_3iZD", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!/bin/bash --noediting" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "v5IaZXNyqvz_", + "colab_type": "text" + }, + "source": [ + "## Git\n", + "\n", + "`git` is a Version Control System (VCS), typically used to organize the source code of software project but also good source of documents or web-pages. An instance of `git` server stores repositories, each typically containing the code relevant to a specific project. Users create local `clones` of repositories, change and develop the local copies of the code, `commit` the changes to their local repository, `push` to the server as a contribution, \n", + "`pull` updates from the server, and `merge` changes between local and remote versions. \n", + "\n", + "Besides cloning, repositories can be branched or forked. A repository generally starts with a `master` branch that evolves as push requests are merged in. Creating a new branch from an existing branch creates a snapshot of the which can evolve independently or be merged in later. Branches are easy to make and delete, and can serve various purposes. They can represent a stable version of software package. Or a parallel development for different operating system. A fork of a repository is a standalone instance of the repository which can be stored and managed independently from the original, where you can work independently without constraints or interference. \n", + "\n", + "[GitHub](github.com) provides a massive publically accessible instance of a `git` system besides sharing code, projects can be developed by the open source community. It provides tools for managing your repository and a wiki for documentation. Contributions to public software on GitHub generally require making a merge request which would be judged by the managers of the repository. That's why most software packages enourage you to create a new fork, so you can work independently.\n", + "\n", + "Lets take a look at some repositories:\n", + "\n", + "* [This class](https://github.com/afarbin/DATA1401-Spring-2020)\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "J_R64sQDqv0A", + "colab_type": "text" + }, + "source": [ + "## Plan\n", + "\n", + "You made a clone of the class repository at start of this lab. We will create a new fork where you can keep track and submit your work, following [these instructions](https://help.github.com/articles/fork-a-repo/).\n", + "\n", + "Goto to github.com and log in.\n", + "\n", + "Next, lets create a fork of the [class repository](https://github.com/afarbin/DATA1401-Spring-2019). Click the link and press the \"Fork\" button on the top right. Select your repository as where you want to place the fork.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "edTvE6rOqv0C", + "colab_type": "text" + }, + "source": [ + "Now we will check out your fork in your Google Drive / Colab.\n", + "\n", + "Note: Jupyter allows you to run shell directly in a notebook. We will use `!` and `%` to call shell commands directly in this notebook. Follow along yourself. Either create a new notebook or open a terminal. \n", + "\n", + "Start by listing the contents of your current directory." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "e5tXg0f8qv0D", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 316 + }, + "outputId": "0565d766-8273-4902-d036-5cb277510178" + }, + "source": [ + "!/bin/bash --noediting\n", + "%cd /content/drive/My\\ Drive\n", + "!ls" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "stream", + "text": [ + "bash: cannot set terminal process group (125): Inappropriate ioctl for device\n", + "bash: no job control in this shell\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# %cd /content/drive/My\\ Drive\n", + "bash: fg: no job control\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# 1ls\n", + "bash: 1ls: command not found\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# !ls\n", + "bash: !ls: event not found\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# cd /content/drive/My\\ Drive\n", + "bash: cd: /content/drive/My Drive: No such file or directory\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# cd /content\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# cd /drive\n", + "bash: cd: /drive: No such file or directory\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# cd drive\n", + "bash: cd: drive: No such file or directory\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# " + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WYsyYcg1qv0J", + "colab_type": "text" + }, + "source": [ + "Make a new directory:" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "Z7noY1hMqv0L", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!mkdir Data-1401-Repo\n", + "%cd Data-1401-Repo" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fwsBdTnYqv0Q", + "colab_type": "text" + }, + "source": [ + "From the github page for your fork, press the green \"Clone or download\" button and copy the URL.\n", + "\n", + "Goto to your notebook and use the following command to clone the repository, pasting the URL you just copied:\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "8w42MH6Jqv0S", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# What you past here should look like:\n", + "#!git clone https://github.com//DATA1401-Spring-2020.git" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "cOAuqTVUqv0V", + "colab_type": "text" + }, + "source": [ + "Go into the directory:" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "b1Ew4tEZqv0X", + "colab_type": "code", + "colab": {} + }, + "source": [ + "%cd DATA1401-Spring-2020\n", + "!ls" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "IrhWToc-qv0a", + "colab_type": "text" + }, + "source": [ + "We will now connect your fork to the original so you can pull changes from there. \n", + "\n", + "Check remote status:" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "JxtMYR-9qv0c", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!git remote -v" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9ud3X0fBqv0f", + "colab_type": "text" + }, + "source": [ + "Now use the original class URL to set your upstream:" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "pgJlKxBqqv0h", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!git remote add upstream https://github.com/afarbin/DATA1401-Spring-2020.git" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "id2yUEt9qv0k", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!git remote -v" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "sAkgeJ6Iqv0n", + "colab_type": "text" + }, + "source": [ + "From now on, you can get the newest version of class material by using:" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "AGDsfTFLqv0o", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!git pull" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "u9RAhs5b4vXY", + "colab_type": "text" + }, + "source": [ + "We will submit your Lab 1 using git at the next Lab." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "PPfGmFQI40HR", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + } + ] +} \ No newline at end of file diff --git a/Labs/Lab-1/Lab_1_solutions.ipynb b/Labs/Lab-1/Lab_1_solutions.ipynb new file mode 100644 index 0000000..c41e877 --- /dev/null +++ b/Labs/Lab-1/Lab_1_solutions.ipynb @@ -0,0 +1,957 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "colab": { + "name": "Lab-1.ipynb", + "provenance": [], + "collapsed_sections": [] + } + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "O5vg8KKRq0sy", + "colab_type": "text" + }, + "source": [ + "# Lab 1\n", + "\n", + "## Python Notebooks on Google Colab\n", + "\n", + "Data 1401's Labs, Homework, and Exams will be all in form of iPython notebooks. You may already be familiar with python notebooks if you have used Jupyter before, for example in Data 1301. If so, you are welcome to use whatever means you have to run Jupyter notebooks for this course, though you may get limited support. Our primary means of running python notebooks will be through [Google Colab](https://colab.research.google.com) and we will be storing files on google drive.\n", + "\n", + "You will need a google account. If you do not have one or you wish to use a different account for this course, please follow [these instructions](https://edu.gcfglobal.org/en/googledriveanddocs/getting-started-with-google-drive/1/) to make an account.\n", + "\n", + "Once you are ready with your account, you can continue in Colab. Click on the following badge to open this notebook in Colab:\n", + "\n", + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github//afarbin/DATA1401-Spring-2020/blob/master/Labs/Lab-1/Lab-1.ipynb)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FVt_1hPt1dAK", + "colab_type": "text" + }, + "source": [ + "## Notebooks in Colab\n", + "\n", + "You now are presumably in Colab. Word of caution, by default, Google Colab does not save your notebooks, so if you close your session, you will loose your work.\n", + "\n", + "So first thing: from the file menu above select \"Save a copy in Drive\"." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "x0JBL_RFrDDj", + "colab_type": "text" + }, + "source": [ + "## Storing Notebooks in Google Drive\n", + "A better way to work is to save your notebooks directly into Google Drive and upload directly to Git (where you will be downloading and uploading your homework). In order properly setup Git, we'll need to work more directly in your Google Drive.\n", + "\n", + "On the left sidebar, press the file icon to see a listing of files accessibile to this Notebook. Then press \"Mount Drive\" and follow the instructions to mount your Google Drive in this notebook. A new cell will be inserted into this notebook, which after you run by pressing the play button will instruct you to follow a link to log into your Google Account and enable access to your Drive in another tab. Finally you will copy a link from the new tab back into the cell in this notebook. Once you are done, press refresh under files in the left sidebar and you should have \"drive/My Drive\" appear." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "hwJ6wJk3tiLv", + "colab_type": "text" + }, + "source": [ + "## Github\n", + "All the class material will be stored on github. You will also submit your homework using github. To do so, you will need a github account.\n", + "\n", + "If you do not already have a github account or wish to create a new one for this course, create one:\n", + "* Browse to [github.com](https://github.com).\n", + "* Click the green “Sign up for GitHub”\tbutton.\n", + "* Follow instructions for creating an account.\n", + "* Make sure you remember your github username and password.\n", + "\n", + "Write an email to the course TA titled \"Data 1401: Github account\" with your github username (not your password) as the contents.\n", + "\n", + "## Google Groups\n", + "\n", + "Class annoucements will be made via google groups. If you did not already receive an invite to the class google group, had trouble with the invite, or wish to use a different email address, write an email to the course TA titled \"Data 1401: Google Group\" with your preferred email.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TjfIzdQZqvzk", + "colab_type": "text" + }, + "source": [ + "## Introduction: Unix, Git, and Jupyter\n", + "\n", + "This lab aims to introduce you to basic Unix, familiarize you with iPython notebooks and get you setup to submit your homework.\n", + "*italicized text*" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "C_LmOgzFqvzp", + "colab_type": "text" + }, + "source": [ + "\n", + "\n", + "### Terminal, Shell, and ssh\n", + "\n", + "\n", + "The terminal is a simple program that generally runs another program, taking mostly keyboard input from you, passing it to this other program, and taking the output of the program and displaying on the screen for you.\n", + "\n", + "The terminal usually runs a program called a shell. Shells present a command prompt where you can type in commands, which are then executed when you press enter. In most shells, there are some special commands which the shell will execute. Everything else you type in, the shell will assume is a name of a program you want to run and arguments you want to pass that program. So if the shell doesn't recognize something you type in, it'll try to find a program with a name that is the same as the first word you gave it. \n", + "\n", + "### Shell in Colab\n", + "\n", + "Unfortunately, google Colab does not allow you to open a terminal window. Jupyter does, so if you are running in Jupyter (which most of you will not be), you may choose to open a terminal window by returning to the jupyter file list tab and selecting new terminal from the top right.\n", + "\n", + "For Colab, we will have to do something non-ideal, but functional. There are several ways to execute shell commands from within a python notebook. For example, you can use any shell command by putting \"!\" in front of the command:\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "KJ5f-WO0wcAv", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!ls\n", + "!echo \"----------\"\n", + "!ls sample_data" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "8f-n4AXFw-dD", + "colab_type": "text" + }, + "source": [ + "Unfortunately, every time you use \"!\" a new environment is created and the state reverted to the original state. Try to understand the difference between the following two sets of commands:\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "99nrBYTWxZJr", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!echo \"Technique 1:\"\n", + "!ls\n", + "!cd sample_data\n", + "!ls" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "2-Znf97Lxl-Z", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!echo \"Technique 2:\"\n", + "!ls ; cd sample_data ;ls" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4x9n1rAkxyYl", + "colab_type": "text" + }, + "source": [ + "Notebooks allow a bit of \"magic\" (using \"%\") to avoid some of these limitations:\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "vLBPTX4rx3gd", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!echo \"Technique 3:\"\n", + "!ls \n", + "%cd sample_data \n", + "!ls" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "U8XpvPjcyH0w", + "colab_type": "text" + }, + "source": [ + "For our purposes, we are just going to explicitly start a new shell and interact with it in the output cell. Execute the following cell. You will be able to type and execute commands. Look around a bit using \"ls\" and \"cd. You can stop the cell from running by typing \"exit\"." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "MIDFitLZyuZy", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!/bin/bash --noediting" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "q-4hfZBywW25", + "colab_type": "text" + }, + "source": [ + "While in this instance your shell is running in a this notebook, you can also run terminals natively on your own computer. On Linux or MacOS, you just have to run a program called terminal. In Windows you can start a \"command prompt\". \n", + "\n", + "\n", + "Type in \"ls\" into the terminal and press enter. The shell will find a program called \"ls\", a standard tool in Unix, and run it. \"ls\" lists the contents (files and directories) of your current directory. If you are just starting in this course, you probably only see the git repository you cloned. \n", + "\n", + "A subtle point to realize here is that while the terminal is running in the browser that is running on the computer in front of you, the shell is actually running on a machine on google hardware. The shell prompt typically displays the name of the machine you are using. What you are not seeing is that there is an intermidate program between the terminal running on your computer and the shell running on google. This intermidary program is taking your input from the terminal sending it over the network to google and bringing back the responses for you terminal to display.\n", + "\n", + "A bit of extra information. If you start a terminal on your own computer, the shell runs locally. The \"ls\" command would then list contents of a directory on your computer. You can typically connect to Unix computers by evoking a shell running on that machine over the network. In this case, you would have to initiate this intermidiary program yourself. The program is called \"ssh\" (secure shell). You can \"ssh\" to another machine from your machine, by simply typing \"ssh\" followed by the machine name or IP address. Most likely you would be prompted for a password, after which you would dropped into the prompt of a shell running on the remote machine. \n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "51Eya4LBqvzs", + "colab_type": "text" + }, + "source": [ + "## Programs and Environment Variables\n", + "\n", + "You have a listing of your current directory, but you don't know where that directory resides. You can see what directory you are using the command \"pwd\" (print working directory). Issue the command and look at the response. You'll get a slash (\"/\") separated list, known as the path, of the directory hierarchy of your current working directory. On Colab, this will start with \"contents\"\n", + "\n", + "Now back to thinking about the command prompt. Since \"ls\" is a program, it most be stored somewhere. It is clearly not in your working directory, because you didn't see it when you executed \"ls\". We can ask the shell to tell us where it found \"ls\" using the \"which ls\" command. Note that \"which\" is also a program. \"which ls\" comes back with \"/bin/ls\", telling you the \"ls\" program is sitting in \"/bin\" directory of the system. \n", + "\n", + "Lets see what else is in there by issuing a \"ls /bin\" command. You will get a long list of programs. You can run any of these programs by just typing their names and pressing enter. You may be able to guess what some of these programs do, but if you want to know, most of them provide you help, using \"--help\" or \"-h\" flag. For example execute \"ls --help\". For more information about a program or command, you can use Unix's manual pages using the \"man\" command. Try typing \"man ls\". Note that you will need to press space to scroll through lengthy manual pages and \"q\" to exit back to the shell prompt. \n", + "\n", + "Another command interesting is \"echo\". \"echo\" simply prints whatever you put after it to the screen. Try executing \"echo Hello World.\"\n", + "\n", + "At this point, you may wonder how was it that the shell knew to look for programs in \"/bin\"? The shell keeps a list of places to look for programs an environment variable with the name \"PATH\". The shell keeps a table that map string variable names to string expressions. When the shell starts, its configuration files set some environment variables that it uses. You can see the full list of defined environment variables using the command \"printenv\".\n", + "\n", + "You can use a environment variable in a shell by prepending name of the variable with a dollar sign character (\"\\$\"). So you can print out the PATH environment variable using the command \"echo $PATH\". What you will see is a colon (\":\") separated list of directories that the shell will search (in order) whenever you type in anything.\n", + "\n", + "You can set you own environment variables. Different shells have different syntax. Lets first figure out what shell we are running. \n", + "\n", + "*Exercise 1:* Use the \"echo\" command to print out the value of the \"SHELL\" environment variable:" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "YS7YFiPwqvzu", + "colab_type": "text" + }, + "source": [ + "!/bin/bash --noediting" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "mCqhdAkMkFqs", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 119 + }, + "outputId": "e7eeb4aa-d47c-48cb-a31c-46fbd60273de" + }, + "source": [ + "!/bin/bash --noediting\n" + ], + "execution_count": 7, + "outputs": [ + { + "output_type": "stream", + "text": [ + "bash: cannot set terminal process group (122): Inappropriate ioctl for device\n", + "bash: no job control in this shell\n", + "\u001b]0;root@cc4296d07c6b: /content\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content\u001b[00m# echo $SHELL\n", + "/bin/bash\n", + "\u001b]0;root@cc4296d07c6b: /content\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content\u001b[00m# exit\n", + "exit\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "YoEgruUhqvzw", + "colab_type": "text" + }, + "source": [ + "## Navigating Directories\n", + "\n", + "You can change your current directory using the \"cd\" shell command. Note that \"cd\" is not a Unix program. Once in a directory, you can use the \"ls\" command to list the contents or \"pwd\" to remind yourself your current working directory. You can move back one level in your current directory hierarchy using \"cd ..\". In general \"..\" represents the path to a directory one level above your current directory, \"../..\" represents two levels up, and so on. \".\" represents the current directory. If you look at the PATH environment variable, you'll notice that the last item is \".\", telling the shell to look into your current directory for commands. Finally the \"~\" character always refers to your home directory.\n", + "\n", + "Some other file manipulation commands:\n", + "\n", + " - The \"mkdir\" command creates new directories. \n", + " - \"cp\" and \"mv\" allow you to copy and move (or rename) files, taking 2 arguments: the original path/filename and the target path/filename. \n", + " - The \"rm\" and \"rmdir\" commands remove (delete) files and directories.\n", + "\n", + "\n", + "*Exercise 2:* Using the \"cd\" command, navigate into \"drive/My\\ Drive\" directory. Create a new directory called \"Data-1441\", and another directory inside \"Data-1441\" called \"Lab-1-Solutions\". Perform the rest of the lab in this directory." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "A16VzZ3G0J8x", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 714 + }, + "outputId": "4a1a576d-2378-4ed0-c7d2-89e8ca284ea7" + }, + "source": [ + "!/bin/bash --noediting\n" + ], + "execution_count": 12, + "outputs": [ + { + "output_type": "stream", + "text": [ + "bash: cannot set terminal process group (122): Inappropriate ioctl for device\n", + "bash: no job control in this shell\n", + "\u001b]0;root@cc4296d07c6b: /content\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content\u001b[00m# !cd\n", + "cd drive\n", + "\u001b]0;root@cc4296d07c6b: /content/drive\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive\u001b[00m# mkdir Data-1441\n", + "mkdir: cannot create directory ‘Data-1441’: Operation not supported\n", + "\u001b]0;root@cc4296d07c6b: /content/drive\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive\u001b[00m# ls\n", + "\u001b[0m\u001b[01;34m'My Drive'\u001b[0m\n", + "\u001b]0;root@cc4296d07c6b: /content/drive\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive\u001b[00m# pwd\n", + "/content/drive\n", + "\u001b]0;root@cc4296d07c6b: /content/drive\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive\u001b[00m# cd /content\n", + "\u001b]0;root@cc4296d07c6b: /content\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content\u001b[00m# mkdir Data-1441\n", + "\u001b]0;root@cc4296d07c6b: /content\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content\u001b[00m# ls\n", + "\u001b[0m\u001b[01;34mData-1441\u001b[0m \u001b[01;34mdrive\u001b[0m \u001b[01;34msample_data\u001b[0m\n", + "\u001b]0;root@cc4296d07c6b: /content\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content\u001b[00m# pwd\n", + "/content\n", + "\u001b]0;root@cc4296d07c6b: /content\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content\u001b[00m# !cd Data-1441\n", + "cd /content Data-1441\n", + "bash: cd: too many arguments\n", + "\u001b]0;root@cc4296d07c6b: /content\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content\u001b[00m# cd Data-1441\n", + "\u001b]0;root@cc4296d07c6b: /content/Data-1441\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/Data-1441\u001b[00m# pwd\n", + "/content/Data-1441\n", + "\u001b]0;root@cc4296d07c6b: /content/Data-1441\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/Data-1441\u001b[00m# cd /content/drive/My\\ Drive\n", + "\u001b]0;root@cc4296d07c6b: /content/drive/My Drive\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive/My Drive\u001b[00m# mkdir Data-1441\n", + "mkdir: cannot create directory ‘Data-1441’: File exists\n", + "\u001b]0;root@cc4296d07c6b: /content/drive/My Drive\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive/My Drive\u001b[00m# mkdir Data-1441\n", + "mkdir: cannot create directory ‘Data-1441’: File exists\n", + "\u001b]0;root@cc4296d07c6b: /content/drive/My Drive\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive/My Drive\u001b[00m# rm Data-1441\n", + "rm: cannot remove 'Data-1441': Is a directory\n", + "\u001b]0;root@cc4296d07c6b: /content/drive/My Drive\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive/My Drive\u001b[00m# rmdir Data-1441\n", + "\u001b]0;root@cc4296d07c6b: /content/drive/My Drive\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive/My Drive\u001b[00m# mkdir Data-1441\n", + "\u001b]0;root@cc4296d07c6b: /content/drive/My Drive\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive/My Drive\u001b[00m# cd /content/drive/My\\ Drive/ Data-1441 \n", + "bash: cd: too many arguments\n", + "\u001b]0;root@cc4296d07c6b: /content/drive/My Drive\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive/My Drive\u001b[00m# cd Data-1441\n", + "\u001b]0;root@cc4296d07c6b: /content/drive/My Drive/Data-1441\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive/My Drive/Data-1441\u001b[00m# pwd\n", + "/content/drive/My Drive/Data-1441\n", + "\u001b]0;root@cc4296d07c6b: /content/drive/My Drive/Data-1441\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive/My Drive/Data-1441\u001b[00m# mkdir Lab-1-Solutions\n", + "\u001b]0;root@cc4296d07c6b: /content/drive/My Drive/Data-1441\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive/My Drive/Data-1441\u001b[00m# pwd\n", + "/content/drive/My Drive/Data-1441\n", + "\u001b]0;root@cc4296d07c6b: /content/drive/My Drive/Data-1441\u0007\u001b[01;32mroot@cc4296d07c6b\u001b[00m:\u001b[01;34m/content/drive/My Drive/Data-1441\u001b[00m# exit\n", + "exit\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "40-LEVappBw4", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 122 + }, + "outputId": "e39055c3-81ab-497c-ec0d-0ceea90650b8" + }, + "source": [ + "from google.colab import drive\n", + "drive.mount('/content/drive')" + ], + "execution_count": 9, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly\n", + "\n", + "Enter your authorization code:\n", + "··········\n", + "Mounted at /content/drive\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "o38c4lbsqvzy", + "colab_type": "text" + }, + "source": [ + "## Exploring Unix Filesystem\n", + "\n", + "You can look at the root directory of the system by issuing \"ls /\". As explained in lecture, Unix uses the file system to communicate with devices and between processes. \"/etc\" keeps the configuration files of the system. \"/bin\" and \"/sbin\" store most of the standard Unix programs. \"/usr\" stores installes programs and their associate files, with \"/usr/bin\" usually storing the commands you can run. \n", + "\n", + "*Exercise 3:* List the \"/dev\" directory. How many SSD storage devices do you see? How many partitions does each device have? (Answer in box below)" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "yNj2LXzP2ksl", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 527 + }, + "outputId": "ac6f2d69-9c6d-43c3-a331-4b07a8070fba" + }, + "source": [ + "!/bin/bash --noediting\n" + ], + "execution_count": 1, + "outputs": [ + { + "output_type": "stream", + "text": [ + "bash: cannot set terminal process group (124): Inappropriate ioctl for device\n", + "bash: no job control in this shell\n", + "\u001b]0;root@b280feab87e4: /content\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/content\u001b[00m# cd /dev\n", + "\u001b]0;root@b280feab87e4: /dev\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/dev\u001b[00m# ls\n", + "\u001b[0m\u001b[01;36mcore\u001b[0m \u001b[40;33;01mfull\u001b[0m \u001b[30;42mmqueue\u001b[0m \u001b[01;36mptmx\u001b[0m \u001b[40;33;01mrandom\u001b[0m \u001b[01;36mstderr\u001b[0m \u001b[01;36mstdout\u001b[0m \u001b[40;33;01murandom\u001b[0m\n", + "\u001b[01;36mfd\u001b[0m \u001b[40;33;01mfuse\u001b[0m \u001b[40;33;01mnull\u001b[0m \u001b[01;34mpts\u001b[0m \u001b[30;42mshm\u001b[0m \u001b[01;36mstdin\u001b[0m \u001b[40;33;01mtty\u001b[0m \u001b[40;33;01mzero\u001b[0m\n", + "\u001b]0;root@b280feab87e4: /dev\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/dev\u001b[00m# ls core\n", + "\u001b[0m\u001b[01;36mcore\u001b[0m\n", + "\u001b]0;root@b280feab87e4: /dev\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/dev\u001b[00m# ls -lh\n", + "total 0\n", + "lrwxrwxrwx 1 root root 11 Feb 6 19:12 \u001b[0m\u001b[01;36mcore\u001b[0m -> \u001b[40;33;01m/proc/kcore\u001b[0m\n", + "lrwxrwxrwx 1 root root 13 Feb 6 19:12 \u001b[01;36mfd\u001b[0m -> \u001b[01;34m/proc/self/fd\u001b[0m\n", + "crw-rw-rw- 1 root root 1, 7 Feb 6 19:12 \u001b[40;33;01mfull\u001b[0m\n", + "crw-rw-rw- 1 root root 10, 229 Feb 6 19:12 \u001b[40;33;01mfuse\u001b[0m\n", + "drwxrwxrwt 2 root root 40 Feb 6 19:12 \u001b[30;42mmqueue\u001b[0m\n", + "crw-rw-rw- 1 root root 1, 3 Feb 6 19:12 \u001b[40;33;01mnull\u001b[0m\n", + "lrwxrwxrwx 1 root root 8 Feb 6 19:12 \u001b[01;36mptmx\u001b[0m -> \u001b[40;33;01mpts/ptmx\u001b[0m\n", + "drwxr-xr-x 2 root root 0 Feb 6 19:12 \u001b[01;34mpts\u001b[0m\n", + "crw-rw-rw- 1 root root 1, 8 Feb 6 19:12 \u001b[40;33;01mrandom\u001b[0m\n", + "drwxrwxrwt 2 root root 40 Feb 6 19:15 \u001b[30;42mshm\u001b[0m\n", + "lrwxrwxrwx 1 root root 15 Feb 6 19:12 \u001b[01;36mstderr\u001b[0m -> \u001b[40;33;01m/proc/self/fd/2\u001b[0m\n", + "lrwxrwxrwx 1 root root 15 Feb 6 19:12 \u001b[01;36mstdin\u001b[0m -> \u001b[40;33;01m/proc/self/fd/0\u001b[0m\n", + "lrwxrwxrwx 1 root root 15 Feb 6 19:12 \u001b[01;36mstdout\u001b[0m -> \u001b[40;33;01m/proc/self/fd/1\u001b[0m\n", + "crw-rw-rw- 1 root root 5, 0 Feb 6 19:12 \u001b[40;33;01mtty\u001b[0m\n", + "crw-rw-rw- 1 root root 1, 9 Feb 6 19:12 \u001b[40;33;01murandom\u001b[0m\n", + "crw-rw-rw- 1 root root 1, 5 Feb 6 19:12 \u001b[40;33;01mzero\u001b[0m\n", + "\u001b]0;root@b280feab87e4: /dev\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/dev\u001b[00m# quit\n", + "bash: quit: command not found\n", + "\u001b]0;root@b280feab87e4: /dev\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/dev\u001b[00m# exit\n", + "exit\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "H4QTSFJg-Qp1", + "colab_type": "text" + }, + "source": [ + "Answer: Looks like I dont have any SSDs " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "7P9EG0KOqvz2", + "colab_type": "text" + }, + "source": [ + "## Text File Manipulation\n", + "\n", + "As explained in lecture, Unix stores most information in text files. For example, the list of all users and their home directories are stored in \"/etc/passwd\". Let get some familiarity with the most commonly used commands to manipulate files.\n", + "\n", + " - You can see the contents contents a file using the \"cat\" (concatenate) command. Try executing \"cat /etc/passwd\". You'll get a huge list that will go by your screen quickly. \n", + " \n", + " - To go through the file page by page, you can use the \"less\" or \"more\" commands. \n", + " \n", + " - You can see the first or last N (N=10 by default) lines of a file using \"head\" or \"tail\" commands. For example \"tail -20 /etc/passwd\" will list the last 20 lines. \n", + " \n", + " - You can search a test file using the \"grep\" command, which takes a string keyword as the first argument and a filename as the second, and by default prints out every line in the file that contrains the string. So for example you can do \"grep \\$USER /etc/passwd\" to find the line corresponding to your account. Some useful flags: \n", + " \n", + " - \"-i\" ignores the case of the keyword\n", + " - \"-v\" display those lines that do NOT match \n", + " - \"-n\" precede each matching line with the line number \n", + " - \"-c\" print only the total count of matched lines \n", + " \n", + " For example \"grep -c \\$USER /etc/passwd\" should show that you are in the password file just once. \n", + " \n", + " - The \"wc\" (word count) command counts the number of lines, words, and characters in a file. By default \"wc\" gives you all three numbers, but \"-w\", \"-l\", or \"-c\" flags \n", + "\n", + "*Exercise 4:* Count how many lines in the password file contain the letter \"w\". " + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "UlsANMuf2qMs", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 595 + }, + "outputId": "b312de92-5a8a-480a-b9b5-b147479bd97b" + }, + "source": [ + "!/bin/bash --noediting\n" + ], + "execution_count": 5, + "outputs": [ + { + "output_type": "stream", + "text": [ + "bash: cannot set terminal process group (124): Inappropriate ioctl for device\n", + "bash: no job control in this shell\n", + "\u001b]0;root@b280feab87e4: /content\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/content\u001b[00m# cat /etc/passwd\n", + "root:x:0:0:root:/root:/bin/bash\n", + "daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin\n", + "bin:x:2:2:bin:/bin:/usr/sbin/nologin\n", + "sys:x:3:3:sys:/dev:/usr/sbin/nologin\n", + "sync:x:4:65534:sync:/bin:/bin/sync\n", + "games:x:5:60:games:/usr/games:/usr/sbin/nologin\n", + "man:x:6:12:man:/var/cache/man:/usr/sbin/nologin\n", + "lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin\n", + "mail:x:8:8:mail:/var/mail:/usr/sbin/nologin\n", + "news:x:9:9:news:/var/spool/news:/usr/sbin/nologin\n", + "uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin\n", + "proxy:x:13:13:proxy:/bin:/usr/sbin/nologin\n", + "www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin\n", + "backup:x:34:34:backup:/var/backups:/usr/sbin/nologin\n", + "list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin\n", + "irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin\n", + "gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin\n", + "nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin\n", + "_apt:x:100:65534::/nonexistent:/usr/sbin/nologin\n", + "systemd-network:x:101:104:systemd Network Management,,,:/run/systemd/netif:/usr/sbin/nologin\n", + "systemd-resolve:x:102:105:systemd Resolver,,,:/run/systemd/resolve:/usr/sbin/nologin\n", + "messagebus:x:103:107::/nonexistent:/usr/sbin/nologin\n", + "nvidia-persistenced:x:104:108:NVIDIA Persistence Daemon,,,:/nonexistent:/sbin/nologin\n", + "\u001b]0;root@b280feab87e4: /content\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/content\u001b[00m# wc /etc/passwd\n", + " 23 33 1243 /etc/passwd\n", + "\u001b]0;root@b280feab87e4: /content\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/content\u001b[00m# grep wc -i W /etc/passwd\n", + "grep: W: No such file or directory\n", + "\u001b]0;root@b280feab87e4: /content\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/content\u001b[00m# grep -c W -i /etc/passwd\n", + "3\n", + "\u001b]0;root@b280feab87e4: /content\u0007\u001b[01;32mroot@b280feab87e4\u001b[00m:\u001b[01;34m/content\u001b[00m# exit\n", + "exit\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SZuhLbD8qvz5", + "colab_type": "text" + }, + "source": [ + "## Redirection\n", + "\n", + "Unix provides programs \"pipes\" for input and output. Most of what you see on the screen when you run a program was written to the \"stdout\" (standard output) pipe. Other pipes are \"stdin\" (standard input) and \"stderr\" (standard error), where error messages are written.\n", + "\n", + "As discussed in lecture, the basic commands of are simple, but you can chain them to do complicated things. Redirection is how you chain these commands, directing the output of one command to the input of the next.\n", + "\n", + "As an example, consider the \"cat\" command. Cat takes stdin and outputs it to stdout. Type \"cat\" and press enter and confirm. You can get back to the command prompt by pressing \"control-c\" (sends terminate singal) or \"control-d\" (end of file character). Note that from now on we will use the convention: \"control-d\" = \"^D\"\n", + "\n", + "*Exercise 5a:* Using \"cat\" and indirection you can write things into a file. The \">\" symbol directs stdout into a file. Try \"cat > favorite-colors-list.txt\" and then type in your 3 favorite colors, each on it's own line. Use \"^D\" to end your input." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "H5vxtcXnqvz6", + "colab_type": "text" + }, + "source": [ + "Use \"cat\", \"more\", or \"less\" to confirm that you file is as you expect it. \">>\" allows you to append to the file. \n", + "\n", + "*Exercise 5b:* Append 2 more colors to your file." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "twRKNaGy3XGw", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 95 + }, + "outputId": "cfc613bc-1641-4d3b-e3c0-57b4f4e1ef26" + }, + "source": [ + "!/bin/bash --noediting" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "stream", + "text": [ + "bash: cannot set terminal process group (124): Inappropriate ioctl for device\n", + "bash: no job control in this shell\n", + "\u001b]0;root@f065f1fd4329: /content\u0007\u001b[01;32mroot@f065f1fd4329\u001b[00m:\u001b[01;34m/content\u001b[00m# " + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DZODNKiAqvz8", + "colab_type": "text" + }, + "source": [ + "The \"sort\" command sorts what it sees on stdin. Instead of taking input from the terminal, you can direct the shell to take stdin from a file using \"<\". Try \"sort < favorite-color-list.txt\" and \"sort < favorite-color-list.txt > sorted-favorite-color-list.txt\".\n", + "\n", + "Finally, instead of piping input / output into files, you can directly chain one program into another using \"|\". So for example, you can do \"cat /etc/passwd | grep -i \\$USER | wc -l\". \n", + "\n", + "*Exercise 5c:* Use indirection to count the number of users on TACC with your first name. Copy the command you used into box below." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "oP9XlZl_3iZD", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!/bin/bash --noediting" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "v5IaZXNyqvz_", + "colab_type": "text" + }, + "source": [ + "## Git\n", + "\n", + "`git` is a Version Control System (VCS), typically used to organize the source code of software project but also good source of documents or web-pages. An instance of `git` server stores repositories, each typically containing the code relevant to a specific project. Users create local `clones` of repositories, change and develop the local copies of the code, `commit` the changes to their local repository, `push` to the server as a contribution, \n", + "`pull` updates from the server, and `merge` changes between local and remote versions. \n", + "\n", + "Besides cloning, repositories can be branched or forked. A repository generally starts with a `master` branch that evolves as push requests are merged in. Creating a new branch from an existing branch creates a snapshot of the which can evolve independently or be merged in later. Branches are easy to make and delete, and can serve various purposes. They can represent a stable version of software package. Or a parallel development for different operating system. A fork of a repository is a standalone instance of the repository which can be stored and managed independently from the original, where you can work independently without constraints or interference. \n", + "\n", + "[GitHub](github.com) provides a massive publically accessible instance of a `git` system besides sharing code, projects can be developed by the open source community. It provides tools for managing your repository and a wiki for documentation. Contributions to public software on GitHub generally require making a merge request which would be judged by the managers of the repository. That's why most software packages enourage you to create a new fork, so you can work independently.\n", + "\n", + "Lets take a look at some repositories:\n", + "\n", + "* [This class](https://github.com/afarbin/DATA1401-Spring-2020)\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "J_R64sQDqv0A", + "colab_type": "text" + }, + "source": [ + "## Plan\n", + "\n", + "You made a clone of the class repository at start of this lab. We will create a new fork where you can keep track and submit your work, following [these instructions](https://help.github.com/articles/fork-a-repo/).\n", + "\n", + "Goto to github.com and log in.\n", + "\n", + "Next, lets create a fork of the [class repository](https://github.com/afarbin/DATA1401-Spring-2019). Click the link and press the \"Fork\" button on the top right. Select your repository as where you want to place the fork.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "edTvE6rOqv0C", + "colab_type": "text" + }, + "source": [ + "Now we will check out your fork in your Google Drive / Colab.\n", + "\n", + "Note: Jupyter allows you to run shell directly in a notebook. We will use `!` and `%` to call shell commands directly in this notebook. Follow along yourself. Either create a new notebook or open a terminal. \n", + "\n", + "Start by listing the contents of your current directory." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "e5tXg0f8qv0D", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 316 + }, + "outputId": "0565d766-8273-4902-d036-5cb277510178" + }, + "source": [ + "!/bin/bash --noediting\n", + "%cd /content/drive/My\\ Drive\n", + "!ls" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "stream", + "text": [ + "bash: cannot set terminal process group (125): Inappropriate ioctl for device\n", + "bash: no job control in this shell\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# %cd /content/drive/My\\ Drive\n", + "bash: fg: no job control\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# 1ls\n", + "bash: 1ls: command not found\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# !ls\n", + "bash: !ls: event not found\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# cd /content/drive/My\\ Drive\n", + "bash: cd: /content/drive/My Drive: No such file or directory\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# cd /content\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# cd /drive\n", + "bash: cd: /drive: No such file or directory\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# cd drive\n", + "bash: cd: drive: No such file or directory\n", + "\u001b]0;root@d32f44febcc8: /content\u0007\u001b[01;32mroot@d32f44febcc8\u001b[00m:\u001b[01;34m/content\u001b[00m# " + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WYsyYcg1qv0J", + "colab_type": "text" + }, + "source": [ + "Make a new directory:" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "Z7noY1hMqv0L", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!mkdir Data-1401-Repo\n", + "%cd Data-1401-Repo" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fwsBdTnYqv0Q", + "colab_type": "text" + }, + "source": [ + "From the github page for your fork, press the green \"Clone or download\" button and copy the URL.\n", + "\n", + "Goto to your notebook and use the following command to clone the repository, pasting the URL you just copied:\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "8w42MH6Jqv0S", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# What you past here should look like:\n", + "#!git clone https://github.com//DATA1401-Spring-2020.git" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "cOAuqTVUqv0V", + "colab_type": "text" + }, + "source": [ + "Go into the directory:" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "b1Ew4tEZqv0X", + "colab_type": "code", + "colab": {} + }, + "source": [ + "%cd DATA1401-Spring-2020\n", + "!ls" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "IrhWToc-qv0a", + "colab_type": "text" + }, + "source": [ + "We will now connect your fork to the original so you can pull changes from there. \n", + "\n", + "Check remote status:" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "JxtMYR-9qv0c", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!git remote -v" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9ud3X0fBqv0f", + "colab_type": "text" + }, + "source": [ + "Now use the original class URL to set your upstream:" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "pgJlKxBqqv0h", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!git remote add upstream https://github.com/afarbin/DATA1401-Spring-2020.git" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "id2yUEt9qv0k", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!git remote -v" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "sAkgeJ6Iqv0n", + "colab_type": "text" + }, + "source": [ + "From now on, you can get the newest version of class material by using:" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "AGDsfTFLqv0o", + "colab_type": "code", + "colab": {} + }, + "source": [ + "!git pull" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "u9RAhs5b4vXY", + "colab_type": "text" + }, + "source": [ + "We will submit your Lab 1 using git at the next Lab." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "PPfGmFQI40HR", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + } + ] +} \ No newline at end of file diff --git a/Labs/Lab-2/Lab-2-solutions.ipynb b/Labs/Lab-2/Lab-2-solutions.ipynb new file mode 100644 index 0000000..eeb8168 --- /dev/null +++ b/Labs/Lab-2/Lab-2-solutions.ipynb @@ -0,0 +1,1017 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "uk7yc0nadBGa" + }, + "source": [ + "# Lab 2\n", + "\n", + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github//afarbin/DATA1401-Spring-2020/blob/master/Labs/Lab-2/Lab-2.ipynb)\n", + "\n", + "## Submitting lab solutions\n", + "\n", + "At the end of the previous lab, you should have set up a \"Solutions\" directory in your Google Drive, with a fork of the class git repository that pull from Dr. Farbin's verison and pushes to your own fork. \n", + "\n", + "Unfortunately due to a typo in the previous lab, you probably forked the 2019 version of the gitlab repository for this course. Unless you noticed and corrected the error, you'll have to fork again.\n", + "\n", + "In addition, due to some problems with the setup in Google Colab, we will be submitting our solutions to your fork using the web interface. Instructions on how to use the command-line are in this notebook, but we suggest you do not follow them unless you are working in a jupyter notebook and not Google Colab." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "OMNaOnRksNK3" + }, + "source": [ + "You may also choose to delete the fork from your GitHub account. " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "J_R64sQDqv0A" + }, + "source": [ + "## Repeating last steps of Lab 1\n", + "\n", + "### Create your own fork\n", + "We will create a new fork where you can keep track and submit your work, following [these instructions](https://help.github.com/articles/fork-a-repo/).\n", + "\n", + "Goto to github.com and log in.\n", + "\n", + "Next, create a fork of the [2020 class repository](https://github.com/afarbin/DATA1401-Spring-2020). Click the link and press the \"Fork\" button on the top right. Select your repository as where you want to place the fork.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "edTvE6rOqv0C" + }, + "source": [ + "### Make a local clone (Advanced)\n", + "\n", + "Before we get started, please mount your Google Drive using by clicking the file icon on the left, then clicking \"Mount Drive\", and following the instructions as you did in the previous lab.\n", + "\n", + "If you did complete Lab 1 and therefore created a 2019 fork and a local clone in you Google Drive, delete the local clone:\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "2u6B-rfNr1wN" + }, + "outputs": [], + "source": [ + "!rm -rf drive/My\\ Drive/Data-1401-Repo" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "BDVI5nu8-2RH" + }, + "source": [ + "Now we will check out your fork in your Google Drive / Colab. If you will be doing everything on your own computer instead of Google Colab/Drive, you are welcome to install Git on your computer and perform the following steps (appropriately modified) on your computer instead.\n", + "\n", + "Start by listing the contents of your current directory." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "e5tXg0f8qv0D" + }, + "outputs": [], + "source": [ + "%cd /content/drive/My\\ Drive\n", + "!ls" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "WYsyYcg1qv0J" + }, + "source": [ + "Make a new directory:" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "Z7noY1hMqv0L" + }, + "outputs": [], + "source": [ + "!mkdir Data-1401-Repo\n", + "%cd Data-1401-Repo" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "fwsBdTnYqv0Q" + }, + "source": [ + "From the github page for your fork, press the green \"Clone or download\" button and copy the URL.\n", + "\n", + "Goto to your notebook and use the following command to clone the repository, pasting the URL you just copied:\n" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "8w42MH6Jqv0S" + }, + "outputs": [], + "source": [ + "# What you past here should look like:\n", + "#!git clone https://github.com/\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# Test your solution here\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mplayer_move\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mno_winner\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 3\u001b[0m \u001b[0mdraw_game_board\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mno_winner\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m\u001b[0m in \u001b[0;36mplayer_move\u001b[0;34m(board, player)\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 10\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Bad move, try again.\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 11\u001b[0;31m \u001b[0;32mif\u001b[0m \u001b[0mmove\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mboard\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mplayer\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlocation\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 12\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m\u001b[0m in \u001b[0;36mmove\u001b[0;34m(matrix, players, coordinates)\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcoordinates\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0;32mif\u001b[0m \u001b[0mmatrix\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m==\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;31m##if there is a 0, means empty spcae\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 7\u001b[0m \u001b[0mmatrix\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mplayers\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mTypeError\u001b[0m: list indices must be integers or slices, not str" + ] + } + ], + "source": [ + "# Test your solution here\n", + "player_move(no_winner, 1)\n", + "draw_game_board(no_winner)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Exercise 9:* Use all of the previous exercises to implement a full tic-tac-toe game, where an appropriate board is drawn, 2 players are repeatedly asked for a location coordinates of where they wish to place a mark, and the game status is checked until a player wins or a draw occurs." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Write you solution here" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Test your solution here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Exercise 10:* Test that your game works for 5x5 Tic Tac Toe. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Test your solution here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Exercise 11: (Extra Credit)* Develop a version of the game where one player is the computer. Note that you don't need to do an extensive seach for the best move. You can have the computer simply protect against loosing and otherwise try to win with straight or diagonal patterns." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Write you solution here" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Test your solution here" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/Labs/Lab-4_solutions.ipynb b/Labs/Lab-4_solutions.ipynb new file mode 100644 index 0000000..19a759e --- /dev/null +++ b/Labs/Lab-4_solutions.ipynb @@ -0,0 +1,642 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Lab 4\n", + "\n", + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github//afarbin/DATA1401-Spring-2020/blob/master/Labs/Lab-4/Lab-4.ipynb)\n", + "\n", + "In this lab we will become familiar with distributions, histograms, and functional programming. \n", + "\n", + "\n", + "### Uniform Distribution\n", + "Lets start with generating some fake random data. You can get a random number between 0 and 1 using the python random module as follow:" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The Value of x is 0.6785059203026328\n" + ] + } + ], + "source": [ + "import random ##random number generator\n", + "x=random.random()\n", + "print(\"The Value of x is\", x)\n", + "\n", + "#x = random.randint(1,10) use this to get numbers between 1 and 10" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Everytime you call random, you will get a new number.\n", + "\n", + "*Exercise 1:* Using random, write a function `generate_uniform(N, mymin, mymax)`, that returns a python list containing N random numbers between specified minimum and maximum value. Note that you may want to quickly work out on paper how to turn numbers between 0 and 1 to between other values. " + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.802661814643326, 0.4138621192599856, 0.23675942482369805, 0.7409487120320195, 0.9263021849889973, 0.3446698507966053]\n" + ] + } + ], + "source": [ + "# Skeleton\n", + "def generate_uniform(N,x_min,x_max):\n", + " out = []\n", + " ### BEGIN SOLUTION\n", + " \n", + " out_list=list()\n", + "\n", + " #random_numbers=x.random.random()\n", + " #for _ in range(N):\n", + " # random_numbers.append(my_uniform())\n", + "\n", + " for _ in range(N):\n", + " out_list.append(random.random()) ##must be random.random() \n", + " return out_list\n", + "\n", + " # Fill in your solution here \n", + "\n", + " ### END SOLUTION\n", + "my_uniform=generate_uniform(6,-10,10) \n", + "print(my_uniform)" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Data Type: \n", + "Data Length: 1000\n", + "Type of Data Contents: \n", + "Data Minimum: 0.00010718203590565079\n", + "Data Maximum: 0.9962669512121353\n" + ] + } + ], + "source": [ + "# Test your solution here\n", + "data=generate_uniform(1000,-10,10)\n", + "print (\"Data Type:\", type(data))\n", + "print (\"Data Length:\", len(data))\n", + "if len(data)>0: \n", + " print (\"Type of Data Contents:\", type(data[0]))\n", + " print (\"Data Minimum:\", min(data))\n", + " print (\"Data Maximum:\", max(data))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Exercise 2a:* \n", + "Write a function that computes the mean of values in a list." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [], + "source": [ + "# Skeleton\n", + "\n", + "\n", + "def mean(data):\n", + " m=0. ##im actually confused as to why m=0. but thats a story for another time ig., ASK!!\n", + " \n", + " ### BEGIN SOLUTION\n", + "\n", + " length_of_data=len(data)\n", + " sum_of_data=sum(data)\n", + " m=sum_of_data/length_of_data \n", + " \n", + " ### END SOLUTION\n", + " \n", + " return m" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Mean of Data: 0.5087112882398924\n" + ] + } + ], + "source": [ + "# Test your solution here\n", + "print (\"Mean of Data:\", mean(data))" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0.5775340177574386" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "## double checked with numpy, ignore\n", + "import numpy as np\n", + "\n", + "data=np.array([0.802661814643326, 0.4138621192599856, 0.23675942482369805, 0.7409487120320195, 0.9263021849889973, 0.3446698507966053])\n", + "data.mean()\n", + "\n", + "##ayeeee it worked" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Exercise 2b:* \n", + "Write a function that computes the variance of values in a list." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [], + "source": [ + "# Skeleton\n", + "def variance(data):\n", + " m=0.\n", + " \n", + " ### BEGIN SOLUTION\n", + " ##mean_of_data=m ##defined in the cell above ###CANT DO THIS, HAS TO BE REASSINGED WHEN MAKING A NEW FUNCTION\n", + " m=mean(data)\n", + " variance=sum((x-m)**2 for x in data)/len(data) ##look at how the formula of variance is structured, just like what you did for mean \n", + " \n", + " \n", + " ### END SOLUTION\n", + " \n", + "\n", + " return variance" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Variance of Data: 0.0660279425851481\n" + ] + } + ], + "source": [ + "# Test your solution here\n", + "print (\"Variance of Data:\", variance(data))" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0.0660279425851481" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "## double checked with numpy, ignore\n", + "\n", + "data=np.array([0.802661814643326, 0.4138621192599856, 0.23675942482369805, 0.7409487120320195, 0.9263021849889973, 0.3446698507966053])\n", + "std1=data.std()\n", + "\n", + "std1**2 ##Standard deviation is the square root of the variance\n", + " \n", + "##first code is correct\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Histogramming" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Exercise 3:* Write a function that bins the data so that you can create a histogram. An example of how to implement histogramming is the following logic:\n", + "\n", + "* User inputs a list of values `x` and optionally `n_bins` which defaults to 10.\n", + "* If not supplied, find the minimum and maximum (`x_min`,`x_max`) of the values in x.\n", + "* Determine the bin size (`bin_size`) by dividing the range of the function by the number of bins.\n", + "* Create an empty list of zeros of size `n_bins`, call it `hist`.\n", + "* Loop over the values in `x`\n", + " * Loop over the values in `hist` with index `i`:\n", + " * If x is between `x_min+i*bin_size` and `x_min+(i+1)*bin_size`, increment `hist[i].` \n", + " * For efficiency, try to use continue to goto the next bin and data point.\n", + "* Return `hist` and the list corresponding of the bin edges (i.e. of `x_min+i*bin_size`). " + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "metadata": {}, + "outputs": [], + "source": [ + "# Solution\n", + "def histogram(x,n_bins=10,x_min=None,x_max=None):\n", + " ### BEGIN SOLUTION\n", + " bin_edges=[]\n", + " \n", + " ##there was students on the live chat that did this (joe explained but im still confused), not sure what it does but when i comment it out i break the code\n", + " \n", + " if x_min==None:\n", + " x_min=min(x)\n", + " if x_max==None:\n", + " x_max=max(x)\n", + " \n", + " hist_range=x_max-x_min\n", + " bin_size=hist_range/n_bins\n", + " hist=[0]*n_bins\n", + " \n", + " for j in x:\n", + " for i in range(len(hist)):\n", + " if j >= x_min+i*bin_size and j <= x_min+i*2*bin_size:\n", + " hist[i]+=1\n", + " \n", + " bin_edges.append(x_min+i*bin_size) ##make sure you append bc then it adds into it, joe explained this, maybe i did it right?\n", + " #print (hist) dont need this actually, its in the cell below. doing this will give you double the outputs i think\n", + " ### END SOLUTION\n", + "\n", + " return hist,bin_edges" + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]\n" + ] + } + ], + "source": [ + "# Test your solution here\n", + "\n", + "h,b=histogram(data,100)\n", + "print(h)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Exercise 4:* Write a function that uses the histogram function in the previous exercise to create a text-based \"graph\". For example the output could look like the following:\n", + "```\n", + "[ 0, 1] : ######\n", + "[ 1, 2] : #####\n", + "[ 2, 3] : ######\n", + "[ 3, 4] : ####\n", + "[ 4, 5] : ####\n", + "[ 5, 6] : ######\n", + "[ 6, 7] : #####\n", + "[ 7, 8] : ######\n", + "[ 8, 9] : ####\n", + "[ 9, 10] : #####\n", + "```\n", + "\n", + "Where each line corresponds to a bin and the number of `#`'s are proportional to the value of the data in the bin. " + ] + }, + { + "cell_type": "code", + "execution_count": 72, + "metadata": {}, + "outputs": [ + { + "ename": "SyntaxError", + "evalue": "invalid syntax (, line 20)", + "output_type": "error", + "traceback": [ + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m20\u001b[0m\n\u001b[0;31m print \"+str(x_min+i*bin_size)+\" , \"+str(x_min+i*2*bin_size)+\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" + ] + } + ], + "source": [ + "# Solution\n", + "def draw_histogram(x,n_bins,x_min=None,x_max=None,character=\"#\",max_character_per_line=20):\n", + " ### BEGIN SOLUTION\n", + "\n", + " if x_min==None:\n", + " x_min=min(x)\n", + " if x_max==None:\n", + " x_max=max(x)\n", + " \n", + " hist_range=x_max-x_min\n", + " bin_size=hist_range/n_bins\n", + " hist=[0]*n_bins\n", + " \n", + " for j in x:\n", + " for i in range(len(hist)):\n", + " if j >= x_min+i*bin_size and j <= x_min+i*2*bin_size:\n", + " \n", + " hist[i]+=1\n", + " bin_edges.append(x_min+i*bin_size)\n", + " \n", + " print \"+str(x_min+i*bin_size)+\" , \"+str(x_min+i*2*bin_size)+ ###fix this if you can\n", + " \n", + " \n", + " ### END SOLUTION\n", + "\n", + " return hist,bin_edges" + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": {}, + "outputs": [], + "source": [ + "# Test your solution here\n", + "\n", + "h,b=histogram(data,20)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Functional Programming\n", + "\n", + "*Exercise 5:* Write a function the applies a booling function (that returns true/false) to every element in data, and return a list of indices of elements where the result was true. Use this function to find the indices of entries greater than 0.5. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def where(mylist,myfunc):\n", + " out= []\n", + " \n", + " ### BEGIN SOLUTION\n", + "\n", + " # Fill in your solution here \n", + " \n", + " ### END SOLUTION\n", + " \n", + " return out" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Test your solution here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Exercise 6:* The inrange(mymin,mymax) function below returns a function that tests if it's input is between the specified values. Write corresponding functions that test:\n", + "* Even\n", + "* Odd\n", + "* Greater than\n", + "* Less than\n", + "* Equal\n", + "* Divisible by" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def in_range(mymin,mymax):\n", + " def testrange(x):\n", + " return x=mymin\n", + " return testrange\n", + "\n", + "# Examples:\n", + "F1=inrange(0,10)\n", + "F2=inrange(10,20)\n", + "\n", + "# Test of in_range\n", + "print (F1(0), F1(1), F1(10), F1(15), F1(20))\n", + "print (F2(0), F2(1), F2(10), F2(15), F2(20))\n", + "\n", + "print (\"Number of Entries passing F1:\", len(where(data,F1)))\n", + "print (\"Number of Entries passing F2:\", len(where(data,F2)))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "### BEGIN SOLUTION\n", + "\n", + " # Fill in your solution here \n", + " \n", + "### END SOLUTION" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Test your solution" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Exercise 7:* Repeat the previous exercise using `lambda` and the built-in python functions sum and map instead of your solution above. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "### BEGIN SOLUTION\n", + "\n", + " # Fill in your solution here \n", + " \n", + "### END SOLUTION" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Monte Carlo\n", + "\n", + "*Exercise 7:* Write a \"generator\" function called `generate_function(func,x_min,x_max,N)`, that instead of generating a flat distribution, generates a distribution with functional form coded in `func`. Note that `func` will always be > 0. \n", + "\n", + "Use the test function below and your histogramming functions above to demonstrate that your generator is working properly.\n", + "\n", + "Hint: A simple, but slow, solution is to a draw random number test_x within the specified range and another number p between the min and max of the function (which you will have to determine). If p<=function(test_x), then place test_x on the output. If not, repeat the process, drawing two new numbers. Repeat until you have the specified number of generated numbers, N. For this problem, it's OK to determine the min and max by numerically sampling the function. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def generate_function(func,x_min,x_max,N=1000):\n", + " out = list()\n", + " ### BEGIN SOLUTION\n", + "\n", + " # Fill in your solution here \n", + " \n", + " ### END SOLUTION\n", + " \n", + " return out" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# A test function\n", + "def test_func(x,a=1,b=1):\n", + " return abs(a*x+b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Exercise 8:* Use your function to generate 1000 numbers that are normal distributed, using the `gaussian` function below. Confirm the mean and variance of the data is close to the mean and variance you specify when building the Gaussian. Histogram the data. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import math\n", + "\n", + "def gaussian(mean, sigma):\n", + " def f(x):\n", + " return math.exp(-((x-mean)**2)/(2*sigma**2))/math.sqrt(math.pi*sigma)\n", + " return f\n", + "\n", + "# Example Instantiation\n", + "g1=gaussian(0,1)\n", + "g2=gaussian(10,3)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Exercise 9:* Combine your `generate_function`, `where`, and `in_range` functions above to create an integrate function. Use your integrate function to show that approximately 68% of Normal distribution is within one variance." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def integrate(func, x_min, x_max, n_points=1000):\n", + " \n", + " return integral" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}