|
1 | 1 |
|
2 | | -# Good coding practices {#sec-coding} |
3 | | - |
4 | | -## Use an IDE to write and run code {#sec-ide} |
| 2 | +# Coding {#sec-coding} |
5 | 3 |
|
6 | | -Unless you're a psychopath, you were not really thinking of writing your code in TXT files. Most of use an **Integrated Developing Environments (IDE)** to code. IDEs provide some commodities to the developer, like: |
7 | 4 |
|
8 | | -- Keyboard **shortcuts**. |
9 | | -- **Linting**: hints about potential problems in your code. |
10 | | -- **Formatters**: reformats your code to fit standard styles in a particular languages, and to make it prettier and more readable. |
11 | | - |
12 | | -There are countless IDEs. Some of them are oriented toward a particular programming language (like RStudio for R or PyCharm for Python), while others are more flexible and allow your to customize them to work in potentially any language (e.g., VScode). Pick the IDE that best adjusts to your needs. |
13 | | - |
14 | | -Make sure your IDE knows which folder you are working on. Sometimes this just means opening the folder from inside the IDE. Most IDEs will make your life easier if you do it! |
15 | | - |
16 | | -::: callout-tip |
17 | | - |
18 | | -## Positron |
19 | | - |
20 | | -If you use R, Python, or a combination of both, Positron might be a good idea. It's optimized for both, but since it's an extension of VScode, it provides almost as much of the flexibility of the latter. |
21 | | - |
22 | | -::: |
23 | | - |
24 | | -::: callout-note |
25 | | - |
26 | | -## Recommended readings |
27 | | - |
28 | | -- Heiss, Andrew. 2024. “Guide to Generating and Rendering Computational Markdown Content Programmatically with Quarto.” November 4, 2024. https://doi.org/10.59350/pa44j-cc302. |
29 | | -- Heiss, Andrew. 2025. “How to Open a Folder as a Positron Project with macOS Quick Actions.” July 22, 2025. https://doi.org/10.59350/pt66c-57w73. |
30 | | -- Heiss, Andrew. 2025. “How to Use Positron’s Connections Pane with DuckDB.” July 10, 2025. https://doi.org/10.59350/w37d8-vj489. |
31 | | -- Heiss, Andrew. 2025. “Use Positron to Run R Inside a Docker Image Through SSH.” July 5, 2025. https://doi.org/10.59350/fredm-56671. |
32 | | -- Heiss, Andrew. 2025. “Open Files in External Programs with Positron or Visual Studio Code.” July 3, 2025. https://doi.org/10.59350/87hpe-4ah24. |
33 | | -- Navarro, Danielle. 2023. “Beware the IDEs of Windows (Subsystem for Linux).” July 2, 2023. https://blog.djnavarro.net/posts/2023-07-02_the-ides-of-wsl/. |
34 | | -- Silge, Julia. 2025. “Release an R package with Positron” https://juliasilge.com/blog/r-pkg-release/. |
35 | | -- Silge, Julia. 2025. “Positron in action with #TidyTuesday orca encounters Positron” https://juliasilge.com/blog/orcas-positron/. |
36 | | -- Velásquez, Isabella and Velásquez, Gustavo, E. 2025. “Porting my favorite RStudio color theme to Positron”. https://ivelasq.rbind.io/blog/positron-theme/. |
37 | | - |
38 | | -::: |
39 | | - |
40 | | - |
41 | | -## Programming environments {#sec-venv} |
42 | | - |
43 | | -When you install a programming language and simply start writing and running code, you are probably working on a **global environment**. This means every package you install is now available to every script your run anywhere in your machine. Scripts from different projects will use the same packages. |
44 | | - |
45 | | -This is farily inconveninent. If you update some package for a particular project, and this indudes breaking chanfges in the code (i.e., the same script will work with the new version but not with the older vesion), now the other projects, which relie on the older version will not work. |
46 | | - |
47 | | -A solution is to create a separate environment for each project (this is called a **virtual environment**), in which you can install whichever version of any package your need without touching the packages installed for other projects. Most (serious) programming languages provide convenient tools to create a virtual environment for a project. |
48 | | - |
49 | | -### In Python (`venv`) |
50 | | - |
51 | | -If you are using Python, `venv` used to be the way to go. In your terminal or command prompt (not your Python console), you would navigate to the project folder, and then run the following line: |
52 | | - |
53 | | -```bash |
54 | | -python -m venv .venv # create virtual environment |
55 | | -.venv/Scripts/activate # activate the virtual environments |
56 | | -``` |
57 | | - |
58 | | -This creates a new folder in your project named `.venv`. This folder now includes a **clean installation** of whatever Python version you are using. Activating the virtual environment is a critical step to make sure we are working in the project-specific environment, not the global environment. Some IDEs will detect and automatically activate virtual environments present in your project folder. We now can install our desired packages: |
59 | | - |
60 | | -```bash |
61 | | -python -m pip install numpy pandas |
62 | | -``` |
63 | | - |
64 | | -Now we can register which packages we have installed and which versions of them. This step will ensure that anyone can generate an identical virtual environment to ours, which will make sure we are all using the same packages, facilitating computational reproducibility. |
65 | | - |
66 | | -```bash |
67 | | -python -m pip freeze > requirements.txt |
68 | | -``` |
69 | | - |
70 | | -This generates a file (`requirements.txt`) in the project root folder that lists all the necessary packages, their versions, and where to download them from. This way, anyone (maybe, yourself, on a different computer) can install them in a new virtual environment by running this line of code, assuming they have access to this file: |
71 | | - |
72 | | -```bash |
73 | | -python -m venv .venv # create virtual environment |
74 | | -.venv/Scripts/activate # activate environment |
75 | | -python -m pip install -r requirements.txt # install packages in requirements.txt |
76 | | -``` |
77 | | - |
78 | | -This workflow is also convenient because virtual environments can take some storage. Instead of sending your virtual environement to your collaborators, this workflow simply provides their machines with the instructions for installing an identical virtual environment in their machines. |
79 | | - |
80 | | -### In Python (`uv`) |
81 | | - |
82 | | -uv is a more recent tool to that provides even more convenient (and fast!) features to work with virtual environments. To use uv, follow the installation instructions from its website. The steps your need to run to achieve the equivalent of what we did using `venv` are: |
83 | | - |
84 | | -```bash |
85 | | -uv init # create virtuakl environment |
86 | | -uv add numpy pandas # install packages |
87 | | -uv lock # register installed packages in uv.lock |
88 | | -``` |
89 | | - |
90 | | -By initializing the virtual environment, uv creates it along some other convenient files and directories. We then installed the packages `numpy` and `pandas`. Finally, we saved the list of installed packages and their versions in a new file (which is named `uv.lock`), which is equivalent to `requirements.txt` in the preivous section. Finally, if one wants to reproduce this environment, they would simply run: |
91 | | - |
92 | | -```bash |
93 | | -uv sync |
94 | | -``` |
95 | | - |
96 | | -And that's it! while `uv` is the state of the art when it comes to handling virtual environments in Python, there are multiple other options, and one should keep and one on upcoming tools. |
97 | | - |
98 | | -### In R (`renv`) |
99 | | - |
100 | | -It used to be inconvenient to deal withh virtual environments in R until the `renv` R package was created. This package follows the same logic as `venv` and `uv`. this main difference is that to use it, we must do some from the R console, instead of the terminal or command prompt: |
101 | | - |
102 | | -```r |
103 | | -library(renv) # load the renv package (shhiuld be previously installed) |
104 | | - |
105 | | -init() # initialize renv |
106 | | -install(c("dplyr", "tidyr")) # install dplyr and tidyr R pakcages |
107 | | -snapshot() # register installed packages in `renv.lock` |
108 | | -``` |
109 | | - |
110 | | -The previous lines will create a new R environment, install the `dplyr` and ``tidyr` R packages, and generate a `renv.lock` file in the project rool folder, whic lists the installed packages and their versions. Finally, to install the packages from the `renv.lock` file: |
111 | | - |
112 | | -```r |
113 | | -library(renv) # load the renv package (shhiuld be previously installed) |
114 | | - |
115 | | -restore() # install packages |
116 | | -``` |
117 | | - |
118 | | -## Style |
119 | | - |
120 | | -### Variable names |
121 | | - |
122 | | -**Variable and function names**: concise, meaningful, consistent |
123 | | - |
124 | | -```r |
125 | | -x <- sr * n # bad |
126 | | -time_domain <- sample_rate * row_number # better |
127 | | -``` |
128 | | - |
129 | | -### Design |
130 | | - |
131 | | -::: callout-note |
132 | | - |
133 | | -- Navarro, Danielle. 2023. “Software Design by Example.” May 31, 2023. https://blog.djnavarro.net/posts/2023-05-31_software-design-by-example/. |
134 | | - |
135 | | -::: |
136 | | - |
137 | | -## Paths |
138 | | - |
139 | | -Inside your code, you may need to tell your machine where to find the files you want to work with. **File paths** are strings of characters that indicate where a path is saved in the file system of your machine. Different programming languages may deal with file paths a bit differently, but we can distinguish two types of file paths: **absolute paths**, and **relative paths**. |
140 | | - |
141 | | -### Absolute file paths |
142 | | - |
143 | | -These specify the **complete location of a file**, from the root folder of your hard drive to the file itself. For intance, this is an absolute file path: |
144 | | - |
145 | | -``` |
146 | | -C:/Users/Me/Documents/my-study/data/myfile.txt |
147 | | -``` |
148 | | - |
149 | | -While absolute paths do the job, their use is discouraged for **reproducibility issues**. In the preious example, our project project lives in the `my-study` folder, which is located in `C:/Users/Me/Documents/` in my machine. If our collaborator is using their own machine, their user might have a different name, like `Collab`, as opposed to our user `Me`,our script will no longer find `myfile.txt` where where the absolute path indicates, and will give our collaborator an error. Even if our users where named the same, our collaborator may have simply downloaded the `my-study` folder somewhere outside of their Documents folder, say for instance their Desktop (`C:/Users/Collab/Desktop/my-study`). The same error will be encountered. |
150 | | - |
151 | | -One might be tempted to invite our poor collaborator to edit the paths in the script to find the files in their own computer. This is not good practice. Instead, one may use **relative paths**. |
152 | | - |
153 | | -### Relative paths |
154 | | - |
155 | | -Relative paths do not specify the complete path of the file. They rather assume that the user has moved their **working directory** to the folder where the script or the project lives in, or to some intermediate folder whose path is common for both us and our collaborator. This is a relative path, for instance: |
156 | | - |
157 | | -``` |
158 | | -my-study/data/myfile.txt |
159 | | -``` |
160 | | - |
161 | | -Assuming that both us and our collaborator have set their current working directory to be the `my-study` folder, wherever it may be located in their respective machines, our script will find the `myfile.txt` successfully in both cases. When you open your project as a folder in your IDE, most IDEs also move the working directory to such folder, which automatically makes relative paths work fine. |
162 | | - |
163 | | -In conclusion, use relative paths whenever possible. Some packages exist so that using relative paths is even more convenient. In R, the package [**here**](https://here.r-lib.org/) allows you to simply write the following line (while at the same time taking care of some technical stuff associated with paths): |
164 | | - |
165 | | -```r |
166 | | -library(here) |
167 | | - |
168 | | -path <- here("data", "myfile.txt") |
169 | | -``` |
170 | | - |
171 | | -In Python, the [**pathlib**](https://docs.python.org/3/library/pathlib.html) standard module^[A standard module is simply a Python package that is included with a basic installation of Python, which means you do not have to install it before using it.] provides a similar convenience: |
172 | | - |
173 | | -```python |
174 | | -from Pathlib import Path |
175 | | -path = Path("data", "myfile.txt") |
176 | | -``` |
177 | | - |
178 | | - |
179 | | -## Dealing with probabilistic events |
180 | | - |
181 | | -::: callout-caution |
182 | | - |
183 | | -In preparation. |
184 | | - |
185 | | -::: |
186 | | - |
187 | | -::: callout-note |
188 | | - |
189 | | -## Recommended readings |
190 | | - |
191 | | -- Navarro, Danielle. 2023. “Fine-Grained Control of RNG Seeds in R.” December 27, 2023. https://blog.djnavarro.net/posts/2023-12-27_seedcatcher/. |
192 | | - |
193 | | -::: |
194 | | - |
195 | | -## Big datasets |
196 | | - |
197 | | -::: callout-caution |
198 | | - |
199 | | -In preparation. |
200 | | - |
201 | | -::: |
202 | | - |
203 | | -## Automating your work |
204 | | - |
205 | | -::: callout-note |
206 | | - |
207 | | -## Recommended readings |
208 | | - |
209 | | -- Heiss, Andrew. 2020. “Automatically Zip up Subdirectories with Make.” January 10, 2020. https://doi.org/10.59350/t1hra-4a041. |
210 | | -- Navarro, Danielle. 2023. “Makefiles. Or, the Balrog and the Submersible.” June 30, 2023. https://blog.djnavarro.net/posts/2023-06-30_makefiles/. |
211 | | -- Vanhove, Jan. 2017. Automatise repetitive tasks.” January 31, 2017. https://janhove.github.io/posts/2017-01-31-automatise-repetitive-tasks/. |
212 | | - |
213 | | -::: |
0 commit comments