-
-
Notifications
You must be signed in to change notification settings - Fork 112
Keyman Code Style Guide
This guide introduces a consistent style for documenting Keyman source code using Doxygen, and code format preferences. For now, existing code may not follow these guidelines, but new code should.
When making pull requests, it may be necessary to edit existing code that doesn't match the styling. If you are making a minor change, then don't rewrite the entire function to match the new style. However, any significant new blocks of code should follow this guide. It may be appropriate to open two PRs: one to regularise the format of the code, and a follow-up that actually makes the code change.
We'll be using this style guide for:
- C/C++
- Swift
- Java
- Python
- Delphi (Object Pascal)
- Typescript
- PHP
- Bash scripts
- And anywhere else that makes sense.
At this time, we will not be extracting code comments using doxygen to generate documentation, just following this style guide.
So, we don't want to get into a huge deep style guide fight. Format your code so it is easy to read is the priority.
The few items listed are hopefully sufficient. We're not religious on line lengths, nor even on brace style (the shorter form is more common these days, so we've leaned towards that). Tabs do cause trouble across platforms and systems though.
Style guides can end up becoming a stupid PR comment fight and we are not interested in that. If you come across something that is hard to format, aim for readability ahead of consistency -- or find an alternative way to write it if you are fighting the language for readability.
Each source file should contain the following header lines:
/*
* Keyman is copyright (C) SIL Global. MIT License.
*
* Created by <author> on yyyy-mm-dd
*
* (Optional description of this file)
*/
Note that this comment is not a doxygen comment; use /*
, not /**
.
The 'Created by' line is optional.
(Formatting options for some languages like C/C++ are defined in .clang-format
and are picked up by the vscode formatter)
In brief:
-
Spaces, not tabs
-
2-space indents
-
braces on same line:
if(foo) { bar(); } else { baz(); }
-
Wrap long lines (> 130 characters)
-
Always use braces and always on a new line:
NO:
if(foo) bar(); if(itWorked) { baz(); return 0; }
YES:
if(foo) { bar(); } if(itWorked) { baz(); return 0; }
-
Function parameter lists:
Short function signatures:
YES:
void short_func(int param) {
Long signatures:
NO:
km_core_state_debug_item const *km_core_state_debug_items(km_core_state const *state, size_t *num_items) {
NO:
km_core_state_debug_item const *km_core_state_debug_items(km_core_state const *state, size_t *num_items) {
YES:
km_core_state_debug_item const * km_core_state_debug_items( km_core_state const *state, size_t *num_items ) {
or YES:
km_core_state_debug_item const *km_core_state_debug_items( km_core_state const *state, size_t *num_items ) {
We follow the Javadoc syntax to mark comment blocks. These have the general form:
/**
* Brief summary.
*
* Detailed description. More detail.
* @see Some reference
*
* @param <name> Parameter description.
* @return Return value description.
*/
Example:
/**
* Returns a compressed version of a string.
*
* Compresses an input string using the foobar algorithm.
*
* @param uncompressed The input string.
* @return A compressed version of the input string.
*/
std::string compress(const std::string& uncompressed);
This is the allowed set of doxygen tags that can be used (note that we use @
rather than \
for improved readability).
-
@param
Describes function parameters.Recommend using two spaces on either side of the parameter names and lining up descriptions for greater readability.
We don't normally mark input parameters, but
[in,out]
and[out]
parameters should be marked:/** * @param[in,out] modifiers The modifier bitmap */
-
@return
Describes return values. -
@see
Describes a cross-reference to classes, functions, methods, variables, files or URL.Example:
/** * Available kinds of implementations. * * @see process::network::PollSocketImpl */
-
@file
Describes a refence to a file. It is required when documenting global functions, variables, typedefs, or enums in separate files. -
@link
and@endlink
Describes a link to a file, class, or member. -
@example
Describes source code examples. -
@image
Describes an image.
We wrap long descriptions by aligning with previous line:
/**
* @param uncompressed The input string that requires a very long
* description and an even longer description on this
* line as well.
*/
Example:
/**
* Prefix used to name Keyman keyboards in order to distinguish
* them from other Javascript objects.
*/
extern const std::string KEYMAN_KEYBOARD_NAME_PREFIX;
Example:
/**
* Buffer storing the current context, text before the input caret.
* The buffer is null terminated, and CurContext[0] is furthest from
* the caret.
*/
WCHAR CurContext[MAXCONTEXT];
Or, if you have a short description, you can use ///<
after the field:
WCHAR CurContext[MAXCONTEXT]; ///< Current context, null terminated
Example:
/**
* Returns a pointer to last n characters in the current context buffer.
*
* Returns a pointer to the character in the current context buffer which will
* have at most n valid characters remaining until the the null terminating
* character. e.g. it will be one code unit less than bufsize if that would
* have meant splitting a surrogate pair.
*
* @param[in] n The maximum number of valid characters - (code points) not
* WCHAR size (code units)
* @return Pointer to the start postion for a buffer of maximum n
* characters
*/
WCHAR *Buf(int n);
Example:
/**
* Provides an interface between Keyman Core input processing and the
* application text store. This is an abstract base class with common
* core functionality and context cache management.
*/
class AppContext
{
Credit to: http://mesos.apache.org/documentation/latest/doxygen-style-guide/
The "rules" in this section may not apply to other programming languages used in the keyman code base.
Windows™ is the original platform on which Keyman was written. Therefore there is legacy code style, conventions, and language extensions that are not consistent with common coding style as we move towards integrating common code across multiple platforms. Windows, Linux, MacOS platform code will inevitably have code that uses extensions etc needed to integrate with native platforms, breaking the guidelines for the common C++ code.
As of 25-Feb-2022 the toolchain supports C11 and C++14.
Use C or C99 types and calling convention for any API boundary, it has the broadest language foreign function interface (FFI) support. Within module implementation more complex data types are permitted.
Use of the C++ Standard Template Library, where possible, is preferred over home-grown or additional library data types.
Don’t use exceptions.
It is not a hard no, but prefer inline functions, enums, and const variables to macros.
Macros are used for including libraries for example KMN_API in keyboardprocessor_bits.h
Templates are permitted but should be used in private implementation detail and not for APIs or public interfaces. Ensure template code is well commented.
The Keyman Common Core uses the namespace km::core
The KMX keyboard processor is nested with in that name space using kmx
, that is km::core::kmx
.
Other keyboard processor implementations can follow the same pattern, for example LDML can use km::core::ldml
.
Proprietary Language Extensions are not permitted in the core cross-platform code. For platform integration components of Keyman it maybe required to use extensions, in which case that is acceptable.
There has not been a previously being a naming convention as long as it was consistent within each logical module or project.
Follow the convention of the project you are adding a file to. Otherwise, filenames should be all lowercase and can include underscores (_).
Use snake case.
std::string my_string;
All upper case.
Use snake case. All lowercase, with underscores between words.
Data members of classes, same variable names (snake case) but with with m_
prefix.??
Note: the Keyman core has used a leading underscore _
however this not recommended in C++. Followed by a lowercase letter it should be ok. Using leading underscore C++
Data members of structs, both static and non-static, are named like ordinary nonmember variables.
Use snake_case.
Header files shall contain include guards to avoid including the same header multiple times. For example
#ifndef CLASSNAME_H
#define CLASSNAME_H
#endif