feat: add doc overviews (#316)

BCDevOps · Aug 15, 2023 · 9b43b38 · 9b43b38
1 parent 6e877ec
commit 9b43b38
Show file tree

Hide file tree

Showing 5 changed files with 90 additions and 2 deletions.
diff --git a/docs/_sidebar.md b/docs/_sidebar.md
@@ -2,12 +2,18 @@
 
 * [Home](/)
 * Onboarding Planning
-** [Onboarding Process](/onboarding.md)
+** [Overview](/planning_overview.md)
 
 * Output Application Events
-** [Funbucks for Flentbit Configuration](/fluentbit.md)
+** [Overview](/app_event_overview.md)
+
+* Send Events to AWS
+** [Overview](/send_overview.md)
+** [Funbucks for Fluentbit Configuration](/fluentbit.md)
+** [Funbucks Onboarding](/onboarding.md)
 
 * Events to OpenSearch Documents
+** [Overview](/esp_overview.md)
 ** [Testing Event Stream Processing](/testing.md)
 ** [Deployment](/deployment.md)
 

diff --git a/docs/app_event_overview.md b/docs/app_event_overview.md
@@ -0,0 +1,18 @@
+
+## Best practices on creating and logging application events.
+
+* Ensure messages are informative, necessary and logged at an appropriate level (More information: https://sematext.com/blog/logging-levels/)
+* Event messages are messages. Don't dillute (pollute) your message with data structures, telemetry, metrics or other things that should be logged in different fields and/or events.
+* Ensure your logs are understandable when output at any level.
+
+## Determine if you need additional event streams
+
+An application will have more than one event stream. Things like trace analytics, metrics and reporting should go into separate streams with their own format and be sent to an indices dedicated to storing that type of data. Your team should work with 1Team to determine if new or existing indices will handle your event stream.
+
+## Commercial off-the-shelf Software
+
+It is unlikely your application will directly log to Elastic Common Schema. It is becoming more common for applications to have the option of logging to JSON. A good first step is to check your configuration options.
+
+## Other activities
+
+All servers sending documents should regularly synchronize their clock. In most cases, the server will have been setup to do this already.
diff --git a/docs/esp_overview.md b/docs/esp_overview.md
@@ -0,0 +1,35 @@
+# Event Streaming Processing Overview
+
+How to utilize the Event Streaming Processing Lambda to manipulate your events and place them in the correct index in OpenSearch.
+
+## When to use lambda
+
+We encourage using the Lambda-based parsers for data-enrichment. It is not recommended that teams implement things like:
+
+* GeoIP lookup
+* User Agent parsing
+
+There are certain "awkward" or commonly needed tasks that we also provide parsers for:
+
+* key to path (transforming a key like "data.key" to an object)
+* field hashing
+* and more
+
+Activating this parsing is done by either sending metadata to enable it or working with us to define standard parsing to do based on your data's fingerprint. Functionally, there is no difference to sending the metadata or relying on the fingerprint to define the parsing. Metadata sent will override the fingerprint metadata.
+
+# Fingerprint your data
+
+Teams should discuss the data they plan on sending and work out the fields and values to send so that the data is recognized as belonging to a supported schema and put into the correct index. This also allows us to setup standard processing for your fingerprint if desired.
+
+The following table summarizes why you might want to let the fingerprint define the processing rather than sending the desired processing with the data.
+
+| Configuration | Pro | Con |
+| ---	| --- | --- |
+| Client | <ul><li>Flexible and agile</li></ul> | <ul><li>Getting it right is up to you</li><li>Rolling out to multiple clients/projects is harder</li><li>Slightly more data sent</li></ul> |
+| Fingerprint | <ul><li>Immediately leverage improvements</li><li>No client rollout required</li></ul> | <ul><li>Opaque</li><li>Improvements could break your logs</li><li>No control over change timing</li><li>Fingerprint conflicts</li></ul> |
+
+### Index, ID and Hashes
+
+We insist that the data fingerprint be used to determine the index. This is so we can centrally control having the data sent to the appropriate index. Index management is central to managing cost and performance.
+
+We recommend that the data fingerprint be used to determine the ID and hash. In some cases, it may be necessary to tweak these on the client-side.
diff --git a/docs/planning_overview.md b/docs/planning_overview.md
@@ -0,0 +1,5 @@
+Your application's events will go on a journey from being created, sent over HTTP to AWS, parsed by a Lambda function and finally added to OpenSearch. The creation of suitable event documents and sending them to add to OpenSearch is your team's responsibility. Your team should explore what are the valuable events in your application, how you will dispatch the events, what are your data management requirements, and how you plan on forwarding the events to OpenSearch.
+
+It is also your team's responsibility that the documents stored in OpenSearch can be correlated with other logs. You are free to internally call your production environment a fun name like "RazzleDazzle." But, if you send that as the environment values your logs won't show up when someone searches for the infinitely more boring "production." The same problem occurs if you put that environment value in a non-standard field. This is partly handled by the strict schema used by each of the index patterns. Your team should explore the existing schemas and the values stored in OpenSearch.
+
+We can provide advice and examples of how teams have successfully made their application's events in OpenSearch. We can also assist with how to go from an event to an event document that fits existing schemas and has data values consistent with other applications in your event documents.
diff --git a/docs/send_overview.md b/docs/send_overview.md
@@ -0,0 +1,24 @@
+
+## Best practices in sending your documents
+
+Any tool that can login to AWS and send signed HTTP requests can send your events. The preferred tool for sending this data is FluentBit. If you are sending on-premise logs then it is your only option.
+
+FluentBit is a highly flexible tool. If you have questions about how to run or configure it then please refer to the FluentBit documentation. There is a fair amount of copy-paste configuration that every solution should send like information about the host. 1Team will be able to provide sample configurations and work with your team to help get FluentBit sending data successfully.
+
+### Sending data directly
+
+Your event stream can be directly sent using code without ever logging to a file and having a tool like FluentBit read it. Your code will need to login to AWS, renew its token and sign the requests sent to Kinesis.
+
+## Data Lifecycle
+
+Your team is responsible for your data's lifecycle. Teams are encouraged to independently store and manage the event stream you are sending to OpenSearch. We do not store the data as sent. If there are errors or dropped data, you will need to resend the data.
+
+The data lifecycles of indices in OpenSearch are optimized for capacity and performance. The lifecycles may be altered in the future for these reasons.
+
+## Funbucks
+
+Our solution for generating the FluentBit configuration logs is called Funbucks. The tool was created to consistenly output the configuration across on-permise and OpenShift servers. It enables rapid configuration of OpenShift products by allowing you to run FluentBit locally for testing and then transform the files into a yaml configuration.
+
+## Pipeline
+
+You should have a robust pipeline that can accomodate multiple destinations for your event stream documents.