Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaning up Mesos from paasta readthedocs - PAASTA-18313 #3954

Merged
merged 4 commits into from
Oct 16, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 29 additions & 44 deletions docs/source/about/glossary.rst
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should maybe update the smartstack entry in the glossary too - we don't really run smartstack anymore (or at least, we no longer run synapse), but we still use the "smartstack" naming to refer to the mesh/mesh config

that said, might be better to do that cleanup in another PR so that the scope doesn't grow too much here :p

Original file line number Diff line number Diff line change
@@ -1,37 +1,54 @@
Glossary
========

**App**
~~~~~~~~

Marathon app. A unit of configuration in Marathon. During normal
operation, one service "instance" maps to one Marathon app, but during
deploys there may be more than one app. Apps contain Tasks.

**Docker**
~~~~~~~~~~

Container `technology <https://www.docker.com/whatisdocker/>`_ that
PaaSTA uses.

**Kubernetes**
~~~~~~~~~~~~~~

`Kubernetes <https://kubernetes.io/>`_ (a.k.a. k8s) is the open-source system on which Yelp runs many compute workloads.
In Kubernetes, tasks are distributed to and run by servers called Kubelets (but a.k.a. kube nodes or Kubernetes agents) from the Kubernetes control plane.

**Kubernetes Node**
~~~~~~~~~~~~~~~~~~~

A node is a worker machine in a Kubernetes cluster that runs Pods.
In our case, it's usually a virtual machine provisioned via AWS EC2 Fleets or AutoScalingGroups

**Kubernetes Horizontal Pod Autoscaler (HPA)**
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It's a Kubernetes feature that automatically scales the number of pods in a deployment based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be consistent with the other definitions:

Suggested change
It's a Kubernetes feature that automatically scales the number of pods in a deployment based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics).
A Kubernetes feature that automatically scales the number of pods in a deployment based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics).


**clustername**
~~~~~~~~~~~~~~~

A shortname used to describe a PaaSTA cluster. Use \`paasta
list-clusters\` to see them all.

**Kubernetes pod**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically a pod in k8s is capitalized Pod - but we use pod a lot and it might get annoying to always capitalize

that said, if we want to be consistent we should s/pod/Pod

~~~~~~~~~~~~~~~~~~~

Atomic deployment unit for PaaSTA workloads at Yelp and all Kubernetes clusters. Can be thought of as a collection of 1 or more related containers.
Pods can be seen as one or more containers that share a network namespace, at Yelp these are individual instances of one of our services, many can run on each server.

**instancename**
~~~~~~~~~~~~~~~~

Logical collection of Mesos tasks that comprise a Marathon app. service
name + instancename = Marathon app name. Examples: main, canary.
Logical collection of Kubernetes pods that comprise a Kubernetes Deployment. service
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe something like:

Suggested change
Logical collection of Kubernetes pods that comprise a Kubernetes Deployment. service
Logical collection of Kubernetes pods that comprise an application deployed on Kubernetes. service

would be simpler for folks that don't know what a pod is? we haven't really defined what a Deployment is yet in k8s in this glossary otherwise :p

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, will add also what a deployment is to the glossary :p

name + instancename = Kubernetes Deployment. Examples: main, canary. Each instance represents a running
nemacysts marked this conversation as resolved.
Show resolved Hide resolved
version of a service with its own configuration and resources.

**namespace**
~~~~~~~~~~~~~

An haproxy/SmartStack concept grouping backends that listen on a
particular port. A namespace may route to many healthy Marathon
instances. By default, the namespace in which a Marathon job appears is
particular port. A namespace may route to many healthy paaSTA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
particular port. A namespace may route to many healthy paaSTA
particular port. A namespace may route to many healthy PaaSTA

instances. By default, the namespace in which a Kubernetes deployment appears is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, not sure if we should use deployment or Deployment here - or abstract this out a bit and just call this a "Kubernetes application" or "an application running on Kubernetes"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should just say "paasta instance" since also an application is not defined here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will define as well kubernetes namespace up in the glossary

its instancename.

**Nerve**
Expand All @@ -40,32 +57,6 @@ its instancename.
A service announcement `daemon <https://github.com/airbnb/nerve>`_
that registers services in zookeeper to be discovered.

**Marathon**
~~~~~~~~~~~~

A `Mesos Framework <https://mesosphere.github.io/marathon/>`_
designed to deploy stateless services.

**Mesos**
~~~~~~~~~

A `Cluster/Scheduler <http://mesos.apache.org/>`_ that interacts
with other `Framework <https://docs.mesosphere.com/frameworks/>`_
software to run things on nodes.

**Mesos Master**
~~~~~~~~~~~~~~~~

A machine running a Mesos Master process, responsible for coordination
but not responsible for actually running Marathon or Tron jobs. There
are several Masters, coordinating as a quorum via Zookeeper.

**Mesos Slave**
~~~~~~~~~~~~~~~

A machine running a Mesos Slave process, responsible for running
Marathon or Tron jobs as assigned by the Mesos Master.

**PaaSTA**
~~~~~~~~~~

Expand All @@ -87,12 +78,6 @@ The brand name for Airbnb’s Nerve + Synapse service discovery solution.

A local haproxy daemon that runs on yocalhost

**Task**
~~~~~~~~

Marathon task. A process (usually inside a Docker container) running on
a machine (a Mesos Slave). One or more Tasks constitutes an App.

**soa-configs**
~~~~~~~~~~~~~~~

Expand All @@ -107,5 +92,5 @@ services.
**Zookeeper**
~~~~~~~~~~~~~

A distributed key/value store used by Mesos for coordination and
A distributed key/value store used by PaaSTA for coordination and
persistence.
2 changes: 1 addition & 1 deletion docs/source/about/paasta_principles.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ a particular app in a theoretical PaaS:
+=============================================+=====================================+
| :: | :: |
| | |
| $ cat >marathon-cluster.yaml <<EOF | |
| $ cat >kubernetes-cluster.yaml <<EOF | |
| web: | |
| env: | |
| PRODUCTION: true | $ paas config:set PRODUCTION=true |
Expand Down
72 changes: 16 additions & 56 deletions docs/source/about/smartstack_interaction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,70 +38,30 @@ can be reached.
What Would Happen if PaaSTA Were Not Aware of SmartStack
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

PaaSTA uses `Marathon <https://mesosphere.github.io/marathon/>`_ to deploy
PaaSTA uses `Kubernetes <https://kubernetes.io/>`_ to deploy
long-running services. At Yelp, PaaSTA clusters are deployed at the
``superregion`` level. This means that a service could potentially be deployed
on any available host in that ``superregion`` that has resources to run it. If
PaaSTA were unaware of the Smartstack ``discover:`` settings, Marathon would
naively deploy tasks in a potentially "unbalanced" manner:
PaaSTA were unaware of the Smartstack ``discover:`` settings, Kubernetes scheduler would
naively deploy pods in a potentially "unbalanced" manner:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think we ported any of the mesos-smartstack logic for placement constraints to k8s - i think the most we do atm is have a global topology spread constraint that tries to balance things out across AZs, but that's independent of what the smartstack config is


.. image:: unbalanced_distribution.svg
:width: 700px

With the naive approach, there is a total of six tasks for the superregion, but
With the naive approach, there is a total of six pods for the superregion, but
four landed in ``region 1``, and two landed in ``region 2``. If
the ``discover`` setting were set to ``habitat``, there would be habitats
**without** tasks available to serve anything, likely causing an outage.
**without** pods available to serve anything, likely causing an outage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be worth copying over some of the words from our internal docs - in reality, there probably wouldn't be an outage in this case since (iirc, just doing this off the top of my head) most services have a list for their advertise section and the documented behavior internally is

If you have a list in the "advertise" section and there are no healthy instances in the smaller location type, smartstack will start discovering your service instances in the larger region (see last point in the "advertise" paragraph)

...that said: i'm not sure if we carried this over to the actual non-smartstack world in which we currently live in :P

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will delete this whole section of "What Would Happen if PaaSTA Were Not Aware of SmartStack" cause it doesnt make sense as paasta is not aware of smartstack config


In a world with configurable SmartStack discovery settings, the deployment
system (Marathon) must be aware of these and deploy accordingly.
system (Kubernetes) must be aware of these and deploy accordingly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably remove this since k8s/paasta aren't actually doing anything placement-wise with the smartstack settings :p


What A SmartStack-Aware Deployment Looks Like
How to set PaaSTA to be aware of SmartStack
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

By taking advantage of
`Marathon Constraint Language <https://mesosphere.github.io/marathon/docs/constraints.html>`_
, specifically the
`GROUP_BY <https://mesosphere.github.io/marathon/docs/constraints.html#group_by-operator>`_
operator, Marathon can deploy tasks in such a way as to ensure a balanced number
of tasks in each latency zone.

Example: Balanced deployment to every habitat
*********************************************

For example, if the SmartStack setting
were ``discover: habitat`` [1]_, we Marathon could enforce the constraint
``["habitat", "GROUP_BY"]``, which will ask Marathon to distribute tasks
evenly between the habitats[2]_:

.. image:: balanced_distribution.svg
:width: 700px

Example: Deployment balanced to each region
*******************************************

Similarly, if the ``discover`` setting were set to ``region``, the equivalent
Marathon constraint would ensure an equal number of tasks distributed to each region.

.. image:: balanced_distribution_region.svg
:width: 700px

Even though there some habitats in this diagram that lack the service, the
``discover: region`` setting allows clients to utilize *any* process as long
as it is in the local region. The Marathon constraint of ``["region", "GROUP_BY"]``
ensures that tasks are distributed equally over the regions, in this case three
in each.


.. [1] Technically PaaSTA should be using the smallest value of the ``advertise``
setting, tracked in `PAASTA-1253 <https://jira.yelpcorp.com/browse/PAASTA-1253>`_.
.. [2] Currently the ``instances:`` count represents the total number of
instances in the cluster. Eventually with `PAASTA-1254 <https://jira.yelpcorp.com/browse/PAASTA-1254>`_
the instance count will be a per-discovery-location setting, meaning there
will always be an equal number of instances per location. (With ``instances: 6``
and a ``discovery: habitat``, and three habitats, the total task count would be
18, 6 in each habitat.)
PaaSTA is not natively aware of SmartStack, to make it aware or more specifically Kubernetes scheduler aware, we can use Pod Topology Spread Contraints.
To balance pods across Availability Zones (AZs) in Kubernetes, we use `topology spread contraints <https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/>`_. By using the key
"topology_spread_constraints" in soa-configs to assign it for each instance of a service.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be better positioned as a section about how PaaSTA is not aware of the smartstack settings for a service atm + what we currently do to spread things out


How SmartStack Settings Influence Monitoring
--------------------------------------------
Expand All @@ -116,7 +76,7 @@ Example: Checking Each Habitat When ``discover: habitat``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If SmartStack is configured to ``discover: habitat``, PaaSTA configures
Marathon to balance tasks to each habitat. But what if it is unable to do that?
Kubernetes to balance tasks to each habitat. But what if it is unable to do that?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sentence isn't true anymore - we unconditionally try to balance across habitats these days


.. image:: replication_alert_habitat.svg
:width: 700px
Expand Down Expand Up @@ -154,7 +114,7 @@ Example: Checking Each Region When ``discover: region``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If SmartStack is configured to ``discover: region``, PaaSTA configures
Marathon to balance tasks to each region. But what if it is unable to launch
Kubernetes to balance tasks to each region. But what if it is unable to launch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here - with the additional bit that with the yelpy definition of "region" from y/habitat, we don't actually have any clusters that span a region since all the superregions in which we have k8s clusters only contain a single region :p

all the tasks, but there were tasks running in that region?

.. image:: replication_noalert_region.svg
Expand Down Expand Up @@ -189,9 +149,9 @@ components of the same service on different ports. In PaaSTA we call these
api:
proxy_port: 20002

The corresponding Marathon configuration in PaaSTA might look like this::
The corresponding Kubernetes configuration in PaaSTA might look like this::

#marathon.yaml
#kubernetes.yaml
main:
instances: 10
cmd: myserver.py
Expand All @@ -214,7 +174,7 @@ the same Nerve namespace. Consider this example::
main:
proxy_port: 20001

#marathon.yaml
#kubernetes.yaml
main:
instances: 10
cmd: myserver.py
Expand All @@ -238,7 +198,7 @@ Sharding is another use case for using alternative namespaces::
main:
proxy_port: 20001

#marathon.yaml
#kubernetes.yaml
shard1:
instances: 10
registrations: ['service.main']
Expand Down
11 changes: 1 addition & 10 deletions docs/source/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ System Package Building / itests
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

PaaSTA is distributed as a debian package. This package can be built and tested
with ``make itest_xenial``. These tests make assertions about the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, what a throwback!

with ``make itest_<os_codename>``. These tests make assertions about the
packaging implementation.


Expand Down Expand Up @@ -71,12 +71,3 @@ it is a little tricky.
* ``eval "$(.tox/py27/bin/register-python-argcomplete ./tox/py27/bin/paasta)"``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oof, this is super out of date - but we can probably fix that in a later PR since I'm not actually sure that tab-complete works (although, i haven't tried it in a while and i assume we don't generate fish completions :p)


* There is a simple integration test. See the itest/ folder.

Upgrading Components
--------------------

As things progress, there will come a time that you will have to upgrade
PaaSTA components to new versions.

* See `Upgrading Mesos <upgrading_mesos.html>`_ for how to upgrade Mesos safely.
* See `Upgrading Marathon <upgrading_marathon.html>`_ for how to upgrade Marathon safely.
nemacysts marked this conversation as resolved.
Show resolved Hide resolved
4 changes: 0 additions & 4 deletions docs/source/generated/paasta_tools.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ Subpackages
paasta_tools.frameworks
paasta_tools.instance
paasta_tools.kubernetes
paasta_tools.mesos
paasta_tools.metrics
paasta_tools.monitoring
paasta_tools.paastaapi
Expand Down Expand Up @@ -71,9 +70,6 @@ Submodules
paasta_tools.log_task_lifecycle_events
paasta_tools.long_running_service_tools
paasta_tools.mac_address
paasta_tools.marathon_dashboard
paasta_tools.mesos_maintenance
paasta_tools.mesos_tools
paasta_tools.monitoring_tools
paasta_tools.monkrelaycluster_tools
paasta_tools.nrtsearchservice_tools
Expand Down
17 changes: 4 additions & 13 deletions docs/source/installation/example_cluster.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,6 @@ everything with ``docker-compose down && docker-compose run playground``.
Getting Started
---------------

Mesos
~~~~~
To launch a running Mesos cluster, then run ``docker-compose run playground``
and you'll be dropped into a shell with the paasta\_tools package installed in development mode.

Kubernetes
~~~~~~~~~~
To instead launch a Kubernetes cluster, run
Expand All @@ -47,9 +42,7 @@ Try it out
The cluster includes a git remote and docker registry. The git remote
contains an example repo but you can add more if you want.

The mesos and marathon webuis are exposed on your docker host
on port 5050, 8080, 8081. So load them up if you want to watch. Then in
the playground container:
In the playground container:

::

Expand All @@ -63,9 +56,8 @@ the playground container:

Scaling The Cluster
-------------------
If you want to add more capacity to the cluster, you can increase the number of Mesos agents/Kubernetes Nodes:
If you want to add more capacity to the cluster, you can increase the number of Kubernetes Nodes:

``docker-compose scale mesosslave=4`` or
``docker-compose scale kubernetes=4``


Expand All @@ -79,9 +71,8 @@ Some but not all of the paasta command line tools should work. Try:
paasta status -s hello-world

Scribe is not included with this example cluster. If you are looking for
logs, check ``/var/logs/paasta_logs`` and syslog on the mesosmaster for
the output from cron. Also note that all the slaves share the host's
docker daemon.
logs, check syslog on the kubernetes node that the pod is running on for the output from cron.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
logs, check syslog on the kubernetes node that the pod is running on for the output from cron.
logs, check syslog on the Kubernetes node that the pod is running on for the output from cron.

You can get the host the pod is running on by adding "-v" to the command above.

Cleanup
-------
Expand Down
Loading
Loading