Skip to content

dazza-codes/aws-lambda-layer-packing

Repository files navigation

AWS Lambda Packaging

Getting Started

Use the make recipes to build and deploy lambda layers for this project. The layer builds use a docker container with the AWS lambda build system for python (3.6 is used at the time of writing, depends on the venv).

The layer packages are based on the pyproject.toml specs; by using poetry export to identify the project dependencies and poetry build to create a pip wheel for the project. The layer zip archives are generated in /tmp/{project_name}/*.zip.

  • make docker-boto-libs checks the boto libraries in an AWS Lambda runtime
  • make docker-base-build prepares the docker base image to build layers
  • make docker-base-shell a bash shell in a base container
  • make docker-test-build prepares the docker test image to run tests
  • make docker-test-shell a bash shell in a test container
  • make docker-test-run runs the docker test image
    • this runs a pytest suite on a pip installation that is like the layer
    • this is not a trivial test setup and test suite (it might be fragile)
  • make layer-package builds one or more lambda layers
    • it reports on the location of the layer artifacts and their sizes
    • it does not publish anything to AWS S3 or Lambda
  • make layer-size reports the total size of each layer
    • this report is also provided by make layer-package
    • this assumes that make layer-package has been run already
  • make layer-publish pushes updated layers to AWS S3 and Lambda
    • this assumes that make layer-package has been run already
    • use source aws_profile.sh and aws-profile to manage AWS credentials
    • use make aws-check and make aws-settings to verify AWS details
    • this assumes that AWS S3_BUCKET is defined allows uploads
    • this assumes that AWS credentials are available and allow publishing
    • this calculates zip archive metadata to detect when an update is required (or not)
  • make function-publish pushes a test lambda function to AWS Lambda
    • at present, the AWS console is required to update the layer it uses

After make layer-package, check the layer sizes are within limits, and then use make layer-publish to release new versioned layers.

$ make layer-size

MAX BYTES   262144000 bytes
                28458 bytes in py36-lambda-project-0.1.0-nodeps.zip
            249618460 bytes in py36-lambda-project-0.1.0-pyarrow.zip
            254553398 bytes in py36-lambda-project-0.1.0-sql.zip
            228347643 bytes in py36-lambda-project-0.1.0-xarray.zip
            209424398 bytes in py36-lambda-project-0.1.0.zip

These package sizes are just an example - as this project for lambda packaging evolves, the optional package details may change and some details of the package optimizations may change. The packages that include optional extra dependencies use a suffix to suggest the extras included, but check the project pyproject.toml for details. Some layer packages with additional extras are too large for AWS lambda. If a project requires optional extras that exceed the AWS Lambda limits, the project will need to use a docker container or other solution that isolates only the dependencies required in some custom dependency solution. Additional details about optimizing packages are noted below.

AWS Lambda Notes

The following notes relate to packaging this project for AWS lambda. The creation of a lambda layer for the project is not trivial, for several reasons:

  • the library versions and APIs are complicated
  • the CRS systems are complex in python libraries
  • the size of the dependency libraries and restrictions on lambda package size

The builds follow the guidelines from the AWS knowledge center. It recommends using the lambci/lambda Docker images, to simulate the live Lambda environment and create a layer that's compatible with the runtimes that you specify. For more information, see lambci/lambda on the Docker website. Note that lambci/lambda images are not an exact copy of the Lambda environment and some files may be missing. The AWS Serverless Application Model (AWS SAM) also uses the lambci/lambda Docker images when you run sam local start-api

See also

Lambda Limits

See https://docs.aws.amazon.com/lambda/latest/dg/limits.html

  • Function and layer storage: 75 GB
  • Deployment package size
    • 250 MB (unzipped, including layers)
    • 50 MB (zipped, for direct upload)
    • 3 MB (console editor)

AWS SDK packages

Use poetry show -t --no-dev, pip freeze or pipdeptree to check poetry installed versions and pin common deps to use the same, consistent versions in lambda layers.

AWS lambda bundles the python SDK in lambda layers, but they advise that bundling it into a project layer is a best practice.

See also the current versions of botocore in lambda - listed at

Also consider what is supported by aiobotocore, see:

To view the packages installed in the lambci/lambda image for python (3.6, as of Nov, 2020):

$ make docker-boto-libs
...
/var/runtime/boto3-1.15.16.dist-info
/var/runtime/botocore-1.18.16.dist-info
/var/runtime/botocore-1.18.18.dist-info  # weird - two botocore versions!
/var/runtime/certifi-2020.6.20.dist-info
/var/runtime/chardet-3.0.4.dist-info
/var/runtime/idna-2.10.dist-info
/var/runtime/jmespath-0.10.0.dist-info
/var/runtime/python_dateutil-2.8.1.dist-info
/var/runtime/s3transfer-0.3.3.dist-info
/var/runtime/six-1.15.0.dist-info
/var/runtime/urllib3-1.25.11.dist-info
...

The layer_create_zip.sh:clean_aws_packages function will remove all of these SDK packages from layer zip files. It might not discriminate package versions that differ from the SDK versions. (It currently requires checking boto* package specs in pyproject.toml, requirements.dev and the layer_create_zip.sh:pin_lambda_sdk function; some manual specs are required because the actual AWS Lambda production system might be using older versions than those actually in the docker image at the time of running a container.)

The lambda/layer_builds.sh will use the SDK versions provided to try to pin dependencies to those versions. For project dependencies that require incompatible versions, a pip check should identify the problem for that layer during the build.

Splitting Layers for Large Packages

There are several large packages for scientific python projects:

$ du -sh /opt/python/lib/python3.6/site-packages/* | grep -E '^[0-9]*M' | sort
11M	/opt/python/lib/python3.6/site-packages/numba
14M	/opt/python/lib/python3.6/site-packages/numpy
24M	/opt/python/lib/python3.6/site-packages/pandas
24M	/opt/python/lib/python3.6/site-packages/pyproj
32M	/opt/python/lib/python3.6/site-packages/numpy.libs
44M	/opt/python/lib/python3.6/site-packages/fiona
55M	/opt/python/lib/python3.6/site-packages/rasterio
57M	/opt/python/lib/python3.6/site-packages/llvmlite

The pip show command can list a package dependencies and the packages that depend on it, e.g.

$ python -m pip show boto3
Name: boto3
Version: 1.12.49
Summary: The AWS SDK for Python
Home-page: https://github.com/boto/boto3
Author: Amazon Web Services
Author-email: UNKNOWN
License: Apache License 2.0
Location: /opt/conda/envs/aws-lambda-layer-packing/lib/python3.6/site-packages
Requires: jmespath, s3transfer, botocore
Required-by: moto, aws-sam-translator

The dependency graph can be displayed and explored using poetry show -t and poetry show -t {package}. For example, the llvmlite package is a dependency of numba, which is a dependency of fastparquet, which also depends on pandas and therefore numpy:

$ poetry show -t fastparquet
fastparquet 0.3.3 Python support for Parquet file format
├── numba >=0.28
│   ├── llvmlite >=0.33.0.dev0,<0.34
│   ├── numpy >=1.15
│   └── setuptools *
├── numpy >=1.11
├── pandas >=0.19
│   ├── numpy >=1.13.3
│   ├── python-dateutil >=2.6.1
│   │   └── six >=1.5
│   └── pytz >=2017.2
├── six *
└── thrift >=0.11.0
    └── six >=1.7.2

Using pipdeptree can also identify package dependencies, including reverse dependencies. For example, it is useful to remove all the packages that lambda already provides, like boto3, and it can help to find anything that depends on it:

$ pip install pipdeptree
$ pipdeptree --help
$ pipdeptree -r -p boto3
boto3==1.12.49
  - aws-sam-translator==1.25.0 [requires: boto3~=1.5]
    - cfn-lint==0.34.1 [requires: aws-sam-translator>=1.25.0]
      - moto==1.3.14 [requires: cfn-lint>=0.4.0]
  - moto==1.3.14 [requires: boto3>=1.9.201]

To isolate the dependency tree to only the package libs without any development libs, it can help to create a clean virtualenv and only install the required packages. After initial analysis of the dep-tree, then install package extras and repeat the analysis.

# create and activate a new venv any way you like
$ poetry install  # only required packages
$ pip install pipdeptree
$ pipdeptree -p boto3  # what does boto3 depend on
$ pipdeptree -r -p boto3  # what depends on boto3

In a similar way, continue with dependencies of a dependency, such as the dependencies of boto3 and so on.

About

Build and publish AWS lambda layers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published