You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug, including details regarding any error messages, version, and platform.
I am trying to use parrow.fs.S3FileSystem in a pod running on AWS EKS. The cluster is configured so that the pod assumes an IAM role via Pod Identity Association.
S3FileSystem seems to have no way to obtain credentials from the Pod Identity Association directly. When instantiated with no arguments, as in S3FileSystem(), it gains no access (receives ACCESS_DENIED on get_file_info for example).
I am able to give S3FileSytem access however by first manually obtaining temporary credentials (access key, secret key and session token) from the Pod Identity Association (e.g. through boto3) and then either passing the temporary credentials as arguments when instantiating S3FileSytem, or storing them in environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_SESSION_TOKEN.
However, in some systems, e.g. Ray Train, an instance of S3FileSytem is created and used potentially for a very long time (e.g. for the duration of training a large model which could take days). Relying on a single set of expiring credentials is therefore limiting.
It is worth noting that if S3FileSytem is instantiated with no temporary credentials in constructor arguments, then I am able to keep updating the temporary credentials in environment variables when they expire and the instance of S3FileSytem will use the refreshed ones on every method call (such as get_file_info). This is the only method I found to give S3FileSytem long-term access through Pod Identity Association beyond the expiry of a single set of temporary credentials.
However, the method of updating environment variables has its own drawbacks:
the environment variables affect the entire Python process - it is not possible to give specific credentials to the instance of S3FileSytem.
with some libraries that internally use S3FileSytem during a long operation (e.g. Ray Train), the user may not necessarily get a reliable opportunity (e.g. through a callback etc.) to update the environment variables. There is always the option of doing it on a separate Python thread, but that may risk race conditions.
Due to the reasons above, it would be very beneficial if S3FileSytem was able to automatically (internally) gain access through EKS Pod Identity Association, and maintain that access beyond the expiry of a single set of temporary credentials.
Component(s)
Python
The text was updated successfully, but these errors were encountered:
Describe the bug, including details regarding any error messages, version, and platform.
I am trying to use parrow.fs.S3FileSystem in a pod running on AWS EKS. The cluster is configured so that the pod assumes an IAM role via Pod Identity Association.
S3FileSystem seems to have no way to obtain credentials from the Pod Identity Association directly. When instantiated with no arguments, as in
S3FileSystem()
, it gains no access (receives ACCESS_DENIED onget_file_info
for example).I am able to give S3FileSytem access however by first manually obtaining temporary credentials (access key, secret key and session token) from the Pod Identity Association (e.g. through
boto3
) and then either passing the temporary credentials as arguments when instantiating S3FileSytem, or storing them in environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_SESSION_TOKEN.However, in some systems, e.g. Ray Train, an instance of S3FileSytem is created and used potentially for a very long time (e.g. for the duration of training a large model which could take days). Relying on a single set of expiring credentials is therefore limiting.
It is worth noting that if S3FileSytem is instantiated with no temporary credentials in constructor arguments, then I am able to keep updating the temporary credentials in environment variables when they expire and the instance of S3FileSytem will use the refreshed ones on every method call (such as
get_file_info
). This is the only method I found to give S3FileSytem long-term access through Pod Identity Association beyond the expiry of a single set of temporary credentials.However, the method of updating environment variables has its own drawbacks:
Due to the reasons above, it would be very beneficial if S3FileSytem was able to automatically (internally) gain access through EKS Pod Identity Association, and maintain that access beyond the expiry of a single set of temporary credentials.
Component(s)
Python
The text was updated successfully, but these errors were encountered: