Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQueryWriteClient fails with UNAUTHENTICATED exception when private-service-connect endpoint is configured #2105

Open
SujithSiddireddy opened this issue May 4, 2023 · 6 comments
Labels
api: bigquerystorage Issues related to the googleapis/java-bigquerystorage API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@SujithSiddireddy
Copy link

Environment

  • Using bigquerystorage version 2.34.2
  • Using openjdk version 1.8.0_362
  • Application is running on a custom GCP Compute Engine VM instance

Steps to reproduce

  1. Create a private service connect endpoint in GCP (https://cloud.google.com/vpc/docs/configure-private-service-connect-apis#create-endpoint)
  2. Verify if the endpoint created above is reachable from the client vm
  3. Create a BigQueryWriteClient object specifying the valid credentials and custom endpoint ( endpoint created in step 1 )
  4. Create a JSONStreamWriter object using the writeClient created in step 3
  5. Append data using the jsonStreamWriter
  6. Append operation fails with UNAUTHENTICATED exception

Code example

// The default endpoint used by BigQueryWriteClient is "bigquerystorage.googleapis.com:443"
// A private service connect endpoint is created with the endpoint name "samplepscendpoint"
// Hence, the private endpoint for bigquerystorage will be "bigquerystorage-samplepscendpoint.p.googleapis.com:443"
//   Refer : https://cloud.google.com/vpc/docs/about-accessing-google-apis-endpoints
// nslookup on bigquerystorage-samplepscendpoint.p.googleapis.com is returning the correct internal ip as configured
String customEndpoint = "bigquerystorage-samplepscendpoint.p.googleapis.com:443";
FileInputStream serviceAccountStream = new FileInputStream(serviceAccountFilePath);
Credentials credentials = ServiceAccountCredentials.fromStream(serviceAccountStream);
BigQueryWriteSettings bigQueryWriteSettings = BigQueryWriteSettings.newBuilder()
                .setCredentialsProvider(FixedCredentialsProvider.create(credentials))
                .setEndpoint(customEndpoint)
                .build();
BigQueryWriteClient bigQueryWriteClient = BigQueryWriteClient.create(bigQueryWriteSettings);
streamWriter = JsonStreamWriter.newBuilder(tableName, client).build();
// build a jsonArray with some data
ApiFuture<AppendRowsResponse> future = streamWriter.append(jsonArray);
// Append fails with UNAUTHENTICATED exception
// When customEndpoint is not explicitly set, data is appended as expected





// The same custom endpoint logic is working fine with BigQuery LegacyStreaming
// The default endpoint for BigQuery legacy streaming is "https://bigquery.googleapis.com"
String customHost = "https://bigquery-samplepscendpoint.p.googleapis.com";
BigQueryOptions bigqueryOptions = BigQueryOptions.newBuilder()
                .setCredentials(credentials)
                .setProjectId(projectId)
                .setHost(customHost)
                .build();
BigQuery client = bigqueryOptions.getService();
// build InsertAllRequest with data
InsertAllResponse response = client.insertAll(insertAllRequest);
// insertAll uploads data as expected

Stack trace

Exception in thread "main" com.google.api.gax.rpc.UnauthenticatedException: io.grpc.StatusRuntimeException: UNAUTHENTICATED: Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.
        at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:116)
        at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:41)
        at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:86)
        at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:66)
        at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
        at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:67)
        at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1132)
        at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
        at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
        at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
        at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
        at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:574)
        at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:544)
        at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
        at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
        at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
        at com.google.api.gax.grpc.ChannelPool$ReleasingClientCall$1.onClose(ChannelPool.java:541)
        at io.grpc.internal.DelayedClientCall$DelayedListener$3.run(DelayedClientCall.java:489)
        at io.grpc.internal.DelayedClientCall$DelayedListener.delayOrExecute(DelayedClientCall.java:453)
        at io.grpc.internal.DelayedClientCall$DelayedListener.onClose(DelayedClientCall.java:486)
        at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:576)
        at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
        at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:757)
        at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:736)
        at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
        at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
        Suppressed: com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed
                at com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57)
                at com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112)
                at com.google.cloud.bigquery.storage.v1.BigQueryWriteClient.getWriteStream(BigQueryWriteClient.java:441)
                at com.google.cloud.bigquery.storage.v1.JsonStreamWriter$Builder.<init>(JsonStreamWriter.java:472)
                at com.google.cloud.bigquery.storage.v1.JsonStreamWriter$Builder.<init>(JsonStreamWriter.java:418)
                at com.google.cloud.bigquery.storage.v1.JsonStreamWriter.newBuilder(JsonStreamWriter.java:395)
                at example.WriteApiDefaultMode$DataWriter.initialize(WriteApiDefaultMode.java:135)
                at example.WriteApiDefaultMode.writeToDefaultStream(WriteApiDefaultMode.java:45)
                at example.Main.writeWithDefaultMode(Main.java:60)
                at example.Main.main(Main.java:39)
Caused by: io.grpc.StatusRuntimeException: UNAUTHENTICATED: Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.
        at io.grpc.Status.asRuntimeException(Status.java:539)
        ... 17 more
@product-auto-label product-auto-label bot added the api: bigquerystorage Issues related to the googleapis/java-bigquerystorage API. label May 4, 2023
@Neenu1995 Neenu1995 added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p2 Moderately-important priority. Fix may not be included in next release. labels May 17, 2023
@idan-at
Copy link

idan-at commented Feb 4, 2024

@Neenu1995 any updates on this one? I also see setHost fails to work (using version 2.37.0)

@guderkar
Copy link

Hey we run to this issue as well when using spark-bigquery-connector which uses this library as dependency.

We were able to work around this issue by faking the DNS entry in /etc/hosts like

<psc-private-ip> bigquerystorage.googleapis.com

@yriveiro
Copy link

Hey we run to this issue as well when using spark-bigquery-connector which uses this library as dependency.

We were able to work around this issue by faking the DNS entry in /etc/hosts like

<psc-private-ip> bigquerystorage.googleapis.com

I have the same problem but in my case, the service private connect has an A record bigquerystorage-internal.p.googleapis.com pointing to IP on a private zone.

The nodes where the spark job runs resolve the DNS correctly using dig but still have the UNAUTHENTICATED error.

Did you test that faking the DNS routes the data through the private connect?

@guderkar
Copy link

Yes I did test and it works. We also have separate DNS entry for PSC but it does not work with that bigquery storage endpoint.

# Define the project, dataset, and table
project_id = "my-project-id"
dataset = "my-dataset"
table = "my-table"
 
# bigquerystorage.googleapis.com must be faked in /etc/hosts to resolve PSC private IP (10.X.X.X)
# bigquerystorage-internal.p.googleapis.com does not work properly because of bug in
# underlying java library, see https://github.com/googleapis/java-bigquerystorage/issues/2105
 
# Read BigQuery table
df = (
    spark.read.format("bigquery")
    .option("bigQueryHttpEndpoint", "https://bigquery-internal.p.googleapis.com")  # this works normally
    .option("bigQueryStorageGrpcEndpoint", "bigquerystorage.googleapis.com:443")   # this must be faked to resolve to PSC private IP otherwise you hit VPC service control denial or UNAUTEHNICATED error in case of configuring bigquerystorage-internal.p.googleapis.com:443
    .option("viewsEnabled", "true")
    .option("credentials", creds_b64)
    .option("parentProject", project_id)
    .option("project", project_id)
    .option("dataset", dataset)
    .option("table", table)
    .load()
)

@peter-gergely-horvath
Copy link

Dear @Neenu1995, would it be possible to prioritize this issue? It bites all enterprise customers really badly through the downstream components like spark-bigquery-connector that build upon this library.

@Neenu1995
Copy link
Contributor

@PhongChuong

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquerystorage Issues related to the googleapis/java-bigquerystorage API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

6 participants