pinecone-io · cwaddingham · Sep 15, 2024 · Sep 15, 2024 · Sep 15, 2024
diff --git a/confluent-rss-newsbot/.env.example b/confluent-rss-newsbot/.env.example
@@ -0,0 +1,18 @@
+# By default, we will fetch news from the past 24 hours
+LATEST_NEWS_TIMEFRAME=86400
+
+# Use https://platform.openai.com/ to get an API key 
+OPENAI_API_KEY=
+
+# Use https://app.pinecone.io to get an API key 
+PINECONE_API_KEY=
+
+# Navigate to Indexes under your Project to retrieve the Index name
+PINECONE_INDEX=
+
+# Retrieve the following from the Confluent Cloud Console.
+CONFLUENT_BOOTSTRAP_SERVERS=
+CONFLUENT_API_KEY=
+CONFLUENT_API_SECRET=
+CONFLUENT_INPUT_TOPIC=
+CONFLUENT_OUTPUT_TOPIC=
diff --git a/confluent-rss-newsbot/.eslintrc.json b/confluent-rss-newsbot/.eslintrc.json
@@ -0,0 +1,3 @@
+{
+  "extends": "next/core-web-vitals"
+}
diff --git a/confluent-rss-newsbot/.gitignore b/confluent-rss-newsbot/.gitignore
@@ -0,0 +1,54 @@
+# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
+
+# dependencies
+/node_modules
+/.pnp
+.pnp.js
+
+# testing
+/coverage
+
+# next.js
+/.next/
+/out/
+
+# production
+/build
+
+# misc
+.DS_Store
+*.pem
+*.swp
+
+# debug
+npm-debug.log*
+yarn-debug.log*
+yarn-error.log*
+/logs
+nohup.out
+output.txt
+
+# Python
+.venv/
+
+# local env files
+.env
+.env*.local
+.env*.development
+client.properties
+
+# vercel
+.vercel
+
+# typescript
+*.tsbuildinfo
+next-env.d.ts
+.env
+
+#pnpm
+pnpm-lock.yaml
+/test-results/
+/playwright-report/
+/playwright/.cache/
+.azure
+
diff --git a/confluent-rss-newsbot/README.md b/confluent-rss-newsbot/README.md
@@ -0,0 +1,69 @@
+# Building a Real-Time News Intelligence Chatbot with Confluent, Pinecone, and Azure OpenAI
+
+In this example, we'll build a full-stack application that uses Retrieval Augmented Generation (RAG) powered by [Pinecone](https://pinecone.io) and streaming data from [Confluent](https://www.confluent.io/) to deliver accurate and contextually relevant responses about current news events in a chatbot.
+
+Our application will:
+
+1. Poll a given news source (reddit news subs by default) to retrieve the latest news stories via RSS.
+2. Stream these stories to Confluent for real-time processing.
+3. Use OpenAI to generate embeddings for the news articles, which are then stored in Pinecone.
+4. Provide a chat mechanism to query about the latest news and events in the world, using the stored embeddings to provide contextually relevant responses.
+
+By the end of this tutorial, you'll have a real-time news intelligence chatbot that provides accurate responses about current events, ensuring a more effective and engaging user experience.
+
+## Step 1: Setting Up Your Next.js Application
+
+First, clone the repository and install the necessary packages:
+
+```bash
+git clone [email protected]:pinecone-field/confluent-demo.git
+cd confluent-demo
+python -m venv .venv
+source .venv/bin/activate
+pip install -r requirements.txt
+npm install
+```
+
+## Step 2: Configure Environment Variables
+
+Create a `.env.local` file in your project root and add the following:
+
+```bash
+PINECONE_API_KEY=your_pinecone_api_key
+PINECONE_INDEX=your_pinecone_index_name
+PINECONE_NAMESPACE=your_pinecone_namespace
+OPENAI_API_KEY=your_azure_openai_api_key
+CONFLUENT_BOOTSTRAP_SERVERS=your_confluent_bootstrap_servers
+CONFLUENT_API_KEY=your_confluent_api_key
+CONFLUENT_API_SECRET=your_confluent_api_secret
+```
+
+## Running the Application
+
+1. The news poller
+
+2. The news processor
+
+3. The Next.js development server
+
+You can create a custom script in your `package.json` to run all of these concurrently, or you can run them in separate terminal windows.
+
+```javascript
+{
+  "scripts": {
+    "dev": "next dev",
+    "server": "python src/scripts/server.py",
+    "start": "concurrently \"npm run server\" \"npm run dev\""
+  }
+}
+```
+
+Then you can start the entire application with:
+
+```bash
+npm run start
+```
+
+## Conclusion
+
+You've now built a real-time news intelligence chatbot that uses Retrieval Augmented Generation (RAG) powered by Pinecone and OpenAI, using Confluent to stream news data. This application demonstrates how to integrate multiple technologies to create a powerful and contextually relevant chatbot.
diff --git a/confluent-rss-newsbot/next-steps.md b/confluent-rss-newsbot/next-steps.md
@@ -0,0 +1,93 @@
+# Next Steps after `azd init`
+
+## Table of Contents
+
+1. [Next Steps](#next-steps)
+2. [What was added](#what-was-added)
+3. [Billing](#billing)
+4. [Troubleshooting](#troubleshooting)
+
+## Next Steps
+
+### Provision infrastructure and deploy application code
+
+Run `azd up` to provision your infrastructure and deploy to Azure (or run `azd provision` then `azd deploy` to accomplish the tasks separately). Visit the service endpoints listed to see your application up-and-running!
+
+To troubleshoot any issues, see [troubleshooting](#troubleshooting).
+
+### Configure environment variables for running services
+
+Configure environment variables for running services by updating `settings` in [main.parameters.json](./infra/main.parameters.json).
+
+### Configure CI/CD pipeline
+
+1. Create a workflow pipeline file locally. The following starters are available:
+   - [Deploy with GitHub Actions](https://github.com/Azure-Samples/azd-starter-bicep/blob/main/.github/workflows/azure-dev.yml)
+   - [Deploy with Azure Pipelines](https://github.com/Azure-Samples/azd-starter-bicep/blob/main/.azdo/pipelines/azure-dev.yml)
+2. Run `azd pipeline config` to configure the deployment pipeline to connect securely to Azure.
+
+## What was added
+
+### Infrastructure configuration
+
+To describe the infrastructure and application, `azure.yaml` along with Infrastructure as Code files using Bicep were added with the following directory structure:
+
+```yaml
+- azure.yaml     # azd project configuration
+- infra/         # Infrastructure as Code (bicep) files
+  - main.bicep   # main deployment module
+  - app/         # Application resource modules
+  - shared/      # Shared resource modules
+  - modules/     # Library modules
+```
+
+Each bicep file declares resources to be provisioned. The resources are provisioned when running `azd up` or `azd provision`.
+
+- [app/pinecone-rag-demo.bicep](./infra/app/pinecone-rag-demo.bicep) - Azure Container Apps resources to host the 'pinecone-rag-demo' service.
+- [shared/keyvault.bicep](./infra/shared/keyvault.bicep) - Azure KeyVault to store secrets.
+- [shared/monitoring.bicep](./infra/shared/monitoring.bicep) - Azure Log Analytics workspace and Application Insights to log and store instrumentation logs.
+- [shared/registry.bicep](./infra/shared/registry.bicep) - Azure Container Registry to store docker images.
+
+More information about [Bicep](https://aka.ms/bicep) language.
+
+### Build from source (no Dockerfile)
+
+#### Build with Buildpacks using Oryx
+
+If your project does not contain a Dockerfile, we will use [Buildpacks](https://buildpacks.io/) using [Oryx](https://github.com/microsoft/Oryx/blob/main/doc/README.md) to create an image for the services in `azure.yaml` and get your containerized app onto Azure.
+
+To produce and run the docker image locally:
+
+1. Run `azd package` to build the image.
+2. Copy the *Image Tag* shown.
+3. Run `docker run -it <Image Tag>` to run the image locally.
+
+#### Exposed port
+
+Oryx will automatically set `PORT` to a default value of `80` (port `8080` for Java). Additionally, it will auto-configure supported web servers such as `gunicorn` and `ASP .NET Core` to listen to the target `PORT`. If your application already listens to the port specified by the `PORT` variable, the application will work out-of-the-box. Otherwise, you may need to perform one of the steps below:
+
+1. Update your application code or configuration to listen to the port specified by the `PORT` variable
+1. (Alternatively) Search for `targetPort` in a .bicep file under the `infra/app` folder, and update the variable to match the port used by the application.
+
+## Billing
+
+Visit the *Cost Management + Billing* page in Azure Portal to track current spend. For more information about how you're billed, and how you can monitor the costs incurred in your Azure subscriptions, visit [billing overview](https://learn.microsoft.com/azure/developer/intro/azure-developer-billing).
+
+## Troubleshooting
+
+Q: I visited the service endpoint listed, and I'm seeing a blank page, a generic welcome page, or an error page.
+
+A: Your service may have failed to start, or it may be missing some configuration settings. To investigate further:
+
+1. Run `azd show`. Click on the link under "View in Azure Portal" to open the resource group in Azure Portal.
+2. Navigate to the specific Container App service that is failing to deploy.
+3. Click on the failing revision under "Revisions with Issues".
+4. Review "Status details" for more information about the type of failure.
+5. Observe the log outputs from Console log stream and System log stream to identify any errors.
+6. If logs are written to disk, use *Console* in the navigation to connect to a shell within the running container.
+
+For more troubleshooting information, visit [Container Apps troubleshooting](https://learn.microsoft.com/azure/container-apps/troubleshooting).
+
+### Additional information
+
+For additional information about setting up your `azd` project, visit our official [docs](https://learn.microsoft.com/azure/developer/azure-developer-cli/make-azd-compatible?pivots=azd-convert).
diff --git a/confluent-rss-newsbot/next.config.js b/confluent-rss-newsbot/next.config.js
@@ -0,0 +1,19 @@
+/** @type {import('next').NextConfig} */
+const nextConfig = {
+  webpack: (config, { isServer }) => {
+    if (!isServer) {
+      config.resolve.fallback = {
+        ...config.resolve.fallback,
+        net: false,
+        tls: false,
+        fs: false,
+        child_process: false,
+        crypto: require.resolve('crypto-browserify'),
+        path: false,
+      };
+    }
+    return config;
+  },
+};
+
+module.exports = nextConfig;