Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions confluent-rss-newsbot/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# By default, we will fetch news from the past 24 hours
LATEST_NEWS_TIMEFRAME=86400

# Use https://platform.openai.com/ to get an API key
OPENAI_API_KEY=

# Use https://app.pinecone.io to get an API key
PINECONE_API_KEY=

# Navigate to Indexes under your Project to retrieve the Index name
PINECONE_INDEX=

# Retrieve the following from the Confluent Cloud Console.
CONFLUENT_BOOTSTRAP_SERVERS=
CONFLUENT_API_KEY=
CONFLUENT_API_SECRET=
CONFLUENT_INPUT_TOPIC=
CONFLUENT_OUTPUT_TOPIC=
3 changes: 3 additions & 0 deletions confluent-rss-newsbot/.eslintrc.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"extends": "next/core-web-vitals"
}
54 changes: 54 additions & 0 deletions confluent-rss-newsbot/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.

# dependencies
/node_modules
/.pnp
.pnp.js

# testing
/coverage

# next.js
/.next/
/out/

# production
/build

# misc
.DS_Store
*.pem
*.swp

# debug
npm-debug.log*
yarn-debug.log*
yarn-error.log*
/logs
nohup.out
output.txt

# Python
.venv/

# local env files
.env
.env*.local
.env*.development
client.properties

# vercel
.vercel

# typescript
*.tsbuildinfo
next-env.d.ts
.env

#pnpm
pnpm-lock.yaml
/test-results/
/playwright-report/
/playwright/.cache/
.azure

69 changes: 69 additions & 0 deletions confluent-rss-newsbot/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Building a Real-Time News Intelligence Chatbot with Confluent, Pinecone, and Azure OpenAI

In this example, we'll build a full-stack application that uses Retrieval Augmented Generation (RAG) powered by [Pinecone](https://pinecone.io) and streaming data from [Confluent](https://www.confluent.io/) to deliver accurate and contextually relevant responses about current news events in a chatbot.

Our application will:

1. Poll a given news source (reddit news subs by default) to retrieve the latest news stories via RSS.
2. Stream these stories to Confluent for real-time processing.
3. Use OpenAI to generate embeddings for the news articles, which are then stored in Pinecone.
4. Provide a chat mechanism to query about the latest news and events in the world, using the stored embeddings to provide contextually relevant responses.

By the end of this tutorial, you'll have a real-time news intelligence chatbot that provides accurate responses about current events, ensuring a more effective and engaging user experience.

## Step 1: Setting Up Your Next.js Application

First, clone the repository and install the necessary packages:

```bash
git clone [email protected]:pinecone-field/confluent-demo.git
cd confluent-demo
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
npm install
```

## Step 2: Configure Environment Variables

Create a `.env.local` file in your project root and add the following:

```bash
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_INDEX=your_pinecone_index_name
PINECONE_NAMESPACE=your_pinecone_namespace
OPENAI_API_KEY=your_azure_openai_api_key
CONFLUENT_BOOTSTRAP_SERVERS=your_confluent_bootstrap_servers
CONFLUENT_API_KEY=your_confluent_api_key
CONFLUENT_API_SECRET=your_confluent_api_secret
```

## Running the Application

1. The news poller

2. The news processor

3. The Next.js development server

You can create a custom script in your `package.json` to run all of these concurrently, or you can run them in separate terminal windows.

```javascript
{
"scripts": {
"dev": "next dev",
"server": "python src/scripts/server.py",
"start": "concurrently \"npm run server\" \"npm run dev\""
}
}
```

Then you can start the entire application with:

```bash
npm run start
```

## Conclusion

You've now built a real-time news intelligence chatbot that uses Retrieval Augmented Generation (RAG) powered by Pinecone and OpenAI, using Confluent to stream news data. This application demonstrates how to integrate multiple technologies to create a powerful and contextually relevant chatbot.
93 changes: 93 additions & 0 deletions confluent-rss-newsbot/next-steps.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Next Steps after `azd init`

## Table of Contents

1. [Next Steps](#next-steps)
2. [What was added](#what-was-added)
3. [Billing](#billing)
4. [Troubleshooting](#troubleshooting)

## Next Steps

### Provision infrastructure and deploy application code

Run `azd up` to provision your infrastructure and deploy to Azure (or run `azd provision` then `azd deploy` to accomplish the tasks separately). Visit the service endpoints listed to see your application up-and-running!

To troubleshoot any issues, see [troubleshooting](#troubleshooting).

### Configure environment variables for running services

Configure environment variables for running services by updating `settings` in [main.parameters.json](./infra/main.parameters.json).

### Configure CI/CD pipeline

1. Create a workflow pipeline file locally. The following starters are available:
- [Deploy with GitHub Actions](https://github.com/Azure-Samples/azd-starter-bicep/blob/main/.github/workflows/azure-dev.yml)
- [Deploy with Azure Pipelines](https://github.com/Azure-Samples/azd-starter-bicep/blob/main/.azdo/pipelines/azure-dev.yml)
2. Run `azd pipeline config` to configure the deployment pipeline to connect securely to Azure.

## What was added

### Infrastructure configuration

To describe the infrastructure and application, `azure.yaml` along with Infrastructure as Code files using Bicep were added with the following directory structure:

```yaml
- azure.yaml # azd project configuration
- infra/ # Infrastructure as Code (bicep) files
- main.bicep # main deployment module
- app/ # Application resource modules
- shared/ # Shared resource modules
- modules/ # Library modules
```

Each bicep file declares resources to be provisioned. The resources are provisioned when running `azd up` or `azd provision`.

- [app/pinecone-rag-demo.bicep](./infra/app/pinecone-rag-demo.bicep) - Azure Container Apps resources to host the 'pinecone-rag-demo' service.
- [shared/keyvault.bicep](./infra/shared/keyvault.bicep) - Azure KeyVault to store secrets.
- [shared/monitoring.bicep](./infra/shared/monitoring.bicep) - Azure Log Analytics workspace and Application Insights to log and store instrumentation logs.
- [shared/registry.bicep](./infra/shared/registry.bicep) - Azure Container Registry to store docker images.

More information about [Bicep](https://aka.ms/bicep) language.

### Build from source (no Dockerfile)

#### Build with Buildpacks using Oryx

If your project does not contain a Dockerfile, we will use [Buildpacks](https://buildpacks.io/) using [Oryx](https://github.com/microsoft/Oryx/blob/main/doc/README.md) to create an image for the services in `azure.yaml` and get your containerized app onto Azure.

To produce and run the docker image locally:

1. Run `azd package` to build the image.
2. Copy the *Image Tag* shown.
3. Run `docker run -it <Image Tag>` to run the image locally.

#### Exposed port

Oryx will automatically set `PORT` to a default value of `80` (port `8080` for Java). Additionally, it will auto-configure supported web servers such as `gunicorn` and `ASP .NET Core` to listen to the target `PORT`. If your application already listens to the port specified by the `PORT` variable, the application will work out-of-the-box. Otherwise, you may need to perform one of the steps below:

1. Update your application code or configuration to listen to the port specified by the `PORT` variable
1. (Alternatively) Search for `targetPort` in a .bicep file under the `infra/app` folder, and update the variable to match the port used by the application.

## Billing

Visit the *Cost Management + Billing* page in Azure Portal to track current spend. For more information about how you're billed, and how you can monitor the costs incurred in your Azure subscriptions, visit [billing overview](https://learn.microsoft.com/azure/developer/intro/azure-developer-billing).

## Troubleshooting

Q: I visited the service endpoint listed, and I'm seeing a blank page, a generic welcome page, or an error page.

A: Your service may have failed to start, or it may be missing some configuration settings. To investigate further:

1. Run `azd show`. Click on the link under "View in Azure Portal" to open the resource group in Azure Portal.
2. Navigate to the specific Container App service that is failing to deploy.
3. Click on the failing revision under "Revisions with Issues".
4. Review "Status details" for more information about the type of failure.
5. Observe the log outputs from Console log stream and System log stream to identify any errors.
6. If logs are written to disk, use *Console* in the navigation to connect to a shell within the running container.

For more troubleshooting information, visit [Container Apps troubleshooting](https://learn.microsoft.com/azure/container-apps/troubleshooting).

### Additional information

For additional information about setting up your `azd` project, visit our official [docs](https://learn.microsoft.com/azure/developer/azure-developer-cli/make-azd-compatible?pivots=azd-convert).
19 changes: 19 additions & 0 deletions confluent-rss-newsbot/next.config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
/** @type {import('next').NextConfig} */
const nextConfig = {
webpack: (config, { isServer }) => {
if (!isServer) {
config.resolve.fallback = {
...config.resolve.fallback,
net: false,
tls: false,
fs: false,
child_process: false,
crypto: require.resolve('crypto-browserify'),
path: false,
};
}
return config;
},
};

module.exports = nextConfig;
Loading