Skip to content

Commit

Permalink
update indexing notebook (#73)
Browse files Browse the repository at this point in the history
Updated indexing notebook. 

For now we're using the version from this
[PR](https://github.com/ml6team/fondant/pulls) to allow using resuable
components in notebooks.

* Converted the custom cleaning component and chunking component to
lightweight components.
* Added TODOs to change after having a stable release 


Future Todos: 
* There is an error when using the custom cleaning component related to
the way the component text is parsed (related to presence of `\n`.
Requires further investigation. (This PR)
* Change write index components to lightweight component (This PR)
* Switch over the two remaining notebooks ti lightweight components as
well (future PRs)

---------

Co-authored-by: Matthias Richter <[email protected]>
  • Loading branch information
PhilippeMoussalli and mrchtr authored Feb 9, 2024
1 parent ff5aa21 commit f6c4e5d
Show file tree
Hide file tree
Showing 18 changed files with 3,970 additions and 2,057 deletions.
21 changes: 3 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,28 +22,13 @@ Check out the Fondant [website](https://fondant.ai/) if you want to learn more a

### A simple RAG indexing pipeline

A [**notebook**](./src/pipeline.ipynb) with a simple Fondant pipeline to index your data into a
A [**notebook**](./src/indexing.ipynb) with a simple Fondant pipeline to index your data into a
RAG system.

### Iterative tuning of a RAG indexing pipeline

A [**notebook**](./src/evaluation.ipynb) which iteratively runs a Fondant
[indexing pipeline](./src/pipeline_index.py) and [evaluation pipeline](./src/pipeline_eval.py) with
different parameters for comparison. You can inspect the data between every step to make
informed choices on which parameters to try.

### Auto-tuning of a RAG indexing pipeline

<p>
A <a href="./src/parameter_search.ipynb"><b>notebook</b></a> which allows you to automatically search for the
optimal parameter settings using different methods
</p>
<br>
<p align="center">
<a href="./src/parameter_search.ipynb">
<img src="./art/iteration.png" width="800px"/>
</a>
</p>
pipeline to evaluate a RAG system using [RAGAS](https://github.com/explodinggradients/ragas/tree/main/src/ragas).

## Getting started

Expand Down Expand Up @@ -84,4 +69,4 @@ fondant --help
There are two options to run the pipeline:

- [**Via python files and the Fondant CLI:**](https://fondant.ai/en/latest/pipeline/#running-a-pipeline) how you should run Fondant in production
- [**Via a Jupyter notebook**](./src/pipeline.ipynb): ideal to learn about Fondant
- [**Via a Jupyter notebook**](./src/indexing.ipynb): ideal to learn about Fondant
4 changes: 2 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
fondant==0.9.0
fondant[component,aws,azure,gcp,docker]==0.10.1
notebook==7.0.6
weaviate-client==3.25.3
weaviate-client==3.25.3
18 changes: 0 additions & 18 deletions src/components/aggregate_eval_results/Dockerfile

This file was deleted.

16 changes: 0 additions & 16 deletions src/components/aggregate_eval_results/fondant_component.yaml

This file was deleted.

1 change: 0 additions & 1 deletion src/components/aggregate_eval_results/requirements.txt

This file was deleted.

16 changes: 0 additions & 16 deletions src/components/aggregate_eval_results/src/main.py

This file was deleted.

13 changes: 0 additions & 13 deletions src/components/text_cleaning/Dockerfile

This file was deleted.

11 changes: 0 additions & 11 deletions src/components/text_cleaning/fondant_component.yaml

This file was deleted.

1 change: 0 additions & 1 deletion src/components/text_cleaning/requirements.txt

This file was deleted.

18 changes: 0 additions & 18 deletions src/components/text_cleaning/src/main.py

This file was deleted.

Loading

0 comments on commit f6c4e5d

Please sign in to comment.