Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hackernews data source #1378

Open
wants to merge 5 commits into
base: staging
Choose a base branch
from

Conversation

nekronos-gh
Copy link

@nekronos-gh nekronos-gh commented Nov 21, 2023

feat: Add HackerNews Data Source Integration and Documentation

This pull request introduces a new data source integration for HackerNews in EvaDB. The integration allows users to connect to HackerNews and retrieve data from various tables such as "items," "users," "top_stories," "new_stories," "best_stories," "ask_stories," "show_stories," "job_stories," and "updates."

Changes Made:

  1. Data Source Integration:

    • Added a new directory evadb/third_party/databases/hackernews/ containing the necessary files: __init__.py, requirements.txt, and hackernews_handler.py.
    • Implemented the HackerNewsHandler class in hackernews_handler.py that inherits from DBHandler and provides methods for connecting, disconnecting, checking connection, and retrieving data.
  2. Configuration Parameters:

    • Configuration parameters such as max_item for limiting rows, a list of tables, and the definition of columns for each table are provided during initialization.
  3. Documentation:

    • Created hackernews.rst under evadb/docs/source/reference/databases with information on the HackerNews data source integration. Updated _toc.yml to include the new documentation.
    • Documented each table's purpose and the type of information they store.

This commit introduces a new data source integration for HackerNews in EvaDB. The integration allows users to connect to HackerNews and retrieve data from various tables such as "items," "users," "top_stories," "new_stories," "best_stories," "ask_stories," "show_stories," "job_stories," and "updates."
…ckerNews data source integration. The documentation provides a brief description of each table, outlining their purpose and the type of information they store. The tables covered include: ``items``, ``users``, ``top_stories``, ``new_stories``, ``best_stories``, ``ask_stories``, ``show_stories``, ``job_stories``, and ``updates``.

1. Added table descriptions in reStructuredText (rst) format under `evadb/docs/source/reference/databases/hackernews.rst`.
2. Updated the table of contents (`evadb/docs/_toc.yml`) to include the new HackerNews documentation.
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 Hello @Nekronos-SPN, thanks for submitting a EVA DB PR 🙏 To allow your work to be integrated as seamlessly as possible, we advise you to:

  • ✅ Verify that your PR is up-to-date with georgia-tech-db/eva master branch. If your PR is behind you can update your code by clicking the 'Update branch' button or by running git pull and git merge master locally.
  • ✅ Verify that all EVA DB Continuous Integration (CI) checks are passing.
  • ✅ Reduce changes to the absolute minimum required for your bug fix or feature addition.

@xzdandy
Copy link
Collaborator

xzdandy commented Nov 21, 2023

There are conflicts from #1362. Please fix. Thanks!

@nekronos-gh
Copy link
Author

Hello @kaushikravichandran!
I am having trouble fixing the errors pointed out by the circleci.

The long integration test is returning the following error, even tho the library is included in the requirements.txt.
ModuleNotFoundError: No module named 'requests_html'

The linter test is stating the following:
Code was reformatted or you have unstaged changes.

Could you give me a hand on this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants