Skip to content

CorrelAid/h4sg25_cdl_challenge

Repository files navigation

CDL @ Hack4SocialGood 2025: Extracting Funding Amounts from State Funding Programs

  • Development of a method to automatically extract funding amounts from funding programs.
  • The problem can be framed as a Named Entity Recognition (NER) or information extraction task.
  • The data basis consists of funding programs that are part of the German federal government's "Förderdatenbank" (funding database). These have been scraped and published here.
  • The background is an attempt to quantify how much the German government spends on promoting democracy. As a first step, a classifier has already been developed to identify democracy funding programs. The next step is the extraction of funding amounts. You can find an article on the project (in German!) here.

Data

  • The data originates from the website: www.foerderdatenbank.de
  • A description of the scraped dataset, as well as the link to the data, can be found here.
  • An example of how the data can be read using Python is available here

Possible Approaches

  • NER using the Python package spaCy.
  • Fine-tuning language models like BERT.
  • In-context learning with generative LLMs.

Important Considerations

  • The method should be evaluated using suitable metrics such as the F1 Score or Accuracy.

Setup

  1. Install uv

  2. uv sync

New Data

New Plan

  1. extract list of NGOs and how much money they got from the government in 2023
  2. compare it to the list of NGOs
  3. calculate how high is the percentage of the whole amount that was dedicated to organizations with the goal of strengthening democracy

all presentation data

hier direkt in die Präsnetation hier zum gesamten Ordner

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •