Skip to content

Commit

Permalink
Upgrade Core part1 (#430)
Browse files Browse the repository at this point in the history
* it builds

* somewhat works

* update of manage.py

* minor improvements

* migration fixes

* progress bar for CVE, CPE, CWE CLI import

* - fix: missing 1 required positional argument: 'id' by Collector creating
- move mudeles configuration to shared

* docstring update, remove methods docstrings

* fix collector registration

* show warnings for deprecation of SQLAlchemy

---------

Co-authored-by: Jan Polonsky <[email protected]>
  • Loading branch information
multiflexi and Jan Polonsky authored Dec 12, 2024
1 parent f133e32 commit b95077a
Show file tree
Hide file tree
Showing 73 changed files with 3,918 additions and 3,333 deletions.
16 changes: 9 additions & 7 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,19 +19,21 @@ updates:
interval: "weekly"
ignore:
- dependency-name: "slackclient"
# - package-ecosystem: "pip"
# directory: "/src/core"
# schedule:
# interval: "weekly"
- package-ecosystem: "pip"
- package-ecosystem: "pip"
directory: "/src/core"
schedule:
interval: "weekly"
ignore:
- dependency-name: "SQLAlchemy"
- package-ecosystem: "pip"
directory: "/src/presenters"
schedule:
interval: "weekly"
- package-ecosystem: "pip"
- package-ecosystem: "pip"
directory: "/src/publishers"
schedule:
interval: "weekly"
- package-ecosystem: "npm"
- package-ecosystem: "npm"
directory: "/src/gui"
schedule:
interval: "weekly"
22 changes: 12 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,8 +139,8 @@ dictionaries and a current list of CVEs preloaded in Taranis NG.

2. Upload the dictionary to the proper path, and import into the database
```bash
gzcat official-cpe-dictionary_v2.3.xml.gz | \
docker exec -i taranis-ng_core_1 python manage.py dictionary --upload-cpe
zcat official-cpe-dictionary_v2.3.xml.gz | \
docker exec -i taranis-ng-core-1 python manage.py dictionary --upload-cpe
```

3. Download the official CVE list from
Expand All @@ -149,8 +149,8 @@ in xml.gz format.

4. Upload the dictionary to the proper path, and import into the database
```bash
gzcat allitems.xml.gz | \
docker exec -i taranis-ng_core_1 python manage.py dictionary --upload-cve
zcat allitems.xml.gz | \
docker exec -i taranis-ng-core-1 python manage.py dictionary --upload-cve
```

5. Download the official CWE list from
Expand All @@ -159,10 +159,12 @@ in xml.zip format.

6. Upload the dictionary to the proper path, and import into the database
```bash
gzcat cwec_latest.xml.zip | \
docker exec -i taranis-ng_core_1 python manage.py dictionary --upload-cwe
zcat cwec_latest.xml.zip | \
docker exec -i taranis-ng-core-1 python manage.py dictionary --upload-cwe
```

Some Linux distributions might provide gzcat instead of zcat.

### Using the default stop lists for better tag cloud

1. Visit Configuration -> Word Lists.
Expand All @@ -177,12 +179,12 @@ gzcat cwec_latest.xml.zip | \
## About...

This project was inspired by [Taranis3](https://github.com/NCSC-NL/taranis3),
a great tool made by NCSC-NL. Currently, NCSC-NL has a new tool for producing advisories,
with a different approach to communicating with the world. There was no funding to maintain or
a great tool made by NCSC-NL. Currently, NCSC-NL has a new tool for producing advisories,
with a different approach to communicating with the world. There was no funding to maintain or
further develop NCSC-NL's Taranis3.

It aims to become a next generation of this category of tools. The project was made in collaboration
with a wide group of European CSIRT teams who are developers and users of Taranis3, and would not be
It aims to become a next generation of this category of tools. The project was made in collaboration
with a wide group of European CSIRT teams who are developers and users of Taranis3, and would not be
possible without their valuable input especially during the requirements collection phase.
The architecture and design of new Taranis NG is a collective brain child of this community.

Expand Down
5 changes: 3 additions & 2 deletions docker/Dockerfile.core
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM python:3.10-alpine3.14 AS build_shared
FROM python:3.13-alpine3.20 AS build_shared

WORKDIR /build_shared/

Expand All @@ -17,7 +17,7 @@ COPY ./src/core/sse/forward.c .
RUN gcc -o forward forward.c


FROM python:3.10-alpine3.14 AS production
FROM python:3.13-alpine3.20 AS production

WORKDIR /app/

Expand Down Expand Up @@ -45,6 +45,7 @@ RUN \
apk add --no-cache --virtual .build-deps \
gcc \
g++ \
git \
build-base\
libc-dev\
zlib-dev \
Expand Down
2 changes: 2 additions & 0 deletions docker/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@ services:
LDAP_SERVER: "${LDAP_SERVER}"
LDAP_BASE_DN: "${LDAP_BASE_DN}"
LDAP_CA_CERT_PATH:
SQLALCHEMY_WARN_20: "1"
PYTHONWARNINGS: "default"

OPENID_LOGOUT_URL: ""
WORKERS_PER_CORE: "1"
Expand Down
10 changes: 7 additions & 3 deletions docker/prestart_core.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,18 @@ echo "Running sse forward in the background..."
/usr/local/bin/forward --sender-port 5000 --client-port 5001 &

echo "Running migrations..."
python /app/db_migration.py db upgrade head
python db_migration.py
echo "Migrations are done."

if [ "$(python ./manage.py collector --list | wc -l)" == 0 ] && [ x"$SKIP_DEFAULT_COLLECTOR" != "xtrue" ]; then
if [ "$(python ./manage.py collector --list | grep 'Total:' | cut -d ' ' -f2)" == 0 ] && [ x"$SKIP_DEFAULT_COLLECTOR" != "true" ]; then
(
echo "Creating default collector..."
echo "Reading API key from file..."
API_KEY=$(cat "/run/secrets/api_key")
python ./manage.py collector --create --name "Default Docker Collector" --description "A local collector node configured as a part of Taranis NG default installation." --api-url "http://collectors/" --api-key "$API_KEY"
) &
else
echo "Default collector already exists or SKIP_DEFAULT_COLLECTOR is set to true."
fi

echo "Done."
echo "prestart.sh finished."
249 changes: 121 additions & 128 deletions src/bots/bots/analyst_bot.py
Original file line number Diff line number Diff line change
@@ -1,128 +1,121 @@
"""AnalystBot class."""

import re

from .base_bot import BaseBot
from managers.log_manager import logger
from shared.schema import news_item
from shared.schema.parameter import Parameter, ParameterType
from remote.core_api import CoreApi


class AnalystBot(BaseBot):
"""AnalystBot class.
This class represents a bot for news items analysis.
Attributes:
type (str): The type of the bot.
name (str): The name of the bot.
description (str): The description of the bot.
parameters (list): The list of parameters for the bot.
regexp (list): The list of regular expressions for data analysis.
attr_name (list): The list of attribute names for extracted data.
news_items (list): The list of news items.
news_items_data (list): The list of news items data.
Methods:
execute(preset): Executes the bot with the given preset.
execute_on_event(preset, event_type, data): Executes the bot on an event with the given preset, event type, and data.
"""

type = "ANALYST_BOT"
name = "Analyst Bot"
description = "Bot for news items analysis"

parameters = [
Parameter(0, "SOURCE_GROUP", "Source Group", "OSINT Source group to inspect", ParameterType.STRING),
Parameter(0, "REGULAR_EXPRESSION", "Regular Expression", "Regular expression for data analysis", ParameterType.STRING),
Parameter(0, "ATTRIBUTE_NAME", "Attribute name", "Name of attribute for extracted data", ParameterType.STRING),
]

parameters.extend(BaseBot.parameters)

regexp = []
attr_name = []
news_items = []
news_items_data = []

def execute(self, preset):
"""Execute the analyst bot with the given preset.
Parameters:
preset (Preset): The preset containing the parameter values.
Raises:
Exception: If an error occurs during execution.
"""
try:
source_group = preset.parameter_values["SOURCE_GROUP"] # noqa F841
regexp = preset.parameter_values["REGULAR_EXPRESSION"]
attr_name = preset.parameter_values["ATTRIBUTE_NAME"]
interval = preset.parameter_values["REFRESH_INTERVAL"]

# support for multiple regexps
regexp = regexp.split(";;;")
attr_name = attr_name.split(";;;")
if len(regexp) > len(attr_name):
regexp = regexp[: len(attr_name)]
elif len(attr_name) > len(regexp):
attr_name = attr_name[: len(regexp)]

bots_params = dict(zip(attr_name, regexp))
limit = BaseBot.history(interval)
logger.info(f"{preset.name}: running with date limit {limit}")
news_items_data, code = CoreApi.get_news_items_data(limit)
if code == 200 and news_items_data is not None:
for item in news_items_data:
if item:
news_item_id = item["id"]
title = item["title"]
preview = item["review"]
content = item["content"]

analyzed_text = " ".join([title, preview, content])

attributes = []
for key, value in bots_params.items():
uniq_list = []
# print('Key:', key, 'Regex:', value, flush=True)
for finding in re.finditer(value, analyzed_text):
if len(finding.groups()) > 0:
found_value = finding.group(1)
else:
found_value = finding.group(0)
# print('Found:', found_value, flush=True)
if found_value not in uniq_list:
uniq_list.append(found_value)

# app is checking combination ID + Value in DB before INSERT (attribute_value_identical)
# so check for some duplicity here (faster)
for found_value in uniq_list:
binary_mime_type = ""
binary_value = ""
news_attribute = news_item.NewsItemAttribute(key, found_value, binary_mime_type, binary_value)
attributes.append(news_attribute)

if len(attributes) > 0:
logger.debug(f"Processing item id: {news_item_id}, {item['collected']}, Found: {len(attributes)}")
news_item_attributes_schema = news_item.NewsItemAttributeSchema(many=True)
CoreApi.update_news_item_attributes(news_item_id, news_item_attributes_schema.dump(attributes))

except Exception as error:
BaseBot.print_exception(preset, error)

def execute_on_event(self, preset, event_type, data):
"""Execute the specified preset on the given event.
Parameters:
preset (Preset): The preset to execute.
event_type (str): The type of the event.
data (dict): The data associated with the event.
Raises:
Exception: If there is an error while executing the preset.
"""
try:
source_group = preset.parameter_values["SOURCE_GROUP"] # noqa F841
regexp = preset.parameter_values["REGULAR_EXPRESSION"] # noqa F841
attr_name = preset.parameter_values["ATTRIBUTE_NAME"] # noqa F841

except Exception as error:
BaseBot.print_exception(preset, error)
"""AnalystBot class."""

import re

from .base_bot import BaseBot
from managers.log_manager import logger
from shared.config_bot import ConfigBot
from shared.schema import news_item
from remote.core_api import CoreApi


class AnalystBot(BaseBot):
"""AnalystBot class.
This class represents a bot for news items analysis.
Attributes:
type (str): The type of the bot.
name (str): The name of the bot.
description (str): The description of the bot.
parameters (list): The list of parameters for the bot.
regexp (list): The list of regular expressions for data analysis.
attr_name (list): The list of attribute names for extracted data.
news_items (list): The list of news items.
news_items_data (list): The list of news items data.
Methods:
execute(preset): Executes the bot with the given preset.
execute_on_event(preset, event_type, data): Executes the bot on an event with the given preset, event type, and data.
"""

type = "ANALYST_BOT"
config = ConfigBot().get_config_by_type(type)
name = config.name
description = config.description
parameters = config.parameters
regexp = []
attr_name = []
news_items = []
news_items_data = []

def execute(self, preset):
"""Execute the analyst bot with the given preset.
Parameters:
preset (Preset): The preset containing the parameter values.
Raises:
Exception: If an error occurs during execution.
"""
try:
source_group = preset.parameter_values["SOURCE_GROUP"] # noqa F841
regexp = preset.parameter_values["REGULAR_EXPRESSION"]
attr_name = preset.parameter_values["ATTRIBUTE_NAME"]
interval = preset.parameter_values["REFRESH_INTERVAL"]

# support for multiple regexps
regexp = regexp.split(";;;")
attr_name = attr_name.split(";;;")
if len(regexp) > len(attr_name):
regexp = regexp[: len(attr_name)]
elif len(attr_name) > len(regexp):
attr_name = attr_name[: len(regexp)]

bots_params = dict(zip(attr_name, regexp))
limit = BaseBot.history(interval)
logger.info(f"{preset.name}: running with date limit {limit}")
news_items_data, code = CoreApi.get_news_items_data(limit)
if code == 200 and news_items_data is not None:
for item in news_items_data:
if item:
news_item_id = item["id"]
title = item["title"]
preview = item["review"]
content = item["content"]

analyzed_text = " ".join([title, preview, content])

attributes = []
for key, value in bots_params.items():
uniq_list = []
# print('Key:', key, 'Regex:', value, flush=True)
for finding in re.finditer(value, analyzed_text):
if len(finding.groups()) > 0:
found_value = finding.group(1)
else:
found_value = finding.group(0)
# print('Found:', found_value, flush=True)
if found_value not in uniq_list:
uniq_list.append(found_value)

# app is checking combination ID + Value in DB before INSERT (attribute_value_identical)
# so check for some duplicity here (faster)
for found_value in uniq_list:
binary_mime_type = ""
binary_value = ""
news_attribute = news_item.NewsItemAttribute(key, found_value, binary_mime_type, binary_value)
attributes.append(news_attribute)

if len(attributes) > 0:
logger.debug(f"Processing item id: {news_item_id}, {item['collected']}, Found: {len(attributes)}")
news_item_attributes_schema = news_item.NewsItemAttributeSchema(many=True)
CoreApi.update_news_item_attributes(news_item_id, news_item_attributes_schema.dump(attributes))

except Exception as error:
BaseBot.print_exception(preset, error)

def execute_on_event(self, preset, event_type, data):
"""Execute the specified preset on the given event.
Parameters:
preset (Preset): The preset to execute.
event_type (str): The type of the event.
data (dict): The data associated with the event.
Raises:
Exception: If there is an error while executing the preset.
"""
try:
source_group = preset.parameter_values["SOURCE_GROUP"] # noqa F841
regexp = preset.parameter_values["REGULAR_EXPRESSION"] # noqa F841
attr_name = preset.parameter_values["ATTRIBUTE_NAME"] # noqa F841

except Exception as error:
BaseBot.print_exception(preset, error)
Loading

0 comments on commit b95077a

Please sign in to comment.