Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shadowserver dynamic config #2372

Merged
merged 75 commits into from
Dec 18, 2023
Merged
Show file tree
Hide file tree
Changes from 73 commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
19a1972
remove obsolete tests and data
elsif2 Apr 11, 2023
2dec6ec
remove json parser - csv provides better performance
elsif2 Apr 11, 2023
0454961
dynamic configuration model
elsif2 Apr 11, 2023
0a39e0d
revised tests
elsif2 Apr 12, 2023
94b22fb
Updated to reset report type on reload #2361
elsif2 May 8, 2023
b2f9bc3
Added schema download on startup and additional logging
elsif2 May 23, 2023
d5cf063
Added version support to the schema update function.
elsif2 May 23, 2023
1e6ea89
Documentation and style updates.
elsif2 May 28, 2023
fc3f5b0
Added schema.json.test.license.
elsif2 May 30, 2023
b996e0e
Updates in response to feedback.
elsif2 Jul 27, 2023
661a964
Removed file_format parameter
elsif2 Jul 28, 2023
a045bee
Minor changes based on feedback 2023-08-24
elsif2 Aug 24, 2023
0660a89
Added VAR_STATE_PATH check.
elsif2 Aug 24, 2023
33370cf
Changes based on feedback 2023-08-25.
elsif2 Aug 25, 2023
bd76ab7
Added INTELMQ_SKIP_INTERNET check
elsif2 Aug 25, 2023
6e5e110
Added debug logging for CI test.
elsif2 Aug 25, 2023
01dcd5e
Refactored test_download_schema to utilize mocking.
elsif2 Aug 25, 2023
9314c84
Added docstring for test_update_schema().
elsif2 Aug 28, 2023
2f11b2a
Removed logging output.
elsif2 Aug 29, 2023
46f2ca7
Removed the assertion regarding report fields.
elsif2 Aug 31, 2023
d0311e0
remove obsolete tests and data
elsif2 Apr 11, 2023
b5416c7
remove json parser - csv provides better performance
elsif2 Apr 11, 2023
876a414
dynamic configuration model
elsif2 Apr 11, 2023
b917a94
revised tests
elsif2 Apr 12, 2023
eafa15b
Updated to reset report type on reload #2361
elsif2 May 8, 2023
b2753cb
Added schema download on startup and additional logging
elsif2 May 23, 2023
fd0a8fd
Added version support to the schema update function.
elsif2 May 23, 2023
357aad5
Documentation and style updates.
elsif2 May 28, 2023
37c6745
Added schema.json.test.license.
elsif2 May 30, 2023
ee8ce87
Updates in response to feedback.
elsif2 Jul 27, 2023
4a73f0b
Removed file_format parameter
elsif2 Jul 28, 2023
e413fb5
Minor changes based on feedback 2023-08-24
elsif2 Aug 24, 2023
df6e622
Added VAR_STATE_PATH check.
elsif2 Aug 24, 2023
9195213
Changes based on feedback 2023-08-25.
elsif2 Aug 25, 2023
cc48565
Added INTELMQ_SKIP_INTERNET check
elsif2 Aug 25, 2023
16daee4
Added debug logging for CI test.
elsif2 Aug 25, 2023
f102f2c
Refactored test_download_schema to utilize mocking.
elsif2 Aug 25, 2023
b103282
Added docstring for test_update_schema().
elsif2 Aug 28, 2023
356b956
Removed logging output.
elsif2 Aug 29, 2023
c72d553
Removed the assertion regarding report fields.
elsif2 Aug 31, 2023
3b60c2f
Skip and log a warning message for fields not in the IDF.
elsif2 Oct 16, 2023
28d306d
Merge branch 'shadowserver-dynamic-config' of https://github.com/cert…
elsif2 Oct 16, 2023
afce131
Merge branch 'develop' into shadowserver-dynamic-config
elsif2 Oct 24, 2023
473f6a6
remove obsolete tests and data
elsif2 Apr 11, 2023
a33fa64
remove json parser - csv provides better performance
elsif2 Apr 11, 2023
cd3338a
dynamic configuration model
elsif2 Apr 11, 2023
b081509
revised tests
elsif2 Apr 12, 2023
c6108d6
Updated to reset report type on reload #2361
elsif2 May 8, 2023
308ec67
Added schema download on startup and additional logging
elsif2 May 23, 2023
9ecf366
Added version support to the schema update function.
elsif2 May 23, 2023
9c4a1a4
Documentation and style updates.
elsif2 May 28, 2023
e4f9ac4
Added schema.json.test.license.
elsif2 May 30, 2023
460344f
Updates in response to feedback.
elsif2 Jul 27, 2023
fec1fd2
Removed file_format parameter
elsif2 Jul 28, 2023
fe2a37c
Minor changes based on feedback 2023-08-24
elsif2 Aug 24, 2023
ec066ce
Added VAR_STATE_PATH check.
elsif2 Aug 24, 2023
d1427f3
Changes based on feedback 2023-08-25.
elsif2 Aug 25, 2023
ae54e7c
Added INTELMQ_SKIP_INTERNET check
elsif2 Aug 25, 2023
e4e5063
Added debug logging for CI test.
elsif2 Aug 25, 2023
1280482
Refactored test_download_schema to utilize mocking.
elsif2 Aug 25, 2023
2a60d2e
Added docstring for test_update_schema().
elsif2 Aug 28, 2023
e401e2c
Removed logging output.
elsif2 Aug 29, 2023
66ae9f5
Removed the assertion regarding report fields.
elsif2 Aug 31, 2023
e04dfee
Skip and log a warning message for fields not in the IDF.
elsif2 Oct 16, 2023
6f23883
Updated convert_http_host_and_url and added category_or_detail.
elsif2 Oct 31, 2023
606fc10
Merge branch 'shadowserver-dynamic-config' of https://github.com/cert…
elsif2 Oct 31, 2023
a0b34cb
Avoid exception when a conversion function is not available in the cu…
elsif2 Oct 31, 2023
61c756d
Added exception for missing schema and added intelmq user to the cron…
elsif2 Nov 4, 2023
a3a3aee
Merge branch 'shadowserver-dynamic-config' into develop
elsif2 Nov 13, 2023
307386d
Documentation update.
elsif2 Nov 13, 2023
ac04471
Removed old unsorted doc and updated the taxonomy functions for the s…
elsif2 Nov 16, 2023
04c63a4
Merge branch 'develop' into shadowserver-dynamic-config
elsif2 Nov 16, 2023
7a7a6a6
Merge branch 'develop' into shadowserver-dynamic-config
kamil-certat Nov 27, 2023
0c0cb68
Merge branch 'develop' into shadowserver-dynamic-config
elsif2 Dec 12, 2023
4743ba9
Merge branch 'develop' into shadowserver-dynamic-config
aaronkaplan Dec 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,11 +148,13 @@
- added support for `Subject NOT LIKE` queries,
- added support for multiple values in ticket subject queries.
- `intelmq.bots.collectors.rsync`: Support for optional private key, relative time parsing for the source path, extra rsync parameters and strict host key checking (PR#2241 by Mateo Durante).
- `intelmq.bots.collectors.shadowserver.collector_reports_api`:
- The 'json' option is no longer supported as the 'csv' option provides better performance.

#### Parsers
- `intelmq.bots.parsers.shadowserver._config`:
- Reset detected `feedname` at shutdown to re-detect the feedname on reloads (PR#2361 by @elsif2, fixes #2360).
- `intelmq.bots.parsers.shadowserver._config`:
- Switch to dynamic configuration to decouple report schema changes from IntelMQ releases.
Comment on lines +159 to +165
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changelog entry is in the wrong section (3.2.0) instead of 3.2.2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be fixed with #2447

- Added 'IPv6-Vulnerable-Exchange' alias and 'Accessible-WS-Discovery-Service' report. (PR#2338)
- Removed unused `p0f_genre` and `p0f_detail` from the 'DNS-Open-Resolvers' report. (PR#2338)
- Added 'Accessible-SIP' report. (PR#2348)
Expand Down
24 changes: 0 additions & 24 deletions docs/unsorted/shadowserver.md

This file was deleted.

174 changes: 61 additions & 113 deletions docs/user/bots.md
Original file line number Diff line number Diff line change
Expand Up @@ -937,11 +937,6 @@ The resulting reports contain the following special field:

**Parameters (also expects [feed parameters](#feed-parameters) and [cache parameters](#cache-parameters)):**

**`country`**

(required, string) **Deprecated:** The country you want to download the reports for. Will be removed in IntelMQ version
4.0.0, use *reports* instead.

**`apikey`**

(required, string) Your Shadowserver API key.
Expand All @@ -956,7 +951,27 @@ The resulting reports contain the following special field:

**`types`**

(optional, string/array of strings) An array of strings (or a list of comma-separated values) with the names of report types you want to process. If you leave this empty, all the available reports will be downloaded and processed (i.e. 'scan', 'drones', 'intel', 'sandbox_connection', 'sinkhole_combined'). The possible report types are equivalent to the file names given in the section Supported Reports of the [Shadowserver parser](#intelmq.bots.parsers.shadowserver.parser_json).
(optional, string/array of strings) An array of strings (or a list of comma-separated values) with the names of report types you want to process. If you leave this empty, all the available reports will be downloaded and processed (i.e. 'scan', 'drones', 'intel', 'sandbox_connection', 'sinkhole_combined'). The possible report types are equivalent to the file names given in the section Supported Reports of the [Shadowserver parser](#intelmq.bots.parsers.shadowserver.parser).

**Sample configuration**

```yaml

shadowserver-collector:
description: Our bot responsible for getting reports from Shadowserver
enabled: true
group: Collector
module: intelmq.bots.collectors.shadowserver.collector_reports_api
name: Shadowserver_Collector
parameters:
destination_queues:
_default: [shadowserver-parser-queue]
file_format: csv
aaronkaplan marked this conversation as resolved.
Show resolved Hide resolved
api_key: "$API_KEY_received_from_the_shadowserver_foundation"
secret: "$SECRET_received_from_the_shadowserver_foundation"
run_mode: continuous

```

---

Expand Down Expand Up @@ -2101,12 +2116,10 @@ No additional parameters.

---

### Shadowserver <div id="intelmq.bots.parsers.shadowserver.parser" /> <div id="intelmq.bots.parsers.shadowserver.parser_json" />
### Shadowserver <div id="intelmq.bots.parsers.shadowserver.parser" />

Parses various reports from Shadowserver.
The Shadowserver parser operates on CSV formatted data.

There are two Shadowserver parsers, one for data in `CSV` format and one for data in `JSON` format. The latter was added
in IntelMQ 2.3 and is meant to be used together with the Shadowserver API collector.

**How this bot works?**

Expand Down Expand Up @@ -2135,8 +2148,7 @@ correct mapping of the columns:

**Module:**

`intelmq.bots.parsers.shadowserver.parser` (for CSV data)
`intelmq.bots.parsers.shadowserver.parser_json` (for JSON data)
`intelmq.bots.parsers.shadowserver.parser`

**Parameters:**

Expand All @@ -2150,108 +2162,44 @@ correct mapping of the columns:

**Supported reports:**

These are the supported report types and their corresponding file name for automatic detection:

| Report Type (`feedname`) | File Name |
|-----------|-----------|
| Accessible-ADB | `scan_adb` |
| Accessible-AFP | `scan_afp` |
| Accessible-AMQP | `scan_amqp` |
| Accessible-ARD | `scan_ard` |
| Accessible-Cisco-Smart-Install | `cisco_smart_install` |
| Accessible-CoAP | `scan_coap` |
| Accessible-CWMP | `scan_cwmp` |
| Accessible-MS-RDPEUDP | `scan_msrdpeudp` |
| Accessible-FTP | `scan_ftp` |
| Accessible-Hadoop | `scan_hadoop` |
| Accessible-HTTP | `scan_http` |
| Accessible-Radmin | `scan_radmin` |
| Accessible-RDP | `scan_rdp` |
| Accessible-Rsync | `scan_rsync` |
| Accessible-SMB | `scan_smb` |
| Accessible-Telnet | `scan_telnet` |
| Accessible-Ubiquiti-Discovery-Service | `scan_ubiquiti` |
| Accessible-VNC | `scan_vnc` |
| Blacklisted-IP (deprecated) | `blacklist` |
| Blocklist | `blocklist` |
| Compromised-Website| `compromised_website` |
| Device-Identification-IPv4 | `device_id` |
| Device-Identification-IPv6 | `device_id6` |
| DNS-Open-Resolvers | `scan_dns` |
| Honeypot-Amplification-DDoS-Events | `event4_honeypot_ddos_amp` |
| Honeypot-Brute-Force-Events | `event4_honeypot_brute_force` |
| Honeypot-Darknet | `event4_honeypot_darknet` |
| Honeypot-HTTP-Scan | `event4_honeypot_http_scan` |
| HTTP-Scanners | `hp_http_scan` |
| ICS-Scanners | `hp_ics_scan` |
| IP-Spoofer-Events | `event4_ip_spoofer` |
| Microsoft-Sinkhole-Events-IPv4 | `event4_microsoft_sinkhole` |
| Microsoft-Sinkhole-Events-HTTP | `event4_microsoft_sinkhole_http` |
| NTP-Monitor | `scan_ntpmonitor` |
| NTP-Version | `scan_ntp` |
| Open-Chargen | `scan_chargen` |
| Open-DB2-Discovery-Service | `scan_db2` |
| Open-Elasticsearch | `scan_elasticsearch` |
| Open-IPMI| `scan_ipmi` |
| Open-IPP | `scan_ipp` |
| Open-LDAP | `scan_ldap` |
| Open-LDAP-TCP | `scan_ldap_tcp` |
| Open-mDNS | `scan_mdns` |
| Open-Memcached | `scan_memcached` |
| Open-MongoDB | `scan_mongodb` |
| Open-MQTT | `scan_mqtt` |
| Open-MSSQL | `scan_mssql` |
| Open-NATPMP | `scan_nat_pmp` |
| Open-NetBIOS-Nameservice | `scan_netbios` |
| Open-Netis | `netis_router` |
| Open-Portmapper | `scan_portmapper` |
| Open-QOTD | `scan_qotd` |
| Open-Redis | `scan_redis` |
| Open-SNMP | `scan_snmp` |
| Open-SSDP | `scan_ssdp` |
| Open-TFTP | `scan_tftp` |
| Open-XDMCP | `scan_xdmcp` |
| Outdated-DNSSEC-Key| `outdated_dnssec_key` |
| Outdated-DNSSEC-Key-IPv6 | `outdated_dnssec_key_v6` |
| Sandbox-URL | `cwsandbox_url` |
| Sinkhole-DNS | `sinkhole_dns` |
| Sinkhole-Events | `event4_sinkhole` |
| Sinkhole-Events IPv4 | `event4_sinkhole` |
| Sinkhole-Events IPv6 | `event6_sinkhole` |
| Sinkhole-HTTP-Events | `event4_sinkhole_http`/`event6_sinkhole_http` |
| Sinkhole-HTTP-Events IPv4 | `event4_sinkhole_http` |
| Sinkhole-HTTP-Events IPv6 | `event6_sinkhole_http` |
| Sinkhole-Events-HTTP-Referer| `event4_sinkhole_http_referer`/`event6_sinkhole_http_referer` |
| Sinkhole-Events-HTTP-Referer IPv4 | `event4_sinkhole_http_referer` |
| Sinkhole-Events-HTTP-Referer IPv6 | `event6_sinkhole_http_referer` |
| Spam-URL | `spam_url` |
| SSL-FREAK-Vulnerable-Servers | `scan_ssl_freak` |
| SSL-POODLE-Vulnerable-Servers | `scan_ssl_poodle`/`scan6_ssl_poodle` |
| Vulnerable-Exchange-Server* | `scan_exchange` |
| Vulnerable-ISAKMP | `scan_isakmp` |
| Vulnerable-HTTP | `scan_http` |
| Vulnerable-SMTP | `scan_smtp_vulnerable` |

\* This report can also contain data on active webshells (column `tag` is `exchange;webshell`), and are therefore not
only vulnerable but also actively infected.

In addition, the following legacy reports are supported:

| Legacy Report Type | Successor Report Type | File Name |
|--------------------|-----------------------|-----------|
| Amplification-DDoS-Victim | Honeypot-Amplification-DDoS-Events | `ddos_amplification` |
| CAIDA-IP-Spoofer | IP-Spoofer-Events | `caida_ip_spoofer` |
| Darknet | Honeypot-Darknet | `darknet` |
| Drone | Sinkhole-Events | `botnet_drone` |
| Drone-Brute-Force | Honeypot-Brute-Force-Events, Sinkhole-HTTP-Events | `drone_brute_force` |
| Microsoft-Sinkhole | Sinkhole-HTTP-Events | `microsoft_sinkhole` |
| Sinkhole-HTTP-Drone | Sinkhole-HTTP-Events | `sinkhole_http_drone` |
| IPv6-Sinkhole-HTTP-Drone | Sinkhole-HTTP-Events | `sinkhole6_http` |

More information on these legacy reports can be found
in [Changes in Sinkhole and Honeypot Report Types and Formats](https://www.shadowserver.org/news/changes-in-sinkhole-and-honeypot-report-types-and-formats/)
.
The report configuration is stored in a `shadowserver-schema.json` file downloaded from https://interchange.shadowserver.org/intelmq/v1/schema.

The parser will attempt to download a schema update on startup when the *auto_update* option is enabled.

Schema downloads can also be scheduled as a cron job for the `intelmq` user:

```bash
02 01 * * * intelmq.bots.parsers.shadowserver.parser --update-schema
```

For air-gapped systems automation will be required to download and copy the file to VAR_STATE_PATH/shadowserver-schema.json.

The parser will automatically reload the configuration when the file changes.


**Schema contract**

Once set in the schema, the `classification.identifier`, `classification.taxonomy`, and `classification.type` fields will remain static for a specific report.

The schema revision history is maintained at https://github.com/The-Shadowserver-Foundation/report_schema/.


**Sample configuration**

```yaml
shadowserver-parser:
bot_id: shadowserver-parser
name: Shadowserver Parser
enabled: true
group: Parser
groupname: parsers
module: intelmq.bots.parsers.shadowserver.parser
parameters:
destination_queues:
_default: [file-output-queue]
auto_update: true
run_mode: continuous
```
---

### Shodan <div id="intelmq.bots.parsers.shodan.parser" />
Expand Down
24 changes: 9 additions & 15 deletions intelmq/bots/collectors/shadowserver/collector_reports_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,15 +34,13 @@ class ShadowServerAPICollectorBot(CollectorBot, HttpMixin, CacheMixin):
A list of strings or a comma-separated list of the mailing lists you want to process.
types (list):
A list of strings or a string of comma-separated values with the names of reporttypes you want to process. If you leave this empty, all the available reports will be downloaded and processed (i.e. 'scan', 'drones', 'intel', 'sandbox_connection', 'sinkhole_combined').
file_format (str): File format to download ('csv' or 'json'). The default is 'json' for compatibility. Using 'csv' is recommended for best performance.
"""

country = None
api_key = None
secret = None
types = None
reports = None
file_format = None
rate_limit: int = 86400
redis_cache_db: int = 12
redis_cache_host: str = "127.0.0.1" # TODO: type could be ipadress
Expand All @@ -66,15 +64,15 @@ def init(self):
self.logger.warn("Deprecated parameter 'country' found. Please use 'reports' instead. The backwards-compatibility will be removed in IntelMQ version 4.0.0.")
self._report_list.append(self.country)

if self.file_format is not None:
if not (self.file_format == 'csv' or self.file_format == 'json'):
raise ValueError('Invalid file_format')
else:
self.file_format = 'json'
self.logger.info("For best performance, set 'file_format' to 'csv' and use intelmq.bots.parsers.shadowserver.parser.")

self.preamble = f'{{ "apikey": "{self.api_key}" '

def check(parameters: dict):
for key in parameters:
if key == 'file_format':
return [["error", "The file_format parameter is no longer supported. All reports are CSV."]]
elif key == 'country':
return [["warning", "Deprecated parameter 'country' found. Please use 'reports' instead. The backwards-compatibility will be removed in IntelMQ version 4.0.0."]]

def _headers(self, data):
return {'HMAC2': hmac.new(self.secret.encode(), data.encode('utf-8'), digestmod=hashlib.sha256).hexdigest()}

Expand Down Expand Up @@ -123,11 +121,7 @@ def _report_download(self, reportid: str):
data = self.preamble
data += f',"id": "{reportid}"}}'
self.logger.debug('Downloading report with data: %s.', data)

if (self.file_format == 'json'):
response = self.http_session().post(APIROOT + 'reports/download', data=data, headers=self._headers(data))
else:
response = self.http_session().get(DLROOT + reportid)
response = self.http_session().get(DLROOT + reportid)
response.raise_for_status()

return response.text
Expand All @@ -144,7 +138,7 @@ def process(self):

for item in reportslist:
filename = item['file']
filename_fixed = FILENAME_PATTERN.sub('.' + self.file_format, filename, count=1)
filename_fixed = FILENAME_PATTERN.sub('.csv', filename, count=1)
if self.cache_get(filename):
self.logger.debug('Processed file %r (fixed: %r) already.', filename, filename_fixed)
continue
Expand Down
Loading
Loading