-
Notifications
You must be signed in to change notification settings - Fork 585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extracts web domain and IP address, implements rendering functions and tests #1944
Conversation
…d tests This PR partially resolves mandiant#1907. It extracts web domains and IP addresses, and implements rendering functions and tests. These changes likely don't require updates to the documentation, but if some users want to, they should be able to repurpose many of the extraction functions without too much trouble. Unfortunately, I'll probably be unavailable during the next few days, but this weekend, I'll ensure the PR passes the CI tests. I'll probably also add some more tests for the rendering functions. Please let me know if you have any questions or suggestions! Below is example output for the default mode: +------------------------------+ | IP addresses and web domains | |------------------------------+ | google.com | | 192.123.232.08 | | my-w3bs1te.net | | maliciooous.r4ndom-site.uhoh | | whoops.net | +------------------------------+ Here is example output for verbose and vverbose modes: +-----------------------------------------------------------+ | IP addresses and web domains | |-----------------------------------------------------------+ | google.com | | |----IP address: | | |----192.0.0.1 | | |----Functions used to communicate with google.com: | | |----InternetConnectA | | |----HttpOpenRequestA | | |----FtpGetFileA | | |----3 occurrances | | | | | 192.123.232.08 | | |----Functions used to communicate with 192.123.232.08:| | |----... | | | +-----------------------------------------------------------+
very cool, I'll have to take a closer look in the upcoming week at this! thanks for the suggestions. |
Thanks @mr-tz! I'm just working on a couple bugs so I'll lyk when it's done! |
CD = Path(__file__).resolve().parent.parent.parent | ||
|
||
# these constants are also defined in capa.main | ||
# defined here to avoid a circular import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhpas define a script with all these constants inside? It is better than having repeated code
from capa.render.result_document import ResultDocument | ||
from capa.features.extractors.base_extractor import FeatureExtractor | ||
|
||
CD = Path(__file__).resolve().parent.parent.parent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CD is for current directory?
for tuple in obj: | ||
strings.append(tuple[0]) | ||
|
||
return strings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return [tuple[0] for tuple in obj]
""" | ||
invalid_list = ["win", "exe", "dll", "med"] # add more to this list | ||
|
||
for domain in invalid_list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return not string in invalid_list
if re.search(DOMAIN_PATTERN, string): | ||
if not invalid_domain(string): | ||
try: | ||
domain_counts[string] += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
domain_counts[string] = domain_counts.get(string, 0) + 1
In this way you don't use a try block. Faster and better
|
||
elif is_ip_addr(string): | ||
try: | ||
ip_counts[string] += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
can this be closed as superseded by #2031? |
@mr-tz Yes, I'll go ahead and close it! |
This PR partially resolves #1907. It extracts web domains and IP addresses, and implements rendering functions and tests.
These changes likely don't require updates to the documentation, but if some users want to, they should be able to repurpose many of the extraction functions fairly easily.
Unfortunately, I'll probably be unavailable during the next few days, but this weekend, I'll ensure this PR passes the CI tests.
I'll probably also add some more tests for the rendering functions.
Please let me know if you have any questions or suggestions!
Below is example output for the default mode:
Here is example output for verbose and vverbose modes:
Checklist