- Motivation.mp4 : Teaser video for project's motivation.
- Data Collection Scripts : 2.1 task1a.py : code to collect tweets using search API and stored in Result.csv 2.2 task1b.py : code to collect tweets using streaming API and stored in tweetstream1.csv 2.3 task2a.py : code to collect tweets using search API and stored in mongodb (collection used: search, database used: twitterdb) 2.4 task2b.py : code to collect tweets using streaming API and stored in mongodb (collection used: search, database used: twitterdb)
- clustering.py : includes the logic to find jaccard similarity among tweets and cluster tweets based on similarity score.
To run clustering.py, use command: python clustering.py n.json SampleInitialSeeds.txt - SampleInittialSeeds.txt : file to provide start point(seed) ie tweet id to start comparison.
- Sample data used for testing : n.json
-
Notifications
You must be signed in to change notification settings - Fork 0
imaditiagg/traceInfoPath
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Tracing the Information paths to sources : Rebuilding the Information Pathways in Trending HashTags on Twitter
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published