Skip to content

Commit

Permalink
Merge pull request #1 from drewmcmillan/AddUserAgent
Browse files Browse the repository at this point in the history
Add user agent
  • Loading branch information
Drew McMillan authored May 23, 2018
2 parents e6a1c04 + 548d4ba commit 669ccc9
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 3 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@ where `light-mc-crawler-config.json` looks something like this:
"limit": "/music/",
"httpsOnly": true,
"showHttpLinksDuring": false,
"showHttpLinksAfter": true
"showHttpLinksAfter": true,
"userAgent": "light-mc-crawler Mixed Content Crawler"
}
```
This will crawl `https://www.example.com` and any pages coming off it with `/music/` in the url.
Expand Down
3 changes: 2 additions & 1 deletion example.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,6 @@
"limit": "/music/",
"httpsOnly": true,
"showHttpLinksDuring": false,
"showHttpLinksAfter": true
"showHttpLinksAfter": true,
"userAgent": "light-mc-crawler Mixed Content Crawler"
}
4 changes: 3 additions & 1 deletion index.js
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ module.exports = (options) => {
crawler.respectRobotsTxt = false
crawler.parseHTMLComments = false
crawler.parseScriptTags = false
crawler.userAgent = options.userAgent || "light-mc-crawler Mixed Content Crawler"
crawler.maxDepth = config.maxDepth || 1


Expand Down Expand Up @@ -99,6 +100,7 @@ function runLighthouse (url, config, callback) {
stats.pageCount++
var mixedContent = require.resolve('lighthouse/lighthouse-core/config/mixed-content.js')
var chromeFlags = config.chromeFlags || '--headless --disable-gpu';
var userAgent = config.userAgent || 'light-mc-crawler Mixed Content Crawler'
const args = [
url,
'--output=json',
Expand All @@ -107,7 +109,7 @@ function runLighthouse (url, config, callback) {
'--disable-cpu-throttling',
'--disable-storage-reset',
'--disable-network-throttling',
'--chrome-flags=' + chromeFlags,
'--chrome-flags=' + chromeFlags + '--user-agent=' + userAgent,
`--config-path=${mixedContent}`
]

Expand Down

0 comments on commit 669ccc9

Please sign in to comment.