The world's largest salmon run occurs in Bristol Bay, Alaska every summer. The exact timing and magnitude of the salmon run is unknown, but predictions are made by ADFG at the start of every year. In 2018 a record 62.3 million salmon returned, 21% more than predicted by ADFG. The goal of this project is the develop a more accurate prediction of the Bristol Bay salmon run then what's currently provided by ADFG.
- Data fetch/scrape
- fishing boat activity = 'http://globalfishingwatch.org/'
- historical salmon catches = 'http://www.adfg.alaska.gov/index.cfm?adfg=commercialbyareabristolbay.harvestsummary'
- historical salmon counts (what escapes upriver) = 'http://www.adfg.alaska.gov/index.cfm?adfg=commercialbyareabristolbay.salmon#fishcounts'
- historical salmon catches in "test" fishery = 'https://www.bbsri.org/'
- stream and lake levels = 'https://waterdata.usgs.gov'
- tide activity = 'http://kapadia.github.io/usgs/reference/api.html'
- land weather = 'https://graphical.weather.gov/xml/'
- marine weather = https://www.ncdc.noaa.gov/data-access/marineocean-data
- more marina data to checkout = https://www.ncdc.noaa.gov/cdo-web/datasets
- NOAA web data services guide = https://www.ndbc.noaa.gov/docs/ndbc_web_data_guide.pdf
- tides & currents = https://opendap.co-ops.nos.noaa.gov/
Port Moller 9463502
Adak Island 9461380
Unalaska 9462620
St Paul Island 9464212
> more here: https://opendap.co-ops.nos.noaa.gov/stations/index.jsp
-
Data clean and transform
- Verify data integrity
- Standardize column names
- Convert all files to CSV
-
AWS setup
- Spin up micro EC2
- Create an S3 bucket
- Setup Sagemaker
-
Train the model
- Use built-in algorithms with Sagemaker
- Pre-process with Jupyter
-
Evaluate the model