Skip to content

Stata code to merge together CDC Natality data from 2000-2004. Eventually will add code to merge 2005-2008 as well.

Notifications You must be signed in to change notification settings

arebe/cdc-natality

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

This repo is a project for merging together natality / birth certificate microdata available from the CDC.

This is individual-level data on births, and includes demographic data on the parents, and medical information about each birth.

The original public-use data is available here:
http://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm

However, the original data is in raw .dat format.  For the raw data to be meaningful, one must associate data locations with variable names.  One could do this by downloading the User's Guide for each data set (available at the above link) and writing a program to read in the raw data.  Alternatively, you might  visit the National Bureau of Economic Research's excellent website, and download the pre-formatted Stata data here:
http://www.nber.org/data/vital-statistics-natality-data.html

To use the code in this repo, the original data must be unzipped, and formatted for use in Stata.  

The input datasets are:
natl2000.dta
natl2001.dta
natl2002.dta
natl2003.dta
natl2004.dta

The output datasets are:
natl2000_rev.dta
natl2001_rev.dta
natl2002_rev.dta
natl2003_rev.dta
natl2004_rev.dta
natl_master.dta

This code requires a significant amount of RAM as the natl_master merged dataset is approximately 6GB in size.

About

Stata code to merge together CDC Natality data from 2000-2004. Eventually will add code to merge 2005-2008 as well.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published