Record Details

General Social Survey (GSS)

Harvard Dataverse (Africa Rice Center, Bioversity International, CCAFS, CIAT, IFPRI, IRRI and WorldFish)

View Archive Info
 
 
Field Value
 
Title General Social Survey (GSS)
 
Identifier https://doi.org/10.7910/DVN/DDDLEQ
 
Creator Damico, Anthony
 
Publisher Harvard Dataverse
 
Description analyze the general social survey (gss) with r the general social survey (gss) has served as america's mood ring since 1972. data-driven social scientists can compare political beliefs by demography, look at attitude trends, make emile durkheim and max weber (pronounced durk-veber) proud. in contrast to high-frequency tracking polls that capture newspaper headlines, the gss has sustained a (now biennial) set of questions over four decad
es.

most analysts start with the cumulative, cross-sectional file (interviews conducted 1972 - present). given the sprawling nature of that cumulative data set, you'd better read the documentation and understand the eccentricities of each of the variables you want to use before you send anything off for peer-review. for example, many of the five thousand variables include missing v
alues
due to split-sample questions. not to say it's bad data - it's damn useful. you try administering a survey that keeps relevant for almost half a century. otherwise, leave it to the national opinion research center (norc) at the university of chicago. ..and the national science foundation to foot the bill.

on the main gss page, norc off
ers two online query tools - nesstar and sda - meaning you can point-and-click your way to some basic statistics. the nesstar system smells like a fixer-upper, but berkeley's sda (survey documentation and analysis) site offers a great way to confirm that you're broadly analyzi
ng the data correctly before you start writing r code to laser-focus on your research question.


the general social survey only gets asked of noninstitutional adults, because everyone already knows what kids' political beliefs are: more candy, no homework. this new github repository contains two scripts:

1972-2012 cumulative cross-sectional - analysis examples.R
  • download, import, save the 1972-2012 cross-sectional table onto your local computer
  • load it back up (so the downloading and importing can be skipped next time)
  • limit the table to the variables needed for an example analysis
  • create a weight and primary samp
    ling unit
    variable based on berkeley's specifications
  • construct the complex sample survey object
  • run a treasure trove of political analyses

replicate berkeley sda.R
  • download, import
    , save the 1972-2010 (no typo there - this is not the more current 1972-2012) cross-sectional table onto your local computer
  • load it back up (so the downloading and importing can be skipped next time)
  • limit the table to the variables needed for an example analysis
  • create a weight and primary sampling unit variable based on berkeley's specifications
  • construct the complex sample survey object
  • print statistics and standard errors matching the target
    replication table
  • loop through each confidence interval on that table as well, using shiny new software born from this thread


click here to view these two scripts



for more detail about the general social survey (gss), visit:

notes:

berkeley's sda website currently hosts release #1 of the 1972-2012 cross-sectional gss file, w
hich is why the replication code above won't match their posted quick tables exactly. i kept bugging them until they ran the 1972-2010 release #2 data set through their same code, available in my github repository. those numbers mat
ch. squeaky wheel, baby.


confidential to sas, spss, stata, and sudaan users: why are you still dialing up to the internet after we've discovered fiber optics? time to transition to r. :D