Measurement and Analysis of Public Opinion: An Analytic Framework (2022)

Chapter: 3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya

Previous Chapter: 3A Drawing Inferences fromPublic Opinion Surveys: Insights for Intelligence Reports - Ren Bautista
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

3B

Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes

Ashley Amaya

EXECUTIVE SUMMARY

Probability-based surveys of the general population are not always possible or appropriate. In some instances, an alternative approach may be more fit-for-purpose (Baker et al., 2010). Analysts, in other words, may seek a less than ideal design that optimizes timeliness, costs, ethical considerations, and data quality based on the specific question they need to address. While no data source (including probability-based general population surveys) is without error, it may be possible to identify a source that is sufficient to address the analyst’s objectives. In some cases, this source may be better suited to address the question than a probability-based general population survey.

Unfortunately, there is no one-size-fits-all data collection method. Instead, analysts should identify their objectives (i.e., what question[s] they need to address) and priorities (e.g., timeliness, minimize bias), then evaluate various data sources to identify which one(s) may accommodate their project needs. For example, one analyst might require a precise estimate of public trust in a current regime whereas another may want to measure change in trust over time. The first analyst will need to prioritize data accuracy and minimize bias whereas the second may prioritize consistent measurement over multiple time points, even if each individual point suffers bias. As a result, the two analysts may opt to analyze different datasets.

This chapter should help analysts answer two primary questions: (a) Should or can I use a probability-based general population survey? (b)

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

What alternatives exist to probability-based general population surveys, what are their strengths and weaknesses, and under what circumstances are they most appropriate?

In addition to addressing these primary questions, this chapter also includes a discussion on how and when one may consider using multiple datasets and offers some general ethical considerations for using any type of data. Because analysts face a myriad of questions and combinations of priorities, this chapter cannot act as a frequently asked question list and offer direct yes/no answers or step-by-step instructions. Instead, it seeks to empower the analysts to evaluate their data options by providing sufficient information on various data types and situations.

RULING OUT PROBABILITY-BASED GENERAL POPULATION SURVEYS AND DETERMINING NECESSARY CRITERIA FOR ALTERNATIVE DATA SOURCES

There are countless reasons that a probability-based survey of the general population (GP) may not be necessary or feasible. In some cases, a probability-based GP survey may not be affordable, or when considering data quality, a sufficiently reliable and complete sample frame may not be available for the target population, which introduces the potential for coverage bias (FAO, 1996). The topic of interest may not be measurable via a survey because it is an abstract construct and not amenable to a survey question (introducing construct validity issues) or respondents may not know the answer or may be unwilling to provide an accurate response (introducing measurement error). Researchers may not be able to interview certain types of respondents because they cannot be accessed (e.g., lack of access due to a natural disaster) or because potential respondents refuse to participate. In these situations, a probability-based GP survey may suffer nonresponse error. In addition to data quality, the data may be dated and no longer relevant. The ultimate consumers may also need the data more quickly than collection reasonably allows, the data may have been gathered in a way that may violate U.S. or in-country ethics norms or laws, or the sample size may be insufficient for the chosen analyses.

It is important to acknowledge the reason for ruling out a GP probability-based survey prior to selecting an alternative data source for two reasons. First, it acts as a double-check on the research priorities and goals. If research priorities and goals do not sufficiently justify rejecting a GP survey, it may be worthwhile to reconsider using this type of survey. Let’s consider a hypothetical question that an analyst might be tasked with answering: How has support for the current regime changed over time? To address this question, it is important to have 1) a repeated

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

measure of support over the time period of interest and 2) sufficient sample sizes to detect the target level of change. Let’s also assume that a GP dataset is available for multiple time points but has been rejected because it suffers coverage bias because it consistently excludes residents living in outlying areas that are difficult, time-consuming, or costly to access. This suggests that the individual point estimates (i.e., level of support for any point in time) may be biased, but it is possible, though not certain, that the change over time (i.e., the measure required to address the research question) is unbiased. In this case, the reason for rejection does not map to the analyst’s priorities, so the analyst may opt to use the available survey.

Second, understanding the reason for rejection will help analysts understand what they need or need to avoid when searching for alternative data sources. For example, if an analyst seeks to understand government support by an ethnic group and no GP survey includes sufficient sample sizes for subgroup analyses, determining what would be a sufficient sample size and seeking out data sources that meet that criterion would be important.

ALTERNATIVES TO PROBABILITY-BASED GENERAL POPULATION SURVEYS

This section introduces various alternatives to probability-based GP surveys. They can be divided into a two-by-two matrix: designed versus organic data (Groves, 2011) and primary versus secondary data. Designed data is systematically considered by a researcher and designed to address a particular research question. It can take the form of surveys or qualitative research. With designed data, researchers should have all necessary information available to address specific research for which the study was designed; such data should have high construct validity.1 Organic data are those that have been collected for another purpose or are a result of another process (e.g., credit card transactions, tweets). As such data were not collected to address the particular question of interest, they may suffer from low construct validity (i.e., they measure something slightly different than what the analyst wishes to measure). They may also suffer from other forms of biases (e.g., coverage error in which the full population is not represented) or ethical challenges such as the lack of consent from individuals to having their cell phone geolocation data used to study migration patterns (Lai et al., 2019).

___________________

1 Note that this does not mean that all designed data can answer all questions for all uses. It is limited to addressing questions for which it was designed. This means that secondary users may find that their analyses still suffer poor construct validity.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

The analyst may use primary or secondary data sources to address their question. Primary data sources are created or collected by the analyst/researcher and used for the purpose of their own research question. In other words, the researcher has control over the methods and full access to the data, and the process is transparent. They control what is collected and when. Secondary data are collected by others and may be accessible by analysts either at the microlevel (i.e., a complete dataset with a record for each unit of analysis) or in summary form (e.g., a table of percent distributions). Secondary data are beneficial in that they are generally free or low-cost since the analyst did not have to pay for collection. They are also immediately available since they have already been collected. In this instance, though, the IC analyst has no control over the process or what data are available, and the methods used to collect them may not be fully transparent.

Examples of each type of data are found in Table 3B-1. Primary designed data includes surveys and qualitative research designed by the researcher while secondary designed data includes similar types of data designed and collected by others. Primary organic data often includes nonsurvey data that is appended to survey response data. Paradata, for example, are administrative data points collected as a result of the survey. They include items such as the date the interview was completed, the

TABLE 3B-1 Examples of Data Sources by Type

Primary Secondary
Designed
  • Researcher-designed nonprobability survey responses (such as those using M-Turk1)
  • Researcher-designed focus group data
  • Afrobarometer2
  • European Social Survey3
  • UNICEF’s U-Report4
Organic

__________________

1 www.mturk.com.

2 http://afrobarometer.org.

3 https://www.europeansocialsurvey.org.

4 https://www.unicef.org/innovation/U-Report.

5 https://developer.twitter.com/en/docs/.

6 https://support.google.com/trends/answer/4365533?hl=enhttps://support.google.com/trends/answer/4365533?hl=en.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

language in which the interview was conducted, and the name or ID of the interviewer. These data points are considered primary organic (not designed) because they are created by the researcher as part of an administrative process such as determining interviewer pay or identifying the optimal time of day to contact potential respondents. They are not meant or collected for purposes of substantive analysis to address the research question. Secondary organic data includes all other nonsurvey and non-qualitative research data that was not collected by the researcher. Tweets, for example, are not surveys or qualitative research. They are also not created as a result of a research process.

As may already be evident, there are numerous data sources that may serve as an alternative to probability-based GP surveys. Despite the overwhelming number of different data sources, this framework (primary versus secondary x designed versus organic) can be used to determine some general strengths and weaknesses of a given dataset as a starting point of evaluation, including those not discussed in this report.

The remainder of this section is limited to data sources most relevant to studying social attitudes. For each source, a definition is provided, strengths and weaknesses are discussed, best-use cases are included, and ethics are considered. Where appropriate, best practices for collecting primary data are included. Examples are also provided throughout this section. Where possible, the examples chosen are related to use cases that may be of interest to the IC, but it was more important to provide examples that had been validated (i.e., were a strong use case) than to provide IC-related examples.

Note that all data sources evolve over time. This section is a snapshot of what is available at the time of writing (2021). It should be used as a baseline and supplemented with more up-to-date and context-specific information when choosing a data source.

Also note what this section is not designed to provide. It cannot provide a universal list of “good” and “bad” datasets because no such list exists. As noted in “Drawing Inferences from Public Opinion Surveys: Insights for Intelligence Reports” (“Drawing Inferences”) earlier in this Analytic Framework, the appropriateness of a dataset depends on the use case. A dataset that is well suited for addressing one question may be merely adequate for another and completely inappropriate for a third. The researcher must determine their research objective, analytic plan, and set of priorities (e.g., getting an answer quickly is most important even if there are some biases in the results). The researcher’s priorities may then be compared to the strengths and weaknesses outlined here to provide a high-level account of the appropriateness of a dataset.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

Probability-Based Surveys Other Than Those Meant to Represent the General Population

It may be easiest to consider what is covered in this type of data by first discussing what is not included. Probability-based surveys that have nearly complete coverage and/or are meant to collect a representative sample of the general population are excluded from this discussion (these types of surveys are covered in the “Drawing Inferences” paper).

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

In some situations, it may be feasible to field a probability-based survey but the sample for such a survey cannot or should not be representative of the general population. In such situations, an alternative probability-based designs may be more appropriate. Three situations where an alternative design may be appropriate are:

  1. Analysts would like to make inference to the general population, but the sampling frame is missing units, for example, of a region, minority group, or age cohort;
  2. Analysts wish to make comparisons among subgroups of adults, but some subgroups (e.g., adults 65 years of age or older, minority ethnic group) may be small, and a representative sample of the population would not yield enough interviews among the small subgroups to produce reliable comparison points; or
  3. Analysts need to study a rare population (e.g., adults with a particular occupation) for which a sample frame is not available and screening a representative sample of a general population frame would be cost-prohibitive.

Incomplete Coverage

Assuming that the USG will undertake primary data collection (i.e., collect new data), they can alter the survey methods prior to data collection to accommodate their analytic needs. In the first example, the researchers cannot draw a representative sample of the general population. The resulting estimates may suffer coverage bias (Groves, 2004; see also “Drawing Inferences” for additional details). That is, individuals who are not represented on the frame are different from those who are on the frame. Researchers should assess the risk of coverage bias by asking whether there is a theoretical reason that excluded individuals should have different attitudes than those who are included and if that difference is large enough to affect the overall estimates. It may be impractical, for example, to interview residents of the Galapagos Islands when conducting a national survey in Ecuador. Even if the residents of the island have significantly different attitudes than the rest of country, their exclusion is unlikely to bias national estimates since they only represent 0.2 percent of the total population.2 This assessment can also be applied to secondary data sources such as those conducted by aid organizations that are limited to a subset of regions in a country.

If coverage bias is a risk, researchers may consider using a secondary sampling frame, mode, or survey. Continuing with the Galapagos example, it may be impractical to conduct in-person interviews, but a population

___________________

2 https://www.citypopulation.de/en/ecuador/admin/.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

register may have telephone numbers for these residents and the survey could be conducted by phone. If researchers conduct analyses of preexisting survey data, they may seek out a similar survey conducted among the missing population and blend the data to create a singular national estimate (for a discussion of blending data, see “Integrating Data Across Sources” [“Integrating Data”] later in this Analytic Framework).

Various weighting methods may also be employed to remove biases. The Keeter adjustment (Keeter, 1995), for example, which is traditionally used in U.S. telephone surveys to account for nontelephone households, identifies individuals who used to be part of the missing group and weights them up to represent the missing group. In consideration of the Galapagos, a question may ask respondents if they have lived in the Galapagos within the past 5 years. Individuals who answer affirmatively would be treated as a proxy for the missing inhabitants and weighted accordingly.3 Alternatively, if the sampling frame only undercovers a subgroup of the population without completely eliminating them, traditional weighting methods may be used to minimize or eliminate bias from the estimates (Valliant, Dever, and Kreuter, 2013). While it is best to consider the weighting plan during the design phase, it may be possible to construct new weights on secondary or preexisting primary datasets depending on what data are available.

Finally, if resources and time permit, researchers may wish to invest in creating a frame of the missing units. This may be an especially useful investment if repeated surveys are expected over time. The frame could be built once and used multiple times over several years. NORC at the University of Chicago has used this method to improve coverage4 of the U.S. Postal Service’s Computerized Delivery Sequence (CDS) File list of addresses, a list that undercovers5 rural addresses and Native American reservations, among others. NORC uses listers to identify and append addresses that are missing from or incomplete on the CDS and uses this enhanced frame to draw address-based samples throughout the decade. Similar listing procedures may be used to create a frame of missing housing units. Geospatial sampling, for example, may be useful when no frame is available (Eckman and Himelein, 2020). It involves creating a map of the area of interest and overlaying a sampling grid. Cells of the grid are randomly selected and listers are tasked with enumerating (and in some cases interviewing) all housing units within the sampled cell.

___________________

3 Note that the Keeter adjustment has not been tested for use cases other than the nontelephone population. The example described here is hypothetical and it, along with any other use case, would require validation before it is applied in practice.

4 https://www.norc.org/Research/Projects/Pages/2010-national-sample-frame.aspx.

5 https://www.aapor.org/Education-Resources/Reports/Report-on-Online-Panels.aspx.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

Rare Populations

In the second and third examples described above, the researcher is not interested in estimates of the general population but in subgroups. In these situations, the researcher may opt to draw a stratified or clustered sample (Rafferty, n.d.). Stratification exists when the sample frame is divided into mutually exclusive groups. One may, for example, have a national register of individuals that can be divided by ethnic group. More realistically, the frame may be divided by region. Samples are drawn independently for each stratum or group. How each sample is drawn is up to the researcher. If a researcher wanted to ensure a nationally representative sample, he/she could use a probability proportional to size (PPS) sample design that would select the same proportion of individuals from each stratum. If we had a stratified sample of states in the United States, for example, we might draw a sample of 10,000 people in California (0.025 percent of the population) and a sample of 147 people in Wyoming (also 0.025 percent of the population). However, a researcher interested in comparing states, would want to adopt a strategy to overrepresent minority populations (in this case, residents of Wyoming) to ensure sufficient sample sizes, and thus may wish to select an equal number of sampled persons per stratum (e.g., 1,000 in California and 1,000 in Wyoming). Even with oversamples, sampling weights (or base weights) allow researchers to create estimates representative of the entire population of interest.

Moreover, when the rare population is the only population of interest, stratification can be used to improve the efficiency of the sample. For example, let us assume the researcher wishes to survey Iraqi Kurds, who only make up 16 percent of the Iraqi population (Mohamed, 2014). It would be inefficient and cost-prohibitive to draw a representative sample of the general population and screen households for Kurdish inhabitants. Instead, researchers may create three (this number is up to the researcher and used only as an example) strata based on the population density of Kurds: 1) provinces in Iraqi Kurdistan, 2) large cities with a significant Kurdish population such as Baghdad, Mosul, and 3) the rest of Iraq. One may opt to draw the majority of the sample in the first two strata and little to none in the third stratum where Kurds are unlikely to live. This method preserves a probability design while improving the efficiency of the sample by increasing the proportion of Kurds in the sample.

Clustering may also be used to efficiently oversample rare populations. Clustering involves selecting a sampling unit that has more than one eligible person in it (Thomas, 2021). This could be a neighborhood block or a household, among other things. In this case, the researcher would interview multiple people within the unit. A researcher may, for example, interview

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

all eligible household members in a household that contains members of the rare population. This approach may significantly reduce data collection costs and improve timeliness of data collection, but it can increase the design effect when individuals within the cluster (e.g., household) have relatively homogenous attitudes (Kish and Frankel, 1974). This, in turn, reduces the statistical efficiency of the sample. In other words, it introduces more noise into survey estimates, making it more difficult for estimates to reach statistical significance.

Overall, analysis of rare populations (either individually or comparatively) generally requires an oversample that must be considered a priori to data collection. This means that the researcher may have to collect primary data or identify a secondary data source that had similar analytic goals. If neither of these are feasible, researchers may have to turn to nonprobability sources and organic data.

Nonprobability Surveys

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

The difference between a probability-based and nonprobability survey is how the sample is drawn. The probabilities of selection are unknown for nonprobability samples. This occurs when no frame exists, or a frame exists but the size of it is unknown to the researcher. Nonprobability samples are frequently used in situations similar to the following:

  • A sampling frame is unavailable and cannot be created within the timeline and budget. Researchers may, for example, wish to measure voter turnout and vote choice in real time on the day of the election. A list of registered voters may not be available, may be incomplete, or may not contain contact information (such as phone numbers) to facilitate same-day contact.
  • Researchers are interested in studying a rare population for which stratification cannot sufficiently improve efficiency or is cost-prohibitive. Researchers may wish to interview female sex workers in China in the midst of a sexually transmitted infection epidemic (Verdery et al., 2015). This population is extremely rare, and a population frame is unlikely to have data that could be used to stratify it. Moreover, these individuals are likely untrusting of authorities, so “cold” contact from a probability-based frame is unlikely to be sufficient to gain cooperation.
  • Cost constraints make a probability-based design infeasible.

Nonprobability Online Panels

There are multiple types of nonprobability surveys that may be used to facilitate the above research objectives (Vehovar, 2016). One option may be

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

the use of nonprobability online panels (Kennedy, 2018). Individuals receive internet pop-ups or invitations through loyalty programs (among other means) inviting them to join a survey panel. Once they opt to join the panel, the panel vendor will invite them to complete various surveys (sometimes many per week). While it may be possible to measure how many people see the invitation to join the panel, the population is unknown, so a probability of selection for joining the panel cannot be calculated.

Nonprobability panels offer several advantages. They are generally less expensive than probability-based designs. Panel vendors typically collect a lot of information from the panelists (e.g., demographics, rare health attributes), making it possible to target specialty subgroups (e.g., veterans) or stratify the sample in ways not possible with most probability-based frames. Some online panel vendors also “route” respondents (Unangst et al., 2019), purposely sampling panelists that they believe have the attributes of interest (e.g., a specific political party affiliation). While this improves efficiency and reduces costs, it also yields a purposive, not random, sample. Panel surveys are also conducted (primarily) by an email invitation to complete a web survey, making them much more timely than some probability-based designs.

Some panel vendors may implement quota sampling,6 a method where targets are set for specific subdomains. The researcher may, for example, require 50 percent of completed interviews be from men and 50 percent from women. Once the target has been reached for a given subdomain, it is “closed.” Potential respondents who fall into closed subdomains are deemed ineligible and not asked to complete the rest of the survey. Quota sampling can ensure that the sample distribution mimics the general population, or if studying a rare population, ensure sufficient sample sizes for small subdomains. It can also be cost-efficient when using postpaid incentives since you do not need to pay the incentive for “extra” interviews in the closed subdomains. It also helps to limit the amount of variance introduced by weighting adjustments. However, quota sampling also results in a false sense of security. Individuals within a given subdomain (e.g., men) are often not representative of the group as a whole. They may be more cooperative or socially engaged, easiest to identify a priori, or may differ on other attributes. Quota sampling often fails to correct biases in point estimates of nonprobability surveys (DiSogra et al., 2011).

However, nonprobability panels generally suffer from under-coverage (Baker et al., 2010). They often do not include individuals who are not online, skew wealthy and more educated, and skew younger. Other coverage issues may also be apparent in different countries. These biases generally skew point estimates. Weights and modeling approaches frequently fail to correct the biases, and in some cases, they make them worse (Dutwin and

___________________

6 https://www.questionpro.com/blog/quota-sampling/.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

Buskirk, 2017). Nonprobability samples also preclude the use of some statistical methods. Probabilities of selection, for example, cannot be calculated, so standard errors and confidence intervals cannot be accurately produced (Baker et al., 2013).7

Moreover, not all nonprobability panels are created equal (Kennedy et al., 2016). Analysts should consider the methods for panel creation and panel maintenance. The European Society for Opinion and Marketing Research (ESOMAR) has published a list of questions (and explanations of the importance of each question) that researchers should ask vendors when considering whether to use their panel/data and/or evaluating different vendors (ESOMAR, 2021). Some vendors have published answers to these questions on their websites, but all should be willing to answer these questions. If they are not willing to provide this information, that may be an indicator of poor practices and poor data quality. Moreover, researchers have developed a framework that outlines the necessary conditions that must be met in nonprobability surveys to produce unbiased data (Mercer et al., 2017).

Intercept Surveys

A second nonprobability design is the intercept survey. Intercept surveys involve “standing” in a particular location and interviewing all or a fraction of individuals who pass by. Exit polls are an example of in-person intercept surveys in which interviewers stand outside of voting precincts and ask people leaving the precinct for whom they voted (AAPOR, 2007). Intercept surveys may also be conducted online using a method called river sampling. Just like some of the recruitment methods used to construct nonprobability panels, an ad or banner is published on a website and individuals who click on it (think of fishing in the river and the fish that bite the line) are asked to complete a survey. Ad managers such as Facebook’s Ads Manager8 can use browsing history, social network group memberships, and other metadata to help target the survey to specific individuals (e.g., social activists).

Similar to nonprobability panels, intercept surveys are frequently less expensive than probability-based designs. In the case of online intercepts, they can also be used to collect large amounts of data quickly, and metadata targeting can be used to improve the efficiency of the sample in identifying individuals from rare populations. In-person intercept surveys (e.g., exit polls) can also produce relatively accurate estimates when the placement of the interviewers facilitates a random selection of potential locations and

___________________

7 Note that this limitation often does not prevent researchers from attempting to create and reporting inaccurate standard errors.

8 https://poweradspy.com/facebook-ads-market-research-survey-campaigns/.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

rules guide interviewers on whom to sample (Armstrong, 2019). For example, a design in which a random sample of polling precincts is selected, and interviewers are instructed to attempt to interview every 10th person who exits would have a good chance of producing accurate estimates. A hand-picked selection of precincts in which interviewers just have to interview 100 voters would not.

Intercept surveys may, however, also suffer similar weaknesses as other nonprobability designs such as coverage bias, limitations on statistical analyses, and differences in data quality due to nuances between intercept designs. In some instances, intercept surveys may yield more biased data than other types of nonprobability sample surveys (Lehdonvirta et al., 2020). In-person exit polls have, moreover, suffered differential nonresponse in which one party is more likely to participate, introducing nonresponse bias (Silver, 2008). In recent years, the accuracy of exit polls has been further challenged due to the coronavirus pandemic, which has facilitated a higher proportion of early voting in several countries (Bronner and Rakich, 2020). If election day voters are different from early voters and if this phenomenon is not correctly accounted for in the models, the exit polls will be inaccurate.

Chain-Referral Methods (Snowball Sampling and Respondent-Driven Sampling)

A final type of nonprobability surveys uses chain-referral methods such as snowball sampling (Goodman, 1961) and respondent-driven sampling, or RDS (Heckathorn, 1997). These designs are used to study networks as well as rare populations in which members of the population likely know each other or where members may be untrusting of outsiders, such as the researcher. They are most commonly used to study at-risk populations in public health (e.g., homelessness [Bernard, Da ková, and Vašát, 2018]), but have also been used to study rare indigenous populations (Mullo, Sánchez-Borrego, and Pasades-del-Amo, 2020) and could be useful to study other groups.

The samples start with identifying some eligible individuals (often called “seeds”), interviewing them, then asking them to recruit or refer other eligible individuals (“sprouts”). Researchers attempt to interview the sprouts, ask them for references to others, and so on.

There are two distinctions9 between snowball sampling and RDS that deserve note as they affect data quality. First, snowball samples use nonprobability methods to identify seeds, whereas RDS may use either nonprobability or probability methods. Second, snowball samples do not attempt to track or control for who referred whom. RDS samples do. As a whole, RDS uses snowball sampling procedures and applies mathematical

___________________

9 https://www.statisticshowto.com/respondent-driven-sampling/.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

models to account for probabilities of referral (i.e., selection) in an attempt to eliminate biases. This makes RDS arguably less prone to biases but much harder to implement.

Chain-referral methods such as snowball sampling and RDS are one of the most efficient methods to identify extremely rare populations. When the population is at-risk or the topic is extremely sensitive, this approach may be more successful than other nonprobability methods (e.g., an online intercept survey ad placed on relevant websites). Referrals aid in gaining cooperation and building rapport in a way that is more difficult with a cold call.

Unfortunately, the use of chain-referral methods may also introduce several challenges. These methods can be costly and labor-intensive, and it can be time-consuming to implement and build a sufficient sample size. Specifically, it is difficult to adequately define eligibility criteria and frame network size questions (Johnston et al., 2008). Some seeds may not spawn enough sprouts, limiting the sample size and introducing bias into the estimates (Truong et al., 2013). Additionally, the models used in RDS are not always successful in eliminating bias, and RDS often increases variance in the estimates (Goel and Salganik, 2010; McCreesh et al., 2012), further limiting statistical efficiency.

Ultimately, nonprobability surveys are often less expensive and time-consuming to implement, and may be better equipped to identify rare populations. However, they are also much more likely to suffer from biases due to coverage error and unknown (or known but uncontrollable) sampling error. Moreover, the variation in data quality makes it critical that the IC ask many questions regardless of whether the data are primary or secondary. If primary, the questions may follow those similar to the ESOMAR questions and aid the choice of a vendor. If secondary, the IC should obtain and review the methodology report that may accompany the data.

Qualitative Data Collection

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

A final designed data source is that of qualitative data collection. Qualitative data collection includes a variety of research methods that require summation of data using nonstatistical means (Busetto, Wick, and Gumbinger, 2020). Data may still be collected using questionnaires or interviews, but it is typically summarized in narrative form. This tool may be especially useful when conducting foundational or exploratory research to understand cultural norms and societal practice. It is also useful for deeper dives into “why” and “how” questions. For example, a survey may be useful to measure general attitudes toward individuals with HIV/AIDS whereas qualitative research may be more appropriate to collect information on how individuals perceive risk factors for contracting HIV (Tarimo et al., 2013).

Qualitative research can generally take six forms.10 This paper focuses on the two most relevant for USG researchers: one-on-one in-depth interviews (that is, semi-structured interviews) and focus groups. For more information on other types of qualitative research, see Merriam and Tisdell (2009).

Similar to a survey, one-on-one in-depth interviews include an interviewer and respondent and a basic set of questions. Unlike a survey, the questionnaire is a guideline, and the conversation is much less standardized. The interviewer is free to follow up with respondents on what they said. Most if not all of the questions are open-ended, unlike most survey questions that have a set of response options.

___________________

10 https://www.questionpro.com/blog/qualitative-research-methods.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

Focus groups are similar to one-on-one interviews except that they are conducted with a group of respondents at once and allow one respondent to build on or negate what another respondent said. In general, one-on-one interviews are more time-consuming than focus groups but are more appropriate for sensitive topics where it may be inappropriate, embarrassing, or risky to share details in front of others.

Because both one-on-one interviews and focus groups are analyzed qualitatively, fewer cases are needed than traditional surveys that require sufficient sample sizes for quantitative analyses. The total number of cases is often driven by affordability, but some research suggests that most themes are discoverable in as few as three focus groups or between 5 and 50 in-depth interviews (Dworkin, 2012; Guest, Namey, and McKenna, 2016). These counts should only be used as a starting point. The actual number of interviews or focus groups required will depend on the topic, variability of attitudes, and recruiting methods. A good rule of thumb is that data collection may stop when each new interview or focus group is adding little to no new information. Because qualitative research typically requires far fewer cases than quantitative methods, it is often less resource-intensive in terms of time, labor, and price.11

Even with fewer cases, it is important to achieve a sample of individuals that represents the population of interest. To this end, a researcher may place an ad to recruit potential individuals and ask a series of screening questions to those that express interest in participating. Screening questions may be based on a series of demographics or correlates of the outcome of interest and may be used to ensure a representative distribution. Individuals in overrepresented groups (e.g., wealthy, higher educated) may be told that the study is full and that they have been placed on a wait list. To get a representative sample, it may be necessary to recruit from multiple sources. This could include posting on a listserv, placing an advertisement on social media (similar to river sampling as discussed in the nonprobability section) and/or partnering with local leaders (e.g., community center managers) to post fliers or communicate the opportunity by word of mouth. Recruitment rules may also be required to limit the number of individuals per household that may participate and screen for past participation in these types of studies.

In addition to recruitment techniques, researchers also need to select an appropriate interviewer or moderator, the person who will conduct the interviews or focus groups (Morrison-Beedy, Côté-Arsenault, and Feinstein,

___________________

11 While total costs for qualitative research may be less than surveys, the cost per completed interview is often higher since interviewers who conduct qualitative research typically require more skills, experience, and training than survey interviewers, incentives are generally higher, and each interview is longer than many surveys.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

2001). Moderating is difficult and requires a skilled individual who is familiar with the research objectives, knows when to probe further, knows when to sit quietly and give respondents time to voice their opinion, is able to pick up on cues (verbal and nonverbal), does not lead the respondents, and can navigate different personalities in group settings. When choosing moderators, consider whether they have special training in focus group and in-depth interviewing. This may include university coursework or special accreditation from institutes such as the Burke Institute.12 It may also be beneficial to hire a different moderator for different types of respondents. For example, it may facilitate more open conversation to have a Black American moderate a group of Black Americans and an Asian American to moderate a group of Asian Americans.

Researchers also need to equip the moderator with the appropriate tools. This will include developing items such as an interviewer guide, an informed consent text, a screening and recruitment protocol, and training on the purpose of the study. Just like developing a sampling plan and questionnaire for a survey, developing these types of materials takes time and multiple rounds of review but is made easiest when building off of a clear research question.

While qualitative research can offer rich information that is often not available from a survey, it is not without its limitations. Most importantly, quantitative data makes it more difficult to produce quantifiable statistics. Researchers may instead want to use qualitative analytical techniques to identify themes and ideas (Warren, 2020). This is a specialized skill set that requires an individual trained in qualitative data analytics (often a social scientist, historian, or anthropologist). Moreover, the use of formal qualitative methods and trained staff minimizes the risk of subjective interpretation of the data.

Additionally, if researchers are not careful to recruit a representative sample of the population, the narrative may suffer significant biases, overemphasizing one perspective or missing several opinions. This can occur even when quotas are set. Let us assume, for example, that researchers place age quotas on the recruitment and then, recruitment materials are placed around school campuses to recruit the youngest age group. While this should yield a sufficient count of young adults, the young adults will likely skew more educated and wealthier than the population as a whole. As mentioned above, researchers may need to recruit from multiple sources, balance on multiple demographics, and consider what other variables (e.g., sufficient tech savviness for handling online surveys) may be correlated with the outcome of interest and need to be considered when evaluating whether recruited individuals likely represent all types of opinions and attitudes.

___________________

12 https://www.burkeinstitute.com/SeminarDescription/Detail/Q01.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

Social Media

The most common type of organic data used to study attitudes is social media data, which include but are not limited to text-, video-, and image-based posts and metadata such as hours spent on the social media platform, number of friends, and click rates on ads. For the purposes of this

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

discussion, social media data do not include nonprobability surveys fielded on social media platforms (see the discussion of intercept surveys above for more information on these types of data).

Research conducted in this space has covered topics from climate change to consumer confidence to election outcomes (Ceron et al., 2013; O’Connor et al., 2010;13 Williams et al., 2015) and has used data from Facebook, Twitter, Reddit, Wikipedia, Google, and Instagram, among others. Social media data may assist researcher in:

  • Providing an immediate reaction to a sudden event such as a coup or natural disaster (e.g., text analytics of posts containing relevant key words)
  • Estimating turnout for an upcoming rally or protest (e.g., number of likes to an event’s page compared to other recent events in the country)
  • Understanding some types of networks such as the influence of party leaders (e.g., number of Twitter followers or retweets)

The largest benefit of social media data is its real-time nature. Unlike surveys that may take a significant time to plan and field, individuals react to news via social media quickly. Social media data are also stored and, in some cases, offer a historical record. This may allow researchers not only to gain a snapshot of attitudes as events unfold but also allow for the analysis of data leading up to the event. Retrospective analyses are rare in surveys because prior time points may not have been collected or because asking respondents to recount past attitudes creates measurement error due to forgetting, telescoping, or the reframing of past attitudes in light of current events (Tourangeau, Rips, and Rasinski, 2000). As such, social media data may be the most appropriate data source to use while unexpected events are unfolding.

While social media data are often considered free or inexpensive and easy to access, this is frequently a misconception. On the one hand, tools such as the Twitter14 and Reddit15 application programming interfaces (APIs) are free to use and offer a variety of options to download different types of data. On the other hand, many social network platforms are entirely inaccessible, making access rather than cost the driving factor. For example, messages and phone calls on WhatsApp use end-to-end encryption, which means that no one, not even WhatsApp administrators,

___________________

13 Note that the trend identified by O’Connor and colleagues (2010) did not hold over time (Conrad et al., 2019).

14 https://developer.twitter.com/en/products/twitter-api/academic-research.

15 https://www.reddit.com/dev/api/.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

can access them (Barrett, 2021).16 Even platforms that offer free access may limit the quantity or format of available data. Google Trends,17 for example, provides normalized descriptive statistics of a sample of Google searches, but it does not provide a full dataset of all searches. Data access may also differ depending on the organization. The Zillow API,18 for example, allows any user to scrape a finite number of data points per day, but Zillow also has a full database, ZTRAX,19 that is accessible only to academics. Additionally, some companies may require lengthy contract negotiations before the data may be accessible. Such negotiations are often labor-intensive and costly.

Even platforms that do allow free access can be expensive to analyze. Many data sources require advanced programming skills in languages such as R or Python, comprehension of nonrectangular datasets (e.g., JSON files), or advanced analytic skills (e.g., text analytics, network analysis). Individuals who typically analyze survey data will likely need more training on analyzing social media data, and it will take time to learn this. Time affects both cost and how quickly social media data can be analyzed, offsetting the benefit of the real-time nature of social media data.

Social media data also suffer from significant biases. Most importantly, only a small proportion of the total population is on any given social media platform. For example, while over 24 million people in India use Twitter (Statista, 2022), this only accounts for 1.6 percent of the total population. Even fewer post to the platform on a regular basis, and users are significantly more likely to be male than the general population (Hootsuite, 2019). In the United States, social media users skew younger and/or more liberal than the general population (Wojcik and Hughes, 2019). To the extent that demographics are correlated with the outcome of interest, estimates will suffer from sizable coverage bias when attempting to make population estimates.

Social media, like surveys, is also subject to measurement error. That is, social media content may not be accurate. Individuals may wish to present their “best” self, so their posts do not paint the full picture of their attitudes and beliefs (Schober et al., 2016). In some countries, they may also avoid posting information on sensitive topics (e.g., critiques of the regime) out of concern for retaliation. This type of filtering suggests that social media data may suffer from missingness error and may not represent all viewpoints as they exist in the population. Other forms of error are introduced by “fake”

___________________

16 International privacy laws are rapidly changing, and access to a number of platforms is likely to change over time.

17 https://support.google.com/trends/answer/4365533?hl=enhttps://support.google.com/trends/answer/4365533?hl=en.

18 https://documenter.getpostman.com/view/9197254/SzRuZCCj?version=latest.

19 https://www.zillow.com/research/ztrax/.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

accounts. Over half of all links to popular websites are likely posted by bots (Wojcik et al., 2018). Including tweets from these automated accounts can add significant noise and bias into all forms of analysis.

Given space limitations, this discussion is limited to the largest and most common sources of bias, but it is critical to understand all potential error sources when considering whether or how to use social media data. Luckily, several researchers have outlined various frameworks for considering error in social media data, and, more broadly, Big Data (Amaya, Biemer, and Kinyon, 2020; Hsieh and Murphy, 2017). We recommend consulting these frameworks for a deeper understanding of the error sources, brainstorming the different types of social media available, and identifying potential solutions, but given these significant biases, the use of social media data should generally be limited to the scenarios outlined above. Specifically, the strongest social media analyses typically limit inference to active social media users (though this population may be of little interest to the IC), assess social networks, or are used as a last resort when data are needed immediately and no other source is available.

Other Data Sources

There are a host of additional data sources that rarely measure social attitudes, but may offer important insights into the economic, social, environmental, and cultural welfare of a nation. Because they are not used to measure attitudes, we have limited discussion of these other data sources to a brief summary in Table 3B-2. References are included to provide researchers with the tools to learn more about each data source.

These data sources are constantly evolving. As a result, researchers should conduct a thorough review of the current literature to stay up to date on current best practices on how to use or evaluate such datasets. Moreover, the nascent nature of these data sources means that examples of their use and evaluation are typically limited to the United States and other Western European countries.

BLENDED DATA

As alluded to in the previous section, there may be situations where it is useful or necessary to blend multiple data sources. While there are multiple ways data may be blended (de Leeuw, 2005), this section is limited to the three most common approaches. Moreover, methods on how to blend data are discussed in the “Integrating Data” paper. This section discusses types of blended data and their strengths and weaknesses.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

TABLE 3B-2 Summary of Popular Alternative Data Sources Used to Measure Societal Trends Other Than Attitudes

Data Type Images Sensors Specimen Collection Administrative
Definition Pictures or videos that may be coded (manually or using machine learning) for a given feature. They may be collected as part of a survey or organic. Generally used to capture data on a single or related set of variables over time. May include geolocation sensors on cell phones, wearable devices, or temperature gauges, among other things. Typically collected with a survey, the researcher may collect samples (e.g., blood, urine, soil, water) to be analyzed in a lab. Typically, organic secondary data that exist in rectangular datasets but were designed for administrative purposes (e.g., medical records, credit card transactions, individuals receiving unemployment).
Strengths May provide more detailed or nuanced information than is available from a single survey question.
May be collected both as part of a designed survey or in organic data (e.g., Instagram).
Provides information that is difficult or impossible to collect through surveys.
Often provides repeated measures over time.
Provides more complete picture of the correlates, moderators, and mediators of attitudes.
Avoids some biases associated with surveys (e.g., nonresponse, measurement).
May suffer from less measurement error than survey data (e.g., date of vaccination) Provides information that is difficult or impossible to collect through surveys.
Weaknesses Images typically need to be coded first, requiring machine learning programs or significant labor investment.
Suffers from different errors from surveys such as incomplete picture of the situation.
Organic sensor data may come from multiple platforms (e.g., FitBit, Apple), requiring significant processing.
Individual-level data often requires explicit consent. Suffers from errors not found in surveys (e.g., dead batteries).
May require specialized data collection tools, data collectors, or analysts, adding complexity, time, and expense.
Often more invasive than a survey (e.g., drawing blood).
Often considered confidential by business or government, making them inaccessible.
May be outdated and incomplete.
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Data Type Images Sensors Specimen Collection Administrative
Unique Ethical Considerations PII (faces, license plates, addresses) should be obfuscated. In some situations, researchers may be legally or ethically compelled to turn over GPS data to aid in a criminal investigation. It may be necessary to share the results of lab tests with respondents and provide additional resources or to report the results to authorities (e.g., HIV positive test results). Medical records may require higher security and transfer requirements than other types of PII.
Examples FoodAps experimented having respondents upload images of receipts instead of keeping a diary (May, 2018).
Satellite imagery has used night light pollution to identify areas of poverty (Yeh et al., 2020).
Drone pictures have been used to estimate crowd size at political rallies (Choi-Fitzpatrick and Juskauskas, 2015).
Researchers have tested the feasibility of using a FitBit to measure physical activity (Eckman, Furberg, and Amaya, 2018).
Cell phone geolocations have been used to study migration patterns (Lai et al., 2019).
Air quality is measured using sensors at 342 monitoring stations in 240 cities across India (Singh, 2019).
The Dioxin Exposure Study supplemented survey data with soil, river, house dust, blood, and soil samples to assess the effect of chemical contamination on various factors (University of Michigan, n.d.).
Researchers in Cary, North Carolina, collected wastewater samples across the city to test for opioid exposure (Duvallet et al., 2020).
Zillow and other real estate databases have been tested to gather information about housing features (Amaya et al., 2020).
The National Immunization Survey supplements survey data with immunization records from medical providers (CDC, n.d.).
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

Different Frames, Same People, Different Variables

In this scenario, different variables from the same person are collected using different means. For example, a survey that is primarily collected in-person by an interviewer may have an audio computer-assisted self-interview(ACASI) component for sensitive items in which the interviewer hands a set of headphones and the computer to respondents so that they can listen to the questions and provide answers without the interviewer seeing the responses. Researchers may also include a consent question on a survey, asking permission to append other information about the individual such as Twitter use and employment records to the respondent’s survey data (Trappmann et al., 2010).

Collecting variables for the same individual from different sources can improve data quality, reduce respondent burden, and may reduce costs. Some questions are prone to measurement error, such as respondents providing inaccurate responses because they don’t know or don’t want to provide the correct answer. While interviewer-administered surveys may be best for gaining cooperation or addressing a population with low literacy, they may also introduce more measurement error on sensitive questions (which questions are considered sensitive is dependent on the country) (Gribble et al., 2000). An alternative data collection mode may help improve data quality by eliminating the interviewer and facilitating more truthful responses. In situations where respondents don’t know the information in question (e.g., what types of attitudes they post on social media), linking the information directly may also reduce burden on the respondent to find or recall it and provide more accurate information (Boudreaux et al., 2015). In some cases, linking to alternative data sources is relatively free and may be less costly than including additional questions in a survey that add costs due to interviewer labor.

These types of blended data also come with weaknesses. There may be additional set-up costs to program an alternative mode or an API to pull organic data from its original source. Data linkage to other sources often requires consent that some individuals may not be willing to provide. If individuals who don’t consent are different from those that do, analyses on the blended data may be biased (Sakshaug and Kreuter, 2012). Data linkage across data sources also requires matching (Harron et al., 2017). In some cases, this is relatively straightforward, but in other cases, it can be time-consuming, may yield inaccurate matches, and is algorithmically difficult (Lohr and Raghunathan, 2017). For example, the individual’s name may be misspelled in one record, or they may have moved, making it difficult to match using the address.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

Different Frames, Different People, Same Variables

Another type of blended data exists when different methods are used to collect the same information from different people. In some situations, it may be too expensive to collect all interviews using the most rigorous method, so a secondary method is used as a supplement. For example, researchers may opt to administer the same survey using both a probability-based sample and a nonprobability sample. They could then combine the interviews they got from each source and analyze them as one dataset. Multimode surveys may be another way to blend multiple methods to save money. Invitations to complete the survey via the most inexpensive mode (e.g., web or interactive voice response [IVR]) are sent first. People who do not respond are invited to complete the survey in a more expensive mode (e.g., by mail). In some cases, a third, even more expensive method may be used to gain cooperation from the most difficult individuals (e.g., in person). This approach is consistent with most U.S. Census Bureau surveys such as the American Community Survey (U.S. Census Bureau, 2014).

Some of these types of blended data sources are relatively easy to work with and some are more challenging and have greater risks associated with them. In general, using a dataset that has a single sampling frame but multiple modes is straightforward and does not require special weighting or analytic methods. Blended data with multiple probability-based frames requires some customization. It is necessary to deduplicate the frames prior to data collection (Murphy, Pulliam, and Lucas, 2004). In other words, the same person should not have a probability of selection on both frames. If deduplication is impossible or impractical, then it is necessary to collect information on the survey to determine whether the respondent could have been sampled on the other frame. These data should be incorporated into the base weight calculation, using probability of selection (Buskirk and Best, 2012). While this sounds complicated, these methods are well-documented (Baffour et al., 2016). If accounted for, researchers may treat multiframe surveys similar to single-frame probability-based surveys. Blended samples that use at least one nonprobability frame are the most challenging to account for statistically, but it is possible to blend probability and nonprobability samples using methods such as those developed by Dever (2018). For more details on statistical techniques to blend these types of data, see the “Integrating Data” paper.

Different Frames, Different People, Different Variables (Same Construct)

Finally, blended data may be defined as the use of multiple data sources to measure the same construct. This could be the use of variables from different sources that are meant to measure the same thing to inform de-

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

cisions. In this case, the analysis is done on each variable or data source individually, but the results from all sources are considered in addressing the research question. Economists, for example, use multiple indicators of economic strength to make policy decisions and predict future growth (Amadeo and Anderson, 2022). Alternatively, one may use a single dataset, but conduct an analysis in multiple ways to determine the sensitivity of an analysis by observing whether different analytical approaches yield the same results. All statistical tests include assumptions (e.g., a t-test assumes independent observations), but data do not always fulfill each assumption. Using multiple approaches to analyze the same set of data allows researchers to minimize the threat of statistical error due to the weakness of a given analytic test. Using multiple measures (either from different data sources or from different analytic approaches) may be time-consuming to implement, but they provide the researcher with the most confidence that the results are robust and accurate, even more so than a single analysis of a single probability-based general population survey.

All in all, blending data using any of the methods described above can yield important and sizable advantages for cost and data quality. However, if collecting primary data, it is important to design the study in a way that facilitates blending. A researcher could, for example, use the same question wording and collect the same information when blending data from multiple frames and multiple people, or add a confirmation screen for variables like email addresses or the spellings of names that will be used in the linkage process.

ETHICAL CONSIDERATIONS

In addition to considering data sources that are accurate, timely, and budget-conscious, researchers must also consider the ethical nature of the data. When considering the use of any of the data sources discussed in this chapter, researchers should consider three questions:

  • How were the data collected, including what data were collected?
  • How were/are they stored?
  • How are they used?

When collecting data, it is critical to ensure that U.S. laws as well as laws in the country of interest are adhered to. This may include submitting the research protocol to a U.S. or in-country institutional review board (APA, 2017). For survey data, this also includes collection of informed consent from adults and consent or assent from minors, depending on the situation (Office for Human Research Protections, 2016). Participants should understand what they are being asked to do, what their rights are, and how

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

their data will be used. However, ethical data collection involves more than just informed consent. If researchers are collecting information on sensitive items, then special attention should be paid to make sure resources are available for adverse reactions (e.g., providing crisis telephone numbers when asking about suicide). Proper cultural norms should be followed to make sure all respondents feel comfortable and at ease. For example, it may be inappropriate to use an interviewer who is the opposite sex of the respondent. Interviewers should also be properly trained to avoid any type of coercion such as threats to participate and attempts to elicit preselected responses. Not only do these types of protocols ensure data are collected ethically, they also improve data quality. Respondents who do not trust the researchers are prone to provide inaccurate answers.

In some instances, it may also be possible to collect informed consent for organic data. For example, if one is appending social media data to an individual’s survey data, it would be necessary and possible to ask for consent to link the data in the survey. In other cases, it is not possible. The Twitter API allows researchers to download tweets from public accounts. In these circumstances, it is implied that publicly available information can be ethically used for research purposes. (We debate this under the usage discussion later in this section.)

Data (regardless of source) should be stored on secure servers. The U.S. federal government provides rules on what is required for data to be considered “secure.” (NIST, 2004) These guidelines vary by the source of data. For example, publicly available data and data for which participants have no expectation of privacy does not require secure storage, whereas data that include personal health information requires researchers to store data in a “FIPS Moderate” environment. In general, the more sensitive the information is and/or the higher the risk of being able to identify an individual from a data source, the more security measures are required.

This concept becomes slightly more complicated when considering blended data. Consider a survey dataset that has no sensitive information and no PII. Now, consider that the researcher appended information on distance to the nearest hospital, distance to the nearest school, and distance to the nearest grocery or farm stand, all of which are publicly available. The blended dataset has a much higher risk for identification of the individual because it creates a triangulation of geographies—there are only so many housing units that are exactly X distance from a hospital, Y distance to a school, and Z distance from a grocery. As a result, the blended data may need to be stored more securely than any of the individual sources.

Data usage should receive consideration equal to that pertaining to data collection and storage. Before using data, the researcher must consider the informed consent. For example, most informed consent statements include a sentence that the results will be used for research purposes only.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

Data from these sources should not be used to identify individuals and their access to programs or legal status. Violation is not only illegal but makes it more challenging for future researchers to gain cooperation and collect accurate information from marginalized groups.

REFERENCES

Amadeo, K., and Anderson, S.G. 2022. “Are We Headed Into Another Recession? Check These Indicators First.” January 20. The Balance (blog). https://www.thebalance.com/leading-economic-indicators-definition-list-of-top-5-3305862.

Amaya, A., P.P. Biemer, and D. Kinyon. 2020. “Total Error in a Big Data World: Adapting the TSE Framework to Big Data.” Journal of Survey Statistics and Methodology 8(1):89–119. https://doi.org/10.1093/jssam/smz056.

AAPOR (American Association for Public Opinion Research). 2007. “Explaining Exit Polls” (online information sheet), September. https://www.aapor.org/Education-Resources/Election-Polling-Resources/Explaining-Exit-Polls.aspx.

APA (American Psychological Association). 2017. “Frequently Asked Questions About Institutional Review Boards,” September. https://www.apa.org/advocacy/research/defending-research/review-boards.

Armstrong, M. 2019. “How Accurate Are the Exit Polls?” Statista Infographics (blog), December 11. https://www.statista.com/chart/20272/how-accurate-are-uk-exit-polls/.

Baffour, B., M. Haynes, M. Western, D. Pennay, S. Misson, and A. Martinez. 2016. “Weighting Strategies for Combining Data from Dual-Frame Telephone Surveys: Emerging Evidence from Australia.” Journal of Official Statistics 32(3):549–578. https://doi.org/10.1515/jos-2016-0029.

Baker, R., S. Blumberg, J.M. Brick, M.P. Couper, M. Courtright, M. Dennis, D. Dillman, M.R. Frankel, P. Garland, R.M. Groves, C. Kennedy, J. Krosnick, S. Lee, P.J. Lavrakas, M. Link, L. Piekarski, K. Rao, D. Rivers, R.K. Thomas, and D. Zahs. 2010. “Research Synthesis: AAPOR Report on Online Panels.” Public Opinion Quarterly 74(4):711–781. https://www.aapor.org/Education-Resources/Reports/Report-on-Online-Panels.aspx.

Baker, R., J.M. Brick, N.A. Bates, M. Battaglia, M.P. Couper, J.A. Dever, K.J. Gile, and R. Tourangeau. 2013, June. “Report of the AAPOR Task Force on Nonprobability Sampling.” Working paper, American Association for Public Opinion Research, Alexandria, Virginia. https://www.aapor.org/aapor_main/media/mainsitefiles/nps_tf_report_final_7_revised_fnl_6_22_13.pdf.

Barrett, B. 2021. “WhatsApp Fixes Its Biggest Encryption Loophole.” Wired, September 10. https://www.wired.com/story/whatsapp-end-to-end-encrypted-backups/#:%7E:text=WhatsApp%20encrypts%20messages%20between%20senders,journey%2C%20nor%20after%20they%20arrive.&text=This%20doesn’t%20break%20or,So%20far%2C%20so%20good.

Bernard, J., H. Da ková, and P. Vašát. 2018. “Ties, Sites and Irregularities: Pitfalls and Benefits in Using Respondent-Driven Sampling for Surveying a Homeless Population.” International Journal of Social Research Methodology 21(5):603–618. https://doi.org/10.1080/13645579.2018.1454640.

Boudreaux, M.H., K.T. Call, J. Turner, B. Fried, and B. O’Hara. 2015. “Measurement Error in Public Health Insurance Reporting in the American Community Survey: Evidence from Record Linkage.” Health Services Research 50(6):1973–1995. https://doi.org/10.1111/1475-6773.12308.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

Bronner, L., and N. Rakich. 2020. “Exit Polls Can Be Misleading—Especially This Year.” FiveThirtyEight (blog), November 2. https://fivethirtyeight.com/features/exit-polls-can-be-misleading-especially-this-year.

Busetto, L., W. Wick, and C. Gumbinger. 2020. “How to Use and Assess Qualitative Research Methods.” Neurological Research and Practice 2(1). https://doi.org/10.1186/s42466-020-00059-z.

Buskirk, T.D., and J. Best. 2012. “Venn Diagrams, Probability 101 and Sampling Weights Computed for Dual Frame Telephone RDD Designs.” Proceedings of the Joint Statistical Conference. http://www.asasrms.org/Proceedings/y2012/Files/304351_72969.pdf.

CDC (Centers for Disease Control and Prevention). n.d. “About the National Immunization Surveys” (online data sheet). https://www.cdc.gov/vaccines/imz-managers/nis/about.html.

Ceron, A., L. Curini, S.M. Iacus, and G. Porro. 2013. “Every Tweet Counts? How Sentiment Analysis of Social Media Can Improve Our Knowledge of Citizens’ Political Preferences with an Application to Italy and France.” New Media & Society 16(2):340–358. https://doi.org/10.1177/1461444813480466.

Choi-Fitzpatrick, A., and T. Juskauskas. 2015. “Up in the Air: Applying the Jacobs Crowd Formula to Drone Imagery.” Procedia Engineering 107(2015):273–281. https://doi.org/10.1016/j.proeng.2015.06.082.

Conrad, F.G., J.A. Gagnon-Bartsch, R.A. Ferg, M.F. Schober, J. Pasek, and E. Hou. 2019. “Social Media as an Alternative to Surveys of Opinions About the Economy.” Social Science Computer Review 39(4):495–508. https://doi.org/10.1177/0894439319875692.

de Leeuw, E.D. 2005. “To Mix or Not to Mix Data Collection Modes in Surveys.” Journal of Official Statistics 21(2):233–255.

Dever, J.A. 2018. “Combining Probability and Nonprobability Samples to Form Efficient Hybrid Estimates: An Evaluation of the Common Support Assumption.” Proceedings of the 2018 Federal Committee on Statistical Methodology Research Conference, Washington, DC, March 7–9, 2018. https://copafs.org/wp-content/uploads/2020/05/COPAFSA4_Dever_2018FCSM.pdf.

DiSogra, C., C. Cobb, E. Chan, and J.M. Dennis. 2011. “Calibrating Nonprobability Internet Samples with Probability Samples Using Early Adopter Characteristics.” American Statistical Society. http://www.asasrms.org/Proceedings/y2011/Files/302704_68925.pdf.

Dutwin, D., and T.D. Buskirk. 2017. “Apples to Oranges or Gala versus Golden Delicious?” Public Opinion Quarterly 81(S1):213–239. https://doi.org/10.1093/poq/nfw061.

Duvallet, C., B.D. Hayes, T.B. Erickson, P.R. Chai, and M. Matus. 2020. “Mapping Community Opioid Exposure Through Wastewater-Based Epidemiology as a Means to Engage Pharmacies in Harm Reduction Efforts.” Preventing Chronic Disease 17(E91):1–4. https://doi.org/10.5888/pcd17.200053.

Dworkin, S.L. 2012. “Sample Size Policy for Qualitative Studies Using In-Depth Interviews.” Archives of Sexual Behavior 41(6):1319–1320. https://doi.org/10.1007/s10508-012-0016-6.

Eckman, S., and Himelein, K. 2020. “Methods of Geo-Spatial Sampling.” In Data Collection in Fragile States: Innovations from Africa and Beyond edited by J. Hoogeveen and U. Pape 103–128. New York: Springer.

Eckman, S., R. Furberg, and A. Amaya. 2018. “Combining Survey and Wearable Data on Exercise and Sleep 2018.” RTI International (slideshow), October 28. https://www.bigsurv.org/bigsurv18/uploads/264/260/Eckman_BigSurv_wearables.pdf.

ESOMAR (European Society for Opinion and Marketing Research). 2021. “Questions to Help Buyers of Online Samples” (online question and answer sheet). https://esomar.org/uploads/attachments/ckqqecpst00gw9dtrl32xetli-questions-to-help-buyers-of-online-samples-2021.pdf.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

FAO (Food and Agriculture Organization of the United Nations). 1996. Conducting Agricultural Censuses and Surveys (FAO Statistical Development Series no. 6). Rome: FAO.

Goel, S., and M.J. Salganik. 2010. “Assessing Respondent-Driven Sampling. Proceedings of the National Academy of Sciences 107(15):6743–6747. https://doi.org/10.1073/pnas.1000261107.

Goodman, L. 1961. “Snowball Sampling.” The Annals of Mathematical Statistics 32(1):148–170.

Gribble, J.N., H.G. Miller, J.A. Catania, L. Pollack, and C.F. Turner. 2000. “The Impact of T-ACASI Interviewing on Reported Drug Use Among Men Who Have Sex with Men.” Substance Use and Misuse 35(6–8):869–890.

Groves, R. 2011. “‘Designed Data’ and ‘Organic Data.’” May 31. U.S. Census Bureau. https://www.census.gov/newsroom/blogs/director/2011/05/designed-data-and-organic-data.html.

Groves, R.M. 2004. Survey Errors and Survey Costs. Hoboken, NJ: Wiley & Sons.

Guest, G., E. Namey, and K. McKenna. 2016. “How Many Focus Groups Are Enough? Building an Evidence Base for Nonprobability Sample Sizes.” Field Methods 29(1):3–22. https://doi.org/10.1177/1525822x16639015.

Harron, K., C. Dibben, J. Boyd, A. Hjern, M. Azimaee, M.L. Barreto, and H. Goldstein. 2017. “Challenges in Administrative Data Linkage for Research.” Big Data & Society 4(2). https://doi.org/10.1177/2053951717745678.

Heckathorn, D.D. 1997. “Respondent-Driven Sampling: A New Approach to the Study of Hidden Populations.” Social Problems 44(2):174–199. https://doi.org/10.2307/3096941.

Hootsuite. 2019. “Digital 2019: India” (online slide share). https://www.slideshare.net/DataReportal/digital-2019-india-january-2019-v01.

Hsieh, P., and J. Murphy. 2017. “Total Twitter Error: Decomposing Public Opinion Measurement on Twitter from a Total Survey Error Perspective.” In Total Survey Error in Practice edited by P.P. Biemer, E.D. de Leeuw, S. Eckman, B. Edwards, F. Kreuter, L.E. Lyberg, N.C. Tucker, B.T. West, and E.D. de Leeuw, 23–46. Hoboken, NJ: Wiley & Sons.

Johnston, L.G., M. Malekinejad, C. Kendall, I.M. Iuppa, and G.W. Rutherford. 2008. “Implementation Challenges to Using Respondent-Driven Sampling Methodology for HIV Biological and Behavioral Surveillance: Field Experiences in International Settings.” AIDS and Behavior 12(S1):131–141. https://doi.org/10.1007/s10461-008-9413-1.

Keeter, S. 1995. “Estimating Telephone Noncoverage Bias with a Telephone Survey.” Public Opinion Quarterly 59(2):196–217. https://doi.org/10.1086/269469.

Kennedy, C. 2018. “What are Nonprobability Surveys?” Pew Research Center, August 6. https://www.pewresearch.org/fact-tank/2018/08/06/what-are-nonprobability-surveys/.

Kennedy, C., A. Mercer, S. Keeter, N. Hatley, K. McGeeney, and A. Gimenez. 2016. “Evaluating Online Nonprobability Surveys.” Pew Research Center, May 2. https://www.pewresearch.org/methods/2016/05/02/evaluating-online-nonprobability-surveys/.

Kish, L., and M.R. Frankel. 1974. “Inference from Complex Samples.” Journal of the Royal Statistical Society: Series B (Methodological) 36(1):1–22. https://doi.org/10.1111/j.2517-6161.1974.tb00981.x.

Kreuter, F. and C. Casas-Cordero. 2010. “Paradata.” Working paper no. 136, German Council for Social and Economic Data, Berlin. https://www.konsortswd.de/wp-content/uploads/RatSWD_WP_136.pdf.

Kreuter, F., G.C. Haas, F. Keusch, S. Bähr, and M. Trappmann. 2018. “Collecting Survey and Smartphone Sensor Data with an App: Opportunities and Challenges Around Privacy and Informed Consent.” Social Science Computer Review 38(5):533–549. https://doi.org/10.1177/0894439318816389.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

Lai, S., E.Z. Erbach-Schoenberg, C. Pezzulo, N.W. Ruktanonchai, A. Sorichetta, J. Steele, T. Li, C.A. Dooley, and A.J. Tatem. 2019. “Exploring the Use of Mobile Phone Data for National Migration Statistics.” Palgrave Communications 5(1). https://doi.org/10.1057/s41599-019-0242-9.

Lehdonvirta, V., A. Oksanen, P. Räsänen, and G. Blank. 2020. “Social Media, Web, and Panel Surveys: Using Non Probability Samples in Social and Policy Research.” Policy & Internet 13(1):134–155. https://doi.org/10.1002/poi3.238.

Lohr, S.L., and T.E. Raghunathan. 2017. “Combining Survey Data with Other Data Sources.” Statistical Science 32(2):293–312. https://doi.org/10.1214/16-STS584.

May, L. 2018. “Improving Efficiencies on FoodAPS with Online Food Logs.” Working paper, Bureau of Labor Statistics, Washington, DC. https://www.bls.gov/cex/foodaps.pdf.

McCreesh, N., S. Frost, J. Seeley, J. Katongole, M.N. Tarsh, R. Ndunguse, F. Jichi, N.L. Lunel, D. Maher, L.G. Johnston, P. Sonnenberg, A.J. Copas, R.J. Hayes, and R.G. White. 2012. “Evaluation of Respondent-Driven Sampling.” Epidemiology 23(1):138–147. https://doi.org/10.1097/ede.0b013e31823ac17c.

Mercer, A.W., F. Kreuter, S. Keeter, and E.A. Stuart. 2017. “Theory and Practice in Nonprobability Surveys.” Public Opinion Quarterly 81(S1):250–271. https://doi.org/10.1093/poq/nfw060.

Merriam, S.B., and E.J. Tisdell. 2009. Qualitative Research: A Guide to Design and Implementation, 3rd edition. Hoboken, NJ: John Wiley & Sons.

Mohamed, B. 2014. “Who Are the Iraqi Kurds?” Pew Research Center, August 20, 2014. https://www.pewresearch.org/fact-tank/2014/08/20/who-are-the-iraqi-kurds/.

Morrison-Beedy, D., D. Côté-Arsenault, and N.F. Feinstein. 2001. “Maximizing Results with Focus Groups: Moderator and Analysis Issues.” Applied Nursing Research 14(1):48–53. https://doi.org/10.1053/apnr.2001.21081.

Mullo, H., I. Sánchez-Borrego, and S. Pasadas-del-Amo. 2020. “Respondent-Driven Sampling for Surveying Ethnic Minorities in Ecuador.” Sustainability 12(21). https://doi.org/10.3390/su12219102.

Murphy, J., P. Pulliam, and R. Lucas. 2004. “Sample Frame Republication in the World Trade Center Health Registry: Minimizing Overcoverage and Cost.” Proceedings of the Joint Statistical Meetings. https://www.rti.org/sites/default/files/resources/murphy_jsm2004_paper.pdf.

NIST (National Institute of Standards and Technology). 2004. “Standards for Security Categorization of Federal Information and Information Systems” (online policy statement). https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.199.pdf.

O’Connor, B., R. Balasubramanyan, B.R. Routledge, and N.A. Smith. 2010. “From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series.” Conference paper, Fourth International AAAI Conference on Weblogs and Social Media, May 23–26. Washington, DC. https://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/viewFile/1536/1842.

Office for Human Research Protections, U.S. Department of Health and Human Services. 2016. “Attachment A: Minimal Risk Informed Consent Models.” July 21. https://www.hhs.gov/ohrp/sachrp-committee/recommendations/attachment-a-minimal-risk-informed-consent-models/index.html.

Rafferty, A. n.d. “Introduction to Complex Survey Design.” Working paper, U.K. Data Services, Colchester, United Kingdom. https://dam.ukdataservice.ac.uk/media/440347/rafferty.pdf.

Sakshaug, J.W., and F. Kreuter. 2012. “Assessing the Magnitude of Non-Consent Biases in Linked Survey and Administrative Data.” Survey Research Methods 6(2):113–122.

Schober, M.F., J. Pasek, L. Guggenheim, C. Lampe, and F.G. Conrad. 2016. “Social Media Analyses for Social Measurement.” Public Opinion Quarterly 80(1):180–211. https://doi.org/10.1093/poq/nfv048.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

Silver, N. 2008. “Ten Reasons Why You Should Ignore Exit Polls.” FiveThirtyEight (blog), November 4. https://fivethirtyeight.com/features/ten-reasons-why-you-should-ignore-exit/.

Singh, H. 2019. “What is Air Quality Index and How is it Calculated?” JagranJosh, November 6. https://www.jagranjosh.com/general-knowledge/what-is-air-quality-index1573026691-1.

Statista Research Department. 2022. “Leading Countries Based on Number of Twitter Users as of October 2021” (online fact sheet). Statista, January 28. https://www.statista.com/statistics/242606/number-of-active-twitter-users-in-selected-countries/.

Tarimo, E.A., T.W. Kohi, M. Bakari, and A. Kulane. 2013. “A Qualitative Study of Perceived Risk for HIV Transmission among Police Officers in Dar es Salaam, Tanzania.” BMC Public Health 13(1). https://doi.org/10.1186/1471-2458-13-785.

Thomas, L. 2021. “An Introduction to Cluster Sampling.” Scribbr, October 5. https://www.scribbr.com/methodology/cluster-sampling/.

Tourangeau, R., L.J. Rips, and K. Rasinski. 2000. The Psychology of Survey Response, 1st edition. Cambridge, United Kingdom: Cambridge University Press.

Trappmann, M., S. Gundert, C. Wenzig, and D. Gebhardt. 2010. “PASS – A Household Panel Survey for Research on Unemployment and Poverty.” Schmollers Jahrbuch 130(4):609–622.

Truong, H.H.M., M. Grasso, Y.H. Chen, T.A. Kellogg, T. Robertson, A. Curotto, W.T. Steward, and W. McFarland. 2013. “Balancing Theory and Practice in Respondent-Driven Sampling: A Case Study of Innovations Developed to Overcome Recruitment Challenges.” PLoS ONE 8(8):e70344. https://doi.org/10.1371/journal.pone.0070344.

Unangst, J., A.E. Amaya, H.L. Sanders, J. Howard, A. Ferrell, K. Karon, and J.A. Dever. 2019. “A Process for Decomposing Total Survey Error in Probability and Nonprobability Surveys: A Case Study Comparing Health Statistics in U.S. Internet Panels.” Journal of Survey Statistics and Methodology 8(1):62–88. https://doi.org/10.1093/jssam/smz040.

U.S. Census Bureau. 2014. “American Community Survey Design and Methodology (January 2014).” https://www2.census.gov/programs-surveys/acs/methodology/design_and_methodology/acs_design_methodology_ch07_2014.pdf.

University of Michigan. n.d.. “University of Michigan Dioxin Exposure Study” (online information portal). https://sph.umich.edu/dioxin/index.html.

Valliant, R., J.A. Dever, and F. Kreuter. 2013. Practical Tools for Designing and Weighting Survey Samples (Statistics for Social and Behavioral Sciences Book 51), 2013 edition. New York: Springer.

Vehovar, V. 2016. “Nonprobability Sampling.” In The Sage Handbook of Survey Methodology edited by S. Steinmetz and V. Toepoel, 329–346. Thousand Oaks, CA: Sage.

Verdery, A.M., M.G. Merli, J. Moody, J.A. Smith, and J.C. Fisher. 2015. “Respondent-Driven Sampling Estimators under Real and Theoretical Recruitment Conditions of Female Sex Workers in China.” Epidemiology 26(5):661–665. https://doi.org/10.1097/ede.0000000000000335.

Warren, K. 2020. “Qualitative Data Analysis Methods 101: The “Big 6” Methods + Examples.” Grad Coach, May. https://gradcoach.com/qualitative-data-analysis-methods/.

Williams, H.T., J.R. McMurray, T. Kurz, and F. Hugo Lambert. 2015. “Network Analysis Reveals Open Forums and Echo Chambers in Social Media Discussions of Climate Change.” Global Environmental Change 32:126–138. https://doi.org/10.1016/j.gloenvcha.2015.03.006.

Wojcik, S., and A. Hughes. 2019. “Sizing Up Twitter Users.” Pew Research Center, April 24. https://www.pewresearch.org/internet/2019/04/24/sizing-up-twitter-users/.

Wojcik, S., S. Messing, A. Smith, and L. Rainie. 2018. “Bots in the Twittersphere.” Pew Research Center, April 9. https://www.pewresearch.org/internet/2018/04/09/bots-in-the-twittersphere/.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.

Yeh, C., A. Perez, A. Driscoll, G. Azzari, Z. Tang, D. Lobell, S. Ermon, and M. Burke, 2020. “Using Publicly Available Satellite Imagery and Deep Learning to Understand Economic Well-Being in Africa.” Nature Communications 11(1). https://doi.org/10.1038/s41467-020-16185-w.

Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 99
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 100
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 101
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 102
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 103
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 104
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 105
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 106
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 107
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 108
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 109
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 110
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 111
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 112
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 113
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 114
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 115
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 116
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 117
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 118
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 119
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 120
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 121
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 122
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 123
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 124
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 125
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 126
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 127
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 128
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 129
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 130
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 131
Suggested Citation: "3B Alternatives to Probability-Based Surveys Representative of the General Population for Measuring Attitudes - Ashley Amaya." National Academies of Sciences, Engineering, and Medicine. 2022. Measurement and Analysis of Public Opinion: An Analytic Framework. Washington, DC: The National Academies Press. doi: 10.17226/26390.
Page 132
Next Chapter: 3C Ascertaining True Attitudes in Survey Research - Kanisha D. Bond
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.