# Introduction *Sampling* is the process of selecting a set of entities (individuals, objects, or events) from a greater population of such entities preferably in such a manner that the sample is representative of that population. But before any sampling can take place, researchers must consider and plan for two important pieces in the overall process of sampling: - Defining the population of interest - Determining a way to gain access to individuals in the population of interest Once these two issues have been properly considered and addressed by the researchers for a given study, they will be better informed and prepared to determining a procedure for actually selecting individuals to contact (i.e. sampling). ==(Sampling methods are the topic for next week.)== # Defining a Population Key features to consider when defining a population: - *Element* - The entity of primary interest (i.e., an observational unit). - This is used to refer members of the target population, and it can also refer to members of the sampling frame. - *Sampling unit* - This is the actual entity that is being sampled from the population. - Depending on the specific situation and nature of the study, the sampling unit may be the same entity as an element, or it may be a cluster of elements. - *Extent of the population* refers to the set of criteria that defines a target population. This may include characteristics such as… - Germaine characteristics (features or properties of elements that are theoretically or practically relevant to the purpose of the study and its research questions; e.g., new elementary-school teachers, patients diagnosed with schizophrenia, students currently enrolled in a four-year university) - Other biodemographic characteristics (e.g., human, currently living, literate) - Temporal extent (time of membership in the population) - Geographic parameters > [!example]+ > Suppose a researcher wants to conduct a study that would make some inference about all the current residents of Minneapolis (specifically, those within city limits) and decides to randomly select people from the most recent Minneapolis telephone directory. So, the target population would be all the residents of Minneapolis, but the accessible population would be everyone who was listed in the directory (the directory itself would be the sampling frame). The target population would differ from the accessible population because the latter would not include certain groups of people, such as those with unlisted numbers, those without telephones, and new residents who moved to Minneapolis since the printing of the directory. Further, the accessible population would likely include people who live just outside city limits (but still within the greater Minneapolis area) and those who have since moved away since the printing of the directory. There is also an important conceptual distinction when considering a population for a study. ## Target Population The *target population* is the group of elements for which the survey investigator wants to make inferences by using the sample statistics. In other words, this is the population to which the researcher wants to generalize findings. - Target populations are finite in size (i.e., at least theoretically, they can be counted). - They have some time restrictions (i.e., they exist within a specified time frame). - They are observable (i.e., they can be accessed). ## Accessible Population The *accessible population* (also known as the *available* or *survey population*) is the population from which the researcher can actually select a sample. The accessible population is operationalized by the *sampling frame*, which is a listing (or other collection of source materials) containing all accessible members of a population. (Sampling frames are introduced in a following section.) The target and accessible populations will rarely be the same. %%Appears to be the same as the *frame population*; def. (Groves, p.45): the set of target population members that has a chance to be selected into the survey sample.%% %%How do these relate to the *survey population*?%% # Sampling Frame Defined in the most general terms, a *sampling frame* is a mechanism of access to a population. The sampling frame (also known simply as the frame) for a given study is the set of materials, resources, information, and/or procedures which provides access to the elements of the population of interest. This represents those elements who have a chance of being selected (i.e., accessible elements). As implied by its definition, a sampling frame is operationalized in one of two general ways: - *Listing* - A *listing* (also known as a *list frame*) is an explicit list of existing records or other similar information resource showing all (potentially) accessible individual elements in a population. This would include some type of identification or contact information. - Examples: An organization’s membership roll, a university’s email directory, list of addresses in a neighborhood, voter registration rolls, and customer databases. - *Procedural frame* - For some target populations, there is not a definite list of known elements before sampling occurs. So, a structured set of steps is created by the researcher which guide those who are conducting the data collection. - Often this involves seeking out individuals who happen to be at a particular place or event that makes them accessible for sampling. - Such a procedure acts as an operational representation of the population by outlining where and/or how the population can be reached. - When a procedural frame is used, there tends to be a significant overlap between the sampling frame and the sampling methods^[(Sampling methods refer to the overall strategy or plan for selecting participants from the sampling frame. This includes the specific methods used to choose the participants (e.g., snowball sampling, purposive sampling, etc.).This topic is covered in depth in the next chapter.]. This is because, in the absence of a pre-existing list or database (i.e., a traditional list frame), the procedures for identifying and accessing participants essentially become part of both the frame and the sampling design. - Examples: Area frames (such as a neighborhood), random-digit dialing (RDD), exit polls, approaching people attending a football game. Ideally, a sampling frame should contain (a) every element of the target population and (b) each element of the target population exactly once. Of course, this virtually never happens for every element in a target population or sampling frame. Shortcomings with these two criteria are the source of the various coverage problems. # Coverage Problems for Sampling Frames Recall that the sampling frame represents the accessible population, not necessarily the target population. There are various potential problems which can occur regarding how well a sampling frame corresponds to the actual target population. *Coverage errors* refer to the potential inaccuracy and bias in survey statistics which can arise due to the various possible coverage problems. There are four basic types of coverage issues: 1. *Undercoverage* 2. *Ineligible units* 3. *Clusters* 4. *Duplicates* Each general type of frame coverage problem is shown below with a prototype diagram where **F** is an element in the sampling frame, and **P** is an element in the target population. ## Proper (Complete) Coverage This is the ideal situation where there are no coverage issues. In this situation, there is a perfect one-to-one mapping of each member of the population to an element in the sampling frame. It is mentioned here as a benchmark for reference. ![[__/coverage_proper.svg]] ## Undercoverage *Undercoverage* refers to the situation where a legitimate element in the target population does not appear in the sampling frame. Such eligible members of the target population cannot appear in any sample drawn for the study. These are sometimes referred to as *missing elements*. ![[__/coverage_missing_elements.svg]] Undercoverage is potentially the most worrisome and serious of all coverage errors. It threatens to produce errors of non-observation in survey statistics due to failure to include parts of the target population in any study using the frame. > [!example]+ Examples: Undercoverage > - Conducting a study using a telephone directory with only landline phone numbers as the sampling frame can lead to undercoverage because it excludes individuals who primarily use mobile phones. > - Say there was a study that used a database of customer addresses as its listing, but this database is not regularly updated. This listing may miss people who have moved or changed their contact information, leading to undercoverage. > - If a sampling frame only includes residential addresses and does not account for homeless individuals, it will underrepresent this segment of the population. > - In political polling, relying solely on voter registration lists can lead to undercoverage because it excludes eligible voters who are not yet registered. ## Ineligible Units An *ineligible unit* (also known as a *blank* or *foreign element*) is an element in the sampling frame that is mapped to an element that is not in the target population. In other words, these are elements that are not members of the target population but are members of the frame population. This type of coverage problem is also called *overcoverage*. ![[__/coverage_ineligible_units.svg]] Undercoverage and ineligible units (overcoverage) can also be depicted as a Venn diagram: ![[__/population_frame_overlap.svg]] The presence of ineligible units is generally not as severe of a problem as undercoverage, and ineligible units in the frame can be a less difficult problem to deal with if the problem is not extensive. When ineligible units are identified on the frame before selection begins, they are easily purged. More often, ineligible units cannot be identified until data collection begins. After sampling they can be identified in a preliminary screening step and dropped from the sample. > [!example]+ Examples: Ineligible units > - Business numbers in a frame of telephone numbers when studying the telephone household target population. > - Say the target population is all currently enrolled undergraduates at NDSU for the current Fall semester, and the sampling frame is the listserv for undergraduate students. If there are any students who have transferred or dropped out early in the semester, their NDSU email may still be active and included in the current listserv. ## Clusters A *cluster* is the situation where multiple elements of the target population are linked to a single element in the sampling frame. So, the selection of such an element in the frame can lead to two or more eligible units from the population. ![[__/coverage_clusters.svg]] ## Duplicates *Duplicates* refers to the situation where several elements in the sampling frame are mapped onto a single element in the target population. In sample surveys using a frame with this issue, the duplicated elements may be overrepresented. ![[__/coverage_duplicates.svg]] ## Complex Coverage Issues A frame may have any combination of these different coverage problems. In particular, there can be a combination of duplication and clustering in which multiple frame elements map to multiple target population elements (many-to-many mappings). ![[__/coverage_complex_mapping.svg]] # Quantitative Noncoverage Bias Say that we are trying to estimate the mean of some particular trait of people that can be expressed in numerical terms. Let's call this variable $y$. Now, consider the following rational using the quantities defined here: | Symbol | Quantity | | -------------- | ------------------------------------------------------------------------------------------------------------------------------------- | | $\large \mu$ | Mean of $y$ for the full target population | | $\large \mu_c$ | Mean of $y$ for the segment of the population (subpopulation) consisting of eligible units that are covered by the sampling frame | | $\large \mu_u$ | Mean of $y$ for the segment of the population (subpopulation) consisting of eligible units that are not covered by the sampling frame | | $\large N$ | Size of the full target population | | $\large N_c$ | Size of the subpopulation of eligible units covered by the sampling frame | | $\large N_u$ | Size of the subpopulation of eligible units not covered by the sampling frame | The size of the full target population is the sum of the number of covered ($N_c$) and noncovered ($N_u$) subpopulations: $\large N~=~N_c~+~N_u$ Noncoverage rate (NCR): $\large R_u~=~\frac{N_u}{N} $ The noncoverage bias of a sample estimator ($\bar{y}$) is the difference between the mean of $y$ for the units covered by the frame ($\mu_c$) and the true overall mean of $y$ for the full target population ($\mu$): $\large \operatorname{Bias}\{\bar{y}\}~=~\mu_c - \mu $ With a little algebraic manipulation, the bias can be shown to be the product of the noncoverage rate ($N_u/N$) and the difference between the covered ($\mu_c$) and noncovered ($\mu_u$) means: $\large \operatorname{Bias}\{\bar{y}\}~=~\frac{N_u}{N}\left(\mu_c~-~\mu_u\right) $ ==So, what does this imply about the nature of noncoverage bias?== # Dealing with Coverage Issues There are a few general options for dealing with frame problems: - Correct the frame when possible (e.g., remove all duplicates entries from the frame if identifiable) - Redefine the target population to match the frame - Post-collection statistical adjustments - Use multiple sampling frames - Ignore the problems with the frame and admit the possibility of coverage errors Of course, there are limitations and drawbacks with each of these, so there is no guarantee that any of these are viable or acceptable options for any given survey. # Populations in Qualitative Research In quantitative research, the concept of a population refers to the entire group of elements that the researcher aims to draw conclusions about. Broadly speaking, the goal of such research is to generalize findings from an observed sample to this larger population using statistical methods. In contrast, the concept of population in qualitative research is less about generalization and more about uncovering and understanding the truth and meaning regarding complex phenomena within specific contexts. Furthermore, qualitative research tends to focus on a target group that is relevant to the research question, often defined by shared experiences, behaviors, or characteristics. # Sampling Frames in Qualitative Research In qualitative research, the concept of a sampling frame is more flexible and less structured. The focus is on selecting participants who can provide rich, detailed insights relevant to the research question. There are a few notable characteristics: - Such sampling frames should be more flexible and dynamic. - There is often a need for purposeful selection where participants can be intentionally selected based on specific characteristics or their ability to contribute valuable information (purposive sampling). - Adaptability is often needed as the sampling approach may change as new insights emerge. It is possible that sampling evolves during the research process to develop or refine emerging theories (theoretical sampling). - There is a definite focus on depth over breadth. - Rich data collection: Emphasis on obtaining detailed information to deeply understand the phenomenon. - Contextual understanding: Focus on the meanings and experiences of participants within their specific contexts. - These do not necessarily need to be exhaustive lists as a complete sampling frame is often unnecessary (or even irrelevant in some instances). To better highlight these distinct features, the major differences between qualitative and quantitative sampling frames are given in the following table. | Characteristic | Quantitative research | Qualitative research | | --------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- | | Construction of the sampling frame | Requires a comprehensive sampling frame (or, at least, as comprehensive as possible). | Uses a conceptual framework to identify relevant participants without needing a complete list. | | Purpose of sampling | The overarching aim is for statistical representativeness in order to generalize findings (although generalizability may sometimes be limited). | Aims to gain in-depth understanding from specific cases without the intention of generalizing statistically. | | Sampling techniques^[This topic is covered in greater depth in the next chapter.] | Employs probability and non-probability methods depending on research goals and practical constraints of a given study. | Typically utilizes non-probability methods like purposive and snowball sampling to better facilitate information richness. | %% POPULATIONS & SAMPLING FRAMES # Populations in QL research - Parallels and Applications: - Defined Group or Context: Both paradigms start by identifying a group relevant to the study's objectives. - Depth Over Breadth: Qualitative research prioritizes in-depth understanding of a phenomenon within a particular group rather than generalizing to a larger population. - Theoretical Population: Sometimes, qualitative researchers conceptualize a theoretical population to which their findings might be transferable. - Implications in Qualitative Research: - Transferability: Instead of seeking generalizability, qualitative studies aim for findings to be transferable to similar contexts or groups. - Contextual Focus: Emphasis is placed on the context in which participants experience the phenomenon, acknowledging that meanings are constructed socially and culturally. # Sampling Frames in Qualitative Research - In Quantitative SRM: - A sampling frame is a comprehensive list or database of all elements in the population from which a sample can be drawn. - It is crucial for ensuring that every member has an equal chance of selection, minimizing sampling bias. - In Qualitative Research: - Sampling frames are less formal and may not involve exhaustive lists. - Access to participants often relies on networks, organizations, or communities relevant to the study. - The sampling frame may be constructed iteratively as the study progresses. - In qualitative research, procedural frames are common. For example, in ethnographic research, the researcher may gain access to participants through gatekeepers or snowball sampling, where each participant helps identify others. - Parallels and Applications: - Identifying Potential Participants: Both paradigms require a method for identifying who can be included in the study. - Access Mechanisms: Qualitative researchers might use gatekeepers, key informants, or community contacts to reach participants. - Flexibility: The sampling frame in qualitative research is adaptable, allowing researchers to refine their participant pool based on emerging insights. - Implications in Qualitative Research: - Non-Probability Sampling: Since the goal isn't statistical generalization, a precise sampling frame is less critical. - Purposeful Inclusion: The sampling frame is designed to include individuals who can provide rich, relevant data. # Logical Congruence and Meaningfulness Your inquiry is logically congruent. While the concepts originate from quantitative research, they have meaningful parallels in qualitative research: - Populations in qualitative research are defined in terms of contexts or groups central to the phenomenon. - Sampling Frames exist but are more flexible and may not be exhaustive lists; they are tools to access information-rich participants. - Sampling focuses on depth, using non-probability methods to select participants who can provide the most insight. These adaptations align with the goals of qualitative research, which prioritizes understanding over generalization. # Summary The concepts of populations, sampling frames, and sampling from SRM are applicable to qualitative research, albeit in adapted forms: - Populations are contextually bound groups rather than statistical entities. - Sampling Frames are tools for accessing relevant participants rather than comprehensive lists. - Sampling methods are strategic and purposive, aiming for depth of understanding. Understanding these parallels enhances the rigor of qualitative research by ensuring purposeful participant selection and data collection strategies that align with the research objectives. --- COVERAGE PROBLEMS & NONCOVERAGE BIAS # Parallels in Qualitative Research for Coverage Problems and Noncoverage Bias While qualitative research does not aim for statistical generalization, coverage issues can still affect the depth and breadth of insights. Let's explore how coverage problems and noncoverage bias manifest in qualitative research. 1. Representation of Perspectives - Selective Inclusion: If the sampling methods exclude certain voices or perspectives, the findings may be skewed. - Marginalized Groups: Hard-to-reach or marginalized populations may be underrepresented, limiting the study's comprehensiveness. 2. Sampling Frame Limitations - Access Networks: Relying on specific networks (e.g., professional organizations, online forums) may inadvertently exclude individuals not connected to these channels. - Gatekeeper Bias: Using intermediaries to access participants can lead to a sample that reflects the gatekeepers' biases or networks. 3. Noncoverage Bias in Qualitative Research - Perspective Bias: The absence of certain viewpoints can lead to an incomplete understanding of the phenomenon. - Theoretical Saturation Issues: Without diverse perspectives, reaching true theoretical saturation may be hindered. # Addressing Coverage Problems in Qualitative Research 1. Expanding the Sampling Frame - Multiple Access Points: Utilize various channels to reach participants, such as community centers, social media, and public events. - Inclusive Criteria: Define participant inclusion criteria broadly enough to encompass diverse experiences relevant to the research question. 2. Sampling Strategies - Maximum Variation Sampling: Intentionally include participants with different backgrounds, experiences, and characteristics to capture a wide range of perspectives. - Snowball Sampling with Caution: While useful, be aware that snowball sampling can perpetuate homogeneity if participants refer similar individuals. 3. Reflexivity and Awareness - Researcher Reflexivity: Continually reflect on how personal biases and assumptions may influence participant selection. - Critical Examination of Exclusions: Identify who is not included and consider how their absence might affect the findings. 4. Community Engagement - Collaborative Approaches: Involve community members or representatives in the research design to ensure broader coverage. - Building Trust: Establish relationships with different groups to improve access and willingness to participate. # Implications for Qualitative Research - Depth and Breadth of Insights: Addressing coverage issues enhances the richness and comprehensiveness of the data. - Credibility and Trustworthiness: Including diverse perspectives strengthens the study's credibility. - Transferability: Broader coverage allows findings to be more transferable to similar contexts or groups. # Summary While the terminology of coverage problems and noncoverage bias originates from quantitative research, the underlying concerns about inclusivity and representation are highly relevant to qualitative research. Key takeaways include: - Awareness of Exclusions: Recognize that certain voices may be missing and consider the impact on the study's findings. - Intentional Inclusivity: Employ strategies to include a wide range of perspectives, enhancing the study's depth. - Reflective Practice: Continuously evaluate and adjust sampling methods to address potential biases. %%