Natalie Lambert

I am interested in the patterns of behaviors and organizational structures, as well as the shared mental models that exist within large-scale communication datasets. I’m particularly interested in developing new computational methodologies that remove barriers to conducting high-quality and trustworthy research of large-scale datasets collected from online spaces and social media platforms. For any online social behavior of interest, “datasets are distributed widely across space with differing means of access,” and stored within “incompatible formats on different platforms and computing environments” (Lee, Fielding, & Blank, 2008, p. 8). Consequently, many studies of online behaviors are limited to case studies of individual websites or interaction platforms. There is therefore a great need for solutions to problems of identifying, collecting, and standardizing very large-scale datasets of online phenomena of interest when the individual pieces of data are fragmented across the Internet in the form of hundreds or thousands of unique websites and discussion spaces. I am developing a framework that will enable scholars to identify a “population” of all websites that contain evidence of a phenomena of interest so that data collection can be done by drawing a representative sample from the population, thereby increasing the trustworthiness and generalizability of research findings resulting from analysis of online data. I am also working to develop network analysis measures that can account for the influence of technological affordances on communication in online spaces. For example, when a member of an online community posts a comment directed at the discussion group at large, the researcher has to make a decision about who to record as having received that message when there are often a multitude of potential receivers (i.e., lurkers or silent members of the space) but no data concerning who actually attended to the message. The manner in which the researcher records this message has an enormous impact on subsequent analyses of the communication network. I am testing multiple methods of recording such data, evaluating each approach’s impact on the larger communication network, as well as assessing the theoretical assumptions that underlie each approach. Large-scale datasets offer tremendous research possibilities, but big data scholars have many challenges to overcome in order to ensure that their research reflects the varied, lived experiences of the individuals that these large-scale datasets represent.