Let AI Do the Heavy Knowledge Lifting

Much has been written about the avalanche of AI bombarding our engagement in the digital realm. For instance, copywriters and editors on LinkedIn lament the myth that an em-dash means “AI did it.” YouTube videos and podcasts [1,2] are helping viewers recognise when “AI did it.”

Bloggers draw attention to the AI-slop served up in the marketing world [3], and those who work online express irritation about how intrusive it is, with a “How can I help you?” at every click. Now Firefox has “Kit.”  

So, what does AI like ChatGPT do? Basically, it gathers information from a very large digital data pool (very quickly) and synthesises that encoded information for a purpose. However, its purpose rests with the instruction given or the question asked. So, bullshit in, bullshit out: I learned that in the very early 1980s when processing survey data in a corporate environment.

AI’s capacity to synthesise information is incredible, but its thinking process, the putting together of that information, is purely (and only) rational. AI does not have “ah hah” moments, flashes of brilliance, creative spurts, or a sense of wonder when patterns are recognised. These non-rational thinking processes have generated some of the most outstanding art and scientific breakthroughs. For example, chemist August Kekulé dreamed of a snake biting its tail, which led to the understanding of the benzene molecule’s cyclic structure [4], René Descartes imagined the world as a grand machine, laying the groundwork for machine consciousness [5]. 

AI is also not inspired by angels, nor does it struggle with demons. Nor does AI respond with its instincts, as do all flesh and blood creatures. AI is a machine; non-rational and irrational thinking processes are exclusively human in comparison.

So, considering that the various types of knowledge, 14 according to some [6], can be loosely categorised as personal, social, and digital, we know the following: First, AI has no access at all to the personal level. Even if it collects your health data via your smartwatch, it is data, and data is always second-hand. Second, the knowledge we share about our experiences is already detached when it is encoded with language (words, numbers, symbols, images) to become social knowledge. Then AI must digitalise that second-hand, social knowledge, making what it spits out third-hand information. Third, AI is limited with respect to its thinking process. It is incapable of being unpredictable, following a hunch, or taking an imaginative leap: Those are human strengths.

So, let AI do the heavy knowledge lifting with its logical, systematic, and methodical processes and instead focus on the blessing of being human: unpredictable, gutsy, and imaginative.

  1. NOVA PBS Official. (2025, Oct. 12) How to Detect Deepfakes: The Science of Recognizing AI Generated Content.  https://www.youtube.com/watch?v=GMoOCKkcd_w,
  2. NOVA PBS Official. (2025, Aug. 26). The Deepfake Detective | Particles of Thought. https://www.youtube.com/watch?v=nG2_GhNdTek
  3. Robinson, Stephan. (2025, Oct. 21). AI Slop is Creating New Freelance Work: Why Businesses Still Need Human Experts in 2025. https://www.peopleperhour.com/discover/guides/ai-slop-is-creating-new-freelance-work-why-businesses-still-need-human-experts-in-2025/
  4. Read, John (1957). From Alchemy to Chemistry. Courier Corporation.  
  5. Sanderson, Daniel. (2025, Oct. 11). The Role of Imagination in Scientific Hypotheses and Memory and Imagination. https://www.planksip.org/the-role-of-imagination-in-scientific-hypothesis-and-memory-and-imagination-1760233400612/
  6. Drew, C. (2023, March 2). The 14 Types of Knowledge. https://helpfulprofessor.com/types-of-knowledge/

Identifying the Two-Dimensional Monotone Monologue of Artificial Intelligence

Hany Farid, an expert in identifying deepfakes, or AI-generated footage of events that have never occurred, claims that human beings can correctly recognize AI-audio just slightly more than chance [1,2]. That sure makes our responses to what we hear and see on social media and professional platforms important: Arguably, madness is a product of responding to what is not real.

Farid’s tools for identification are primarily technical. He uses machines (computers) to do it, but there are other ways of figuring out if what you are seeing and hearing is AI-generated.

Unlike machine consciousness, human consciousness is sensual: we inhabit a meat suit and gather data through it. Two senses are essential for sifting authentic content from AI content. First, the voice of AI-generated content has a particular scripted tone—a tone that transfers even into the academic works/self-help books I edit.

Then there are visual clues—an overly dramatic edge, five fingers and a thumb, shadows that contradict (or none), limbs that come out of or disappear into other objects. As AI is improved and the pixel arrangements become more seamless, the visual clues will become less evident.

A third helpful sense for differentiating authentic footage from AI-generated footage is what some call intuition, a flexible, non-rational experience-based insight [3]. I call the latter soul. If you tune into what you see with your soul, you can feel the soul or absence of it in what is presented. AI generates a deadness. Even when the creator tries to animate what AI produces, it comes across as a two-dimensional monotone monologue with no spark of life behind it. Inevitably, intelligent human beings are going to grow tired of what Stephen Robinson calls AI-slop, or “the flood of low-quality AI outputs that look convincing at first glance but miss the mark in accuracy, tone, usability, or brand fit” [4].   

Of course, AI is not all bad. It is a blessing when one wants to check a fact or definition, find an answer to a quick question, or is gathering and synthesizing information. On social media, sometimes a story is told that tugs at the heart, and your heart might just open. But AI is not going to save us from anything except the heavy lifting or replace jobs except for those held by bureaucrats and call-centre staff. The world is full of forms to complete and chatbots helping one do it.  

Still, it is time to see AI for what it is: a machine, a tool that echoes what has been put into language and published digitally by the collective of human consciousness. Essentially, AI has no access to what is. It cannot access the messiness of being in a body and rubbing shoulders with others. It only has access to the experiences that human beings put into language. Its condition for existence is “data.” Data is always second-hand.

But most important of all, remember AI will never enjoy those non-rational moments of insight that expand human consciousness.

  1. NOVA PBS Official. (2025, Oct. 12) How to Detect Deepfakes: The Science of Recognizing AI Generated Content.  https://www.youtube.com/watch?v=GMoOCKkcd_w,
  2. NOVA PBS Official. (2025, Aug. 26). The Deepfake Detective | Particles of Thought. https://www.youtube.com/watch?v=nG2_GhNdTek
  3. Rephrasely Media. (2023, Jan. 15). Instinct vs. Intuition. https://rephrasely.com/usage/instinct-vs-intuition
  4. Robinson, Stephan. (2025, Oct. 21). “AI Slop is Creating New Freelance Work: Why Businesses Still Need Human Experts in 2025. https://www.peopleperhour.com/discover/guides/ai-slop-is-creating-new-freelance-work-why-businesses-still-need-human-experts-in-2025/

Turnitin Terrors!

As an editor for academics, postgraduates, and nonfiction writers, I was given the opportunity in the last couple of months to see just how unreliable and terrifying Turnitin is. For those who don’t know, Turnitin is a standard plagiarism detector that many academic institutions use to measure the extent to which a writer has presented a published author’s work as their own.

When I majored in Anthropology in the early 80s, it was more than four words in a row; now, it is five. Some institutions choose a 10% similarity score, others 12%. These are arbitrary measures.

Witness This

The candidate did her first submission to Turnitin. The similarity score was 9%, acceptable for that institution, but there were swathes of text in a meticulously constructed 200-page literature review of already-developed models and frameworks that were flagged. As her editor, I supported her decision to rework some of those passages, and she did. To our surprise, the percentage went up 4%―beyond what was acceptable to the institution.

Intrigued, I compared the two reports. In the second report, pages and pages that were not flagged before were flagged. Most intriguing was the flagging of the opening sentence in the first report, but the same sentence in the second report was not. Whereas in the first report, literally nothing in the methodology chapter was flagged, in the second, passages were flagged, especially the section on sampling. There are only so many ways one can explain the difference between probability and non-probability sampling before one runs out of options. More intriguing still was that the research objectives and questions were flagged, as well as some verbatim interviews with participants and arbitrary phrases like “in Table 5.3.”

I wondered, “Has someone published her work in the month we have been working on it?” With the third report, at the eleventh hour, we managed to get the similarity score down to 10%—the very edge of submissability for that institution.

The arbitrariness of the measure (five words) and the percentage permitted is just part of the problem.

How many ways can a standard research claim be made?

First, it is easy to present five words in a row that are written in the same order as a published author because of how English is structured. Consider, for example, research report statements like, “A qualitative phenomenological approach was used in this study.” I suppose one can replace “used” with “applied” or “employed,” or start with an introductory phrase, “in this study,” ” but I know those have been used in numerous research reports because I have edited literally hundreds of research reports over the last 20 years, and there are not hundreds of ways to disclose the approach employed.

Second, due to the “publish or perish” mentality in academic contexts, so much has been published about so many topics in so many fields that any possible way to phrase the same ideas and link them has been used up. There is no new way of conveying your ideas and findings without being flagged by Turnitin, especially in a literature review. If a writer is looking to do an overview of a much-written-about field, like stress or leadership, that writer is at a disadvantage because many more published authors have tried to find different ways of saying the same thing. Moreover, in academic research, one is working with constructs and concepts that have been woven into models, frameworks, and theories. How many ways can a list of five components of a model be listed without presenting the list in the same order as an already-published author?

What about Voice and Cadence?

So what makes a formal research report original, rather than stolen? What makes the difference? I would claim it is in the cadence of the voice and the consistency of the cadence, bearing in mind that any writer is influenced by the voice of those they read.

Cadence is a level of language that Turnitin is not tuned into. And the irony? Turnitin is itself an AI program, and generative AI is the biggest plagiariser of all—it steals the content and voice of whatever has ever been published on the Web by anyone you could imagine. How cruel of academia to terrify students with such arbitrary measures of their contribution to knowledge and truth, and even more so, leave it to a machine that has no understanding of voice and how a voice is developed to make that judgment.  

Seven Steps to a Less Painful Publishing Process for Academic Papers

So, you are required or invited to submit a paper based on your PhD for publication in a journal.

Whether required or invited, bear in mind that besides acknowledging your supervisor/mentor as an author of the paper, you have far fewer words to contribute your sliver of truth—some journals allow as little as 3,000 words [1]. Others offer more space but presenting a PhD dissertation of 60,000 to 120,000 words [2] in just 8,000 or even 12,000 words may prove challenging.

Here are seven steps to reduce the pain of publishing an academic paper based on your PhD. 

Pinning Down the Purpose

An article requires a precise focus because you have far fewer words to play with. Are you showcasing a methodology? Describing a case study? Presenting a framework for consideration based on your literature review? Offering a meta-analysis? Reporting on the results of your research? Your purpose will determine your research questions, and your research questions, in turn, will determine what parts of your dissertation are relevant to answer that research question.

For example, you might have conducted broad research about what enables and obstructs firms from adopting data analytics and choose to focus on just the human resource factors in your paper. In that case, you would alert the reader to broader research conducted, but your focus would be on the human resource factors highlighted by the research rather than the technological enablers and obstructions.

Establishing the Journal Agenda

Depending on what you want to demonstrate or argue, the point of presenting your sliver of truth determines which journals to consider. If you intend to describe the merits of a qualitative case study, sending your work to a journal that publishes correlational research based on hypothetical constructs is pointless. Likewise, erudite models presented for testing require targeting discipline-specific audiences, and journals that value empirical evidence may not welcome a meta-analysis of constructs to include in a proposed framework.

Using Keywords

A helpful way of narrowing down possible publishers is plugging the keywords for your dissertation into Google Scholar, importing those titles into Zotero, and generating a list of journals where the articles include the same keywords. This exercise lets you identify journal titles where your sliver of truth might be accommodated.

You can also cast an eye over your references list. Where have the minds with whom you have been developing your thinking about the issue been publishing their work? It is important that you identify a receptive audience.

Identifying the Requirements

Once you have a short list, explore those journal requirements. At this point, the word count is the most important consideration. If what you wish to present, based on your Purpose, the journal’s Agenda, and the article’s Keywords, cannot be accommodated in 4,000 words, exclude those journals from your list. If you must pad your paper unnecessarily to reach 12,000 words, consider excluding those journals from your list. Now, you have a shortlist of journals for which you can write with a view to submitting your sliver of truth.

Crafting the Content

The fifth step is crafting your sliver of truth according to the requirements of your first choice on the shortlist. How much of your methodology must be secured to make your sliver of truth credible? Some journals require precise detail, others just a concise description covering all the bases—sample selection, data collection, data documentation, and data analysis—all within the required methodological and ethical parameters.

Some journals insist on specific headings—Introduction, Literature Review, Methods, Discussion, and Conclusion; others permit the development of a unique and coherent argument where you can choose more exciting, meaningful, thematic headings.

Carving the Product

The sixth step demands that you begin to carve out and let go of anything in the dissertation that is not central to achieving your Purpose, demonstrating congruence with the journal’s Agenda, and deliberating upon your Keywords. Sometimes, it might be whole chapters you exclude; for example, if you present a theoretical framework for testing based on a meta-analysis of the literature in your field, your methodology, discussion, and conclusion chapters can be excluded, for the most part. I say “for the most part” because you might want to suggest the method by which the model was tested in your research as a research possibility in the Conclusion to your paper. And there you have the grounds for writing your next paper based on the results of testing the framework you proposed.

Other papers might require a golden thread that cuts to the chase with a precise summary of aspects of the research, and that means carving away constructs that are not required to support the sliver of truth you offer.

Proofread and Polish

Once you are clear about your Purpose, identified the journal’s Agenda, used your Keywords to generate a shortlist of potential publishers, are clear about which journals are read by a receptive audience, have identified your first choice in terms of allowing enough words to achieve your aims, and have crafted and carved the paper to support your answers to the research questions and conclusions, consider a professional proofread and polish to ensure your paper is well-structured and coherent and that the style is congruent with the journal’s requirements.

Remember, a well-written, coherent and well-structured paper that follows the style of the journal you chose has a better chance of being taken seriously, reaching a reviewer’s desk, and earning meaningful feedback. And if the answer from the journal editor is yes with no amendments or even yes with revisions, you are well on your way to contributing your sliver of truth and seeing your name in print.

References

[1] Monash University. (2025). Writing a journal article. Student Academic Success. https://www.monash.edu/student-academic-success/excel-at-writing/how-to-write/journal-article

[2] Campus Team. (2025.) Tips for writing a PhD dissertation: FAQs answered. Times Higher Education. https://www.timeshighereducation.com/campus/tips-writing-phd-dissertation-faqs-answered

Time to Upskill Your Academic Credentials?

Try cultivating a 15-minutes-a-day writing habit

How do you find time to write a master’s thesis or PhD dissertation when working full-time and still have space for a life? Because the truth is, if you wait to find the time to feel inspired to write, the chances are you will miss the boat or sink it. And if you write on top of a deadline, you risk burn out.

So, rather than missing the boat, sinking your dream, or burning out your life energy, consider devoting just 15 minutes a day to writing up your research report.

Finding the Time

Consider that the average master’s thesis is between 30 and 100 pages and a PhD between 50 and 450 pages [1], depending on your discipline and institution. If we take an average, let’s say 200 pages, and there are 365 days in a year, if you write just 140 focused words daily, the equivalent of five sentences or half a page (in a double-spaced 12-point Times Roman font), you could more than cover the word count. And, writing the equivalent of five sentences in just fifteen minutes is a breeze if you have done some reading, thinking and journaling over a weekend.

The Importance of Planning

It is best to prioritise creating an overall outline for the thesis or dissertation in a working document—your first 15 minutes could be used to generate chapter headings. The general sequence of a thesis is an introduction (problem definition and statement, the purpose of the research and how you will go about it), a literature review (the social and knowledge context and a theoretical frame for understanding the problem), the methods used in detail, the documentation of the results or the analysis or interpretation of the data (statistics and/or themes), and the conclusions you have reached based on your analysis.

The next priority would be to outline the required sections within each chapter because there are requirements when writing a thesis. For example, you have to address the ethics affecting your research design. More broadly speaking, the reader must be convinced that conducting this research is worthwhile (relevant), well-thought-through and executed. That will ensure your conclusions are trustworthy, a considered opinion rather than fiction. 

Once you have a working outline—because the outline and headings will change as you progress and they become more creative, descriptive, and meaningful—set up a linked table of contents to facilitate you moving around the document with speed and then commit to keeping a journal of what you read and think in odd moments or when reading in your field or processing data. Once you know where you are going with your writing, you can peruse your journal in your fifteen minutes each day and place statements within the outline you have created. Sentence by sentence, your outline will grow into the final report you will submit. Essentially, having an outline breaks what could be experienced as an overwhelming task into bite-sized pieces: You will simply fill in the spaces for fifteen minutes a day for a year.

Uninspired Moments

Long-term writing projects inevitably risk running out of steam because we cannot feel inspired to write on demand every day, even for fifteen minutes, and even if we love writing and have flexible schedules. Remember that there are many mundane tasks when writing a thesis: listing References, generating tables to support your claims, and creating diagrams to show links between concepts and themes. If you feel completely uninspired about constructing sentences, you can use that fifteen minutes on those mundane tasks to fill in the blanks and would otherwise be left for the final stretch.

Just 1% of your day

Albert Einstein allegedly said, “Anyone can be a genius, if they pick just one specific subject and study it diligently just 15 minutes each day.” [2] Whether you desire to be a genius or just upskill and graduate with a master’s or PhD, 15 minutes of daily focused activity, just 1% of your day [3], can assist even the busiest person to complete a master’s or PhD research report without burning out or disrupting your work-life balance.

If you start with a working outline, your fifteen minutes can be spent filling in the gaps, sentence by sentence, as you build the vehicle that will carry you forth in your profession. When feeling inspired is challenging, remember that listing just eight References or creating a table or figure brings you a step closer to achieving your dream of graduating with that sought-after master’s or PhD degree.

References

[1] Stapleton, Andrew. (2024). How long is a thesis or dissertation? [The data]. Academia Insider. https://academiainsider.com/how-long-is-a-thesis-or-dissertation/

[2] Albert Einstein quotes. (2024) Quotefancy.com. https://quotefancy.com/quote/763678/Albert-Einstein-Anyone-can-be-a-genius-if-they-pick-just-one-specific-subject-and-study

[3] Idea Vision Action. (2018, January 30). Create Your Dream Life, 15 Minutes a Day. https://ideavisionaction.com/personal-development/create-your-dream-life-15-minutes-a-day/

AI is Limited!

I quietly chortled when I found this cartoon on “my wall” the other night. Several interactions back, I suggested to “my coach” (#remotania) that in a world where AI writes for people, the odd grammar and spelling error will make the difference between feeling like you have been addressed by a machine or encountered a real human being.

That aside, there is an avalanche of concern and fear porn about AI and its capabilities. Many assume AI is so ‘intelligent’ that it threatens the value of human consciousness. Yes, in a worst-case scenario, homo tech may turn against homo sapiens and seek to extinguish them. Realistically,  it will probably just make life complicated and inspire us to find ways around it because AI is limited far past the irritation of being addressed by chatbots and automatic subtitles misheard and written up. Let me elaborate.

At its root, AI is dependent on homo sapiens continued existence. First, how will machine intelligence, homo tech, keep running without a homo sapien to maintain it? Machines are made of parts separately produced and put together—they are made, not created. There are industries built around that. Homo sapiens created homo tech. AI’s potential is, therefore, less than than that of its creator in the same way as homo sapiens are less than their Creator. Second, from where will homo tech access power. Batteries do not last forever. They have to be charged and replaced, and access to electrical charge is becoming a problem the world over, Gas lines break for whatever reason, manufactured components are subject to trade and weather events, and the electrical power grid is vulnerable to being fried by a solar flare. I mean, what could go wrong with machine intelligence under those conditions?

In the same vein, how will the homo tech endure the nonsterile and hostile conditions that are the foundation for an organic realm? Earth is an evolving and creative process that has unfolded over billions of years. Being organic requires creations to rot so other, more evolved creations can grow, which is a messy process. How will homo tech, or the transhuman dream for humanity, endure hail storms and floods, the fine dust in Africa gumming up the moving parts, and the searing heat that brittles the insulation that keeps the former two away from the moving parts and its circuits?

Of course, it is not just the pseudo-human robots people worry about but also the algorithms that control human consciousness; however, there is always a choice. I work on a freelance platform. If I look at the possibilities the algorithms suggest, I would be completely off track—it has no idea what I have done on its platform in the last two years based on the gigs I have completed (with 100% success and mostly stellar ratings) because it is designed to collect specific information rather than my experience of the job. Moreover, algorithms, if they work, are easy to manipulate: You have a choice. After losing Buddy, my canine travelling companion, in an accident, I watched how my Facebook wall filled with animal stories and animal rescue concerns begging for funds. So, AI may limit its offerings and censor posts and information on social media platforms, but it is not in control of what I see—it is just very limited. Besides, if I do not want to see something, I do not give a jot about clicking the X in the corner of a post. 

Finally, AI does not have a heart. For example, I receive an email thanking me for attending a meeting from which I was absent or being invited to pay for a product I am already using. The programming that processes the information did not consider who attended the meeting, nor did AI understand the contents of the email it was programmed to send. AI does not have a personal touch, and in the future, that is what will make the difference.

Like it or not, AI is here to stay, and I concede it is brilliant at gathering information and can process that information, the bytes, faster than any human. However, intelligence is not just about processing information. It is also about relating to others, feeling inspired to think or do, and creating something new from the information processed. The human brain is organic, a pulsating spiral of connections and constellations, not a linear series of 1s and zeros. So, to survive the avalanche of AI, corporations, economic collectives, and vocations will need to focus on their hearts: They will need to be entities that relate personally and whose service is humanized.

Simplifying Sampling Strategies

Ideally, if one wants to know the answer to a research question, for example, to what extent is people’s buying behavior is influenced by an advertisement, one will need to access everyone who viewed that advertisement.

Photo by Firmbee.com on Unsplash

Accessing everyone concerned, however, is well-nigh impossible; therefore, when conducting research, one accesses a sample or portion of those who viewed the advertisement.

The question that arises then is how representative is that sample because if it is not representative of the population who have seen that advertisement, the ability to generalize the results of the research will be limited. That brings us to a discussion of the various types of samples and their limitations for generalizing the results of a research.  

There are basically two types of samples: probability sampling and nonprobability sampling. For the most part, quantitative methods, or methods that crunch numbers, require a probability sample.

Probability Sampling

Four kinds of probability sampling exist (Shin, 2020), all of which require a sampling frame, in other words, a data base of elements that pertain to the population on which the research is focused and from whom (or which if you are doing research in the life sciences) you will gather a sample.

In probability sampling, the size of the sample counts. A general rule is the smaller the sampling frame, the higher the percentage of participants chosen because many statistical equations require a certain number of responses(f) to be executed. For example, a chi-square test requires an expected frequency of more than 5 for any cross-tabulated variable (cell), so larger samples are essential (Gravetter & Wallnau, 2005).

The most representative sample would be a simple random sample. In other words, every person in the population in which you are interested has an equal change of being chosen to participate in the research. For example, if you were interested in the needs of homeowners, you might access a list of all rate payers in a city. Rate payers in your city would be the sampling frame. Likewise, if I was researching the extent to which patients are satisfied with their treatment at a specific hospital, I would access the database of everyone who has visited that hospital over the last three years (sampling frame), assign a number to each of those patients, and then choose a sample using a random sample table or generator. That ensures that every patient has an equal chance of being chosen.

Even simple random sampling, however, may not be as random as one thinks. Some patients may have died over the last three years and others moved without providing a forwarding address. In other instances, the sampling frame may not be ideal. For example, if one is trying to establish the market for fridges in a particular area and the sampling frame to which one has access is ratepayers, ratepayers may have a different demographic from those who do not pay rates by virtue of not being able to afford to buy a home, so in the end, the data represents only the market for fridges among people who own homes. If one wants a more accurate view, one must find a different sampling frame, for example, people who access electricity because that would include both home and apartment dwellers.

A second sampling option is a systematic random sampling strategy, where one would sample every, for example, 5th or 10th person on the list in a database. For example, I might use a data providers’ list of smartphone numbers and call every 10th number or begin with a map of the suburbs and visit every fifth house on every fifth block to establish if they have seen the advertisement, and if so, to what extent their buying behavior was influenced by the advertisement. Arguably, approaching every fifth person entering a mall or shopping center, or even store, would be also be considered systematic random sampling.

The random nature of systematic random sampling may be compromised by participants choosing not to answer an unknown number or having the phone put down or door slammed in one’s face. And there is no guarantee people will respond to emails requesting their participation, so very often a systematic random sample is not random, but a volunteer sample, in other words, a sample of people willing to participate in the research.   

So, given that simple and stratified random samples may not be possible because a suitable sampling frame may not exist and/or those chosen to participate choose not to participate, the next best bet is a stratified random sample. Here one divides the population into groups with similar attributes (Health Knowledge, n.d.), for example, people living in standalone homes and people living in apartments, and randomly samples each group. Or if I am exploring the effects of an advertisement for a particular fridge, I might access an electrical company’s customer database and randomly sample only those who buy or use a certain number of units of electricity because it takes a certain number of units to run a fridge in addition to other electrical appliances. Stratified random sampling is also useful if one want to make comparisons, for example, in a study of the health outcomes of nursing staff in a country, if there are seven hospitals each with different numbers of nursing staff, it would be appropriate to sample numbers from each hospital proportionally, so the hospitals with more nursing staff constitute a larger proportion of the sample. And if I am going to use chi-square, I best ensure the samples from smaller hospitals are large enough to ensure that on any cross-tabulation, the expected frequency is more than 5.  

A final probability strategy is cluster sampling. For example, if I am conducting research in education about the efficacy of a particular Math module, there may be five classes at one school using that module and seven at another school, and I might choose just one class from each school based on the assumption that the classes not chosen would demonstrate the same dynamics as the classes I chose for my sample.

Non-Probability Sampling

Non-probability sampling includes convenience sampling, quota sampling, purposive sampling, and snowball sampling.

Quota sampling is a strategy most often used by market researchers. They are given a quota of specific types of people to select, for example, the research question might be who is buying nonfiction books, and interviewers are asked to recruit a certain number of adolescents, young adults, and adults over the age of 40 based on the proportion of those categories of people in the general population; so, ideally, the sample would be representative of the proportions of those age groups in the population. 

Convenience sampling is often used in the social sciences and humanities because (a) participants are protected by the ethic of informed consent and (b) participants have the right to withdraw from a participation at any point and without prejudice. So, convenience sampling is the easiest way to recruit available and willing participants.

With convenience samples, the representativeness is severely compromised because those who volunteer to participate may have very different profiles from those who do not. For example, social media has become a popular means for distributing surveys, but one has to bear in mind that not everyone uses Facebook and/or Instagram and that those who volunteer to participate may be people with time on their hands rather than busy professionals. That may skew the sample towards people who are not employed or are underemployed, and the less random the nature of the sample, the more unreliable the statistical manipulations. However, while conveniences samples may not be appropriate for asking how many and to what extent, they can answer the what, how, and why questions or assist with describing components and processes and explaining their connections.

A third non-probability sampling strategy for gathering a sample is purposive sampling, which means earmarking specific individuals who would make suitable participants and inviting them to participate. This sampling strategy is most often used in qualitative research, the assumption being that the person can comment on the focus of the research. For example, if one is exploring the meaning of boredom, it would be pointless to include people who claim that they are never bored in one’s sample. Purposive sampling is the most time- and cost-effective strategy, but the least representative and generalizable. It is also the most time-consuming data to process. One can apply software that helps, but software helps: it does not distill the understanding for one.

Snowball sampling is generally used in social sciences to access groups that are difficult to reach. For example, before it became trendy to be part of the LGTB+ community, one would ask an interviewee to nominate two members of the community who would be willing to share their experiences or opinions, and those two would in turn nominate two, and thus the sample would grow. The danger of such samples is that one ends up examining a subculture of the culture on which one is focused.

Defining the Inclusion and Exclusion Criteria

When writing about the sampling strategy chosen, it is critically important to define both the inclusion and exclusion criteria for your sample. For example, if one is going to explore the effects of secondary trauma among neighborhood watch volunteers, it is critical that those participating have (a) experienced secondary trauma within a particular timeframe, (b) are active members of a neighborhood watch, and (c), are volunteers and not paid security personnel. If they are paid security personnel, that would be a reason to exclude them from the sample.  

Likewise, if one is exploring the impact of being terminated from one’s employment for not having had the corporate-mandated jab, one would not interview those who not working in a corporate that mandated the jab, those who obeyed the mandate, or those have not had their employment terminated because they refused to take the jab. None of those potential participants would be able to speak about their experience of being terminated for that particular reason. Likewise, it would be pointless to ask people who have not viewed an advertisement how it affected them. So, for example, if one was using a survey method, the first filter question might be, “Have you viewed the said advertisement.” If not, the survey would be terminated with that person. Of course, information about how many people did and did not view the advertisement would be useful, but the latter could not offer an opinion about an advertisement they have not seen.

Some Conclusions

Samples, and the strategies used to choose those samples, are important because applying statistics and making valid claims about what the data says and then generalizing the findings of the research depend on a sample accurately representing the population in which you are interested and about which you are making claims. At the same time, sampling in the humanities and social sciences is subject to sampling bias that makes their representativeness questionable because ethically, a researcher has little control over who chooses to participate and/or drop out. Moreover, a sample may not be as random as assumed because the return rate for a questionnaire sent, even with a self-addressed and stamped envelope or on email, might be skewed towards those who have the time and motivation to complete the questionnaire.

There are ways and means of evaluating to what extent a sample is representative after the fact. One way is to compare the demographic data of the sample (age, education, income, gender, etc.) to the demographic of the general population you intend generalizing about, if such is available. That not only underlines the importance of collecting at least some basic demographic data, but also allows one to understand which categories of the population may skew the results. Knowing, for example, that people over the age of 60 are over-represented in a sample allows a researcher to temper the interpretation of the processed data. On the other hand, if one can show that the demographic of the sample matches that of the population on which one is focusing, it strengthens one’s ability to generalize the results to that population. So, collecting the relevant demographic facts about the sample is not just about being able to introduce the sample. It is also important because it allows one to assess the degree to which the sample represents the population in which one is interested.

Finally, it would be well to remember that sampling and the associated statistics, even for a probability sample, are based on probabilities, not certainties. So, even when people insist that one  attend to the science, as if science offers truth, bear in mind that even hard science is not about proof or truth but about what is most probably true, all things considered.

References    

Shin, T. (2020, Oct. 25). Four types of random sampling techniques explained with visuals. Towards Data Science. https://towardsdatascience.com/four-types-of-random-sampling-techniques-explained-with-visuals-d8c7bcba072a

Health Knowledge. (n.d.). Methods of sampling from a population. https://www.healthknowledge.org.uk/public-health-textbook/research-methods/1a-epidemiology/methods-of-sampling-population

Gravetter, F. J., & Wallnau, L. B. (2005). Essentials of statistics for the behavioral sciences (5th ed.). Wadsworth.

Proposing Research: A Brief Summary

If you have been following the posts shared over the past several months, you should at least understand the process involved when writing a proposal for conducting research.

Photo by Joshua Sortino on Unsplash

You would know that it all begins with a research question and then a series of informed decisions about how that research question could be answered.

Through a review of the literature, you have identified the language you will use to frame and convey your understanding of the phenomenon on which you have focused—the field, paradigm, theories, concepts and/or models you will use to make sense of the data to be collected.

You would also be aware that the way you answer the question, the method, with all its assumptions and limitations, needs to be able to answer the research question posed.

Finally, you will have realized that you have to write in a way that demonstrates the literature reviewed and method chosen are based on and informed and considered decision-making process.

Essentially, when writing a proposal, you are presenting an argument to convince the audience the research question is worth answering and the method by which you will answer it will deliver valid and/or trustworthy results that will add value to the field’s knowledge base and/or humanity in general.

None of this can be done, of course, without acknowledging others who have raised the same kinds of questions and/or used the same kinds of methods, which brings us to the issue of academic styles, and inevitably, citations and reference entries.

The primary academic styles are APA, Harvard (all 64 versions registered on Zetoro), Chicago (footnotes or in-text), and MLA. Consider the following examples of how the same journal article would be notated in the different styles:  

APA

Butler, K. (2001). Defining diaspora, refining a discourse. Diaspora, 10(2), 189‒219. https://doi.org/10.1353/dsp.2011.0014

  • As Butler (2001) explained, ‟Quote” (p. 190).
  • (Butler, 2001, p. 190).

Harvard

Butler, K 2001, ‛Defining diaspora, refining a discourse’, Diaspora, vol. 10, no. 2, pp. 189‒219, <https://doi.org/10.1353/dsp.2011.0014>.

  • As Butler (2001, p. 190) explained, “Quote.”
  • (Butler 2001, p. 190).

In some forms of Harvard, the page number would be introduced with a colon:

  • As Butler (2001:190) explained, “Quote.”
  • (Butler 2001:190).

Chicago

Butler, K. 2001. ‟Defining diaspora, refining a discourse.” Diaspora, 10, no. 2: 189‒219. https://doi.org/10.1353/dsp.2011.0014

  • As Butler (2001, 190) explained, “Quote.”
  • (Butler 2001, 190).

MLA

Butler, K. “Defining diaspora, refining a discourse.” Diaspora, 10, no. 2, 2001, 2001, pp. 189‒219. https://doi.org/10.1353/dsp.2011.0014

  • As Butler explained, “Quote” (190).
  • (Butler 190).

Notice that all styles document the same basic details in the References (or for MLA, Works Cited) lists, but the order of the information and punctuation or absence thereof is different. The basic information is

  • Author (which could be an organization, for example, the World Health Organization),
  • date of publication,
  • name of the work,
  • name of the container (if a journal or news source, include the volume and issue numbers as well as page numbers) or publisher,
  • the URL or DOI number.

Moreover, all but MLA expect that when citing an author, the date of publication should be documented, at least the first time that author in mentioned in a paragraph. And all but Harvard would format the list with a hanging indent.

To make it even more confusing, each journal and institution makes minute changes to the main styles to “make it their own.” Given that, the best course of action is to seek out a style guide for the journal or institution, and failing that, be absolutely consistent with how citations and references entries are presented.

Validity in Qualitative Studies

The validity of a qualitative approach and its methods is determined by how trustworthy the data is. That can take some convincing because quantitative methods remain the dominant paradigm even in the social sciences, and conclusions based on qualitative methods would not be considered to have much validity. The data is anecdotal, narrative in nature, and subjective.

Jo Szczepanska on Unsplash: A common means for processing raw qualitative data into themes is to use different color post it notes

The late Dreyer Kruger, a leading phenomenologist in his time, impressed upon us undergraduates at in the Psychology Department, Rhodes University, the importance of being “rigorous, systematic, and methodical” in the explication of experiential texts, or people’s descriptions of what it means to be anything a human being has the potential to be or experience. In qualitative approaches, one is not chasing answers to how much and how many variables affect a phenomenon, but attempting to elucidate what it means to be subjected to a phenomenon, be it being a leader or follower, a narcissist or borderline, a perpetrator or victim of a crime, or a marketer or customer loyal to a particular brand.

Rigorous means being thoughtful, deliberate, and diligent about interrogating the personal and theoretical lenses with which one approaches the data and the biases within those lenses. Human beings inevitably see the world from a point of view, however, broad, and one must be honest about the limitations of one’s vision.

Systematic means keeping a record of the decisions made and their implementation at the level of the data and interpretation so that one has an audit trail that can be followed, and one explicates rather than analyzes because one is attempting to understand the complexity of the whole, not break the whole down into its component parts.

Methodical means one has a goal. The focus in qualitative research is directed to answering a particular research question, and one pursues that goal in a step-by-step manner consciously and deliberately and with both forethought and afterthought. It means revisiting the data already explicated as new themes emerge to consider if that new theme is evident or at least does not contradict the experiences of already processed transcripts. So, rather than being methodical in a linear sense, it is being methodical in circular fashion, allowing what arises initially to inform one’s interpretation and allowing what arises later to inform one’s earlier interpretation. 

Because the participant is recognized to be a subject and the insight sought is about the world or phenomenon from his or her point of view, one engages the participant in the research as a subject (rather than object), and one means for establishing validity in qualitative research is to take the transcript, preferably a readable summary, and in a best case scenario, the distilled common themes, to check in with the participant. The intent is to ensure the participant does not feel his or her meaning was misrepresented. What I call collaborative validity is also an attempt to give back to those who made themselves available for the research.   

Another form of validity is intersubjective validity, which occurs at the level of processing the texts. For example, one might ask colleagues or friends to identify themes independently and then compare the themes in order to both challenge and come to some degree of consensus about what is essential for understanding the phenomenon, be the focus on burnout, comfort shopping, or loyalty to a brand or organization.

Triangulation is therefore also an important means for ensuring validity in qualitative research in so much as one involves several participants (data triangulation), processers of the data (investigator triangulation), and lenses (theoretical triangulation).

Finally, reflexivity, or disciplined self-reflection about one’s lenses and the process “is perhaps the most distinctive feature of qualitative research” (Banister et al., 1994, p. 149). In qualitative research, the influence of the researcher’s life experience on the construction of knowledge is centralized rather than marginalized, and it involves not only being honest about one’s personal and theoretical lenses but also continuously and critically examining the process of the research to reveal biases, values, and assumptions that have a bearing on one’s interpretation. This is most often done by keeping journal that documents what one did when and why and the decisions made with their rationales. Including a brief summary of this audit trail allows the reader to evaluate the validity of the conclusions to the research.

So, if you choose to apply a qualitative method in your research, be prepared to raise your level of self-awareness; disclose and critically interrogate your own theoretical lenses and personal biases; reflect upon and defend every research decision made; and engage in an ever-deepening spiral of understanding of the phenomenon you chose to examine. 

References

Kruger, D. (1979). An introduction to phenomenological psychology. Juta.

Banister, P., Burman, E., Parker, I., Taylor, M., & Tindall, C. (1994). Qualitative methods in psychology: A research guide. Open University Press.

The Value of Validity II

Internal and External Validity

In the previous blog, I attended to specific types of validity that need to be addressed when applying a quantitative research design, including content validity, face validity, criterion-related validity, and construct validity.

Photo by Jo Szczepanska on Unsplash

As if that is not enough, one also has to design a methodology around internal validity, or estimate the extent to which conclusions about relationships between variables are likely to be true based on the measures used, the research setting, and the whole research design, and external validity, or the extent to which one may generalize from the sample studied to the target population defined as well as other populations in time and space.

Experimental techniques involve measuring the effect of an independent variable on a dependent variable under highly controlled conditions, for example, one measures how stressed a participant is before and after an intervention. Such designs usually allow for high degrees of internal validity. There are a number of extraneous factors, however, that may threaten the internal validity of even an experimental design:

  • History factors pertain to specific events that occur between first and second measurements in addition to the experimental variables. For example, when seeking to measure the effectiveness of a post-traumatic stress intervention, a traumatic event between the pre-intervention and post-intervention tests may affect the degree to which the intervention offered can be said to be effective.
  • Maturation factors pertain to processes that occur within participants due to the passage of time as opposed to specific events. For example, participants becoming hungry and tired between a pre- and post-tests on the same day may well affect the results. If measuring the effectiveness of meditation for lowering stress levels, for instance, by the time the post-test is executed, the participants may be feeling exhausted or bored or be worrying about what is going on at home due to their prolonged absence, and that may affect their responses.
  • Testing factors pertain to the effects of taking a test upon the scores of a second test, particularly if the same test is used to compare pre-test and post-test scores in, for example, language proficiency tests where one gives the test, gives a lesson, and then uses the same test to assess the change in participants’ proficiency. In this instance, it might be preferable to use and compare the results of two tests that have been shown to have high convergent validity.
  • Instrumentation factors pertain changes in the calibration of a measurement tool. For example, when using a peak flow meter to measure the force of the breath of a person suffering from asthma, it is advised to use the same brand of peak flow meter because different brands have different calibrations. In other words, if one was measuring the effectiveness of a medication for treating asthma using a different brand of peak flow meter, the pre- and post-medication measures obtained would likely be misleading. Likewise, an observer may have read more about a topic between pre- and post-intervention observations and note aspects in the post-treatment phase that he or she would not have thought to note in pre-treatment phase. There may thus be changes noted that are not a product of the intervention or treatment but a product of the observer’s increased knowledge.   
  • Statistical regression factors occur where participants with extreme scores are included in the analysis. Most often in statistical analyses, especially a Pearson product-moment correlation, outliers, or participants with extreme scores, would be excluded from the analysis.
  • Selection factor biases occur due to differential selection of participants, which is why most quantitative research designs, ideally, use random samples and/or the criteria for selection are discussed in detail upfront. For example, selecting volunteers from social media may point to a particular demographic because not everyone participates on social media platforms, for example, Generation X rather than Baby Boomers and/or people who are unemployed or underemployed and have the time to complete questionnaires. Moreover, the characteristics of those who volunteer versus those who do not may differ. Selection bias threatens the generalizability of the data unless one is looking at a construct that is peculiar to the demographic.  
  • Experimental mortality pertains to the differential loss of respondents from the comparison groups. For example, if one is examining adolescent development in a longitudinal research design, the chances are that over the five years’ duration of the research, some adolescents who participated in the first stages of the study may move away or lose interest.

Four factors might jeopardize external validity or representativeness of one’s research findings:

  • Reactive or interaction effect of testing is where a pretest might increase the scores on a post-test because practice makes perfect. This threat may be overcome at least to some degree, by comparing the pretest and post-test means for the sample, or by using an equivalent test with high convergent validity.
  • Reactive effects of experimental arrangements may also affect the external validity of one’s findings. Often experimental settings are artificial, and one cannot ignore the Hawthorne effect, i.e., when people know they are being observed, contributing to research data, or having their personality assessed, their behavior may change. Moreover, we may ask that participants answer the questions as honestly as possible, but that does not guarantee they will answer honestly. It may also be the case that questions are interpreted differently by different people. For example, with respect to a question like, “Do you often feel angry?” How often is often? My often may not be your often. Even, for the statement, “I feel angry most of the time,” what does most mean? Ultimately, the results of any experimental design, even in hard science, can be questioned based on the fact that the experimental situation can only ever approximate reality.
  • Multiple-treatment interference occurs when the effects of earlier treatments are not erasable. Moreover, a participant may be participating in treatments other than the one being tested—one might be testing for the merits of meditation as a stress reliever, but the participant may also be engaged in therapy as well as practice a host of other means to alleviate their stress, and one cannot then be sure that it was the meditation that reduced their stress level or indeed, whether being in therapy interfered with the efficacy of medication because therapy often only works if there is a certain degree of anxiety present on the part of the patient.  
  • The interaction effects of selection biases and the experimental variable may also threaten the validity of a research. Clearly, selection biases may negatively affect both the internal and external validity, so it is critical that you think about how you will select participants carefully and to what extent those selection criteria will limit the generalizability of the research findings. For example, it may be true that students are more likely to embrace remote work, but if the only participants selected to participate are students, one is limited from generalizing the findings to people on the edge of retirement. In most instances, perfectly random selections are not possible because it would require a list of everyone the population of interest in order to select a random sample from that population.

Finally, there is ecological validity, or the extent to which the results the research can be applied to real life situations. For example, an actual driving test would have more ecological validity than would a simulated driving test.

In most instance, a methodology chapter would include a section devoted to discussing the particular threats to the validity of a research design and the extent to which the findings may be limited by the selection of participants, the methods of measurement chosen, and the context in which the research will be or was undertaken. The conclusion to the research would remind the reader about the threats to the validity of the findings so that the reader can take those limitations into account when generalizing the findings.

The most important aspect to remember when discussing the extent to which validity issues pertain is to discuss only those issues that pertain to your research. For example, a once-off cross-sectional design would not be subject to maturation factors or participants dropping out, whereas discussion of these issues is critical in a longitudinal design. So be clear about which types of validity apply to your research and focus on the threats to validity to your particular research design and selection criteria in order to make the issues and how you will deal with those threats explicit.

Remember, too, that there is no perfect research design, so it is a question of being aware of what kinds of threats to the validity of your findings exist, making your reader aware of those threats, developing strategies to minimize those threats, and then being honest about the extent to which your findings can be depended upon for making decisions.