How to Use a Codebook for Qualitative Research Analysis

Business statistic documents
andresr / Getty Images

The market research budget of a small business owner or a home-based business generally does not have room for spending large sums of money on software to analyze the qualitative data collected for business development.

However, you can use an ordinary word processing application to conduct text analysis for qualitative market research. These processes can be applied to the analysis of quantitative data collected from survey research, focus group sessions, and in-depth interviews.

The Basics of Coding

The first step in qualitative data analysis is coding. A code is a "label" to tag a concept or a value found in a narrative or text.

Code books include definitions of themes and sub-themes that are used as references for the coding of narrative text. The themes can be those actually expressed by the respondents—which are called "in vivo codes"—or those that are constructed or inferred by the researcher. Each theme and sub-theme is assigned a specific number that can be used for sorting the text data and for relocating the places in the narrative text for deeper analysis.

The important bottom line is that coding improves reliability as it creates a structure and agreement about important definitions, constructs, and themes. 

Reliability Issues

Determining which code should be assigned to particular text is not always obvious. A common reliability problem is that coders or raters do not always code similar passages of text exactly the same. Reliability can be improved by making sure to use clear categories for coding. 

Reliability across two or three coders can be calculated. This inter-rater or inter-coding reliability index will show whether the researcher needs to revise the coding scheme. The formula for calculating this inter-rater reliability index is shown below.

Reliability = # of agreements / # total codes (# code agreements + # code disagreements)


  1. Code Agreements = the same codes were chosen by two or more coders
  2. Code Disagreements = different codes were chosen by two or more coders


There are three very different coding strategies. These include:

  1. Code book creation according to the theory
  2. Coding by induction according to ​"grounded theory"
  3. Coding by ontological categories

Coding According to Theory

When taking a theory-based approach to the creation of a code book, the market researcher creates a list of concepts based on those found in the research questions or the hypothesis. Using analytical frameworks and analysis grids, the researcher works through the narrative and codes the text according to theoretical reasoning.

Reference Table

As shown in the table below, decimal numeric codes are used to identify themes. Identifying the themes and codes in this manner enables an easy sorting process during the data analysis. The codebook table is separate from the data recording table that is shown in Step 1. The codebook table serves as a reference, but it is not an active table.

 In addition to the definitions of the themes and sub-themes, the codebook may contain criteria used for inclusion or exclusion of text instances in the thematic categories. Careful, logical design of the codebook and the indexing structure promotes ease in coding—and ultimately in the reporting of findings. A recommended format for a codebook is conceptually logical and follows a conventional sequential structure of an outline.

Level 1 Level 2 Level 3 Theme
4.0000 Music in the Barrio (Venezuela)
4.05 Nucleo – music classes M-F afternoon and Sat. mornings
4.10 Each one teach one – basic agreement
4.105 As music students learn, they coach younger students
4.15 Ensemble music strengthens the sense of community
4.20 Classical music paves way for social change / social justice
4.205 Self-identify of young musicians forever altered
4.215 Orchestra elevates social standing and fosters inclusion
Excerpt of 3-Level Code Book Table