From Speech to Text: Transcription

smiling woman typing on a laptop wearing headphones

Universities and the wider scientific community use transcription to convert sound and video material into text data, which can then be better used in research. In addition to various interview studies, the coronavirus pandemic has greatly increased the number of video meetings and webinars, and it is possible to send in, e.g. Teams meeting recordings to be transcribed.

Another major theme since 2020 has been the accessibility of video and audio material. Since the European Accessibility Directive took effect, the production of a text equivalent for audiovisual publications has been mandatory. For example, a podcast episode needs a text equivalent in order for the content to reach all potential users. Below, we will go through the various types of transcription and a few things you should take into account when planning transcription.

Which level of transcription is required?

For transcription of research interviews, you can choose either basic level or exact level transcription. In basic transcription, speech is transcribed without sounds or filler words, whilst exact transcription aims for an exact text equivalent of the speech transcribed. In other words, a basic transcription is slightly more edited and reader-friendlier, whereas an exact transcription enables a more precise look at aspects of speech and non-linguistic factors, such as yawns or pauses. Transcriptions can also be produced in standard language, which means that the colloquial elements of the material are edited out and standard language is used instead.

The material transcribed can be e.g. an unstructured interview (thematic interviews, group interviews), a structured interview, or a semi-structured interview (interviews using forms). We can tailor the transcription to the needs of the study by, e.g. anonymising it, adding timestamps, or using agreed project-specific transcription notation.

What should I take into account when planning a transcription project?

When producing the recording, you should already pay attention to, e.g. sound quality. Make sure that the speakers can be heard, that they speak closely enough to the microphone and, if it is a video call, that the Internet connection is stable. Note that we charge as based on the length of the whole recording, so you may want to edit out any empty or irrelevant parts of the recording. File size has a significant effect on the ease of moving and processing the material, so especially with video files, the version you send us should be as small as possible, e.g. in low resolution. When ordering, please disclose at least the following information: which language/languages are spoken on the recording, which level of transcription you wish, how many speakers there are on the recording, and whether or not the material contains specialist terminology or any specific dialects. This way, we can start the transcription process as soon as possible and choose the right professionals for the project.

Read more about Lingsoft's transcription service here.

Back to blog