Audio describing foreign films

Agnieszka Szarkowska, University of Warsaw and Anna Jankowska, Jagiellonian University in Kraków


This article presents the main challenges of audio describing foreign films: synchronising the translation of foreign language dialogue with audio description, identifying speakers, describing culture-bound elements, and dealing with intertextuality. The findings are discussed with a reference to an explorative study carried out among Polish viewers with visual impairments. The solutions proposed in this article include the name insertion strategy, audio introductions and a number of strategies to deal with culture-specific items in audio description, such as explicitation, naming, generalisation, specification, retention. The results of the study also demonstrate the feasibility of adopting the Translation Studies toolkit to the benefit of audio description.


Audio description, foreign films, media accessibility, blind and partially sighted viewers, voiceover, synchronisation, audiovisual translation, culture-bound items.

1. Introduction

Audio description, both as an academic endeavour and as professional practice, is usually tackled from an intra-lingual and intra-cultural perspective. An overwhelming majority of currently audio described films are monolingual and their scope does not go beyond one culture, most frequently that of the original production. This may be quite surprising given the increasing globalisation of audiovisual products, manifested, inter alia, by a large number of foreign (usually American) imports. In this article, an attempt is made to bridge the gap between the growing need of audio describing foreign films and a lack thereof in the academia and on the AD market. A point of departure for this article is the current situation in Poland, where this study takes place.

Until recently, audio description for foreign films in Poland has been scarce. A vast majority of the currently audio described content in Poland are home-made productions. However, the newly amended media law1, obliging broadcasters to provide at least 10% of their programming with subtitling for the hearing impaired, audio description and sign language interpreting, may further influence the rising trend in this direction. Interestingly, unlike their counterparts in subtitling countries, Polish blind and partially sighted viewers are ‘lucky’ enough to have the foreign dialogue translated and read out to them in the form of voiceover (VO, for more information on voiceover in Poland see Garcarz 2007 and Szarkowska 2009). The missing component is audio description.

Indeed, one of the most difficult challenges in audio describing foreign films is a need to provide blind and partially sighted viewers not only with the description of what can be seen on screen, but also with a translation of the foreign dialogue. The latter has usually been done either through audio subtitling (AST), also known as spoken subtitles (see Theunisz 2002; Orero 2007; Braun and Orero 2010; Remael 2012; Szarkowska forthcoming), or through voice-over (see Szarkowska and Jankowska 2012). Another important issue when describing foreign films is how to render various elements of foreign culture, especially those that may be immediately recognisable by the primary target audience in the country of the original production, but are less known in the secondary target audience in foreign markets (for more on primary and secondary audience see O’Sullivan 2011).

This article begins with outlining how the translation of foreign dialogue can be combined and synchronised with AD. Then, we discuss the challenges and solutions related to identifying speakers, describing foreign places and rendering culture-bound elements and dealing with intertextuality. The solutions we propose are based on AD strategies, which we hope marks a departure from the objective-subjective paradigm in AD towards an approach drawing on the conceptual framework of Translation Studies. Finally, the article presents a novel solution, i.e. audio introductions, as a supplementary tool which can be used in audio description for foreign films. The solutions and conclusions proposed here are based on a qualitative exploratory study Audio describing foreign films2.

2. Research project on AD for foreign films

This paper reports on a qualitative study “Audio describing foreign films” carried out in the years 2010-2012, the first research project of its kind in Poland. The findings presented here are not based on numerical results, but rather on numerous discussions and unstructured interviews with visually impaired viewers and audio describers regarding a number of issues related to describing the films selected for the purposes of this study: Volver (2006, dir. Pedro Almodovar), Matchpoint (2005, dir. Woody Allen), Vicky Cristina Barcelona (2008, dir. Woody Allen), Midnight in Paris (2011, dir. Woody Allen), Man on Wire (2008, dir. James Marsh), Big Fish (2003, dir. Tim Burton), as well as based on analysing existing Polish AD scripts from Wadjda (2012, dir. Haifaa Al-Mansour), Fill the Void (2012, dir. Rama Burshtein), andthe English AD script to The Memoirs of a Geisha (2012, dir. Rob Marshall).

For each film, the following scenario was adopted: first, an AD script was written and consulted with the visually-impaired. After the revisions, the script was recorded and mixed in a professional studio. Finally, open screenings for the blind and partially sighted were held, followed by post-screening discussions and questionnaires.

3. Technical and linguistic issues related to combining AD and VO
3.1 Synchronising foreign dialogue, its translation and audio description

One of the challenges while audio describing foreign films is the appropriate combination of the original dialogue, its translation (in the form of VO or AST) and audio description3. The original dialogue needs to be audible so that target viewers can hear and recognise the voices of the original actors, but the volume of the original dialogue should be appropriately adjusted in order to enable the viewers to hear the translation of the dialogue. In Poland, which is often perceived as a stronghold of voiceover (see Glaser 1991; Garcarz 2007; Szarkowska 2009), the original dialogue in foreign films is usually audible and the voiceover translation is provided with a slight delay so that viewers can hear the beginning of the original dialogue (for more information on synchronisation in VO see Franco et al. 2010). Since voiceover, like subtitling, typically contains an abbreviated version of the original dialogue, viewers can often hear not only the beginning of the original dialogue, but also its end — the translation being shorter than the original utterance.

The possibility to hear the clearly distinguishable original actors’ voices in the background on top of the translation is an important factor contributing to the understanding and enjoyment of the film. A good example of how poor sound mixing, resulting in the impossibility to hear clearly the original voices, can spoil the understanding and enjoyment of a film is the audio described version of Volver released on DVD in the UK. Many viewers complained about not being able to distinguish who was speaking at a given point since the volume of the original dialogue was lowered too much and all the dialogues were read out by one female voice talent in the form of spoken subtitles (Leen Petré, RNIB, personal communication).

In the course of discussions following the screenings of foreign audio described films in our study, the vast majority of viewers stressed the importance of an appropriate synchronisation of all the audio tracks in the film: the original dialogue, its voiceover translation and the AD script as well as important sounds and music. According to the participants of the study, the optimum synchronisation is when the original dialogue can be heard in the background and when AD does not overlap with the original actors’ voices or with the voiceover translation, as shown in Figure 1:

Figure 1. Optimum synchronisation of AD with the original dialogue and VO translation

Should the above-mentioned synchronisation be impossible, for instance owing to limited time available for AD, many viewers stated that it would be acceptable for AD to overlap with and to cover some of the original dialogue, as shown in Figure 2.

Figure 2. Acceptable synchronisation of AD with VO translation

What turned out to be unacceptable to viewers was the option of AD overlapping with the voiceover translation (see Figure 3).

Figure 3. Unacceptable synchronisation of AD with VO translation

The findings are consistent with a number of AD guidelines, where it is stressed that AD should not overlap with dialogue: “description should only occur during non-dialogue pauses; description should never occur over dialogue, musical numbers or sound effects unless absolutely necessary” (Milligan et al. n.d.: 4).

3.2 Identifying speakers in audio described voiced-over films

Given the nature of voiceover in Poland, whereby one — usually male — voice talent reads out the translation of all dialogues, it may sometimes be difficult for viewers, sighted and visually impaired alike, to identify who is speaking at a particular time. This problem is particularly pertinent in scenes with many people speaking simultaneously.

One solution is to use the name insertion strategy, i.e. to include speakers’ names in the AD script immediately before their utterance. In one of the experiments from the study, we presented visually impaired viewers with two versions of AD to the same scene from The Big Fish, featuring four characters sitting at a table over dinner. The first version contained only a few names inserted just before a character said something. The other version had the speaker’s name almost every time they spoke. Interestingly, while the name insertion strategy was appreciated by a number of viewers with visual impairments, who confirmed it helped them correctly identify the speaker, some viewers stressed that inserting too many names in the AD reminded them that they were watching a film, disrupting their immersion in the story and preventing them from suspending their disbelief.

The golden rule concerning the identification of speakers would therefore seem to be to include speakers’ names in the AD script, particularly in the exposition phase of the film to help viewers recognize who is talking and to learn their voices (for more on the exposition phase in AD see Remael and Vercauteren 2007). The name insertion strategy is also recommended in scenes with many speakers on screen and when introducing new characters.

Table 1 presents a sample from the Polish AD script to Midnight in Paris, demonstrating the name insertion strategy (fragments of the AD script with names are marked in blue):

Polish VO translation

and the name insertion strategy in AD

Original English dialogue

and back-translated Polish AD

Ojciec Inez, John:

Nasi zapaleni turyści.


Już mnie mdli od widoku czerwonych bulwarków i bistro. […]


Hemingway nazwał to „Ruchomym świętem”.

Matka Inez, Helen:

W dzisiejszych korkach daleko by się nie ruszył.[…]


Nie należę do frankofilów.


Nienawidzi ich polityków4.


Oni za Ameryką też nie przepadają.


Trudno ich winić, że nie wleźli za nami w tę iracką króliczą norę, prawda?

Inez’s father, John:

There are our sightseers.


If I never see another charming boulevard or bistro again, I... […]


What did Hemingway say? He called it “A moveable feast”.

Inez’s mother, Helen:

In this traffic nothing moves.[…]


I’m not a big Francophile.


John hates their politics.


Certainly been no friends to the United States.


Well, you can’t exactly blame them for not following us down that rabbit’s hole in Iraq, can you?

Table 1. Example of the name insertion strategy from Midnight in Paris

Admittedly, the name insertion strategy has already been used in the audio description of domestic/monolingual5 productions. It is however worth noting that its employment in audio described foreign films can be particularly useful in facilitating the recognition of characters given the presence of multiple simultaneous language tracks: original dialogue, voiceover translation and audio description.

In the course of our study, we noted an interesting solution with regard to naming characters in AD proposed by some members of our visually impaired audience. They suggested that they would not mind if a character was named using a set of synonyms, instead of one term only. This goes against AD principles, according to which a character should be referred to with the same term throughout the film.

4. Culture-bound elements

Culture-bound elements, also known as ‘culturemes,’ can be defined as “formalised, socially and juridically embedded phenomena that exist in a particular form or function in only one of the two cultures being compared” (Katan 2009: 71). Pedersen (2011: 43) terms such elements ‘extra-linguistic cultural references’ (ECRs). He defines an ECR as a

reference that is attempted by means of any cultural linguistic expression, which refers to an extralinguistic entity or process. The referent of the said expression may prototypically be assumed to be identifiable to a relevant audience as this referent is within the encyclopaedic knowledge of this audience (2011: 43).

ECRs encompass “a wide array of semantic fields: from geography and traditions to institutions and technologies” (Katan 2009: 71). Díaz Cintas and Remael (2007: 201) enumerate a number of categories that ECRs may fall into, like geographical, ethnographic and socio-political references. In this paper we focus on geographical and ethnographic references.

A number of authors have proposed sets of global- and local-level strategies (also referred to as ‘procedures’, ‘techniques,’ etc.) which can be used to tackle culture-bound elements when transferring a foreign film into another language and culture (Hejwowski 2004; Katan 2004; Leppihalme 1994; Newmark 1988; Vinay and Darbelnet (1995); for a general discussion on strategies see Kearns 2009; for classifications specifically about AVT see Díaz Cintas and Remael 2007, Gottlieb 1997, Pedersen 2011).

Pedersen (2011: 75) offers a set of local-level translation strategies to deal with ECRs in subtitling, elements of which will be adopted for AD here. He divides the strategies into two groups: (1) source-oriented, like retention, specification, and direct translation and (2) target-oriented, like generalisation, substitution and omission. Some authors have provided similar sets of strategies for audio description. For instance, Mazur (2014) outlines the strategies aimed at audio describing emotions: literalness, explicitation, generalisation, omission and a combination of strategies. On a more general level, Walczak and Figiel (2013) talk about foreignisation or domestication of AD scripts. All in all, it is important to note that this strategy-based approach marks a significant shift in studies on audio description, away from the objective-subjective paradigm (cf. Szymańska and Strzymiński 2010) towards a more scientific model, drawing on the theoretical framework of Translation Studies.

In what follows we present two sets of techniques to be used in AD for foreign films: one for describing geographical references, and the other for dealing with ethnographic references.

4.1 Geographical references

Among elements of foreign culture that are particularly important in audio description are geographical references (Szarkowska 2012). Film characters are always presented in a setting which needs to be given an accurate term in AD. On the one hand, this term should be relevant to the story world, yet on the other hand, it is important for audio describers, just like translators, to decide whether the place being described is likely to be known to the target audience or not. In the course of the project, we have identified four main strategies of describing foreign places in AD:

4.1.1 Naming

The strategy of naming the foreign place in the film is seemingly the most straightforward and unproblematic. If a character is presented in a certain place, it would seem that this place should simply be named accordingly in audio description. For instance, in Woody Allen’s Matchpoint the gallery where the main character, Chris, meets his brother-in-law’s ex-girlfriend, Nola, is Tate Modern. Now, the question for the audio describer is whether the name of the gallery should be retained as it is, or whether it should be supplemented with some kind of explanatory description for Polish viewers. The decision whether to use the naming strategy or a more descriptive strategy each time depends on the film context and the envisaged familiarity of the target viewers with the place in question.

Naming is a strategy particularly useful when dealing with easily recognisable international landmarks, as shown in Figures 4 and 5.

Figure 4. Naming strategy:
the Eiffel Tower

AD: Zamglony zarys Wieży Eiffla wyłania się zza kremowych budynków.

English back translation: A misty outline of the Eiffel Tower emerges from behind creamy buildings.

Figure 5. Naming strategy:
the Houses of Parliament

AD: Przed nimi panorama Londynu – Big Ben i Parlament.

English back translation: In front of them London panorama: Big Ben and the (Houses of) Parliament.

The two examples above are rather straightforward as the names presented are easily recognisable. There are other situations, however, where the naming strategy can come in useful even though the places are not immediately obvious from the screen, as shown in Figure 6 from Vicky Cristina Barcelona, where Ben and Vicky are walking along La Rambla. This major tourist attraction is not in any way hinted at in the dialogue nor is it given much attention in the scene. To the person who knows Barcelona, however, the place is immediately recognisable thanks to numerous stalls with flowers and birds.

Figure 6. Naming strategy: La Rambla

AD: Ben i Vicky na Rambli, przy straganie z ptakami.

English back translation: Ben and Vicky on La Rambla, next to a stall with birds.

By weaving foreign names in the AD script, the naming strategy offers viewers an opportunity to experience the foreign flavour of the film. It accurately names the place presented. It may be claimed that the strategy assumes some background knowledge on the part of viewers as it does not explain the foreign name in any way. This can be regarded both as an advantage and as a downside. On the positive side, hearing the foreign name may evoke memories or associations among viewers who are familiar with the place. On the negative side, however, the term will not evoke any associations in those viewers who are not familiar with it, so the audio describer each time needs to decide whether the naming strategy will be sufficient for the viewers. Interestingly, some viewers in our study pointed out that the naming strategy — as opposed to description without naming — allows them to research a particular place being described if they are unfamiliar with it.

4.1.2 Explicitation

Traditionally, explicitation has been understood, after Vinay and Darbelnet (1995), as “the process of introducing information into the target language which is present only implicitly in the source language, but which can be derived from the context or situation” (Klaudy 1998: 80). For the purpose of AD, explicitation refers to introducing into the AD script some words accompanying proper names/place names, making the function or the nature of those places explicit.

Unlike the naming strategy described in the previous section, the explicitation strategy not only provides viewers with the names of the place, such as Tate Modern, but supplements this name with some descriptive term, usually relating to the function or character of the place, as in: The Tate Modern Gallery (see Figure 7).


Figure 7. Explicitation: the Tate Modern gallery

AD: Chris zmierza w kierunku galerii Tate Modern.

English back translation: Chris is heading towards the Tate Modern gallery.


To give another example, in Matchpoint, the main character, Chris, is offered a position in his father-in-law’s office located in the famous Swiss Re building in London, also known as the Gherkin, designed by Norman Foster. This prestigious location in the London City may not be well known to all Polish viewers, so in the AD script the explicitation strategy was used, supplementing the name with a more explanatory term (see Figure 8).

Figure 8. Explicitation:
the Gherkin office block

AD: Biurowiec “Gherkin”.

English back translation: ‘The Gherkin office block’.

Figure 9. Explicitation:
the edifice of Grand Palace

AD: W oddali gmach Grand Palais.

English back translation: ‘In the distance the edifice of Grand Palace.’

The limited time available in this scene for audio description (1-2 seconds) prevented the audio describer from including a more descriptive and explanatory phrase. The explicitating term biurowiec (‘office block’), however, allowed the viewers to understand the character of the place.

In a similar vein, the explicitation strategy was used in the opening sequence of Midnight in Paris, featuring a number of Parisian landmarks, to describe the Grand Palace (see Figure 9). The name itself was preceded by the term gmach (‘edifice’), which not only explained to the viewers that the name refers to a building, but it also gave them an indication of its great size.

The explicitation strategy has yet another advantage: it may come in useful for stylistic reasons when the inflected foreign term would sound clumsy or unnatural by itself. When accompanied by an explicitating word, it can be smoothly interwoven into the AD script.

4.1.3 Describing without naming

To continue with the example of Tate Modern, the strategy of describing a foreign place without naming it in a film would contain a description of the physical features of the building and perhaps some more background information, but refrain from directly specifying its name, as in: A large, brick industrial building with a tall chimney, former power station on a river bank.

Although this strategy was not used in any of the films audio described in this study, it may be justifiable to avoid giving the name of the place immediately in some cases, where suspense may be required by the story and not revealing the place name is important to the plot.

Last but not least, the strategy of describing without naming may also be possibly used by audio describers who are unfamiliar with a foreign setting shown in a film (and thus unable to name it). This points to a different set of competences required from audio describers in domestic vs. foreign films: AD for foreign films is more similar to translation, as the competences required from the describer/translator include familiarity with the source culture.

4.1.4 Describing and naming

The last strategy identified in the study with regard to the treatment of foreign places in the AD script was describing and naming (see Figure 10 and 11). This strategy can be used whenever time allows to include not only the name of the foreign place presented in the film, but also more description of its characteristic features. This offers the viewers an opportunity not only to learn where the scene takes place, but it also provides them with additional information to complete their mental picture of the place. Thus, a description of Tate Modern with the describing and naming strategy could be as follows: A large, brick industrial building with a tall chimney: Tate Modern.

Figure 10. Describing and naming strategy: Moulin Rouge

AD: Czerwony młyn góruje nad fasadą teatru Moulin Rouge.

English back translation: A red windmill towers above the Moulin Rouge theatre façade.

Figure 11. Describing and naming strategy: London Eye

AD: Powoli obraca się diabelski młyn – London Eye

English back translation: A ferris wheel, the London Eye, is moving slowly.

This strategy may be particularly useful when the place being described is probably not immediately recognisable by the target audience or when its description would enhance the atmosphere of the scene. Needless to say, it can only be used when there is ample time available for AD.

4.1.5 Combining the strategies

The strategies described above can also be used in combination. For instance, the explicitation strategy can be combined with the describing and naming strategy, as in Vicky Cristina Barcelona:

AD: Dziewczyny fotografują fasadę i wnętrze katedry Sagrada Familia o organicznych kształtach. Czerwono-zieloną rzeźbę Miro Jeune fille s'évadant. Spacerują na tarasie widokowym kamienicy Casa Milà. Fasada budynku przypomina fale.

English back translation: The girls are taking pictures of the façade and the interior of the biomorphic Sagrada Familia cathedral, the red and green sculpture of Miró’s Jeune fille s'évadant. They are strolling on a viewing terrace of the Casa Milà historic house. The façade resembles waves.

To recap, when audio describing foreign films, describers have a number of strategies at their disposal. Depending on the time constraints, the requirements of the fabula and their assumptions regarding the familiarity of the target audience with the places being described, they can employ more or less descriptive strategies. Peppering the AD script for a foreign film with names of foreign places offers the viewers a chance to experience the flavour of the ‘foreign’ and to better immerse themselves in the story world. At the same time, however, it requires from the describer the familiarity with the foreign culture and the readiness to research the foreign elements for the purposes of the AD script and to the benefit of the visually impaired audience.

4.2 Ethnographic references

Apart from geographical references, when describing foreign films audio describers are likely to encounter other elements of foreign culture, such as ethnographic references. These can include different types of food, clothing, measurements, education, proper names, literature, government, entertainment, sports, currency, etc. (see also Díaz Cintas and Remael 2007: 201 and Pedersen 2011: 59).

Just as in the case of other types of audiovisual translation like subtitling, dubbing and voiceover, ethnographic references may also pose problems in audio description. Depending on their role in the plot and the context they appear in, they may need to be referred to in the AD script by employing a set of strategies, similar to those used in other types of audiovisual translation (see Díaz Cintas and Remael 2007: 201, Pedersen 2011: 75). Below we present how some strategies proposed by Pedersen (2011) for subtitling can also be applied to ethnographic references in audio description.

4.2.1 Retention

According to Pedersen (2011: 77), retention is the most source-oriented strategy and it consists in “keeping ST elements in the TT”. By preserving a culture-bound element, retention brings the blind and partially sighted viewers closer to the source culture, allowing them to immerse in the foreign. A major downside of this strategy, however, is that it contains no explanation regarding the culture-bound element and as such offers “no guidance whatsoever to the TT audience” (Pedersen 2011: 78). As a result, the culture-bound element may not be understood by the audience.

Figures 12 and 13 below present two examples of the retention strategy in AD. The first one comes from a scene in Midnight in Paris, a brief shot of a group of men playing the French games of boules in a Parisian park. In the Polish AD to this film, the term boules was retained, although during the consultations with the visually impaired, we also considered using a generalisation strategy (‘balls’ instead of ‘boules’). In the other example from The Memoirs of a Geisha, the original Japanese term okiya was used in the English AD to denote the house of a geisha.

Figure 12. Retention: boules

AD: Mężczyźni grają w bule.

English back translation: Men play boules.

Figure 13. Retention: okiya

AD: In the okiya, Chiyo is being dressed for her debut as a geisha in training.

In both examples, the use of the original terms adds to the foreign flavour of the AD script, reminding the audience they are watching a film set in a different culture. The terms were not explained in any way, as the first one was deemed insignificant for the plot, and the other was clear from the context.

4.2.2 Specification

Pedersen (2011: 79) defines specification as “retaining the ECR in its untranslated form, but adding information that is not present in the ST, making the TT ECR more specific than the ST ECR”. He notes that this strategy is sometimes also referred to as explicitation. In this sense, specification can thus be considered a complex strategy, consisting of retention and explanation combined.

Figure 14 presents an example of the specification strategy as used in the English audio description in The Memoirs of a Geisha. In this scene, the term ‘geta’ was used for the very first time in the film and as such it was probably thought to need an explanatory note. Figure 15 shows a similar example from Wadjda, where the main character, the girl named Wadjda, is praying with her mother. They are wearing khimars, which are named in AD and explained using the specification strategy.

Overall, in our study we observed that the specification strategy is often used when a culture-bound element is described for the first time (see also Walczak and Figiel 2013). Once it is introduced and explained, it is often referred to using the retention strategy (see section 4.2.4).

Figure 14 – Specification: geta

AD: Memeha takes out onto a cobbled street to practise walking in geta, the raised wooden clogs worn by geisha.

Figure 15 – Specification: khimars

AD: W pokoju. Wadjda stoi obok matki. Modlą się. Są ubrane w chimary, czyli okrycia wierzchnie okrywające wszystko z wyjątkiem ich twarzy. Matka ma na sobie czerwony chimar. Wadjda – czarny.

English back translation: Room. Wadjda stands next to her mother. They pray. They wear khimars, that is outer garments covering everything except their faces. The mother’s khimar is red, Wadjda’s is black.


The undeniable advantage of using the specification strategy is that the culture-bound element is both retained, which preserves authenticity, and at the same time explained; as a result, viewers are more likely to understand its nature. On the negative side, as noted by Pedersen (2011: 82), this strategy could sometimes be regarded as patronising, particularly by those viewers who are familiar with the source culture presented in the film.

4.2.3 Generalisation

As the name suggests, generalisation “entails replacing an ECR referring to something specific by something more general” (Pedersen 2011: 85), most often by using a superordinate term instead of the original one. Generalisation can be quite a useful strategy in AD for foreign films, particularly when the foreign culture-bound element does not have its equivalent in the target language, or is unknown by the audio describer (and difficult to research). This was the case in a scene from Fill the void, as shown in Figure 16, where a traditional “keffiyeh” was described in more general terms as “turban-looking headscarf.”

Figure 16. Generalisation

AD: Matka Shiry to kobieta w średnim wieku. […] Na głowie, tak jak inne zamężne kobiety, nosi zwiniętą jak turban chustę.

English back translation: Shira’s mother is a middle-aged woman. […] She is wearing a turban-looking headscarf, just like other married women.

Generalisation makes it easier for the visually-impaired audience to comprehend what is being described, by using a term more familiar to them. Its potential downside is depriving the audience of the foreign flavour of the original culture.

4.2.4 Combining the strategies

Similarly to the strategies used to render geographical references, the strategies describing culture-bound elements in AD can also be combined. In our study, a particularly frequent combination was specification followed by retention. When a culture-bound element was encountered in a film for the first time, it was often the specification strategy that was employed. This allowed audio describers to introduce a foreign concept to the audience and to explain it. When the reference re-appeared later in the film, the retention strategy was used with no additional explanation (See Figure 17).


Figure 17. Combining the strategies: retention + specification

AD: W rozklekotanej terenówce trzy kobiety. Dwie w czarnym nikabach, jedna w czarnej burce, czyli stroju zakrywającym twarz.

English back translation: Three women sit in a rickety off-roader. Two of them wear black niqabs, one wears a black burqa, a garment that covers the body and the face.


The presence of elements from foreign culture and their treatment in AD is a vast research topic which merits a separate study and goes beyond the limited scope of this article. The goal of this section has only been to signal the potential problems and solutions available to audio describers when describing foreign films, to emphasise the importance of the socio-cultural dimension of AD as well as to draw some important parallels between AD and translation. We also want to open a discussion on the competencies of audio describers in terms of decoding and coding intercultural texts, bearing in mind that the image cannot always be considered universal and, as a result, it is often impossible to describe it objectively.

4.3 Intertextuality

Apart from geographical and ethnographic references, audio describers sometimes face a more subtle set of difficulties: intertextual references to other works. The notion of intertextuality was coined by Julia Kristeva 1986[1969] to denote a dialogue that a text enters into with another text or text type through a quote, allusion, borrowing, etc. (see Kaźmierczak 2012: 18). Since intertextuality may also imply references to other non-printed texts, such as visual arts and moving pictures, some scholars emphasise its relations to intersemioticity and intermediality (see Kaźmierczak 2012: 20). Federici (2007: 153) notes that “In the postmodern aesthetics, the mingling of literary genres is accompanied by quotations and allusions to other forms of art and also to popular expressions such as comics or television programmes, so intertextuality has become intermediality”. It is beyond the scope of this article to enter into terminological debates on the nature of intertextuality; suffice it to say that whatever the term, a reference to another oeuvre embedded in a film may constitute an important aspect of the film and as such should be rendered in AD, even though it will probably not be recognised by all viewers alike.

In this study, two important instances of intertextuality were identified, both from Midnight in Paris. They were both manifested through the presence of visual references to other works of art. The first example of intertextuality comes from the opening scene of the film. The main characters, Gil and his fiancée Inez, are standing on a green wooden bridge over a lily pond in Claude Monet’s garden in Giverny. The framing of the scene resembles Monet’s famous painting Bridge over a Pond of Water Lilies (see Figure 18).

Figure 18. Intertextuality: reference to Monet’s water lilies in Midnight in Paris

Given that the film characters are talking throughout the scene, it was impossible to include any indication of this clear allusion to Monet’s painting during the scene itself. This intertextual element, however, was deemed important as it was undoubtedly adding a flavour to the scene. Therefore, it was decided that an audio introduction should be added to the film (see section 4.4), explaining this allusion to the viewers beforehand.

Another interesting intertextual element from Midnight in Paris appeared in the scene at Gertrude Stein’s, featuring a heated discussion between Stein herself and Pablo Picasso over his recent painting of his mistress Adriana (see Figure 19). In Picasso’s view, the portrait accurately captures the nature of Adriana. Gertrude Stein, however, argues that it is Picasso’s twisted vision that is depicted, and not the subtle nature of Adriana:

I was just telling Pablo this portrait does not capture Adriana. It has universality but no objectivity. […] Look how he has done her… Dripping with sexual innuendo, carnal to the point of smouldering. Yes, she’s beautiful, but… it’s a subtle beauty, an implied sensuality.

The scene is meant to be comical as the figure of the woman in the painting, with her limbs pointing in different directions and very thin hair, stand in stark contrast with Adriana, charmingly played by Marion Cotillard, who is also witnessing the scene along with the film's main character, Gil. The juxstaposition of the balding female figure, almost abstract in its quality, presented in the painting and adorable Adriana creates a comic effect (see Figure 19).

Quite importantly, the painting later makes a re-appearance in the film’s other time plane, modern times, where Gil with his girlfriend Inez and their American friends see it on display in a museum (see Figure 20). It is therefore important for the audio describer to make the connection and help viewers realise that the two paintings are in fact the same one.

Figure 19. Intertextuality: Picasso’s La Baigneuse in the 1920s

Figure 20. Intertextuality:
Picasso’s La Baigneuse in modern times

Interestingly enough, the painting featured in the film is a real work by Picasso known as La Bagneuse (1928). It therefore seems that the best AD strategy in this case would be to name the painting in the AD script, as in the example below6:

AD: Wszyscy zwracają się w kierunku obrazu. To „Kąpiąca się” Picassa. Na błękitnym tle zdeformowana sylwetka nagiej kobiety, zredukowanej do pośladków i długich, powyginanych kończyn.

English back translation: All turn towards the painting. It’s La Bagneuse by Picasso. Against a blue background, a deformed figure of a naked woman, reduced to buttocks and long, twisted limbs.

This solution, however, was not resorted to because the time available for the description was very limited, forcing the audio describer to make a choice between giving the title of the painting (the naming strategy) or simply describing what it presents (description without naming). Owing to the time constraints and the comic character of the scene, the solution used in the study was a metaphorical comparison of the female figure to a stick insect, emphasising the contrast between the figure shown in the painting and the real appearance of Adriana:

AD: Na obrazie kobieta-patyczak.

English back translation: In the picture, a woman like a stick insect.

4.4 Audio introductions

It is often argued that in contrast to literary translation — where the translators can provide viewers with a number of paratextual clues such as footnotes, and endnotes — audiovisual translation, including audio description, does not have such tools at its disposal. In this article, however, a claim is made that it is possible for the audio describer to include some kind of explanatory paratext by adding an audio introduction (AI, see also Romero Fresco and Fryer 2013). Genette, who coined the term ‘paratext,’ argues that it is

the conveyor of a commentary that is authorial or more or less legitimated by the author, constitutes a zone between text and off-text, a zone not only of transition but also of transaction: a privileged place of a pragmatics and a strategy, of an influence on the public, an influence that – whether well or poorly understood and achieved – is at the service of a better reception for the text and a more pertinent reading of it (1997: 2).

Indeed, the goal of audio introductions is often to serve a better reception of a film as a whole. Although Genette originally conceived of paratext as relating to literary works, the notion of paratext can also be applied to audiovisual translation and audio description. Just as a literary work can be accompanied with paratext, so can an audio described film be supplemented with an audio introduction, together forming one cohesive whole, framing the film in a wider context and enabling viewers “a more pertinent reading of it,” to use Genette’s words (1997: 2).

In the study described here, audio introductions were prepared to the following films: Volver, Midnight in Paris and Man on Wire. The first two were short (ca. 20 seconds) and read over the opening credits of the films, while Man on Wire was accompanied by a separate audio file with AI, played before the screening of the film itself, and lasted for about 7 minutes (see Masłowska 2014). In all the films, the main idea behind the AI was to introduce elements that would not be possible to describe in the course of the film owing to time limitations.

4.4.1. Midnight in Paris

As mentioned in section 4.3, the audio introduction to Midnight in Paris was meant to explain to the viewers the intertextual elements which otherwise could not be described in the film, thus serving a foreshadowing and explanatory function (after Remael and Reviers 2013).

Following an explanatory sentence about the sequence of picture-postcard images of Paris presented at the very beginning of the film, the audio introduction moves on to name the place where the characters are presented (Claude Monet’s house), introduces the two main characters with their first names (Gil and Inez), describes the setting they are in (wooden bridge, water lilies) and finally explicitly states that the image resembles Claude Monet’s painting Bridge over a Pond of Water Lilies.

AD: Film „O północy w Paryżu” otwiera sekwencja paryskich widoków: mostów nad Sekwaną, zatłoczonych bulwarów i znanych atrakcji turystycznych. Następnie akcja przenosi się do ogrodu przy domu Claude’a Moneta, gdzie dwoje turystów z Ameryki — Gil i jego narzeczona Inez — rozmawia na drewnianym mostku. Przed nimi na wodzie unoszą się kępy nenufarów. Nad taflą stawu wiszą jasnozielone gałęzie wierzb płaczących, które odbijają się w wodzie. Całość przypomina obraz Claude’a Moneta — „Nenufary”.

English back translation: The film Midnight in Paris opens with a sequence of picture-postcard images of Paris: bridges over the Seine, crowded boulevards and famous landmarks. We then move to the garden at Claude Monet’s house, where two American tourists, Gil and his fiancée Inez, are talking on a wooden bridge. In front of them, on the water, float green flat leaves and cup-shaped petals of water lilies. Long arms of light-green weeping willows hang over the glassy surface of the pond. The image resembles Claude Monet’s painting: “Bridge over a Pond of Water Lilies”.

During post-screening discussions, several blind and partially sighted members of the audience expressed positive opinions regarding the audio introduction and its treatment of Allen’s intertextual and intellectual play with viewers. Some people who lost sight at a later stage in their lives stated they retained a visual memory of this particular painting and that they appreciated the explicit mention of intertextuality in the audio introduction. Others, especially those congenitally blind and partially sighted, said the film and the discussion that ensued encouraged them to look for more information about the painting.

4.4.2. Man on Wire

The Polish audio introduction to the voiced-over version of Man on Wire (see Masłowska 2012)) was based on the English AI by Pablo Romero Fresco and Louise Fryer7 (2014). The authors presented their viewers with a 15-minute introduction, detailing various aspects of the film – from general information about the film, explanation of different time planes and description of major characters, to an explicit mention of filmic language (e.g. split screen, animation). The authors also compiled an elaborate questionnaire to find out the participants' views on a number of issues related to the AI. The Polish study aimed to replicate the one in the UK, but only with a view to examining whether Polish viewers would find the idea of audio introductions useful (Masłowska 2014).

In contrast to the British AI, the Polish AI lasted for about 7 minutes, so it was about half of the duration of the British introduction. Its structure was similar to that of its British counterpart. First, general information on the film was presented, followed by the information on the languages spoken by the characters and a description of the different time planes used in the film. Finally, the AI provided the audience with a set of descriptions of the main characters.

Unlike British viewers, Polish audience found the audio introduction too long and tiring to listen to. Many viewers stressed that the AI, particularly the part with descriptions of characters, was difficult to follow as it consisted of descriptions of many similar characters, most of whom were middle-aged men. Some stated, however, that the audio introduction helped them understand the structure of the film and the different time planes, which in turn made the film easier to follow. Most people agreed that they would like to hear introductions to other films. Some also suggested that they would appreciate the option of having the text of the introduction available before as well as after the screening (Masłowska 2014).

4.4.3. Volver

The audio introduction to Volver was part of a larger project on auteur description, i.e. audio description to auteur cinema (Szarkowska 2013). It included the descriptions of the two main characters, Raimunda and her sister Sole, as envisaged by the film director and scriptwriter, Pedro Almodóvar (2006), in the published script, using his turn of phrase and juicy language. The AI was meant to present the two main characters, making it clear from the outset that viewers will be watching an unconventional AD.

AD: Za chwilę obejrzą Państwo film pt. Volver z audiodeskrypcją, która powstała na podstawie scenariusza Pedro Almodóvara. Na początek kilka słów o filmie. Główna bohaterka to 32-letnia Raimunda, grana przez Penélope Cruz. Almodóvar opisuje tę postać tak: Raimunda jest rasowa, o niezaprzeczalnej urodzie, zakorzeniona w ziemi okrągłym i szczodrym tyłkiem; z biustem takim, że wzroku nie można oderwać od dekoltu. Nieustępliwa, stanowcza, żywiołowa, pełna odwagi, a zarazem krucha. Ma męża Paco i córkę Paulę. Siostra Raimundy, Sole, ma słabszy charakter. Jest strachliwa, przesądna i żyje w separacji z mężem.

English back translation: You will see the film Volver with audio description based on the script written by Pedro Almodóvar. First, a few words about the film. The main character of the film is a 32 year-old Raimunda played by Penélope Cruz. Almodóvar describes this character as follows: Raimunda, of an astounding and genuine beauty, is firmly grounded by her luscious rounded bottom and her bosom, which one can hardly take one’s eyes off. Uncompromising, resolute, exuberant, courageous, and fragile at the same time. She is married to Paco and has a daughter, Paula. Raimunda’s sister, Sole, has a weaker character. She’s fearful, superstitious and is separated from her husband.

The viewers’ reactions after the screening — both to the audio introduction and to the auteur AD — were largely positive. In the post-screening interview, many viewers used the exact phrases from the script, praising the juicy language and claiming it was memorable.

All in all, audio introductions seem to be a promising new solution to some of the problems arising when audio describing foreign films. Their usefulness is now being tested in an increasing number of settings and countries (see di Giovanni 2012; Romero Fresco and Fryer 2013; Jankowska 2013; Remael and Reviers 2013). There are still a number of issues that need to be examined empirically in greater detail, such as the optimum duration of AI, its structure, style, and content, to name just a few.

5. Conclusion

This article presented a number of challenges related to audio describing foreign films, which so far has been a rather unexplored territory, both academically and professionally. When we began this project back in 2010, the AD for foreign films was non-existent in Poland. A few years later, possibly thanks to new legal regulations obliging broadcasters to offer programmes with AD, the necessity to audio describe foreign films has now become a fact. The lack of research on the topic, few experienced describers and limited professional training are still major obstacles on the way to increase the foreign audio described output on the Polish market. This project has hopefully shown that AD for foreign films is both doable and needed, as stressed by many blind and partially sighted people who attended our screenings.

The study presented here also points to a greater convergence between audio description and Translation Studies — both in terms of a number of problems and solutions, such as the treatment of culture-specific elements which can be challenging when describing/translating a foreign film for a different target audience than it had originally been envisaged for. Parallels between AD and translation can also be found in a set of strategies that can be employed when dealing with the abovementioned problems. As rightly noted by Walczak and Figiel (2013), similarly to translation strategies, AD strategies can also be situated on a continuum between foreignisation and domestication.

There are still a number of research issues to be addressed in future studies on audio describing foreign films, for example the impact of closeness/distance between the source and target cultures on the treatment of ECRs in AD; convergences and disparities between VO and audio subtitling when combined with AD; the choice of voices to read out the AD script and the translation of the dialogue; the usefulness of audio introductions for the reception of the film. Last but not least, there is still an insatiable need to conduct more reception studies among larger samples of target viewers in more countries.


This work was supported by research grant No. IP2010 040370 of the Polish Ministry of Science and Higher Education for the years 2010-2012.

Among the many people who helped this study see the light of day is our partially sighted consultant and friend, Wojciech Figiel, whom we owe a great deal of gratitude. We would like to gratefully acknowledge the help of the following audio describers: Agnieszka Walczak, Irena Michalewicz, and Bogna Olszewska, who passed away prematurely before this article saw the light of day, – thank you all for your passion and meticulousness in creating the AD scripts. Many thanks to Agnieszka Walczak for her comments on an earlier version of the manuscript. We also extend our gratitude to Anna Żórawska and Robert Więckowski from Fundacja Kultury bez Barier and to Monika Cieniewska from the Polish Association of the Blind (PZN) for organizing the screenings with AD. Finally, we would like to thank all of the blind and sighted participants who gave their time and energy for our studies.

Dr Agnieszka Szarkowska is an Assistant Professor at the Institute of Applied Linguistics, the University of Warsaw. She is the founder and head of the Audiovisual Translation Lab (AVT Lab, Her recent research projects include an eyetracking study on subtitling for the deaf and hard of hearing, multilingualism in subtitling, audio description in education, text-to-speech audio description, and audio description for foreign films. She is a member of the European Association for Studies in Screen Translation (ESIST), European Society for Translation Studies (EST) and an honorary member of the Polish Audiovisual Translators Association (STAW). She also works as a freelance subtitler and certified translator. She can be contacted at

Dr Anna Jankowska is an Assistant Lecturer at the UNESCO Chair for Translation Studies and Intercultural Communication of the Jagiellonian University in Krakow. Her recent research projects include studies on the viability of creating audio description scripts through translation from foreign language, multiculturalism in audio description, audio description for foreign films and history of audiovisual translation. She is a member of the European Association for Studies in Screen Translation (ESIST) and an honorary member of the Polish Audiovisual Translator Association (STAW). She also is the founder and president of 7th Sense Foundation which aims at promoting media and culture accessibility. She can be contacted at


