Introduction and goals

The possibilities of communicating through physical behaviour and, particularly, movement are becoming an increasingly active focus of research. Such communication occurs in, for example, gestural accompaniment to speech and face-to-face interaction, but also more independently of verbal language in forms such as dance. Although dance has long been regarded by ethnographers and dance scholars as a ‘universal’ form of communication (cf. Kurath, 1977; Hanna, 1984; Reddish et al., 2013; Karpati et al., 2015; Richter and Ostovar, 2016; Pavlović et al., 2021), the precise nature of that communication is still in need of considerable clarification. In fact, despite the substantial history of research and study of dance, accounts to date remain limited in many respects, leading dance scholars to consider a broader variety of approaches to the study and understanding of dance as a form of communication within temporally and spatially situated sociocultural contexts (see for example: Adshead-Lansdale, 1981, 1988; Foster, 1986; Opacic et al., 2009; Brandstetter and Klein, 2012; Bannerman, 2014; Keevallik, 2018).

In the current paper, in which we use ‘dance’ as a general term to refer to all dance forms globally, we propose a principled approach by which techniques developed for the study of other forms of communication, particularly verbal communication, can be made to apply to dance as well. We see this as offering several potential benefits, including not only new theoretical insights into the nature of this form of communication and its relations to others, but also more directly practical outcomes, such as, for example: assisting in the teaching of various versions of repertoire pieces to young dancers—where differences in movement choices that impact on character interpretation might be highlighted; generating new ways of archiving performances digitally; or allowing for novel techniques for learning and memorising new choreographies (see Maiorani, 2021, chapter 4). To further these aims, we set out a detailed methodology both for characterising dance as a form of communication and for investigating the communicative possibilities of dance empirically. More specifically, we show how it is possible to move in a principled fashion from raw movement data gathered using motion capture technology to the interpretations necessary for contextualising such movement as discourse and narrative.

Our account draws on several principles and methods from linguistics, but also requires extensions beyond primarily linguistic approaches to communication. This is necessary because dance works with two quite different aspects of communication, only one of which is commonly emphasised in linguistic work. On the one hand, the effectiveness of dance clearly relies on embodiment and the embodied understanding of potential meanings of physical movements; this has been classified in Peircean semiotic terms as involving iconicity (see, for example, specifically on Peirce and dance: Bannerman, 2010, p. 19). Although long recognised to be at work in verbal language as well (cf., e.g., Dingemanse et al., 2015), theoretical accounts of iconic meaning-making remain under-developed. It is this aspect that is largely referred to when talk of the ‘universality’ of dance occurs. However, on the other hand, many forms of dance—especially in their early and/or traditional manifestations —are also strongly conventionalised, i.e., symbolic in Peirce’s terms, which directly undercuts claims of universality. Both aspects need to be combined in any adequate view. To achieve this, our account builds directly on broader models of communication developed in linguistics, semiotics and multimodality. We argue that placing research on dance on a foundation of this kind makes dance a legitimate and rewarding target for empirical communication research. We show in particular that constructs developed recently within the field of multimodality studies are now well placed to treat complex communicative forms of this kind.

Although we consider our model to have far broader application, we illustrate our approach with several examples from classical ballet. This is intended neither to raise premature claims of universality nor to suggest that our account is solely relevant for ballet. We focus here on ballet because ballet is highly conventionalised and, moreover, a globally recognised form of dance familiar to most audiences in most cultures, especially in the form of repertoire performances based on commonly recognisable tales and stories. Classical ballet technique also serves as the basis for many diverse and more recently developed dance techniques as well and so offers an ideal starting point for the current discussion.

The organisation of the paper is as follows. First, we briefly situate our approach with respect to relevant current approaches to the treatment of movement-based communicative forms such as dance and introduce the basics of the semiotic framework that we employ. Second, we set out how we apply this semiotic framework to provide an abstract characterisation of the semantics of ballet. Here we present a detailed example of how the account bridges between physical movement and narratively relevant interpretations of that movement in a principled fashion, supporting aspects of meaning deriving from iconicity and from convention equally. Third, we set out how a programme of empirical research then follows directly from the account of multimodality adopted; this close link established between general model and methodological consequences for empirical study is an additional benefit of the account. We then conclude with a brief summary of what has been achieved, emphasising again how the approach opens up communication-oriented research for a much extended range of semiotic forms.

Previous relevant work

Two broad areas of research are relevant for our approach. In terms of theory, we apply results from multimodal discourse analysis as introduced in Bateman et al. (2017) combined with a model developed particularly for ballet by Maiorani (2021, 2017, 2021). Both of these draw significantly on earlier work on multimodality originating in the systemic-functional linguistic and social semiotic traditions (cf. Kress and van Leeuwen, 2001; Kress, 2010); useful overviews of perspectives taken on multimodality and their intellectual forebears can be found in, for example, Jewitt (2014, p. 39) and van Leeuwen (2015). Multimodality in the social semiotic sense developed as a response to the growing realisation in the 1980s and 1990s that ‘language’ as traditionally conceived within linguistics almost never works alone and is most commonly deployed with other forms of expression, ranging from typography, page layout and pictures for written language to intonation, gesture, and body posture for spoken language. Research consequently viewed multimodal communication as an integrated social phenomenon and attempted to develop principles and techniques capable of addressing language together with other forms of expression. This led to the proposal of a unified theoretical and empirical framework principally relying on the exploration of a broad range of “‘grammars’ of specific modes” (van Leeuwen, 2015, p. 449).

Considerable work has now been undertaken in this tradition, but a number of open challenges remain. Jewitt (2013), for example, points to limitations in the scale of studies and the resulting difficulty in pursuing empirical research, even though it is widely accepted that multimodal corpus work, building on techniques developed for linguistic corpora, would be beneficial (for a review, see: Bateman, 2014). Treatments of the iconic components of semiotic systems are also still relatively undeveloped, particularly in the area of movement-based semiotics (cf., e.g., Martinec, 1998; van Leeuwen, 2021). We address some of these issues in the next section where we show how the more recent developments in multimodality theory and practice that we build on can support work at scale while also maintaining an appropriate balance between conventionalisation and iconic, embodied responses to dance. Our study also contributes to a growing trend within the broader research area of multimodality that is seeking to address multi-sensory experiences and phenomenological semiotics so as to strengthen relations with aesthetics (Hansen, 2018). Multimodality consequently not only investigates the diverse processes and practices of representation that shape our knowledge (Kress, 2010, p. 27), but also naturally encourages explicit considerations of dance as a sociocultural form (or forms) of movement-based representation practised in most cultures throughout the world.

In addition to this work growing out of social semiotic multimodality theory, there is also a small but growing body of work now attempting to bring dance within the scope of formal models of communication as well. Here the most relevant account is Patel-Grosz et al.’s (2019) explicit characterisation of a classical South Indian narrative dance form drawing on extensions of discourse representation theory as developed in linguistics (cf. DRT: Kamp and Reyle, 1993). The usual account of discourse adopted in this context is Abusch’s (2013) extension of DRT to visual discourse. The principle empirical direction taken in that work explores potential patterns of non-verbal reference to see if patterns similar to those observed within verbal language appear. We build on this below.

There are also proposals for characterising the physical-material possibilities open to dance in a manner analogous to treatments of phonetics. Most relevant here is the work of Napoli and Kraus (2017) and the formal principles of grouping and segmentation proposed by Charnavel (2019). Charnavel argues that grouping is a cognitive ability shared across domains and modalities and posits six principles of change premised on fundamental perceptual dimensions in the perception of basic human movement. When specified for dance, these give the following six grouping principles: change of moving entity, change of orientation, change of contact point with floor or weight shift, change of direction, change of speed, and change of dynamics/quality (Charnavel, 2019, p. 4). Charnavel then reports on experimental results that examine to what extent these grouping principles actually play a role in perception. More specifically, the experiments performed addressed three hypotheses: (i) are the principles relevant at all? (ii) Do they have different strengths when in competition? And if then applicable, (iii) what are those relative strengths? Both hypotheses (i) and (ii) were strongly supported in segmentation tests. The data also allowed a single coherent ordering of the principles by strength (Charnavel, 2019, p. 15). Several of these grouping principles are therefore important for our own segmentation work as we explain below. In general it should not be the case that any segmentation we propose on functional semiotic grounds violates predictions on perceptual grounds. This is consequently an additional beneficial source of constraint during modelling.

Finally, it is important to emphasise that our work is not intended to replace the classical notation systems used by dance professionals such as Labanotation, created by Rudolf Laban, or Benesh movement notation, created by Rudolf and Joan Benesh, both originally published in the 1950s (see: Laban, 1956; Causley, 1967). These complex notation systems require professionally trained notators to be used and basically mark the positions of a dancer’s individual body parts, similarly to notes within a music score, along with some additional physical movement qualities. Critically, these notations are not intended to capture the semiotic and semantic value of a performance or the specific interpretative choices made by different dancers performing the same role and, as a consequence, do not yet make contact with the communicative nature of dance forms, which is our focus here. Moreover, besides the practical difficulty of using these notation systems and the relatively small number of dance professionals in various roles who can actually deploy them, a number of further critiques have been made.

Scholars have noted, for example, how their disembodied manner of encoding dance movement fails to capture discursively significant dancers’ interpretative choices and relevant discursive patterns, as well as important aspects of dance’s very materiality, which is not constituted by unrelated body parts (see, for example, Watts, 2010). For these reasons, the earlier notations are better seen as systems for ‘physical movement analysis’ in dance rather than as providing a ‘dance analysis’, in which ‘dance’ already implies acts of interpretation and communication (see Adshead-Lansdale, 1994, p. 16). They also do not record the semiotic role of a dance performance space and how that space interacts with dance movement when enacting communication (Munjee, 2015; Brandão, 2017). It is these more communication-oriented aspects that are central to our account here.

The approach: constructing a semiotic mode for classical ballet

As noted above, the semiotic foundation of our approach to characterising dance in terms of communicative systems builds on recent developments within multimodality theory that attempt to deal with scale and the support for empirical research by adopting some quite specific extended definitions of some of multimodality’s basic terms. In particular, we rely crucially on the formally defined notion of semiotic mode proposed for multimodality in, for example, Bateman (2016) and Bateman et al. (2017). In contrast, a prominent earlier definition of ‘semiotic mode’ still employed in social semiotic multimodality is Kress’s “a socially shaped and culturally given semiotic resource for making meaning" (Kress, 2010, p. 79). This has the benefit of generality but is less helpful for deriving methodological principles for systematic engagement with complex communicative situations. As a consequence, many researchers in multimodality fall back on sensory or perceptual modes instead, rather than engaging directly with the semiotic aspects of the use of sensory modalities. When approaching new areas of multimodal meaning-making, however, it is highly desirable that stronger methodological principles can be applied—a position argued at length in Bateman (2022). For this reason, we work here solely with the newer definition of semiotic mode as this provides not only a theoretical framework for capturing non-verbal communicative systems but also, as we shall see below, a set of methodological principles by which we can empirically evaluate and develop proposals further.

The definition of semiotic modes proposed by Bateman et al. (2017) sees semiotic modes in terms of three distinct, but related, levels of description. Each level, or ‘stratum’, reflects a differing degree of semiotic abstraction. At the least abstract stratum there are physical regularities that can be measured in some communicative situation, while at the most abstract stratum there are semiotic mode-specific definitions of discourse relations and strategies for guiding discourse coherence. The level between the least and most abstract semiotic strata then takes responsibility for classifying the formal-material structures relevant for each specific semiotic mode into distinguishable qualitative categories. Thus, whereas the material stratum simply ‘measures’ aspects of the material properties pertinent in some semiotic mode, the middle level groups such measures into qualitative classes that state just which ranges of values will be recognised as distinct for subsequent purposes of discourse interpretation. In contrast to the work on dance mentioned above drawing on DRT for characterising discourse, the discourse representations found in semiotic modes are typically defined in terms of Asher and Lascarides’ (2003) segmented discourse representation theory so as to achieve more flexibility in relating form with interpretations. A graphical view of this ‘tri-stratal’ approach to defining semiotics modes adapted from Bateman et al. (2017) is given on the left-hand side of Fig. 1.

Fig. 1: The model of semiotic mode used in the current work.
figure 1

Left: An abstract graphical view of the internal structure of semiotic modes (adapted from Bateman et al., 2017). Right: the kinds of information used to fill in that structure specifically for classical ballet.

Scientifically, a semiotic mode specification is seen as a ‘current best hypothesis’ concerning explanations for observed (and, ideally, measurable) material regularities. Since no restrictions are made on the kinds of materials that may be considered, the model appears equally applicable to movement-based semiotic systems such as dance. The challenge undertaken here is to explore the extent to which this general model of multimodal communication can indeed be applied to more specific medial forms, such as that of classical ballet, in order to provide a foundation for further empirical research.

When approaching a new semiotic system, the first methodological step according to Bateman et al. (2017) is to identify the relevant contributions to each of the three semiotic strata required for defining a semiotic mode. For dance, and then ballet in particular, this means that we must identify the semiotic strata relevant for this form of communication in order to construct a formal relationship between materiality and a range of discourse semantic functions responsible for characterising how ballet as a communicative practice signifies. As noted above, classical ballet is in many respects an ideal target to illustrate this kind of analysis in that the kinds of movements performed (aspects of materiality) are subject to an extremely high degree of conventionalisation (aspects of discourse interpretation).

To fill in the contents of the three semiotic strata we draw on a recent proposal by Maiorani (2021) for a ‘functional grammar of dance’ (FGD). According to Maiorani, a description according to the FGD serves precisely to group measurable behaviours into qualitative equivalence classes that are already primed specifically for discourse interpretation. This matches directly the requirements of a semiotic mode and so offers an initial set of distinctions sufficient for undertaking analysis of actual dance sequences. Thus, while the semiotic mode model provides an overarching framework and general guidelines for pursuing empirical investigation and validation, the specific sets of distinctions necessary for concrete analysis of ballet are imported directly from the FGD. This connection between the abstract semantic mode definition and the particular constructs of the FGD is shown graphically on the right-hand side of Fig. 1.

It is important to emphasise that this construction is clearly only one ‘slice’ through the full complexity of semiotic modes generally involved in any dance performance. Different components of the materialities involved call for discrete analytical treatments, which may complement the current analysis as required. Thus, while we fully recognise the rich constellation of further relationships created through the interplay between different materialities and means of perceptions that experiencing a dance performance entails, ranging over music, costume, facial expressions, stage design and much besides, employing semiotic modes in the way suggested provides a principled mechanism for achieving analytic focus without losing sight of the more complex enveloping semiotic spaces involved. Indeed, previous and ongoing work considers several of these broader spaces, including explorations of the possibility of analysing dance and music contextually using models of analysis derived from Systemic-Functional Linguistic theory (Maiorani, 2021), as well as using the FGD to explore the relationship between dance movement and costume (see Maiorani and Liu, 2022).

Two essential dimensions are posited by the FGD for characterising the material distinctions meaningful for the movement-focused interpretation of ballet and these are incorporated directly into our definition of the corresponding semiotic mode. The first dimension responds to the observation that in ballet dancers set up trajectories of movement through spatial displacements; the second captures the fact that during the execution of such trajectories, the hands, arms, legs, feet, head and torso are all independently movable but nevertheless function together. FGD terms this latter dimension of dance organisation projection, understood in the particular sense of the interactive connection between a dancer’s body parts in movement and the space within which that movement occurs over the course of a dance performance; that is: a dancer projects towards objects, people or regions in the environment by extending or directing specific body parts. This serves to ‘refer’ to spatial regions on the stage and so grant them semiotic salience, generally for discoursal and narrative purposes as we shall see below. This extends the simpler notions of reference pursued in the formal accounts of dance mentioned above and is also reminiscent of discourse referent creation in sign language (e.g., Morgan, 2000).

Projections are defined formally in terms of their ‘articulators’—i.e., the body parts that play roles in pointing—and their respective movements. These physical properties give rise to a set of distinguishable qualitative configurations functionally differentiated according to both their intrinsic directions relative to the dancer’s body (e.g., ‘left arm’, ‘right leg’, etc.) and their orientation with respect to the direction of movement of the dancer (e.g., forwards, backwards, etc.). Thus, arm and hand projections as well as projections involving the legs and feet may either be (vertically or horizontally) perpendicular to the direction of movement of the dancer as well as following or opposite to that movement. As Maiorani bases the FGD on Halliday’s Functional Grammar for verbal language (e.g., Halliday and Matthiessen, 2013), she consequently proposes three broad areas of functional meaning for formal dance units as well: functions to do with representing the world, functions to do with interpersonal interaction, and functions to do with the structural organisation of the units of the semiotic system for communication. Projections generally correspond to the first two of these areas of meaning, while their binding into structural elements corresponds to the last.

Maiorani argues that it is precisely the sequencing of projections anchored into directed displacements that provides the basis for discourse structure generation in classical ballet (Maiorani, 2021). The FGD consequently provides particular structuring principles relevant for the middle stratum of the semiotic mode that we are constructing. The smallest structural unit of motivated movement in the FGD is called Move. Each Move is constituted by a displacement of a dancer across space over a bounded interval of time during which projections are performed. Segmentation between Moves is indicated by a transition from one set of projections to another. The FGD characterises the sequencing of projections in terms of grouping operations similar to, but extending, the grouping operations for dance in general set out by Charnavel (2019) mentioned above.

For ballet, the structural grouping immediately ‘above’ Moves is termed a Minimal Ballet Sequence (hereafter MBS). Formally this consists of two ballet Moves: a beginning Move to establish an initial directed trajectory, and an ending Move to express either a continued direction or an altered direction. Two Moves then provide the minimum number of displacements through which a trajectory in space can be defined – one Move would only include one displacement, which defines direction but not yet a trajectory that may be either maintained or changed as required by the choreography. MBSs are then the building blocks of dance discourse rhetorical patterns: structurally they define the relationship between Moves to be either continuous—i.e., two consecutive Moves in the same direction across space - or varied—i.e., two consecutive Moves in two different directions.

The final step in filling out the contents of a semiotic mode of ballet is to consider the stratum of discourse semantics. The FGD supports this as well by characterising abstractly the communicative work that distinct kinds of projections achieve during any dance. The performance of projections draws specific spatial regions of the stage into discourse relations. The precise regions identified by any projection depend on the general orientation of the dancer at the time that the projections are performed. Maintaining and changing orientations across a sequence of moves therefore gives rise to a communicative resource capable of manipulating the particular meanings made so that they are only established in relation to the developing discourse. This is shown in detail in the section following.

Projections consequently support the generation and recovery of dance discourse by capturing the performative nature of movement-based communication in ballet; this recognises that the dancer’s body should not only be seen as moving in a physical space, but also moves in a discoursally established contextual space that is presupposed by dancers and which needs to be recognised by the audience. It is this placement in a contextual space that supports interpretation of the body/space interactions that generate dance discourse. Projections then occur when body parts moving in combination physically project, i.e., ‘point’, by extension and direction toward meaningful portions of the stage set. The relationship between movement and the placement of that movement within a contextual space by interpretation also marks out a clear distinction between movement as a purely physical activity, perhaps for exercise or exhibition, and movement as dance as we are conceiving it here.

Putting the semiotic mode to work: from movement to story

With the three semiotic strata filled in as suggested by the FGD, we now set out the major steps in interpretation, showing in particular how the manipulation of projections serves both to organise material distinctions and to relate those distinctions to possible discourse interpretations.

Since projections not only indicate particular areas of the performance space as potential discourse referents but do so in particular ways depending on the precise articulators employed and their movements, they support an extended range of potential ‘referring’ acts. This is captured by associating each structural configuration constituting a projection with an abstract discourse-level predicate. These discourse level predicates characterise physical configurations in terms of their potential discourse roles for the dance. Distinguished physical orientations are consequently placed in correlation with discourse structures that are formally analogous to the abstract notion of ‘event’ found in many approaches to verbal discourse semantics. Distinct types of projections give rise to correspondingly distinct ‘event structures’ as summarised in Table 1.

Table 1 Mappings between physical projection structures and discourse event predicates.

Each of the discourse predicates involves a specified set of roles analogous to thematic roles in syntax and event-based semantic descriptions. By these means, we arrive at a specification that is largely identical to common characterisations of linguistic semantics (cf. Parsons, 1990). The discourse referents filling such roles are then constrained by the actual orientation of the dancer’s body parts with respect to the stage or performance space. Thus, if a physical projection structure involving the arms is performed perpendicularly to the direction of movement, resulting in a CONNECTING discourse event, then the referents identifying what is being connected are given by, first, the dancer his or herself as the one connecting and, second, potential discourse referents on stage ‘projected’ on either the left or right side depending on which arm is used.

This ‘two-stage’ semantics, whereby an under-specified generic reading is subsequently resolved against the developing discourse context, offers a powerful means of restricting the obvious ‘polysemy’ of movements not only in the movement-based medium of dance but also quite generally, as in the use of gesture. The most direct level of under-specified meaning descriptions corresponds broadly to iconic readings of bodily movement; those iconic readings are then only imbued with further more specific ‘content’ during discourse resolution and so are inherently (but systematically) variable.

Capturing the resolved discourse referents requires maintaining a representation of the stage space and positions of potential referents within that space that can continue to track locations when the dancer changes orientation. Such a representation is best provided by a qualitative description of the regions of the space. Many well formalised qualitative characterisations of spatial regions exist in the literature and it would be an interesting further research topic within ballet to explore empirically just which range of distinctions are used in dance and if there is any variation. For current purposes, the relatively coarse region description of Freksa’s (1992) ‘double-cross’ calculus is sufficient. This calculus identifies regions in space with respect to an oriented ‘double cross’ formed from a vector running from some starting location (sl) to some designated orienting point (sp). With this vector defined, 15 qualitatively distinct and mutually exclusive spatial regions are induced according to where some region is positioned with respect to the orienting vector. Each qualitative region receives a fairly intuitive label such as ‘left-forward’ (lf), ‘right-forward’ (rf), ‘left-perpendicular’ (lp), ‘right-perpendicular’ (rp), ‘left-centre’ (lc), and so on. These regions and their positions relative to the orienting vector are shown graphically in Fig. 2.

Fig. 2
figure 2

The 15 qualitatively distinguished spatial regions of Freksa’s (1992) double-cross calculus.

The value of adopting such a calculus is that it provides formal operations for tracking where objects are when the orienting vector changes—as, for example, when a dancer changes direction. In addition, for ballet, this two-dimensional plane is also anchored in a three-dimensional space with identified directions ‘upwards’ (designated by the label ‘Top’) and ‘downwards’, generally indicating the ‘Ground’. Adopting a formally specified oriented description for dance trajectories allows us to straightforwardly resolve spatial regions invoked by the dancer’s projections in relation to the dancer’s movement directions as these develop across a dance. Projections performed in sequence then enable more complex discourse propositions to be constructed, first by invoking particular abstract discourse predicates (cf. Table 1) and, second, by successively evoking a changing set of discourse referents.

We will illustrate this process at work with respect to a particular dance sequence taken from Princess Aurora’s solo in Act I of Sleeping Beauty when Aurora appears for the first time on stage at court for her sixteenth birthday. The solo features the original choreography by Marius Petipa created in 1890 on Tchaikovsky’s score. The fragment we analyse includes all of the distinctions introduced so far and so serves as a particularly suitable example. As noted above, we base our discussion here on well known instances of repertoire ballet of this kind in order to reach as wide an audience as possible, even though, as noted above, we see the FGD and our research as being applicable to a far broader spectrum of dance styles not necessarily based on classical ballet vocabulary and stylistic or technical conventions. Maiorani and Liu (2022), for example, present a particularly challenging application of the FGD for the analysis of contemporary dance that does not even involve movement across space.

For orientation in the discussion that follows, Fig. 3 shows a bird’s eye view of Aurora’s movement across the stage in the selected example. The directions and orientation of the stage are indicated using the terms of the double-cross spatial calculus introduced above as well as in more traditional stage direction terms; below we retain only the spatial calculus terms as these are what are used formally for tracking positions when dancers’ orientations and directions of movement vary. Our task is then to demonstrate how a discourse interpretation of this raw movement can be derived according to the semiotic mode model we have introduced.

Fig. 3
figure 3

The fragment of the Aurora Solo interpreted in this example.

The first step in interpretation is to segment the raw movement using the technical features defined by the middle semiotic stratum of that semiotic mode. This enforces an articulation on the continuous movement in terms of Moves, during which sets of projections are performed, and provides the immediate grouping of those moves into Minimal Ballet Sequences. The first six Minimal Ballet Sequences of Aurora’s solo are shown in Fig. 4. Each MBS consists of two Moves as explained above and these structural relationships contribute directly to the semantic distinctions. Whereas two consecutive Moves in the same direction will be able to realise projections in the same range of meaningful portions of space, since the orientation of the dancer has not changed, consecutive Moves in different directions are able to realise projections in two different ranges of meaningful portions of space (thus creating a situation where the meaning potential is provided by two contrasting ranges of possible projections).

Fig. 4: Aurora’s Solo—first six Minimal Ballet Sequences shown from above with the stage oriented as indicated.
figure 4

A and B designate contrasting directions of movement and projection.

This level of detail for our example can also be seen in Fig. 4. Here, we see that there are three groups of two MBS that follow a continuous trajectory (i.e., MBS1 + MBS2, MBS3 + MBS4, and MBS5 + MBS6); each MBS in these groups is made by two Moves realised in the same direction. The chosen direction is indicated in the figure with an arrow drawn with a solid line, the non-chosen direction (which could be anywhere but is indicated as a specific one for the sake of the discussion) is indicated by arrows drawn with a dashed line. Change in direction occurs at the boundaries between MBS2 and MBS3 and between MBS4 and MBS5. The potential meanings that accrue with a dancer following these alternatives are then developed in terms of projections as follows.

The projections performed during movement are derived directly from the directions and orientations of the projecting body parts, i.e., the articulators. This information is combined with the potential discourse referents identified as the ‘targets’ of projections, filling the participant roles of the corresponding abstract discourse predicates shown in Table 1. The result is an under-specified logical form as generally employed in Asher and Lascarides’ (2003) account of discourse semantics, but within which discourse referents are also anchored to spatial regions in a manner similar to that proposed by Schlenker (2018). The spatial regions involved are described in terms of the spatial qualitative calculus introduced above. This superimposes qualitative spatial regions ‘over’ the actual objects, props, and so on that are present on the stage relative to the current movement of the dancer. A graphical version of this coarse spatial region description for the current example with the dancer moving from the back of the stage directly to the front is given in Fig. 5a.

Fig. 5: The stage setting for the Aurora’s Solo superimposed on the 15 qualitatively distinguished spatial regions of Freksa’s (1992) double-cross calculus.
figure 5

a With the dancer moving from the back directly towards the front of the stage and b with the dancer moving from the back to the front diagonally to the right.

This particular layout is taken from an edition of the ballet performed within the repertoire at the Bolshoi Theatre in Moscow:Footnote 1 on the right of the stage we find the royal palace colonnade, seats, princes who have arrived to court Aurora and courtesans; on the left, the other side of the colonnade, more seats, some other courting princes, more courtesans, and the King and Queen towards the front. In the back are positioned Aurora’s girlfriends with some guards and more seats, while in the distance there are canals, ships, and other buildings on the water (using a backdrop that recalls a Venetian landscape).

This establishes all the components necessary to derive the discourse interpretation of Aurora’s dance sequence as it unfolds. Referring to Fig. 4 and the six MBSs shown there, we can see that the first move of the first MBS, M1 upper right, moves diagonally forwards and to the right. Moreover, as this is the first in the sequence, the dancer must take up a particular orientation in order to start. All the parts of the body that can project and the dancer’s orientation are noted at this point, and then subsequently similarly all parts of the body that can project are analysed along with their directions of projection. As will become clear, each of these components can involve considerable complexity.

The starting position, projections and their orientations for our example sequence are shown in Table 2. We will use the same type of tabular representation for the discussion of all the moves that follow since the analyses shown in the tables provide the basis for all subsequent steps of the discourse analysis. The first three columns of the tables show the directly observable physical movements relevant for the semiotic mode of ballet according to the FGD. Following this, the type of discourse events produced (column 5) are read off of the projection structures (column 4) following the associations given in Table 1. The discourse events then introduce potential discourse referents standing in specific roles—for example, the fillers for the agent and goal roles of the CONNECTING event in the first row. Both are simply derivable, since it is the dancer who acts and the projection downwards indexically projects towards the ground straightforwardly. We will see more interesting cases below.

Table 2 The initial set of projections of the starting position of the first Move (M1) of Aurora’s solo in the first Act of Sleeping Beauty.

As is typical and to be expected for a ‘starting’ position, all projections are meant to create relationships between agents and places rather than expressing one or more agents’ actions. Thus we see a clear depiction of the dancer’s rather static preparation for the first move, already suggesting its direction and showing an initial address to the audience.

Table 3 then shows the set of further projections as they unfold during the move. Here, again, we see considerable complexity: six distinct projection configurations derived from 10 distinct articulator movements (given by the rows). The process of interpretation is the same as before, although now we will see more dynamic projections and also more differentiated use of the spatial environment. The first row of the table shows a CONNECTING event as was the case with the starting position, but the second major row already indicates that we have entered the dynamic performance of the move. Here, we have a COMING-FROM discourse event, which inherently involves movement and for which we again need to resolve the semantic roles. The agent is the dancer, presenting few difficulties or possibilities for misinterpretation, but the source of the event needs to be abduced from the spatial region that the articulators are pointing to. The spatial regions indicated by the projections are coded directly in terms of the qualitative spatial regions shown in Fig. 2. These can be seen in the final columns of the analysis tables. The qualitative spatial calculus then allows these regions to be tracked formally as the dancer changes directions throughout a dance.

Table 3 The set of projections performed in the first Move (M1) of Aurora’s solo in the first Act of Sleeping Beauty.

In this second segment of the move, therefore, the arm is projecting left and so the qualitative spatial regions that correspond to this bodily orientation are pertinent. The ‘left-hand side’ is represented simply in the double-cross calculus by all regions on the left-hand side, that is: the disjunction {lp, lc, ll}, i.e., left-perpendicular or left-centre or left- (starting)-location. This is the standard way in which disjunctions, or larger regions, are denoted within such calculi: disjunctions naturally give less specific information as the position of interest is in one of the named spatial regions, but it is not known which. This is then the resolution of the discourse referents at this level of abstraction. A similar process operates for all projections and for all moves.

Comparing the two tables, we can observe that Table 2 shows a consistent lack of dynamic relations while foregrounding the creation of relationships: the collection of discourse events shown, CONNECTING, LOCATING, ADDRESSING, and ENGAGING, are typical of a static starting position where projections mostly realise connections rather than actions. In Table 3, the performance of the move proper, the projections become considerably more dynamic—particularly where, for example, the arm is moved towards the left (with respect to the dancer) and ‘straight backwards’: this is therefore opposite to the direction of movement of the Move. This is what motivates the allocation of the discourse event of a COMING-FROM.

The identification of spatial regions within the qualitative calculus then provides sufficient information to resolve, at least partially, sets of potential ‘story’ referents made up of the objects, people, props, and so on positioned at the qualitative spatial regions deployed. For example, as we can see directly from Fig. 5b, when the dancer moves to the right {lp, lc, ll} resolves to the King and Queen, the Princes, the court and so on because these are the stage entities that lie within the identified region at that time. Consequently, taking all the discourse events and their partially resolved discourse referents in the last two columns of Table 3 together yields for this first move, M1, the collection of partially resolved discourse events given in Listing (1):

The referents are still only partially resolved at this stage because there are still disjunctions or sets of entities in play within the qualitative regions identified. Semantic constraints, analogous to selection restrictions for linguistic predicates, could also be drawn on here to pre-restrict likely options, although because we are dealing with a discourse level of representation, such constraints can be expected to be abductive hypotheses rather than strict constraints.

The under-specified expressions identified here must then be fed into a final stage of discourse contextualisation where they can be regulated both by the internal coherence of Move sequences and by background knowledge concerning the story. There are several factors for referent resolution that can be usefully explored here, ranging from relative spatial proximity to the dancer to the history of already resolved discourse interpretations of the projections preceding. The fact that the provided framework allows semantic structures to be generated that are very similar to those used in natural language semantics provides the basis for a whole raft of cross-semiotic system comparisons. There are also clearly differences in how these structures are used, which demand further investigation, however.

In general, the individual components present within a single Move constitute a set of constraints that serve to specify the situation depicted at that point in the dance. Various patterns of repetition and difference have been observed within such move-internal collections. On the one hand, information may be simply repeated; this is generally done for reasons of emphasis, temporal extension, or clarity, and as the movements are considered to fit together best aesthetically. For example, Listing (1) shows that the event of Aurora moving away (COMING-FROM) from the King and Queen, Princes and court occurs twice due to the projections being repeated by different articulators within the move. On the other hand, information can also be accumulated over the move. For example, we see that the dancer, playing the role of Aurora, is appealing to the ‘heavens above’, while also being grounded in the court and engaging and addressing the court and the Princes. The semantics of each MBS is then derived by combining the semantic specifications accumulated for each move. It is these units that play the primary role in interpreting longer stretches of dance.

We show this concretely with respect to Moves 3 and 4 making up MBS2 in the Aurora sequence shown in Fig. 4. The individual discourse semantic specifications for these moves are derived precisely as indicated for the first move, M1, above, although we omit here the tables showing the progression from physical movements to under-specified discourse events for reasons of space. The discourse events resulting from this process are shown in Listings (2) and (3).

As was the case with M1, here there are also several repetitions within and across the individual moves, as well as points of contrast across the moves. The repetitions consist of Aurora locating and connecting herself with respect to the ground and the guards, girlfriends and background props, while the contrast is made by Aurora engaging first with the ‘audience’ and then with the ‘sky’. We consider the repeated, shared content as providing an overall ‘topic’ of the MBS that is modified by the points of contrast. The notion of topic here draws directly on the mechanisms for discourse structure construction set out by Asher and Lascarides (2003)—establishing a shared topic is a crucial precursor to combining contributing discourse segments and allows elements to be grouped into narrative sequences, placed in contrast or parallelism relations, and so on. In addition, these points of contrast are taken to stand in a temporal succession relation induced by the participation of the successive moves in a single MBS.

Combination into a single narrative is made possible by considering just what discourse referents are maintained or changed with respect to the selected discourse predicates. Again, this is very similar to discourse interpretation in language. As seen in the figure of the two Moves of MBS1 and MBS2, the dancer (Aurora) does not change direction and so is always moving towards the Princes and away from the location of King and Queen; this is direction A in Fig. 4. Consequently, her forward projections are always directed to the Princes and court, her backwards projections are directed towards the King and Queen, and the direction A defines her relative right and left sides so that references can be resolved to the backdrop of the stage and the audience side, respectively. The continuation of direction over the course of these two Minimal Ballet Sequences means in effect that projections to the same directions maintain their potential referents and thus can serve as a form of co-reference.

The next two Minimal Ballet Sequences, MBS3 and MBS4, contrast with the previous sequences. In this case, Aurora reverses direction and starts moving towards the opposite side of the stage, shown as direction B in the figure. Now, even though she realises the same types of projections as in MBS1 and MBS2 (which adds an aesthetic quality of cohesive repetition), she follows a different direction and so projections index different portions of space with correspondingly different values. For example, Table 4 shows the set of projections that the dancer forms within her fifth Move. In this case, the dancer is moving towards the left, which means that her right arm and leg are now projecting towards Princes and court instead of the King and Queen: the narrative has taken the opposite direction. This is identified in the table by specifying the disjunction of the rp, rc, and rl regions as before. Listing (4) summarises the partially resolved events produced by this move.

Table 4 Aurora’s arrival set of projections of Move 5 in MBS3 in Act 1 solo of Sleeping Beauty.

These semantics constructed for the individual MBSs are then combined further into larger discourse structures in a precisely analogous fashion to that described for Segmented Discourse Representation Theory by Asher and Lascarides (2003), i.e., by finding applicable discourse relations to dynamically grow encompassing discourse structures. Examining precisely which discourse relations might hold in this semiotic mode and how they will be best defined for ballet is one of the major challenges opened up by the account proposed.

In the restricted range of examples we have discussed so far, the discourse relations employed may be characterised analogously to Asher and Lascarides’ definition of Narration, whereby two situations may be abductively positioned in strict temporal sequence, to their subordinating relation of Elaboration, and to the further coordinating relation of Parallel. Elaboration at MBS level captures the connection between points of contrast within a Minimal Ballet Sequence and that sequence’s ‘shared topic’ as suggested above while Parallel captures structural parallelisms. Elaboration can happen also at Move level (Maiorani, 2021, pp. 24–25), when having reached a set of projections at the end of a move a dancer performs a further set of projections but without moving across space. As Elaboration is not performed at Move level in the MBSs analysed here, this will not be addressed further in this paper.

In our example sequence, Parallels are a prominent discursive feature, present between MBS1 + MBS2 and MBS5 + MBS6 where Aurora realises the same types of projections. These two sets of MBSs are separated by MBS3 and MBS4, where Aurora’s range of projections is modified by the change of direction. Moreover, by performing the same movements she creates not only structural parallelism but also a mirroring effect (as shown in Table 4) as she projects towards different areas and items that are located on the opposite side with respect to MBS1 and MBS2. The discursive parallelism between MBS1, MBS2, MBS5, and MBS6 is therefore interrupted by their mirror images in MBS3 and MBS4. The resulting discourse structure is suggested graphically in Fig. 6 following the usual graphical conventions for depicting SDRT analyses and incorporating the discourse relations discussed so far.

Fig. 6: Discourse structure and relations within the Aurora solo analysed.
figure 6

MBS: Minimal Ballet Sequence; m1-m12 designate the individual moves of the sequence.

This alternation of trajectories not only sets up a regular structural organisation of the minimal ballet sequences and the Moves they comprise in terms of use of space, it also articulates a regular distribution of projections and the meanings they create, corresponding to Aurora alternating action and interaction between parties. Thus, whereas the Move defines more local semantic values in terms of actions and interactions textually bound by the positions taken up during a move, the larger MBS marks ‘structural’ dance segmentations in terms of continuity or variation, corresponding to two different types of meaning distribution at the higher discursive level. There are many further suggestive strands to follow on the basis of such analyses: for example, the use of parallelism and mirroring appears more reminiscent of structures that would be observed for music and poetry than traditional narrative. Questions of intermediality are therefore not only clearly relevant but also made analytically addressable.

After generating an entire discourse structure for a sequence, it then becomes possible in effect to verbalise the ‘story’ told by that sequence. An approximation for the first six MBSs treated above following the discourse structure derived is:

  • MBS 1-2: (M1) Aurora in the Palace moves from King and Queen to Court, interacting with the Court (M2) and then interacting with the sky and the Audience. (M3 + M4) Aurora poses in the Palace with the Court and interacts with the Audience.

  • MBS 1-2: (M5) Aurora moves back to King and Queen, interacting with them and the Court (M6) and then interacting with the sky and the Audience. (M7 + M8) Aurora poses in the Palace with the Court and interacts with the Audience.

  • MBS 1-2: (M9) Aurora moves again from the King and Queen to the Court, interacting with the Court (M10) and interacting with the sky and the Audience. (M11 + M12) Aurora poses in the Palace with the Court and interacts with the Audience before running back again in a diagonal to re-position herself in the spot where she started (M1), to start another sequence.

It is easy to see how these three sets of MBSs realised at the start of the first solo Aurora performs in the whole ballet construe a discourse through which this character introduces herself to her social and physical environment as well as to the audience and in a relevant context. We might then consider further how this skeleton narrative might be enriched further by appealing to story knowledge and also to what extent such verbal ‘glosses’ of dance sequences might improve recipients’ understandings of what is unfolding in dance sequence, particularly when those recipients may not be experts in the form.

We have now seen how it is possible to provide a step-by-step derivation of a discourse structure and interpretation from a specific dance piece. It is also very valuable and both theoretically and practically interesting to examine what happens to the analysis when we compare different, but related, instances of dance. For example, if we compare this version of Aurora’s solo with another traditional version, performed by repertoire at the Opéra de Paris with the traditional choreography revisited by Rudolf Nureyev, our analysis begins to reveal both differences, as would be expected, and similarities, or ‘congruencies’, reflecting the higher-level of description achieved.

More specifically, both the Bolshoi and the Opéra solos design the same discursive path as shown in Fig. 4 above against a very similar stage set up.Footnote 2 However, in Nureyev’s version there are more arm, head, and torso projections made towards the audience and the King and Queen, especially in Moves 2, 4, and 6. This builds a different type of fine-grained discourse, in which Aurora actively elicits engagement between the characters on stage and the audience. Nevertheless, at the MBS level the discourse patterns remain very similar. Discursive variation is consequently created more locally and mostly impacts the character’s interaction with the participants on stage and the audience.

A further interesting set of differences and similarities is revealed when we carry out the same type of analysis on the same solo, with the same traditional choreography, but performed in the unusual space of a TV studio: Kirov’s 1969 traditional version created for a TV broadcast.Footnote 3 In this version, Aurora does not dance in a big environment but under a semi-circular pavilion. Courtesans interacting among themselves in a garden are vaguely visible through the arches. Aurora is mostly surrounded by her ladies in waiting who play instruments for her in a semi-circle, and it is only with them that she interacts during the six MBSs we have covered in our analysis. The existence of an audience remains completely unacknowledged, which suggests a situation more typical of a film. Only towards the end does the camera widen its scope to reveal the King and Queen, this time sitting on the left front corner of the pavilion, joined by a group of courting princes, who appear in the right front corner at the very end. In this version, therefore, due to the different set-up of the contextual space, Aurora’s projections are mostly directed towards the sky and the playing maids and the solo appears to be more like a game reserved to a group of young ladies in a more private pavilion. Nevertheless, as before, the overall discourse structure is maintained.

Interesting contrastive results are also revealed when we analyse the same solo in Matthew Bourne’s contemporary version of Sleeping Beauty (2012), danced in a modern technique and offering a much more Gothic fairy tale.Footnote 4 In this version, the story starts in the late Victorian period (1890) at a non-defined European court, moves to the early 20th century and the Edwardian era, and ends in modern times. Aurora is a rebellious teenager who dances barefoot at the picnic organised by her parents in their palace gardens for her birthday. She is courted by several young men but especially by the son of the witch Carabosse in disguise, Caradoc, who is the one who tries to enchant her to sleep with a black rose. Bourne turns Aurora’s solo into a pas-de-deux between the Princess and Caradoc, who dance on the solo’s music while the King and Queen appear worried from the right front corner of the stage; they eventually join Aurora and Caradoc in the dance along with a few surrounding couples of courtesans but remain in a circumscribed position.

What emerges from the analysis in this final case is that both Caradoc’s and Aurora’s arms are continuously interlaced, with Caradoc stopping Aurora from projecting anywhere else. The Princess rarely manages to project one leg outside but she is always moved in a circle by Caradoc and she therefore always remains in a loop controlled by him; the very rare times she manages to project an arm outwards, towards the right or left side of the garden where a few scattered courtesans are lingering, Caradoc projects his corresponding arm in the opposite direction: thus the couple performs an interlaced and contrasting set of projections that highlight their being focused on each other and that enhances their head projections, almost always directed at each other. There is no interaction with or acknowledgement of the audience in this pas-de-deux. Nevertheless, despite these considerable differences, the six MBSs that they perform again follow the same pattern as the traditional solo as analysed above, only in the opposite directions. Even in this modern, subversive version, therefore, the same discursive pattern is repeated at the level of MBS, highlighting the focus on this moment of passage for Aurora, who becomes a young adult and starts taking some distance from the King and Queen and the court, and who is being courted for the first time; a moment of the narrative that is central in all versions.

In summary, we have now demonstrated how minimal ballet sequences establish basic building blocks for the construction of longer ballet sequences. This enables mechanisms for characterising such construction to be imported directly from established formal and functional approaches to discourse structure. Constructing discourse structures for ballet sequences in this manner also makes it possible to empirically address questions generally asked of discourse structures in other communicative forms, particularly for verbal language, from within the particular communicative form of ballet as well. Moreover, as our last comparative analyses show, it becomes possible to examine fine-grained differences in choreographies and to consider their motivations in terms of narrative adaptations as well as adaptations to different medial contexts.

Towards a programme of empirical research

So far we have shown how a formalisation of ballet in terms of a semiotic mode filled in with specific details from the FGD enables us to construct interpretations of abstract, but still closely form-related characterisations of the possibilities of movement-based meaning within ballet. The primary purpose of this characterisation, however, has been to establish a solid foundation for addressing a range of further more detailed questions concerning ballet as a communicative form which would have been difficult, if not impossible, to raise without establishing this linking between the general and the specific. In the remainder of this paper, therefore, we set out how the framework can be used to support empirical investigation drawing directly on the distinct levels of description provided by the semiotic mode.

Each of the distinct semiotic strata in the semiotic mode in fact correlates with particular kinds of data annotation. This can then be used to define annotation schemes for larger sets of data, thereby supporting both corpus-based and experimental methods as we shall now see. We begin with materiality, demonstrating that it is indeed possible to move from the raw physical movements of a dancer to the medium level of semiotic abstraction defined above. For this we have employed a motion capture system with a movement analysis component to extract precisely those details of body parts, movements and directions required for the process of analysis illustrated above to start. Since, however, not all dance data we might want to analyse is available for motion capture, we also provide an extensive annotation scheme that may be applied to any ballet sequences, again drawing on the specification of the FGD as embedded in our semiotic mode account. Finally, we set out some of the questions the discourse semantic description raises and how these can be pursued experimentally.

From movement to abstract ballet descriptions

As explained above, the lowest level of semiotic abstraction simply consists of measurements of selected values in the materiality defined as relevant for the semiotic mode—which in this case is our characterisation of ballet. This demands ways of providing such measurements. Given the prior identification of bodily movement and position as the essential material properties holding in this semiotic mode, it is natural to consider the use of motion capture systems. This is consequently one way in which we can move beyond manual descriptions of data to partially automated annotations as is necessary for larger-scale analysis.

Motion capture technology is increasingly used for a range of communicative activities relying on bodily movements, ranging from gesture accompanying speech (cf. Schüller et al., 2017; Mittelberg, 2018) to a variety of perspectives taken on dance. Some studies, for example, have focused on the automated generation of labanotation scores directly from motion data (e.g., Ballas et al., 2017; Cohensawat et al., 2015; Wang and Miao, 2018), while others have explored relations with audience emotional responses and the possibility of isolating particular movement features capable of assisting choreographers in creating artistic movement (cf. Camurri et al., 2002; Lourens et al., 2010; Vincs and Barbour, 2014). A particularly detailed multilevel set of dance-relevant features offering targets for several kinds of sensor measurements, including motion capture, is given by Camurri et al. (2016). Four levels are defined: physical signals, low-level movement measurements, mid-level trajectories of single or groups of body parts, and a fourth level concerning the communication of expressive qualities. Some of these features might complement those of our own account in the future, although currently there is only limited overlap with the particular properties of movement that we adopt for characterising dance communication.

To support the automated acquisition of these properties, we have built specifically on functionalities of the Perception Neuron inertial Mo-Cap system (NOITOM Ltd., 2015). Developed primarily for gaming and virtual reality applications, this motion capture system provides the ability to perform calibrated full body inertial motion capture with minimal latency, while streaming and logging kinematic data in real time (Fig. 7a). The system’s proprietary software produces a three-dimensional reconstruction of the suit’s wearer and so, once calibrated, coherent motion with the wearer can be visualised for all body segments.

Fig. 7: Sensors (above) and example of analysed data (below) recorded with Junor Souza.
figure 7

a The motion capture suit worn during the experiments. b Example of projections during an Arabesque as recorded by our bespoke software tool.

A bespoke analysis script was developed in Python that receives the sensor data broadcast by the Axis Neuron software and then analyses them in real time to extract directions and projections of the movements corresponding to the categories of the FGD. A new description is output whenever a change in orientation of any part of the body is recognised. An example of the script operating for the analysis of the First Arabesque seen above in Move M1 in Fig. 4 is shown in Fig. 7b. The video of the full solo executed by Junor Souza, First Soloist with the English National Ballet, together with the transcription of the output of the script are available as supplementary material to this paper. The sequence shown in the figure starts from original frame number 2155 in the video.

Although still preliminary, the script correctly captures the changes in the movement and links them to the directions defined by the FGD. For example, as shown in Fig. 7b, the output captures the relative position of all articulators as they establish sets of projections: the direction that the head is facing, the directions to which the arms are projecting by extension, the relative positions of legs and feet that include their degree of rotation, which is extra information about the degree of joint rotation (i.e., legs and feet facing certain directions) that the dancer is capable of achieving in every movement (joint rotation is a characteristic of ballet technique that is directly related to the possibility to achieve higher technical levels). These results demonstrate the potential of the script to provide the necessary basic information present in the left-most columns of Tables 2, 3 and 4 above on which all subsequent analysis builds. In general, one may reasonably expect automation methods for the less abstract strata of materiality and form for all semiotic modes.

Corpus building through data annotation

Another central requirement for larger-scale analysis following the semiotic mode methodology involves the collection of annotated corpora showing data that is presumed to be conformant with the semiotic mode being investigated. This allows all the standard techniques of corpus analysis in linguistics to be applied regardless of the multimodal forms of communication at issue. Following the semiotic mode definition, therefore, we gather annotated data at the level of the materiality of dance, at the level of the categories and structures defined by the FGD, and at the level of the hypothesised discourse structures corresponding to the formal configurations found. This follows an extension of the general methodological approach proposed for multimodal data in Bateman (2022) and as suggested graphically in Fig. 8. We see this as a potential contribution to archiving such data as well.

Fig. 8
figure 8

A graphical representation of the correlation maintained between semiotic levels of abstraction and distinct kinds of annotations to be used in corpus construction (adapted from: Bateman, 2022).

The lowest level is then the immediate output of the motion capture system described in the previous subsection. The second level is the result of the Python script running on that data. The specification as a whole then serves as a general annotation scheme that combines both the lower-level features, which may be acquired automatically in the manner we have shown, and the structural configurations and higher-level descriptions in terms of discourse events. Following the FGD then helps substantially in acquiring data capable of supporting investigation of the discourse semantic interpretation of ballet. Not only does this allow for an accurate annotation of the choreographic realisations at different structural and semantic levels, it also foregrounds choices that may not be immediately noticeable to simple observation and that nonetheless impact the discursive outputs realised by projections. Several examples are shown below.

We conduct corpus annotation using ELAN (ELAN, 2021), a versatile and multipurpose annotation tool for audio and video materials developed by the Max Planck Institute for Psycholinguistics at Nijmegen (e.g., Brugman and Russel, 2004; Lausberg and Sloetjes, 2009). We have found three key functionalities of ELAN particularly useful for the annotation of dance discourse. First, ELAN adopts a horizontal timeline as a base and allows annotation on multiple layers, called “tiers” (a tier groups annotations that describe the same aspect of the data). These enable the annotation of the temporal and textual structures of dance discourse created by the interaction between different body parts and meaningful portions of space. Second, ELAN supports the creation of tier dependencies and hierarchies, which allows us to make explicit the hierarchical relations between body movements, Moves, and MBSs. And third, ELAN encourages the use of controlled vocabularies for systematising any annotation labels used. As proposed in the semiotic mode methodology and shown in Fig. 8, it is particularly beneficial in multimodality research to link such controlled vocabularies directly with the categories defined in the various levels of the semiotic modes involved. In our research, therefore, the controlled vocabularies are created directly on the basis of the FGD and the discourse semantics.

Data annotation is then carried out following four main steps. First, we add video files of the dance sequences to be annotated in ELAN: these can be both recordings of actual dance sequences and visualisations obtained from the motion capture software. Second, with the media files added we can create tiers to describe Moves and MBSs drawing on the FGD features illustrated above. At the Move level, this involves tiers that code the following six aspects of dance discourse: physical movement, structure, narrative projections, ‘narrathletic’ enhancers (if any: Maiorani, 2021), interactive projections and modal values of projections (which indicate focus or amplification of meanings depending on the number of limbs projecting in the same direction) in relation to different body parts, i.e., arms, hands, legs, feet, torso and head. At the MBS level, we create one tier to code the discursive trajectory of MBSs (i.e., structural configurations). Third, the video is segmented along the time axis following the formal constraints of the FGD. Since all annotations in ELAN are associated with temporal segments of the video being analysed, segments are created to capture the starting and arrival points of each Move as well as the starting and arrival points of each MBS. The former are hierarchically placed within the latter. Finally, in the last step, we enter specific annotation values for each segment as appropriate, drawing from the FGD-derived controlled vocabularies we created.

The purpose of this fine-grained data annotation echoes that of corpus-based approaches in general. Appropriate annotations enable us to locate and explore patterns in data from several perspectives, as well as structuring the data in a form optimal for deriving material for experiments. We demonstrate this briefly using annotations taken from an analysis of a solo from the Ballet Raymonda as performed by first soloist Junor Souza on the new choreography Tamara Rojo created for the English National Ballet. Figure 9e focuses on two MBSs selected from this data.

Fig. 9: Video frames (top) and example of data annotation conducted with ELAN (bottom).
figure 9

a The video frame of the arrival point of Move 15. b The video frame of the arrival point of Move 16. c The video frame of the arrival point of Move 17. d The video frame of the arrival point of Move 18. e MBS 8 and MBS 9 and their constituent Move annotations.

In the figure, annotations of the two selected Minimal Ballet Sequences are shown, labelled MBS8 and MBS9, together with their four constituent Moves, i.e., M15 (Fig. 9a), M16 (Fig. 9b), M17 (Fig. 9c) and M18 (Fig. 9d). The MBSs are captured in a single tier, while below this, the tiers at the Move level code structure, narrative projections and interactive projections in relation to arms and hands, legs and feet, torso and head. Several additional labels are used for moves to anchor their annotations in physical space and to show directions and other relevant entities: in particular: BG (stage background), GR (ground), LFC (left front corner), MD (move direction), POS (participant/s on stage), RBC (right back corner) and RFC (right front corner). Unlike the qualitative spatial regions used above, the locations here are already anchored to the physical positions of the performance space since this can be observed directly.

The tiers as defined and shown in the figure improve the visibility of the two dimensions of space foregrounded by the FGD: movement in physical space is represented by the annotation of structures, which appear at the same tier level as the annotation of projections enacted in the contextual space of a performance. Consequently, in the solo annotated, it becomes clear that significant repetition can be observed at the Move level, but not at the structural level realised by the MBSs.

This is shown specifically in Fig. 9e. Varied MBS8 and MBS9 are formed by four Moves that follow a mirrored pattern realised by two couples of identical moves carried out in reversed order: M15 is the same as M18, whereas M16 is the same as M17. The two varied MBSs both foreground a high number of repetitions both at structural and semantic level—for example, in the four moves (M15 to M18), the structure of right arm and right-hand movements are all vertically perpendicular to move direction, while the interactive projections are all towards participant/s on stage. This is then a further instance of the kind of mirroring that we saw in our first detailed example above, but here revealed directly from the corpus analysis.

Moreover, in this specific solo the annotations show that there is more repetition in the varied MBSs than in the continuous ones, which in turn suggests a specific rhetorical use of varied MBSs. Thus, even though Moves may be carried out on the same spatial plane (e.g., along the same diagonal) and movement structures can be repeated several times, changes of MBS direction across that plane (i.e., back and forth) nevertheless introduce a more varied discursive strategy than the similarities at the lower semantic level of Moves alone would suggest.

Figure 9 therefore reveals several aspects of the choreography construing specific discursive strategies that less detailed analysis would be unlikely to reveal and would certainly not be accessible to broader-scale analysis. Indeed, although annotation at the Move level can capture repetitions across Moves, it is precisely the visualisation of patterns across MBSs that allows us to recognise higher levels of rhetorical strategy. This type of synthetic visualisation would also naturally help us capture in more detail any variation applied by dancers who may interpret the same role and its immediate repercussions across discursive levels differently, and so now opens the door to a range of detailed empirical investigations that would not otherwise be possible.

Semiotic mode refinements by experimentation

In addition to the distribution-based approach to empirical validation and refinement of the details of a semiotic mode of classical ballet described in the previous subsection, we can now also set out how our approach makes further properties of ballet accessible experimentally. More specifically, we have shown how the development of an abstract, but discoursally oriented description scheme for ballet can play a significant role in bridging the gap between physical movements and their discourse-relevant interpretations. This can now be put to use to support further experiment-based research. This constitutes our final demonstration of how the specification in terms of a semiotic mode provides a direct chain of ‘import’ for a variety of further empirical methods capable of probing the communicative nature of ballet.

The discourse semantics of a semiotic mode can play a central role in experimental research because that semantics is already defined in terms of interpretative hypotheses. These hypotheses may be directly recast as experimental hypotheses so that the accuracy and reach of the model can be evaluated. Showing the results of such experimentation goes beyond the scope of the current paper and so we will limit our discussion here to illustrating the nature of some of the imported methods and questions. Further such explorations can certainly be envisaged, but our goal here is to demonstrate how a particularly productive interaction can now be built between empirical issues raised for verbal discourse and previously inaccessible but very analogous challenges for ballet.

We take two very different kinds of investigation to indicate something of the breadth of phenomena that are made accessible. In both cases we show how the questions and methods that we apply draw on substantial work (primarily) on language and perception that would otherwise have been difficult to apply to ballet at all.

First, considering evaluation of the internal structure of the semiotic mode we have proposed, we mentioned above how one of the central questions raised to date in formal approaches to dance has been that of achieving reference non-verbally. This question is itself a direct extension of questions concerning restrictions on anaphoric reference in verbal discourse (Kamp and Reyle, 1993). We can now do precisely the same for ballet sequences. More generally, this also invites considering more carefully the conditions under which indexation of visual entities as discourse entities occurs and to ask whether the discourse structures that result share properties with similar discourse-level inferences proposed for verbal language. One of the most direct ways of addressing these concerns experimentally is to use specifically designed ballet sequences to generate expectations (according to the model) and then to examine whether recipients share those expectations. For this, eye-tracking methods offer a well-established technique for probing mismatches of expectations precisely because attention is strongly driven by hypotheses concerning where information relevant for interpretation will be found (cf., e.g., Itti and Koch, 2001). Manipulating the ballet sequence’s discourse structure further also supports more fine-grained explorations, such as, for example, the investigation of just which kinds of discourse structure are possible and whether these differ to those observed for language. Wolf and Gibson (2005), Danlos (2008), and Egg and Redeker (2008) all provide important discussions of the complexity and forms necessary for verbal discourse but considerable further investigation is necessary as the question remains unresolved. We see the possibility of triangulating with other semiotic modes in the manner we suggest for ballet as an important direction to develop.

Second, there is extensive psychological work in perception that shows how event segmentation plays an important role in language, film, and everyday life (Zacks et al., 2007). This has also proved useful for researchers investigating the perception of discourse structure across domains and modalities (Lerdahl and Jackendoff, 1983; Popescu et al., 2021; Bläsing, 2015). Relevant empirical studies suggest that boundary perception is a general cognitive ability that is presumed to be governed by defined rules and modulated by additional factors. This then allows us to design concrete empirical investigations to assess the extent to which the kinds of segmentations provided for ballet by our account also mesh with recipients’ perception. Work in perception often applies the method of having participants segment some ongoing communicative sequence and then investigating how segmentations group and what features of the communication correlate most predictively with the boundaries assigned. This can now also be applied directly to dance sequences as the semiotic mode description makes very specific predictions concerning the units of Moves and MBSs that would not be available in purely descriptive notations such as those of Laban or Rudolf and Joan Benesh mentioned above. This then supports testing of whether there is a correlation between the pattern of viewers’ segmentation and the discourse structural organisation based on the FGD model. Thus, again, specific research questions that would otherwise be difficult to operationalise for ballet are brought within reach.

In this section, we have shown how working within the semiotic mode approach encourages the adoption of research methods from other fields, particularly the linguistic approaches of corpus work, formal work on discourse structures, and psychological research on attention and event segmentation, so that directly communicative aspects of ballet can be addressed.

Conclusions and outlook

Our discussion in this paper has presented a new approach to ballet as a form of communication that enables raw motion capture data to be processed up to a level of abstract semantic representation of the kind typically used in formal discourse representation frameworks. We have also argued how this framework then makes ballet accessible to both corpus-based and experiment-based investigation directly importing methods developed for other communicative forms.

The initial step of moving from raw data of a non-verbal kind to formal semantic representations is always problematic when addressing new semiotic systems. The fundamental challenge is one of imposing discourse-appropriate qualitative categories on continuous data, which has generally constituted a major bottleneck for larger-scale research. This issue always arises when the material of expression of a form of communication is continuous and, in many respects, iconic. The approach described here shows how we have met this challenge, opening the door to extensive further empirical investigations. Moreover, although the solution we presented specifically targeted ballet, the general methodology that we have applied is not specific: the theory of semiotic modes employed insists that all forms of communication be approached in a similar fashion, characterising material distinctions to construct structural configurations that are then capable of supporting discourse interpretations.

In many respects, therefore, we see this as an opening gambit for a range of further, more focused semantic investigations for both ballet and other analogous communicative forms. Having produced formal semantic representations on the basis of the performed dance movements, we can turn to a range of questions that hold quite generally for discourse across all forms of communication. Indeed, at the level of abstract semantics described, many differences between communicative forms have already been neutralised, making it more straightforward both to consider interactions between communicative forms (such as between the danced performance and the story or concept that is being represented through the performance) and to employ already existing reasoning and representation tools. Moreover, the notion of discourse inherited from multimodality that we have drawn on here is by no means limited to solely narrative discourse and may be expected to apply equally to communicative forms that are not attempting to ‘tell stories’, thus widening the potential scope of application of the model still further. This possibility can already be seen in our account above simply by relaxing the constraint that projected spatial regions be resolved against story elements—they can equally well stand alone as ‘abstract’ discourse referents should the form (or piece) demand it.

Questions for further work that we now consider particularly pressing involve extending and evaluating the range of dance movements that we can reliably cover, including moves to configurations produced by multiple dancers, ascertaining the full range of discourse structures that ballet supports, exploring the degree to which we can model interactions between background knowledge of likely story developments and the semantic configurations actually produced during a dance, and establishing the degree to which we can use the semantic configurations produced to derive hypotheses for empirical testing—in particular, concerning likely hypotheses for attention allocation to resolve potential discourse referents. There are also several potential practical applications as we noted above. For example, we are exploring the extent to which providing explicit guidance to viewers concerning the placement of projections within a performance might contribute to viewers’ understanding and appreciation of the art form. All of these questions demand, however, that we are first able to provide the semantic configurations required, and this is what we have now demonstrated in prototype form in this paper.

Finally, its generality notwithstanding, we expect that pursuing more precise description will equally reveal valuable differences between communicative forms that would otherwise not have been visible. This must always be a goal of multimodality research of this kind – revealing not only the generalisations, but also the significant differences, that help make each communicative form what it is.