Vocal Expression Through Group Free Improvisation: Somatic Methods and Artistic Risk-Taking
Abstract
This practice-based mixed-methods study investigates how group free vocal improvisation, combined with somatic methods from theater and movement practices, influences singers’ willingness to take artistic risks. Seven graduate-level singers participated in four weekly 90-minute sessions that progressively integrated collaboration, body/movement exploration, spoken text, and vocal expression. Methods were drawn from Pauline Oliveros (Deep Listening), Viewpoints, Alexander Technique, Lessac Kinesensics, and Gaga Movement. The researcher served as facilitator.
Two research questions structure the analysis: (1) How does singers’ relationship to experimentation and process evolve through participation in somatic-improvisational vocal practice? (2) What conditions facilitate singers’ increased openness to uncertainty?
Quantitative data (IRT and AIRT surveys administered pre/post each session) and qualitative data (semi-structured interviews, post-session journals) were analyzed using a framework analysis approach. The central finding is a trajectory from tolerating discomfort to initiating artistic action, confirmed by 6/7 participants and supported by a group IRT increase from 3.43 to 4.79. This trajectory depends on specific conditions: explicit non-evaluative framing, group cohesion actively constructed through warm-up activities, and pedagogical structures calibrated to individual needs. Structure functions as scaffolding for some participants and as a perfectionism trigger for others, representing the study’s most nuanced finding.
Findings should be interpreted as exploratory given the small sample and practice-based design. No control condition was used.
Key Findings
RQ1: How does the relationship to experimentation evolve?
| Finding | Confidence | Evidence |
|---|---|---|
| Participants describe a trajectory from tolerating discomfort to initiating artistic action | High | 6/7 interviews describe this arc; IRT 3.43 to 4.79 (p = .016, d = 2.12) |
| The trajectory is condition-dependent, not linear or universal | High | P2 reverts to rigidity when group cohesion breaks down (Session 4) |
| “Getting it right” decreases across sessions but is managed, not eliminated | High | 6/7 report reduction; P7 notes self-judgment displaced to after sessions |
| Participants develop somatic awareness of their own risk-taking states | Exploratory | P5, P2, P4 describe body-based markers; others less explicit |
| Artistry and agency are reframed: from performing competence to exploring | High | P3, P4, P6, P2 name conceptual shifts; self-scores show universal increase |
RQ2: What conditions facilitate openness to uncertainty?
| Finding | Confidence | Evidence |
|---|---|---|
| Explicit non-evaluative framing of the space is foundational | High | 7/7 describe the space as non-evaluative; P5 reports contraction without explicit naming |
| Group energy operates through modeling, permission, and contagion | High | 7/7 name the group as significant; P5 named as catalyst by 3 participants |
| Group cohesion must be actively constructed and can degrade | High | P2’s Session 4 experience; games/warm-ups identified as the construction mechanism |
| Structure functions differently for different participants | High | P1: visual scores as bridge. P2, P4: structure triggers perfectionism |
| Environmental framing is necessary but not sufficient | Exploratory | P2: internal rigidity can override environmental conditions |
| The performance frame reactivates anxiety even after four sessions | High | 6/7 name singing-with-audience as hardest Session 4 activity |
Survey Findings
| Finding | Confidence | Evidence |
|---|---|---|
| IRT scores increase for all participants (group mean +1.36) | High | Wilcoxon p = .016, d = 2.12, r = 1.00 (all participants increased) |
| AIRT overall scores increase (group mean +0.45) | High | Wilcoxon p = .043, d = 1.05, r = 1.00 |
| Within-session gains occur at every session | High | Sessions 1, 2, 4 significant (p = .016, .043, .016); all d > 1.3 |
| Voice domain shows the largest AIRT gain (+0.72) | Exploratory | Raw gain largest but not significant (p = .180); ceiling effects limit power |
Study Design
Participants
Seven graduate-level singers (4 female, 3 male) enrolled in vocal performance or vocal pedagogy programs. All had prior vocal training and ensemble experience. Prior improvisation experience ranged from limited (5 participants) to moderate (1) to significant (1). The participant with the most prior somatic and improvisational experience (P5) serves as a contrasting case throughout the analysis.
Sessions
Four weekly 90-minute group sessions, each combining guided theatrical, vocal, and movement-based improvisation:
| Session | Focus | Key Activities |
|---|---|---|
| 1 (Nov 8) | Collaboration and group sensibility | Alexander-inspired grounding walk, name ball game, group counting game, card-based improvisation (solo, pairs, group) |
| 2 | Body and movement exploration | Memory circle, Viewpoints-inspired movement, body-voice connection exercises |
| 3 | Spoken text and expression | Lessac-influenced text work, graphic scores, Cinderella narrative improvisation |
| 4 | Integrated vocal expression | Tuning meditation, graphic scores with voice, personal repertoire (“sing a song you have negative feelings about”) |
The arc was designed to first establish exploration norms, then gradually raise stakes through increasingly vocal and personal material.
Instruments
Intellectual Risk-Taking Scale (IRT) (Beghetto, 2009): 6-item Likert scale measuring willingness to try new things, make mistakes, and share unpolished ideas. Administered pre and post each session (8 timepoints). 50 of 56 possible administrations completed.
Artistic Risk-Taking Willingness Scale (AIRT): 8-item researcher-designed scale measuring willingness across 4 domains (collaboration, movement, text, voice). Same administration schedule. 50 of 56 possible administrations completed.
Semi-structured interviews: Conducted after all sessions. Two-phase protocol: (1) language/concept calibration, (2) process and transformation. All participants asked the same questions in the same order.
Post-session surveys: Open-ended reflections after each session, plus baseline feelings and Session 4 specific prompts. 25 of 28 possible responses collected.
Missing data: P3 and P4 were absent from Session 3; P2 was absent from Session 2. Absent participants have no survey or open-ended data for those sessions.
Analysis Method
Qualitative: Framework Analysis
Interview analysis used a framework approach (Ritchie & Spencer, 1994). Because all seven interviews followed the same structured protocol, the primary analytic move was question-by-question cross-case comparison: for each interview question, all seven responses were laid side by side, and patterns of convergence and divergence were identified. These patterns were organized into six themes under the two research questions, and a participant-by-theme matrix was constructed.
Quantitative: Survey Analysis
IRT items scored 1-5 (Strongly disagree to Strongly agree), computed as mean of 6 items per administration. AIRT computed as overall mean and by four paired domains (collaboration, movement, text, voice). With n = 7, the primary analysis is descriptive: individual trajectories, group means, and pre-post comparisons. Wilcoxon signed-rank tests (exact) were used to test Session 1 pre vs. Session 4 post differences, with Cohen’s d (paired) and rank-biserial correlations as effect size measures.
Triangulation
Quantitative trajectories were compared with qualitative themes and self-reported risk-taking scores. Open-ended survey responses provided session-level texture.
Results
RQ1: How Does Singers’ Relationship to Experimentation Evolve?
The Tolerating-to-Initiating Trajectory
The central finding across both data sources. Six of seven participants describe moving from enduring discomfort to actively choosing to engage – not simply “getting more comfortable” but a qualitative shift in agency.
P4’s arc is the most dramatic: “As a classical singer, I’m very rigid in my own practice… at first I was tolerating discomfort… now I feel free to take artistic risks.” P1 is the most candid about initial resistance: “I didn’t want to be there. I was wary.” P5 is the contrasting case: she never experienced significant tolerating because she entered with prior somatic/improv experience and the conditions were right.
P2 is the essential counterexample. Her trajectory is not linear but condition-dependent: when group cohesion was strong (built through games), she could move toward initiating. When it broke down (Session 4, no warm-up games), she reverted to rigidity.
The IRT data makes this trajectory visible quantitatively:
The most significant pattern is consistent within-session gains: at every session, post-scores exceed pre-scores for nearly every participant. Each session produced an immediate increase in self-reported intellectual risk-taking that partially reset before the next session.
P1 (brown) shows the clearest individual trajectory: from 2.00 to 4.00, the lowest baseline and one of the largest gains. P7, P4, and P3 reach ceiling (5.0) early, limiting the scale’s ability to track their continued qualitative growth.
Pre-Post Comparison
The overall change from Session 1 pre to Session 4 post:
All participants show gains on both measures. P1 shows the largest IRT gain (2.0 to 4.0). P1 and P2 show the largest AIRT gains, driven primarily by growth in the voice domain.
Wilcoxon signed-rank tests confirm the IRT change: p = .016, d = 2.12 (very large effect). AIRT overall also reaches significance: p = .043, d = 1.05. The rank-biserial r = 1.00 for both measures, meaning every participant increased.
Full statistical tables and domain-level analyses are in the Detailed Results section below.
RQ2: What Conditions Facilitate Openness to Uncertainty?
Environmental Framing
The explicit naming of the space as non-evaluative was foundational. Every participant describes the space in terms of absence: no right or wrong, no expectations of correctness, no performance pressure.
P5 articulates the consequence of ambiguity: “If we don’t name the room as a space where we can explore, then I lose touch with that part of me that’s willing to just try things. I feel like someone has to tell me, ‘You can try something.’ If no one explicitly tells me that, then I feel like a shrunken version of myself.”
P2 establishes the boundary condition: “There is a point of rigidity where the values of the room still can’t change the fact that I feel inherently rigid today.” Environmental framing is necessary but not always sufficient.
Group Energy
The group is the most consistently named condition (7/7 participants). It operated through modeling (seeing others take risks), social permission (collective engagement removing individual social cost), and contagion (group energy as something you “catch”).
Three participants specifically name P5 as a catalyst for their own engagement. P3 contrasts the study with choir: “It’s looked down upon to make a mistake, so people don’t try things.”
P2 provides the essential counterpoint: when group cohesion degraded in Session 4 (no warm-up games), she reverted to rigidity. The group’s capacity to function as a risk-enabling environment must be actively constructed.
Structure: Scaffolding vs. Trigger
This is the study’s most nuanced finding. Structure functions as scaffolding for some participants and as a perfectionism trigger for others:
| Participant | Relationship to Structure |
|---|---|
| P1 | Visual scores as “bridge between what I was already comfortable with and what we were going towards” |
| P2 | “Having that external structure really activates that desire to adhere to the structure” |
| P4 | Graphic score familiarity triggered “I need to read this perfectly” before group safety overrode it |
| P6 | Least structure = most challenging (“Now do anything” felt daunting) |
| P7 | Initially followed score seriously, then got lost and let go |
The pedagogical implication: facilitators cannot assume that more structure or less structure is universally helpful. The study’s progressive arc may have worked precisely because it offered multiple entry points.
The Transfer Problem
Six of seven participants named singing with the group as audience as the hardest Session 4 activity. Even after four sessions of building safety, the shift from improvisation to performance-adjacent singing reactivated anxiety. The “performance frame” is where the old patterns reassert themselves.
Yet all seven identified specific ways to transfer the experience:
- P3: “I’ll need to instill a sense of it’s ok to make mistakes with any choir I’m in front of.”
- P4: “I might take some of the body work to my voice lessons with my students.”
- P5: “Trying to get into a space of play before I start my work!”
- P6: “I can see myself using vocal improv to free up my voice during practice sessions when a piece feels stuck.”
Two participants (P3, P4) name transfer to their own teaching, not just personal practice. The experience influenced them not only as singers but as future pedagogues. The study created conditions for a shift, but the durability and transferability of that shift to performance contexts remains an open question.
Detailed Results
The sections below provide supporting analyses, domain-level breakdowns, and the full statistical results underlying the core findings above.
“Getting It Right” and Its Dissolution
Six of seven participants describe a change in their relationship to correctness. P6 offers the most detailed account: “The first day, every time I would notice something I felt like I need to change it and make sure it’s correct. But as we went on, it became more about noticing without changing anything, and if I did change, it was based on my instincts, not in order to be correct.”
P3 identifies a layered process: facilitator-imposed expectations dissolved first, then social expectations (through group buy-in), and self-imposed expectations faded last.
P7 introduces a temporal displacement: she didn’t judge during sessions but did after. The space suspended real-time judgment rather than eliminating it.
Somatic Awareness and Vocal Discovery
Four of seven participants’ Session 4 surprises centered on vocal sounds they didn’t know they could make:
- P4: “The way I could access certain sounds in my voice without effort because I wasn’t thinking about it.”
- P5: “The sounds I heard come out of my mouth!”
- P2: “I made some sounds vocally that I really liked!”
- P1: “Being willing to play around with sounds up in my head voice.”
The somatic methods appear to have created conditions where the voice could do things that conscious, technique-focused effort could not access. P4: “My voice made sounds more easily when I didn’t stop to think.”
Self-Reported Risk Scores
During interviews, participants placed themselves on a 1-5 risk-taking scale before and after the study:
Universal increase. P4 shows the largest change (+3.0, from 1 to 4), consistent with his narrative of moving from classical rigidity to freedom. P2 shows the smallest change (+1.0), consistent with her more complex, condition-dependent experience.
AIRT Domain Trajectories
The AIRT measured willingness across four domains: collaboration, movement, text, and voice. Voice showed the largest gain, starting from the lowest baseline.
Voice (pink) and text (purple) start lowest and show the most growth. Collaboration (green) stays near ceiling throughout. The convergence of all four domains toward 5.0 by Session 4 post suggests that the intervention’s effects were not domain-specific but generalized across expressive modalities.
Statistical Tests
Session 1 Pre vs. Session 4 Post:
| Measure | n | Pre M (SD) | Post M (SD) | Cohen’s d | p |
|---|---|---|---|---|---|
| IRT | 7 | 3.43 (0.87) | 4.79 (0.37) | 2.12 | .016 |
| AIRT overall | 7 | 4.39 (0.70) | 4.84 (0.37) | 1.05 | .043 |
| AIRT Text | 7 | 4.21 (0.76) | 4.86 (0.38) | 1.16 | .041 |
| AIRT Voice | 6 | 4.08 (1.43) | 4.83 (0.41) | 0.64 | .180 |
| AIRT Movement | 7 | 4.57 (0.61) | 4.86 (0.38) | 0.73 | .103 |
| AIRT Collaboration | 7 | 4.57 (0.45) | 4.79 (0.39) | 0.54 | .180 |
AIRT voice shows the largest raw gain but does not reach significance due to ceiling effects and one missing data point reducing n to 6.
Within-session IRT changes:
| Session | n | Mean Gain | Cohen’s d | p |
|---|---|---|---|---|
| 1 | 7 | +0.79 | 1.41 | .016 |
| 2 | 6 | +0.56 | 1.30 | .043 |
| 3 | 5 | +0.60 | 1.38 | .063 |
| 4 | 7 | +0.55 | 1.60 | .016 |
Session 3 shows the same effect size but misses significance because n drops to 5 (two participants absent).
Caveats: With n = 7, exact tests are used. No multiple comparison correction was applied. Results should be interpreted as supporting evidence alongside qualitative findings, not standalone proof of effect.
Self-Reported Risk Scores
During interviews, participants placed themselves on a 1-5 risk-taking scale before and after the study:
Limitations
Sample size. Seven participants. Findings are exploratory and not generalizable.
Researcher as facilitator. Michal designed, facilitated, and analyzed the sessions. This dual role is inherent to practice-based research but means the analysis cannot be fully separated from facilitation choices.
Self-report measures. Both IRT and AIRT rely on self-assessment. Five of seven participants reported pre-session surveys did not fully capture their state, suggesting noisy baselines.
Ceiling effects. IRT and AIRT both hit 5.0 for several participants by mid-study, limiting sensitivity to continued growth.
Missing data. Two participants missing Session 3 data; one missing Session 2.
No control condition. No comparison group. Observed changes could reflect time, group bonding, demand characteristics, or Hawthorne effects. Qualitative data helps distinguish mechanisms but cannot replace controlled design.
Session 4 design. Session 4 omitted warm-up games present in earlier sessions. This may have degraded group cohesion and represents both a limitation and a finding about the importance of warm-up activities.
Transfer. The study measures experience within the study context only. Transfer to performance or professional contexts was not directly measured.
Acknowledgments
This analysis was developed collaboratively with Claude (Anthropic) under the direction of Kayla Gautereaux. All analytical decisions were made by the research team. Survey scoring, quantitative analysis, and cross-case qualitative analysis were performed computationally; thematic interpretation was guided by the researcher’s direct knowledge of participants and sessions.