About the SONG:
Florespiña («Thorny Flower») is an homage to the famous Galician singer Ana Kiro (1942-2010), a woman ahead of her time who strongly defended her thoughts in a difficult context to create a better Galician society. She stood for feminism, Galicianism, and progressivism. She was an example of resilience, and we portray her career through the image of gorse, considered like the National Flower of Galicia, a hardy plant that grows in desolate lands, symbolizing the strong power of Galician people, who have risen again and again despite the worst moments. The entry is a musical journey from the past to the future of the artist, and a new version of her in the metaverse, a real lider who desired and fought for freedom. Her name must be known for the new generations.
About the HUMAN-AI PROCESS:
We started with the lyrics, using the BERTIN-GPT-J-6B model, which is a Spanish finetuned version of GPT-J 6B, with the aim of generating traditional Galician couplets from digital songbook Volai-vai. This dataset consists of 500 couplets, each with four verses. With just 250 fine-tuning steps, we managed to get the model to produce coherent results, and subsequently, we selected the most interesting ones based on their theme, and stylistically correcting some words.
For the melody, we generated different rhythmic variations with Google Magenta from «Viva Galicia», one of the most played songs of Ana Kiro, getting new versions of it for each verse. In terms of music, we recorded a Galician bagpipe, tambourine, and a bass drum and processed them by DDSP, obtaining new artificial sounds according to the melodic line. In addition, Demucs was used to extract a double bass that was also sampled and transformed by AI.
To generated background and atmospheres noises, we utilized the text-to-music model AudioLDM. We fed it with prompts taken from song lyrics or ideas evoked within them. We aimed not to guide the model too explicitly. From this process, over a hundred sonic ideas were produced, and we selected the most intriguing ones. We were also inspired by Daw WavTool to harmonize, and we fixed some artifacts and mastered with Logic Pro, Ableton and Izotope Ozone AI RX 10.
AI was used for voice synthesis generation. Using the architecture from SoftVC VITS Singing Voice Conversion, two models were trained that facilitate timbric transfer. Given an input audio clip from any person, the model gives an audio clip with Ana Kiro’s voice as the output. The development of these two models is justified by the differences between typical speech and singing voices in terms of registers. These models have been trained with content provided by CRTVG.
Finally, the main tool for producing the videoclip was Midjourney, generating static images by writing text prompts or using reference images, so numerous visuals were produced, to which motion was later added with Runway Gen-2, Pika Labs and Leiapix. Depending on the nature of the input image or the desired motion, one or the other was chosen. Lastly, InsightFaceSwap was taken to replace faces in various footage with that of our tributed singer.